Best Practices on Posit Open Source

Rapp 0.3.0

Tomasz Kalinowski — Wed, 18 Feb 2026 00:00:00 +0000

We’re excited to share our first tidyverse blog post for Rapp, alongside the 0.3.0 release. Rapp helps you turn R scripts into polished command-line tools, with argument parsing and help generation built in.

Why a command-line interface for R?

A command-line interface (CLI) lets you run programs from a terminal, without opening an IDE or starting an interactive R session. This is useful when you want to:

automate tasks via cron jobs, scheduled tasks, or CI/CD pipelines
chain R scripts together with other tools in data pipelines
let others run your R code without needing to know R
package reusable tools that feel native to the terminal
expose specific actions through a clean interface that LLM agents can invoke

There are several established packages for building CLIs in R, including argparse, optparse, and docopt, where you explicitly parse and handle command-line arguments in code. Rapp takes a different approach: it derives the CLI surface from the structure of your R script and injects values at runtime, so you never need to handle CLI arguments manually.

How Rapp works

At its core, Rapp is an alternative front-end to R: a drop-in replacement for Rscript that automatically turns common R expression patterns into command-line options, switches, positional arguments, and subcommands. You write normal R code and Rapp handles the CLI surface.

Rapp also uses special #| comments (similar to Quarto’s YAML-in-comments syntax) to add metadata such as help descriptions and short aliases.

A tiny example

Here’s a complete Rapp script (from the package examples), a coin flipper:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


#!/usr/bin/env Rapp
#| name: flip-coin
#| description: |
#|   Flip a coin.

#| description: Number of coin flips
#| short: 'n'
flips <- 1L

sep <- " "
wrap <- TRUE

seed <- NA_integer_
if (!is.na(seed)) {
  set.seed(seed)
}

cat(sample(c("heads", "tails"), flips, TRUE), sep = sep, fill = wrap)

Let’s break down how Rapp interprets this script:

R code	Generated CLI option	What it does
`flips <- 1L`	`--flips` or `-n`	Integer option with default of 1
`sep <- " "`	`--sep`	String option with default of `" "`
`wrap <- TRUE`	`--wrap` / `--no-wrap`	Boolean toggle (TRUE/FALSE becomes on/off)
`seed <- NA_integer_`	`--seed`	Optional integer (NA means “not set”)

The #| short: 'n' comment adds -n as a short alias for --flips. The #!/usr/bin/env Rapp line (called a “shebang”) lets you run the script directly on macOS and Linux without typing Rapp first.

Running the script

With Rapp installed and flip-coin available on your PATH (see Get started below), you can run the app from the terminal:

1
2
3
4
5


flip-coin -n 3
#> heads tails heads

flip-coin --seed 42 -n 5
#> tails heads tails tails heads

Auto-generated help

Rapp generates --help from your script (and --help-yaml if you want a machine-readable spec):

1

flip-coin --help

1
2
3
4
5
6
7
8
9


Usage: flip-coin [OPTIONS]

Flip a coin.

Options:
  -n, --flips   Number of coin flips [default: 1] [type: integer]
  --sep           [default: " "] [type: string]
  --wrap / --no-wrap   [default: true] Disable with `--no-wrap`.
  --seed         [default: NA] [type: integer]

Breaking change in 0.3.0: positional arguments are now required by default

If you’re upgrading from an earlier version of Rapp, note that positional arguments are now required unless explicitly marked optional.

1
2
3
4
5
6


# Before 0.3.0: this positional was optional
name <- NULL

# In 0.3.0+: add this comment to keep it optional
#| required: false
name <- NULL

If your scripts use positional arguments with NULL defaults that should remain optional, add #| required: false above them.

Highlights in 0.3.0

Rapp will be new to most readers, so rather than listing every change, here are the main ideas (and what’s improved in 0.3.0).

Options, switches, and repeatable flags from plain R

Rapp recognizes a small set of “declarative” patterns at the top level of your script:

Scalar literals like flips <- 1L become options like --flips 10.
Logical defaults like wrap <- TRUE become toggles like --wrap / --no-wrap.
#| short: n adds a short alias like -n (new in 0.3.0).
c() and list() defaults declare repeatable options (new in 0.3.0): callers can supply the same flag multiple times and values are appended.

Subcommands with `switch()`

Rapp can now turn a switch() block into subcommands (and you can nest switch() blocks for nested commands). Here’s a small sketch of a todo-style app:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


#!/usr/bin/env Rapp
#| name: todo
#| description: Manage a simple todo list.

#| description: Path to the todo list file.
#| short: s
store <- ".todo.yml"

switch(
  command <- "",

  #| description: Display the todos
  list = {
    limit <- 30L
    # ...
  },

  #| description: Add a new todo
  add = {
    task <- NULL
    # ...
  }
)

Help is scoped to the command you’re asking about, so todo --help lists the commands, and todo list --help shows just the options/arguments for list (plus any parent/global options).

Installable launchers for package CLIs

A big part of sharing CLI tools is making them easy to run after installation. In 0.3.0, install_pkg_cli_apps() installs lightweight launchers for scripts in a package’s exec/ directory that use either #!/usr/bin/env Rapp or #!/usr/bin/env Rscript:

1

Rapp::install_pkg_cli_apps("mypackage")

(There’s also uninstall_pkg_cli_apps() to remove a package’s launchers.)

Get started

Here’s the quickest path to your first Rapp script:

1
2


# 1. Install the package
install.packages("Rapp")

1
2


# 2. Install the command-line launcher
Rapp::install_pkg_cli_apps("Rapp")

Then create a script (e.g., hello.R):

1
2
3
4
5
6


#!/usr/bin/env Rapp
#| name: hello
#| description: Say hello

name <- "world"
cat("Hello,", name, "\n")

And run it:

1
2


Rapp hello.R --name "R users"
#> Hello, R users

Learn more

To dig deeper into Rapp:

browse examples in the package: system.file("examples", package = "Rapp")
read the full documentation: https://github.com/r-lib/Rapp
note that Rapp requires R ≥ 4.1.0

If you try Rapp, we’d love feedback! We especially want to hear about your experiences with edge cases in argument parsing, help output, and how commands should feel. Issues and ideas are welcome at https://github.com/r-lib/Rapp/issues .

mirai 2.6.0

Charlie Gao — Thu, 12 Feb 2026 00:00:00 +0000

mirai 2.6.0 is now on CRAN. mirai is R’s framework for parallel and asynchronous computing. If you’re fitting models, running simulations, or building Shiny apps, mirai lets you spread that work across multiple processes – locally or on remote infrastructure.

With this release, it bridges the gap between your laptop and enterprise infrastructure – the same code you prototype locally now deploys to Posit Workbench or any cloud HTTP API, with a single function call.

You can install it from CRAN with:

1

install.packages("mirai")

The flagship feature for this release is the HTTP launcher for deploying daemons to cloud and enterprise platforms. This release also brings a C-level dispatcher for minimal task dispatch overhead, race_mirai() for process-as-completed patterns, synchronous mode for debugging, and daemon synchronization for remote deployments. You can see a full list of changes in the release notes .

How mirai works

If you’ve ever waited for a loop to finish fitting models, processing files, or calling APIs, mirai can help. Any task that’s repeated independently across items is a candidate for parallel execution.

The previous release post covered mirai’s design philosophy in detail. Here’s a brief overview for readers encountering mirai for the first time.

library(mirai)
# Set up 4 background processes
daemons(4)

# Send work -- non-blocking, returns immediately
m <- mirai({
  Sys.sleep(1)
  100 + 42
})
m
#> < mirai [] >

# Collect the result when ready
m[]
#> [1] 142

# Shut down
daemons(0)

That’s mirai in a nutshell: daemons() to set up workers, mirai() to send work, [] to collect results. Everything else builds on this.

In mirai’s hub architecture, the host session listens at a URL and daemons – background R processes that do the actual work – connect to it. You send tasks with mirai() , and the dispatcher routes them to available daemons in first-in, first-out (FIFO) order.

This design enables dynamic scaling: daemons can connect and disconnect at any time without disrupting the host. Add capacity when you need it, release it when you don’t.

A single compute profile can mix daemons launched by different methods, and you can run multiple profiles simultaneously to direct different tasks to different resources. The basic syntax for each deployment method:

Deploy to	Setup
Local	`daemons(4)`
Remote (SSH)	`daemons(url = host_url(), remote = ssh_config(...))`
HPC cluster (Slurm, SGE, PBS, LSF)	`daemons(url = host_url(), remote = cluster_config())`
HTTP API / Posit Workbench	`daemons(url = host_url(), remote = http_config())`

Change one line and your local prototype runs on a Slurm cluster. Change it again and it runs on Posit Workbench. Your analysis code stays identical.

The async foundation for the modern R stack

mirai has become the convergence point for asynchronous and parallel computing across the R ecosystem.

It is the recommended async backend for Shiny – if you’re building production Shiny apps, you should be using mirai. It is the only async backend for the next-generation plumber2 – if you’re building APIs with plumber2, you’re already using mirai.

It is the parallel backend for purrr – if you use map(), mirai is how you make it parallel. Wrap your function in in_parallel() , set up daemons, and your map calls run across all of them:

1
2
3
4
5


library(purrr)
daemons(4)
models <- split(mtcars, mtcars$cyl) |>
  map(in_parallel(\(x) lm(mpg ~ wt + hp, data = x)))
daemons(0)

It powers targets – the pipeline orchestration tool for reproducible analysis. And most recently, ragnar – the Tidyverse package for retrieval-augmented generation (RAG) – adopted mirai for its parallel processing.

As an official alternative communications backend for R’s parallel package, mirai underpins workflows from interactive web applications to pipeline orchestration to AI-powered document processing.

Learn mirai, and you’ve learned the async primitive that powers the modern R stack. The same two concepts – daemons() to set up workers, mirai() to send work – are all you need to keep a Shiny app responsive or run async tasks in production.

HTTP launcher

This release extends the “deploy everywhere” principle with http_config() , a new remote launch configuration that deploys daemons via HTTP API calls – any platform with an HTTP API for launching jobs.

Posit Workbench

Many organizations use Posit Workbench to run research and data science at scale. mirai now integrates directly with it.¹ Call http_config() with no arguments and it auto-configures using the Workbench environment:

1

daemons(n = 4, url = host_url(), remote = http_config())

That’s it. Four daemons launch as Workbench jobs, connect back to your session, and you can start sending work to them.

Here’s what that looks like in practice: you’re developing a model in your Workbench session. Fitting it locally is slow. Add that line, and those fits fan out across four Workbench-managed compute jobs. When you’re done, daemons(0) releases them. No YAML, no job scripts, no leaving your R session – resource allocation, access control, and job lifecycle are all handled by the platform.

If you’ve been bitten by expired tokens in long-running sessions, http_config() is designed to prevent that. Under the hood, it stores functions rather than static values for credentials and endpoint URLs. These functions are called at the moment daemons actually launch, so session cookies and API tokens are always fresh – even if you created the configuration hours earlier.

See the mirai vignette for troubleshooting remote launches.

Custom APIs

The HTTP launcher works with any HTTP API, not just Workbench. Supply your own endpoint, authentication, and request body:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


daemons(
  n = 2,
  url = host_url(),
  remote = http_config(
    url = "https://api.example.com/launch",
    method = "POST",
    token = function() Sys.getenv("MY_API_KEY"),
    data = '{"command": "%s"}'
  )
)

The "%s" placeholder in data is where mirai inserts the daemon launch command at launch time. Each argument can be a plain value or a function – use functions for anything that changes between launches (tokens, cookies, dynamic URLs).

This opens up a wide range of deployment targets: Kubernetes job APIs, other cloud container services, or any internal job scheduler with an HTTP interface. If you can launch a process with an HTTP call, mirai can use it.

C-level dispatcher

The overhead of distributing your tasks is now negligible. In a mirai_map() over thousands of items, what you measure is the time of your actual computation, not the framework – per-task dispatch overhead is now in the tens of microseconds, where existing R parallelism solutions typically operate in the millisecond range.

Under the hood, the dispatcher – the process that sits between your session and the daemons, routing tasks to available workers – has been re-implemented entirely in C code within nanonext . This eliminates the R interpreter overhead that remained, while the dispatcher continues to be event-driven and consume zero CPU when idle.

This also removes the bottleneck when coordinating large numbers of daemons, which matters directly for the kind of scaled-out deployments that the HTTP launcher enables – dozens of Workbench jobs or cloud instances all connecting to a single dispatcher. The two features are designed to work together: deploy broadly, dispatch efficiently. mirai is built to scale from 2 cores on your laptop to 200 across a cluster, without the framework slowing you down.

`race_mirai()`

race_mirai() lets you process results as they arrive, rather than waiting for the slowest task. Suppose you’re fitting 10 models with different hyperparameters in parallel – some converge quickly, others take much longer. Without race_mirai() , you wait for the slowest fit to complete before seeing any results. With it, you can inspect or save each model the instant it finishes – updating a progress display, freeing memory, or deciding whether to continue the remaining fits at all.

race_mirai() returns the integer index of the first resolved mirai. This makes the “process as completed” pattern clean and efficient:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


daemons(4)

# Launch 10 model fits in parallel
fits <- lapply(param_grid, function(p) mirai(fit_model(data, p), data = data, p = p))

# Process each result as soon as it's ready
remaining <- fits
while (length(remaining) > 0) {
  idx <- race_mirai(remaining)
  cat("Finished model with params:", remaining[[idx]]$data$p, "\n")
  remaining <- remaining[-idx]
}

daemons(0)

Send off a batch of tasks, then process results in the order they finish – no polling, no wasted time waiting on the slowest one. If any mirai is already resolved when you call race_mirai() , it returns immediately. This pattern applies whenever tasks have variable completion times – parallel model fits, API calls, simulations, or any batch where you want to stream results as they land.

Synchronous mode

When tasks don’t behave as expected, you need a way to inspect them interactively.

Without synchronous mode, errors in a mirai return as miraiError objects – you can see that something went wrong, but you can’t step through the code to find out why. The task ran in a separate process, and by the time you see the error, that process has moved on.

daemons(sync = TRUE), introduced in 2.5.1, solves this. It runs everything in the current process – no background processes, no networking – just sequential execution. You can use browser() and other interactive debugging tools directly:

1
2
3
4
5
6
7
8


daemons(sync = TRUE)
mirai(
  {
    browser()
    mypkg::some_complex_function(x)
  },
  x = my_data
)

You can scope synchronous mode to a specific compute profile, isolating the problematic task for inspection while the rest of your pipeline keeps running in parallel.

Daemon synchronization with `everywhere()`

everywhere() runs setup operations on all daemons – loading packages, sourcing scripts, or preparing datasets – so they’re ready before you send work.

When launching remote daemons – via SSH, HPC schedulers, or the new HTTP launcher – there’s an inherent delay between requesting a daemon and that daemon being ready to accept work. The new .min argument ensures that setup has completed on at least that many daemons before returning:

1
2
3
4
5
6
7


daemons(n = 8, url = host_url(), remote = http_config())

# Wait until all 8 daemons are connected before continuing
everywhere(library(mypackage), .min = 8)

# Now send work once all daemons are ready
mp <- mirai_map(tasks, process)

This creates a synchronization point, ensuring your pipeline doesn’t start sending work before all daemons are ready. It’s especially useful for remote deployments where connection times are unpredictable.

Minor improvements and fixes

miraiError objects now have conditionCall() and conditionMessage() methods, making them easier to use with R’s standard condition handling.
The default exit behavior for daemons has been updated with a 200ms grace period before forceful termination, which allows OpenTelemetry disconnection events to be traced.
OpenTelemetry span names and attributes have been revised to better follow semantic conventions.
daemons() now properly validates that url is a character value where supplied.
Fixed a bug where repeated mirai cancellation could sometimes cause a daemon to exit prematurely.

Try it now

1
2
3
4
5
6
7
8


install.packages("mirai")
library(mirai)

daemons(4)
system.time(mirai_map(1:4, \(x) Sys.sleep(1))[])
#>    user  system elapsed
#>   0.000   0.001   1.003
daemons(0)

Four one-second tasks, one second of wall time. If those were four model fits that each took a minute, you’d go from four minutes down to one – and if you needed more power, switching to Workbench or a Slurm cluster is a one-line change. Visit mirai.r-lib.org for the full documentation.

Acknowledgements

A big thank you to all the folks who helped make this release happen:

@agilly , @aimundo , @barnabasharris , @beevabeeva , @boshek , @eliocamp , @jan-swissre , @jeroenjanssens , @kentqin-cve , @mcol , @michaelmayer2 , @pmac0451 , @r2evans , @shikokuchuo , @t-kalinowski , @VincentGuyader , @wlandau , and @xwanner .

Requires Posit Workbench version 2026.01 or later, which enables launcher authentication using the session cookie. ↩︎

nanonext 1.8.0

Charlie Gao — Mon, 09 Feb 2026 00:00:00 +0000

When we introduced nanonext last year, we showed how it connects R directly to Python, Go, Rust, and other languages through NNG’s messaging protocols. We hinted at its web capabilities – but that was just the beginning.

R already has excellent web infrastructure. Shiny and plumber2 are the go-to tools for building interactive applications and REST APIs in R. They are both powered by httpuv . nanonext adds a complementary option at the httpuv level of the stack – a low-level streaming HTTP and WebSocket server built on NNG, giving developers fine-grained control over connections, streaming, and static file serving over TLS. nanonext is for when you need lower-level control – custom protocols, infrastructure endpoints, or embedding a server alongside an existing Shiny or plumber2 application.

You can install it from CRAN with:

1

install.packages("nanonext")

You can see a full list of changes in the release notes .

Streaming HTTP/WebSocket server

The flagship feature of this release is http_server() , a streaming HTTP and WebSocket server with full TLS support. Built on NNG’s HTTP server architecture, it brings the same performance that powers nanonext’s messaging layer to web serving.

One server, one port – HTTP endpoints, WebSocket connections, and streaming all coexist. Static files bypass R entirely, served natively by NNG. WebSocket and streaming connections run callbacks on R’s main thread via the later package. Mbed TLS is built in for HTTPS/WSS, and there’s no need to run separate processes or bind additional ports.

Because it shares the same event loop that Shiny uses, http_server() can run alongside a Shiny app in the same R process. You could spin up a nanonext server to handle health checks, serve static assets, or stream real-time events – while Shiny or plumber2 handles the application logic. They’re designed to work together.

As part of our investment in expanding what’s possible with R, we’re already using nanonext at Posit to explore new real-time capabilities, and we’re excited to see what the community builds with it.

Basic HTTP server

Where frameworks like plumber2 give you a full-featured API layer with routing, serialization, and documentation out of the box, http_server() gives you direct control over requests and responses – the kind of direct access you’d reach for when building custom infrastructure or embedding a server inside a larger system.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


library(nanonext)

server <- http_server(
  url = "http://127.0.0.1:8080",
  handlers = list(
    handler("/", function(req) {
      list(status = 200L, body = "Hello from nanonext!")
    }),
    handler("/api/data", function(req) {
      list(
        status = 200L,
        headers = c("Content-Type" = "application/json"),
        body = '{"value": 42}'
      )
    }, method = "GET")
  )
)
server$start()

Handlers receive a request and return a response list. You can freely mix handler types in a single server:

Handler	Purpose
`handler()`	HTTP request/response with R callback
`handler_ws()`	WebSocket with `on_message`, `on_open`, `on_close` callbacks
`handler_stream()`	Chunked HTTP streaming (SSE, NDJSON, custom)
`handler_file()`	Serve a single static file
`handler_directory()`	Serve a directory tree with automatic MIME types
`handler_inline()`	Serve in-memory content
`handler_redirect()`	HTTP redirect

Specifying port 0 in the URL lets the operating system assign an available port. The actual port is reflected in server$url after $start(), so you can set up test servers without worrying about port conflicts.

Static file serving

Static handlers bypass R entirely – NNG serves content directly and efficiently:

1
2
3
4
5
6
7


handler_directory("/static", "www/assets")  # serve a folder
handler_file("/favicon.ico", "favicon.ico") # serve a single file
handler_inline(
  "/robots.txt",
  "User-agent: *\nDisallow:",
  content_type = "text/plain"
) # serve in-memory content

For example, you can serve a rendered Quarto website with a single handler:

1
2
3
4
5
6
7


server <- http_server(
  url = "http://127.0.0.1:0",
  handlers = handler_directory("/", "_site")
)
server$start()
server$url
# Browse to the URL to see your Quarto site

WebSocket server

WebSockets provide full bidirectional communication – the server can push messages to the client, and the client can send messages back. WebSocket and HTTP handlers share the same server and port, so a browser can load a page over HTTP and open a WebSocket to the same origin – no cross-origin configuration needed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


server <- http_server(
  url = "http://127.0.0.1:8080",
  handlers = list(
    handler("/", function(req) list(status = 200L, body = "...")),
    handler_ws("/ws",
      on_message = function(ws, data) ws$send(data),
      on_open = function(ws) cat("connected:", ws$id, "\n"),
      on_close = function(ws) cat("disconnected:", ws$id, "\n")
    )
  )
)

This makes it easy to build lightweight real-time services – monitoring endpoints or live-updating feeds that push results to the browser as they arrive.

HTTP streaming and Server-Sent Events

When you only need to push data in one direction – server to client – streaming is a lighter-weight alternative to WebSockets. It works over plain HTTP, so any client that speaks HTTP can consume the stream without needing a WebSocket library. handler_stream() enables chunked transfer encoding for streaming responses:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


conns <- list()

handler_stream("/events",
  on_request = function(conn, req) {
    conn$set_header("Content-Type", "text/event-stream")
    conn$set_header("Cache-Control", "no-cache")
    conns[[as.character(conn$id)]] <<- conn
    conn$send(format_sse(data = "connected", id = "1"))
  },
  on_close = function(conn) {
    conns[[as.character(conn$id)]] <<- NULL
  }
)

The format_sse() helper formats messages per the SSE specification. On the browser side, updates arrive automatically as they happen – no page refreshes or repeated requests needed. Streaming also supports NDJSON and custom formats – useful for streaming model training progress, sensor readings, monitoring endpoints, or pipeline notifications.

TLS/SSL support

For HTTPS, pass a TLS configuration. nanonext bundles Mbed TLS, so there’s nothing extra to install:

1
2
3
4
5
6


cert <- write_cert(cn = "127.0.0.1")
server <- http_server(
  url = "https://127.0.0.1:0",
  handlers = handler("/", function(req) list(status = 200L, body = "Secure!")),
  tls = tls_config(server = cert$server)
)

Full response headers for HTTP client

ncurl() now accepts response = TRUE to return all response headers:

resp <- ncurl("https://postman-echo.com/get", response = TRUE)
resp$headers |> names()
#>  [1] "Date"                          "Content-Type"                 
#>  [3] "Content-Length"                "Connection"                   
#>  [5] "CF-RAY"                        "etag"                         
#>  [7] "vary"                          "Set-Cookie"                   
#>  [9] "x-envoy-upstream-service-time" "cf-cache-status"              
#> [11] "Server"

Previously you could only request specific headers by name. Now you can retrieve the complete set – useful for inspecting rate limits, caching directives, and other metadata from REST APIs.

Async HTTP with Shiny

If your Shiny app calls a REST API, a slow or unresponsive endpoint will block the R process and freeze the app for all users, not just the one who triggered the request. ncurl_aio() avoids this – it performs the HTTP call on a background thread and returns a promise, so the R process stays free to serve other sessions. It works anywhere that accepts a promise, including Shiny’s ExtendedTask:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


library(shiny)
library(bslib)
library(nanonext)

ui <- page_fluid(
  p("The time is ", textOutput("current_time", inline = TRUE)),
  hr(),
  input_task_button("btn", "Fetch data"),
  verbatimTextOutput("result")
)

server <- function(input, output, session) {
  output$current_time <- renderText({
    invalidateLater(1000)
    format(Sys.time(), "%H:%M:%S %p")
  })

  task <- ExtendedTask$new(
    function() ncurl_aio("https://postman-echo.com/get", response = TRUE)
  ) |> bind_task_button("btn")

  observeEvent(input$btn, task$invoke())
  output$result <- renderPrint(task$result()$headers)
}

shinyApp(ui, server)

New documentation

The package documentation has been reorganized into focused, self-contained guides:

Guide	Topics
Quick Reference	At-a-glance API overview
Messaging	Cross-language exchange, async I/O, synchronization
Protocols	req/rep, pub/sub, surveyor/respondent
Configuration	TLS, options, serialization
Web Toolkit	HTTP client/server, WebSocket, streaming

Whether you need a quick API cheatsheet or a deep dive into WebSocket chat servers, the new vignettes are designed to get you up and running fast.

Bug fixes and improvements

A new race_aio() function returns the index of the first resolved async operation in a list – useful when waiting on multiple concurrent operations and you want to act on whichever completes first.

This release also fixes two critical issues – one affecting TLS operations in fresh sessions with newer system versions of Mbed TLS, another when custom serialization hooks threw errors. Error handling is now more graceful throughout, with closed streams returning error values instead of throwing. Under the hood, serialization, streaming, and async sends are all faster, and the bundled Mbed TLS is updated to 3.6.5 LTS. Building from source no longer requires xz.

Looking ahead

nanonext gives R a new building block for web infrastructure – one that complements httpuv. We see it as part of a broader investment in making R a first-class platform for real-time, connected applications. If you want to dig deeper, visit the package website or explore the source on GitHub .

Acknowledgements

A big thank you to everyone who contributed to this release:

@jeroenjanssens and @shikokuchuo .

yaml12: YAML 1.2 for R and Python

Tomasz Kalinowski — Wed, 07 Jan 2026 00:00:00 +0000

Today we’re announcing two new packages for parsing and emitting YAML 1.2: yaml12 for R and py-yaml12 for Python.

Both packages are implemented in Rust and built on the excellent saphyr crate. They share the same design goals: predictable YAML 1.2 typing, explicit control over tag interpretation via handlers, and clean round-tripping of unhandled tags.

Before we get into the details, a quick note on how this relates to the existing R yaml package. The R yaml package is now in r-lib , and we’ve taken over maintenance after years of stewardship by its original author, Jeremy Stephens, and later by Shawn Garbett.

If yaml already works for you, there’s no need to switch. yaml12 is an experiment providing consistent R and Python bindings to a new Rust library specifically for YAML 1.2, which, as we’ll see below, has some particular advantages.

Install

Install the R package from CRAN:

install.packages("yaml12")

Install the Python package from PyPI:

1

pip install py-yaml12

Quick start (R)

library(yaml12)

yaml <- "
title: A modern YAML parser and emitter written in Rust
properties: [fast, correct, safe, simple]
"

doc <- parse_yaml(yaml)
str(doc)
#> List of 2
#>  $ title     : chr "A modern YAML parser and emitter written in Rust"
#>  $ properties: chr [1:4] "fast" "correct" "safe" "simple"

Round-trip back to YAML:

obj <- list(
  seq = 1:2,
  map = list(key = "value"),
  tagged = structure("1 + 1", yaml_tag = "!expr")
)
write_yaml(obj)
#> ---
#> seq:
#>   - 1
#>   - 2
#> map:
#>   key: value
#> tagged: !expr 1 + 1
#> ...

identical(obj, parse_yaml(format_yaml(obj)))
#> [1] TRUE

Quick start (Python)

# Install from PyPI:
#   python -m pip install py-yaml12
from yaml12 import parse_yaml, format_yaml, Yaml

yaml_text = """
title: A modern YAML parser and emitter written in Rust
properties: [fast, correct, safe, simple]
"""

doc = parse_yaml(yaml_text)

assert doc == {
  "title": "A modern YAML parser and emitter written in Rust",
  "properties": ["fast", "correct", "safe", "simple"]
}

assert doc == parse_yaml(format_yaml(doc))

# Tagged values
tagged = parse_yaml("!expr 1 + 1")
assert tagged == Yaml(value="1 + 1", tag="!expr")

Why YAML 1.2?

YAML 1.2 tightened up a number of ambiguous implicit conversions. In particular, plain scalars like on/off/yes/no/y/n are strings in the 1.2 core schema, and YAML 1.2 removed sexagesimal (base-60) parsing, so values like 1:2 are not treated as numbers.

YAML 1.2 also removed !!timestamp, !!binary, and !!omap from the set of core types, which further reduces implicit coercions (for example, getting a date/time object when you expected a string). If you want to interpret those values, you can do so explicitly via tags and handlers.

That makes YAML a better default for configuration files, front matter, and data interchange. You get fewer surprises and fewer “why did this become a boolean?” moments (or “why did this become a date?”).

Highlights

A consistent API in R and Python

The two packages intentionally share the same high-level functions:

parse_yaml() : Parse YAML from a string
read_yaml() : Read YAML from a file
format_yaml() : Format values as YAML (to a string)
write_yaml() : Write YAML to a file (or stdout)

Tags and handlers (opt-in, meaning, safe defaults)

In YAML, tags are explicit annotations like !expr or !!timestamp that attach type and meaning to a value.

Tags are preserved by default:

In R, tags are kept in a yaml_tag attribute.
In Python, tags are kept by wrapping values in a Yaml() object.

Handlers let you opt into custom behavior for tags (including tags on mapping keys) while keeping parsing as a data-only operation by default.

If you used R yaml’s !expr tag to evaluate expressions, you can recreate that behavior by registering a handler, but it’s only recommended when parsing trusted YAML, since evaluating arbitrary code is a security risk. For untrusted input, the default behavior is safer because it keeps !expr as data and does not execute code.

R example:

# by default, tags are kept as data
dput(parse_yaml("!expr 1 + 1"))
#> structure("1 + 1", yaml_tag = "!expr")

# Add a handler to process tagged nodes (like the {yaml} package does)
handlers <- list("!expr" = \(x) eval(str2expression(x), globalenv()))
parse_yaml("!expr 1 + 1", handlers = handlers)
#> [1] 2

Python example:

from yaml12 import parse_yaml

handlers = {"!expr": eval}  # use with trusted input only
parse_yaml("!expr 1 + 1", handlers=handlers)

#> 2

Simplification and missing values (R)

In R, parse_yaml() can simplify homogeneous sequences to vectors. When it does, YAML null becomes the appropriate NA type:

parse_yaml("[1, 2, 3, null]")
#> [1]  1  2  3 NA

str(parse_yaml("[1, 2, 3, null]", simplify = FALSE))
#> List of 4
#>  $ : int 1
#>  $ : int 2
#>  $ : int 3
#>  $ : NULL

Non-string mapping keys

YAML allows mapping keys that aren’t plain strings (numbers, booleans, tagged scalars, even sequences and mappings). Both packages preserve these safely:

In R, you’ll get a regular named list plus a yaml_keys attribute when needed.
In Python, unhashable keys (like lists/dicts) are wrapped in Yaml so they can still be used as dict keys and round-trip correctly.

R example:

dput(parse_yaml("{a: b}: c"))
#> structure(list("c"), names = "", yaml_keys = list(list(a = "b")))

Python example:

from yaml12 import parse_yaml, Yaml

doc = parse_yaml("{a: b}: c")
assert doc == {Yaml({'a': 'b'}): 'c'}

Mapping order is preserved

YAML mappings are ordered. yaml12 preserves mapping/dictionary order when parsing and formatting, so the order you see in a YAML file (or emit) round-trips in both R and Python.

Document streams and front matter

Both packages support multi-document YAML streams with multi = TRUE. When multi = FALSE (the default), parsing stops after the first document, which is handy for extracting YAML front matter from text that continues with non-YAML content.

Example:

yaml <- "
---
title: Extracting YAML front matter
---
This is technically now the second document in a YAML stream
"
str(parse_yaml(yaml))
#> List of 1
#>  $ title: chr "Extracting YAML front matter"
str(parse_yaml(yaml, multi = TRUE))
#> List of 2
#>  $ :List of 1
#>   ..$ title: chr "Extracting YAML front matter"
#>  $ : chr "This is technically now the second document in a YAML stream"

Performance and safety notes

yaml12 is implemented in Rust and written with performance and safety in mind. It avoids unnecessary allocations, copies, and extra traversals where possible. In Python, py-yaml12 (imported as yaml12) also releases the GIL for large parses and serializations.

In typical usage, the R package yaml12 is ~2× faster than the yaml package, and the Python package py-yaml12 is ≥50× faster than default PyYAML in the benchmarks (R benchmarks ; Python benchmarks ).

Tags are preserved by default, and interpreting them (including any kind of evaluation) is always an explicit opt-in via handlers. Plain scalars follow the YAML 1.2 core schema rules for predictable typing.

In Python, py-yaml12 ships prebuilt wheels for common platforms. If you do need to build from source, you’ll need a Rust toolchain. In R, yaml12 is available from CRAN (including binaries on common platforms).

Wrapping up

If you work with YAML as a data format for configuration, front matter, or data interchange, we hope yaml12 (R) and py-yaml12 (Python) help you parse and emit YAML 1.2 predictably. If you run into YAML that doesn’t behave as expected, we’d love to hear about it in the issue trackers: r-yaml12 and py-yaml12 .

Learn more

R package docs: https://posit-dev.github.io/r-yaml12/
R package on CRAN: https://cran.r-project.org/package=yaml12
Python package docs: https://posit-dev.github.io/py-yaml12/
Python package on PyPI: https://pypi.org/project/py-yaml12/

Acknowledgements

Both packages build on the fantastic work in the YAML ecosystem, especially the saphyr Rust crate and the yaml-test-suite .

testthat 3.3.0

Hadley Wickham — Thu, 13 Nov 2025 00:00:00 +0000

We’re chuffed to announce the release of testthat 3.3.0. testthat is a testing framework for R that makes it easy to turn your existing informal tests into formal, automated tests that you can rerun quickly and easily.

You can install it from CRAN with:

install.packages("testthat")

This blog post highlights the most important changes in this release, including lifecycle changes that removed long-deprecated mocking functions, improvements to expectations and their error messages, and a variety of new features that make testing easier and more robust. You can see a full list of changes in the release notes .

library(testthat)

Claude Code experiences

Before we dive into the changes, I wanted to talk a little bit about some changes to my development process, as I used this release as an opportunity to learn Claude Code . This is the first package where I’ve really used AI to support the development of many features and I thought it might be useful to share my experience.

Overall it was a successful experiment. It helped me close over 100 issues in what felt like less time than usual. I don’t have any hard numbers, but my gut feeling is that it was maybe a 10-20% improvement to my development velocity. This is still significant, especially since I’m an experienced R programmer and my workflow has been pretty stable for the last few years. I mostly used Claude for smaller, well-defined tasks where I had a good sense of what was needed. I found it particularly useful for refactoring, where it was easy to say precisely what I wanted, but executing the changes required a bunch of fiddly edits across many files.

I also found it generally useful for getting over the “activation energy hump”: there were a few issues that had been stagnating for years because they felt like they were going to be hard to do and with relatively limited payoff. I let Claude Code loose on a few of these and found it super useful. It only produced code I was really happy with a couple of times, but every time it gave me something to react to (often with strong negative feelings!) and that got me started actually engaging with the problem.

If you’re interested in using Claude Code yourself, there are a couple of files you might find useful. My CLAUDE.md tells Claude how to execute a devtools-based workflow, along with a few pointers to resolve common issues. My settings.json allows Claude to run longer without human intervention, doing things that should mostly be safe. One note of caution: these settings do allow Claude to run R code, which does allow it to do practically anything. In my experience, Claude only used R to run tests or documentation.

I also experimented with using Claude Code to review PRs. It was just barely useful enough that I kept it turned on for my own PRs, but I didn’t bother trying to get it to work for contributed PRs. Most of the time it either gave a thumbs up or bad advice, but every now and then it would pick up a small error.

(I’ve also used Claude Code to proofread this blog post!)

Lifecycle changes

The biggest change in this release is that local_mock() and with_mock() are defunct. They were deprecated in 3.0.0 (2020-10-31) because it was becoming clear that the technique that made them work would be disallowed in a future version of R. This has now happened in R 4.5.0, so the functions have been removed. Removing local_mock() and with_mock() was a fairly disruptive change, affecting ~100 CRAN packages, but it had to be done, and I’ve been working on notifying package developers since January so everyone had plenty of time to update. Fortunately, the needed changes are generally small, since the newer local_mocked_bindings() and with_mocked_bindings() can solve most additional needs. (If you haven’t heard of mocking before, you can read the new vignette("mocking") to learn what it is and why you might want to use it.)

Other lifecycle changes:

testthat now requires R 4.1. This follows our supported version policy , which documents our commitment to support five versions of R (the current version and four previous versions). We’re excited to be able to finally take advantage of the base pipe and compact anonymous functions (i.e. \(x) x + 1)!
is_null()/matches(), deprecated in 2.0.0 (2017-12-19), and is_true()/is_false(), deprecated in 2.1.0 (2019-04-23), have been removed. These conflicted with other tidyverse functions so we pushed their deprecation through, even though we have generally left the old test_that() API untouched.
expect_snapshot(binary), soft deprecated in 3.0.3 (2021-06-16), is now fully deprecated. test_files(wrap), deprecated in 3.0.0 (2020-10-31), has now been removed.
There were a few other changes that broke existing packages. The most impactful change was to start checking the inputs to expect() which, despite the name, is actually an internal helper. That revealed a surprising number of packages were accidentally using expect() instead of expect_true() or expect_equal() . We don’t technically consider this a breaking change because it revealed off-label function usage: the function API hasn’t changed; you just now learn when you’re using it incorrectly.

If you’re interested in the process we use to manage the release of a package that breaks its reverse dependencies, you might like to read the issue where I track all the problems and prepare PRs to fix them.

Expectations and the interactive testing experience

A lot of work in this release was prompted by an overhaul of vignette("custom-expectations"), which describes how to create your own expectations that work just like testthat’s. This is a long time coming, and as I was working on it, I realized that I didn’t really know how to write new expectations, which had led to a lot of variation in the existing implementations. This kicked off a bunch of experimentation and iterating, leading to a swath of improvements:

All expectations have new failure messages: they now state what was expected, what was actually received, and, if possible, they clearly illustrate the difference.
Expectations now consistently return the value of the first argument, regardless of whether the expectation succeeds or fails (the only exception is expect_error() and friends which return the captured condition so that you can perform additional checks on the condition object). This is a relatively subtle change that won’t affect tests that already pass, but it does improve failures when you pipe together multiple expectations.
A new pass() function makes it clear how to signal when an expectation succeeds. All existing expectations were rewritten to use pass() and (the existing) fail() instead of expect() , which I think makes the flow of logic easier to understand.
Improved expect_success() and expect_failure() expectations now test that an expectation always returns exactly one success or failure (this ensures that the counts that you see in the reporters are correct).

This new framework helped us write six new expectations:

expect_all_equal() , expect_all_true() , and expect_all_false() check that every element of a vector has the same value, giving better error messages than expect_true(all(...)):

test_that("some test", {
  x <- c(0.408, 0.961, 0.883, 0.46, 0.537, 0.961, 0.851, 0.887, 0.023)
  expect_all_true(x < 0.95)
})
#> ── Failure: some test ────────────────────────────────────────────────
#> Expected every element of `x < 0.95` to equal TRUE.
#> Differences:
#> `actual`:   TRUE FALSE TRUE TRUE TRUE FALSE TRUE TRUE TRUE
#> `expected`: TRUE TRUE  TRUE TRUE TRUE TRUE  TRUE TRUE TRUE
#> Error:
#> ! Test failed with 1 failure and 0 successes.

expect_disjoint() , by @stibu81 , expects values to be absent:

test_that("", {
  expect_disjoint(c("a", "b", "c"), c("c", "d", "e"))
})
#> ── Failure:  ─────────────────────────────────────────────────────────
#> Expected `c("a", "b", "c")` to be disjoint from `c("c", "d", "e")`.
#> Actual: "a", "b", "c"
#> Expected: None of "c", "d", "e"
#> Invalid: "c"
#> Error:
#> ! Test failed with 1 failure and 0 successes.

expect_r6_class() expects an R6 object:

test_that("", {
  x <- 10
  expect_r6_class(x, "foo")

  x <- R6::R6Class("bar")$new()
  expect_r6_class(x, "foo")
})
#> ── Failure:  ─────────────────────────────────────────────────────────
#> Expected `x` to be an R6 object.
#> Actual OO type: none.
#> ── Failure:  ─────────────────────────────────────────────────────────
#> Expected `x` to inherit from "foo".
#> Actual class: "bar"/"R6".
#> Error:
#> ! Test failed with 2 failures and 0 successes.

expect_shape() , by @michaelchirico , expects a specific shape (i.e., nrow() , ncol() , or dim() ):

test_that("show off expect_shape() failure messages", {
  x <- matrix(1:9, nrow = 3)
  expect_shape(x, nrow = 4)
  expect_shape(x, dim = c(3, 3, 3))
  expect_shape(x, dim = c(3, 4))
})
#> ── Failure: show off expect_shape() failure messages ─────────────────
#> Expected `x` to have 4 rows.
#> Actual rows: 3.
#> ── Failure: show off expect_shape() failure messages ─────────────────
#> Expected `x` to have 3 dimensions.
#> Actual dimensions: 2.
#> ── Failure: show off expect_shape() failure messages ─────────────────
#> Expected `x` to have dim (3, 4).
#> Actual dim: (3, 3).
#> Error:
#> ! Test failed with 3 failures and 0 successes.

As you can see from the examples above, when you run a single test interactively (i.e. not as a part of a test suite) you now see exactly how many expectations succeeded and failed.

Other new features

testthat generally does a better job of handling nested tests, aka subtests, where you put a test_that() inside another test_that() , or more typically it() inside of describe() . Subtests will now generate more informative failure messages, free from duplication, with more informative skips if any subtests don’t contain any expectations.
The snapshot experience has been significantly improved, with all known bugs fixed and some new helpers added: snapshot_reject() rejects all modified snapshots by deleting the .new variants, and snapshot_download_gh() makes it easy to get snapshots off GitHub and into your local package. Additionally, expect_snapshot() and friends will now fail when creating a new snapshot on CI, as that’s usually a signal that you’ve forgotten to run the snapshot code locally before committing.
On CRAN, test_that() will automatically skip if a package is not installed, which means that you no longer need to check if suggested packages are installed in your tests.
vignette("mocking") explains mocking in detail, and new local_mocked_s3_method() , local_mocked_s4_method() , and local_mocked_r6_class() make it easier to mock S3 and S4 methods and R6 classes.
test_dir() , test_check() , and friends gain a shuffle argument that uses sample() to randomly reorder the top-level expressions in each test file. This random reordering surfaces dependencies between tests and code outside of any test, as well as dependencies between tests, helping you find and eliminate unintentional dependencies.

try_again() is now publicized, as it’s a useful tool for testing flaky code:

flaky_function <- function() {
    if (runif(1) < 0.1) 0 else 1
  }
  
  # 10% chance of failure:
  test_that("my flaky test is ok", {
    skip_on_cran()
    expect_equal(flaky_function(), 1)
  })
  
  # 1% chance of failure:
  test_that("my flaky test is ok", {
    skip_on_cran()
    try_again(1, expect_equal(flaky_function(), 1))
  })
  
  # 0.1% chance of failure:
  test_that("my flaky test is ok", {
    skip_on_cran()
    try_again(2, expect_equal(flaky_function(), 1))
  })

Note that it’s still good practice to skip such tests on CRAN.

New skip_unless_r() skips tests on unsuitable versions of R. It has a convenient syntax so you can use, e.g., skip_unless_r(">= 4.1.0") to skip tests that require ...names() .
New SlowReporter makes it easier to find the slowest tests in your package. You can run it with devtools::test(reporter = "slow").
New vignette("challenging-functions") provides an index to other documentation organized by various challenges.

Acknowledgements

A big thank you to all the folks who helped make this release happen: @3styleJam , @afinez , @andybeet , @atheriel , @averissimo , @d-morrison , @DanChaltiel , @DanielHermosilla , @eitsupi , @EmilHvitfeldt , @emstruong , @gaborcsardi , @gael-millot , @hadley , @hoeflerb , @jamesfowkes , @jan-swissre , @jdblischak , @jennybc , @jeroenjanssens , @kevinushey , @krivit , @kubajal , @lawalter , @m-muecke , @maelle , @math-mcshane , @mcol , @metanoid , @MichaelChirico , @moodymudskipper , @njtierney , @nunotexbsd , @pabangan , @pachadotdev , @plietar , @schloerke , @schuemie , @sebkopf , @shikokuchuo , @snystrom , @stibu81 , @TimTaylor , and @tylermorganwall .

pkgdown 2.2.0

Hadley Wickham — Thu, 06 Nov 2025 00:00:00 +0000

We’re delighted to announce the release of pkgdown 2.2.0. pkgdown is designed to make it quick and easy to build a beautiful and accessible website for your package.

You can install it from CRAN with:

install.packages("pkgdown")

This version of pkgdown has one major change: a new pkgdown::build_llm_docs() function that automatically creates files that make it easier for LLMs to read your documentation. Concretely, this means two things:

You’ll get an llms.txt at the root directory of your site. llms.txt is an emerging standard that provides an easy way for an LLM to get an overview of your site. pkgdown creates an overview by combining your README, your function index, and your article index: this should give the LLM a broad overview of what your package does, along with links to find out more.
Every existing .html on your site gets a corresponding .md file. These are generally easier for LLMs to understand because they contain just the content of the site, without any extraneous styling.

If you don’t want to generate these files, just add the following to your _pkgdown.yaml:

llm-docs: false

This release also includes new translations for Dutch and Japanese, removal of the long-deprecated autolink_html() and preview_page(), and a handful of other bug fixes and minor improvements. You can read about them all in the release notes .

Acknowledgements

As always, a big thanks to everyone who helped make this release possible: @cderv , @chabld , @Danny-dK , @davidorme , @dmurdoch , @hadley , @hfrick , @IndrajeetPatil , @jayhesselberth , @jeroenjanssens , @jmgirard , @krlmlr , @lorenzwalthert , @maelle , @MichaelChirico , @pepijn-devries , @remlapmot , @rempsyc , @Rohit-Satyam , @royfrancis , @rparmm , @schloerke , @TimTaylor , and @usrbinr .

mirai 2.5.0

Charlie Gao — Fri, 05 Sep 2025 00:00:00 +0000

We’re excited to announce mirai 2.5.0, bringing production-grade async computing to R!

This milestone release delivers enhanced observability through OpenTelemetry, reproducible parallel RNG, and key user interface improvements. We’ve also packed in twice as many changes as usual - going all out in delivering a round of quality-of-life fixes to make your use of mirai even smoother!

You can install it from CRAN with:

install.packages("mirai")

Introduction to mirai

mirai (Japanese for ‘future’) provides a clean, modern approach to parallel computing in R. Built on current communication technologies, it delivers extreme performance through professional-grade scheduling and an event-driven architecture.

It continues to evolve as the foundation for asynchronous and parallel computing across the R ecosystem, powering everything from async Shiny applications to parallel map in purrr to hyperparameter tuning in tidymodels.

library(mirai)

# Set up persistent background processes
daemons(4)

# Async evaluation - non-blocking
m <- mirai({
  Sys.sleep(1)
  100 + 42
})
m
#> < mirai [] >

# Results are available when ready
m[]
#> [1] 142

# Shut down persistent background processes
daemons(0)

A unique design philosophy

Modern foundation: mirai builds on nanonext , the R binding to Nanomsg Next Generation, a high-performance messaging library designed for distributed systems. This means that it’s using the very latest technologies, and supports the most optimal connections out of the box: IPC (inter-process communications), TCP or secure TLS. It also extends base R’s serialization mechanism to support custom serialization of newer cross-language data formats such as safetensors, Arrow and Polars.

Extreme performance: as a consequence of its solid technological foundation, mirai has the proven capacity to scale to millions of concurrent tasks over thousands of connections. Moreover, it delivers up to 1,000x the efficiency and responsiveness of other alternatives. A key innovation is the implementation of event-driven promises that react with zero latency - this provides an extra edge for real-time applications such as live inference or Shiny apps.

Production first: mirai provides a clear mental model for parallel computation, with a clean separation of a user’s current environment with that in which a mirai is evaluated. This explicitness and simplicity helps avoid common pitfalls that can afflict parallel processing, such as capturing incorrect or extraneous variables. Transparency and robustness are key to mirai’s design, and are achieved by minimizing complexity, and eliminating all hidden state with no reliance on options or environment variables. Finally, its integration with OpenTelemetry provides for production-grade observability.

Deploy everywhere: deployment of daemon processes is made through a consistent interface across local, remote (SSH), and HPC environments (Slurm, SGE, PBS, LSF). Compute profiles are daemons settings that are managed independently, such that you can be connected to all three resource types simultaneously. You then have the freedom to distribute workload to the most appropriate resource for any given task - especially important if tasks have differing requirements such as GPU compute.

OpenTelemetry integration

New in mirai 2.5.0: complete observability of mirai requests through OpenTelemetry traces. This is a core feature that completes the final pillar in mirai’s ‘production first’ design philosophy.

When tracing is enabled via the otel and otelsdk packages, you can monitor the entire lifecycle of your async computations, from creation through to evaluation, making it easier to debug and optimize performance in production environments. This is especially powerful when used in conjunction with other otel-enabled packages (such as an upcoming Shiny release), providing end-to-end observability across your entire application stack.

Reproducible parallel RNG

Introduced in mirai 2.4.1: reproducible parallel random number generation. Developed in consultation with our tidymodels colleagues and core members of the mlr team, this is a great example of the R community pulling together to solve a common problem. It addresses a long-standing challenge in parallel computing in R, important for reproducible science.

mirai has, since its early days, used L’Ecuyer-CMRG streams for statistically-sound parallel RNG. Streams essentially cut into the RNG’s period (a very long sequence of pseudo-random numbers) at intervals that are far apart from each other that they do not in practice overlap. This ensures that statistical results obtained from parallel computations remain correct and valid.

Previously, we only offered the following option, matching the behaviour of base R’s parallel package:

Default behaviour daemons(seed = NULL): creates independent streams for each daemon. This ensures statistical validity but not numerical reproducibility between runs.

Now, we also offer the following option:

Reproducible mode daemons(seed = integer): creates a stream for each mirai() call rather than each daemon. This guarantees identical results across runs, regardless of the number of daemons used.

# Always provides identical results:

with(
  daemons(3, seed = 1234L),
  mirai_map(1:3, rnorm, .args = list(mean = 20, sd = 2))[]
)
#> [[1]]
#> [1] 19.86409
#> 
#> [[2]]
#> [1] 19.55834 22.30159
#> 
#> [[3]]
#> [1] 20.62193 23.06144 19.61896

User interface improvements

Compute profile helper functions

with_daemons() and local_daemons() make working with compute profiles much more convenient by allowing the temporary switching of contexts. This means that developers can continue to write mirai code without worrying about the resources on which it is eventually run. End-users now have the ability to change the destination of any mirai computation dynamically using one of these scoped helpers.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# Work with specific compute profiles
with_daemons("gpu", {
  result <- mirai(gpu_intensive_task())
})

# Local version for use inside functions
async_gpu_intensive_task <- function() {
  local_daemons("gpu")
  mirai(gpu_intensive_task())
}

Re-designed `daemons()`

Creating new daemons is now more ergonomic, as it automatically resets existing ones. This provides for more convenient use in contexts such as notebooks, where cells may be run out of order. Manual daemons(0) calls are no longer required to reset daemons.

1
2
3
4
5
6


# Old approach
daemons(0)  # Had to reset first
daemons(4)

# New approach - automatic reset
daemons(4)  # Just works, resets if needed

New `info()` function

Provides a more succinct alternative to status() for reporting key statistics. This is optimized and is now a supported developer interface for programmatic use.

info()
#> connections  cumulative    awaiting   executing   completed 
#>           4           4           8           4           2

Acknowledgements

We extend our gratitude to the R community for their continued feedback and contributions. Special thanks to all contributors who helped shape this release through feature requests, bug reports, and code contributions: @agilly , @D3SL , @DavZim , @dipterix , @eliocamp , @erydit , @karangattu , @louisaslett , @mikkmart , @sebffischer , @shikokuchuo , and @wlandau .

nanonext 1.7.0

Charlie Gao — Tue, 02 Sep 2025 00:00:00 +0000

Introducing nanonext: breaking down language barriers in data science

We’re excited to welcome nanonext to the r-lib family! nanonext is R’s binding to NNG (Nanomsg Next Generation), a high-performance C messaging library that implements scalability protocols for distributed systems. Because NNG has bindings and ports across multiple languages—including Python, Go, Rust, C++, and many others—nanonext enables seamless interoperability between R and other modern programming languages.

The latest version 1.7.0 brings enhanced reliability with improved HTTP client functionality, better handling of custom serialization methods, and more robust event-driven promises.

Get started by installing the package from CRAN now:

1

install.packages("nanonext")

The challenge: multi-language data science

If you’ve worked in data science, you’ve likely encountered this scenario: your workflow spans multiple programming languages. Perhaps you have Python models, R analysis scripts, Go services, or C++ performance libraries that all need to work together.

Traditionally, making these components communicate means:

Writing data to files and reading them back
Building REST APIs and making HTTP calls
Handling different serialization formats like JSON or protocol buffers
Dealing with the latency and complexity that comes with each approach

The solution: NNG’s scalability protocols

nanonext changes this by bringing NNG’s scalability protocols to R. NNG is a high-performance C library that implements standardized communication patterns like request/reply, publish/subscribe, and pipeline architectures. Any process using NNG can communicate directly with any other NNG process using the same protocol—regardless of the programming language.

This means that for simple atomic vector types, your R data doesn’t even need to be serialized and converted to a common format, enabling direct, real-time communication between processes. For more complex data types, as long as serialization and unserialization methods exist in both languages, then nanonext can be used to send and receive arbitrary binary data.

What can you do with nanonext?

nanonext opens up several powerful possibilities for R users:

🔗 Cross-language integration: Connect R directly with Python machine learning models, Go microservices, Rust compute engines, or C libraries without intermediate files or complex APIs.

⚡ Real-time data pipelines: Build data systems where different components process data as it flows, perfect for live dashboards or high-frequency analytics.

📡 Modern web integration: Create WebSocket clients, make asynchronous HTTP requests, or build real-time APIs that integrate with web services.

🚀 Asynchronous programming: Write non-blocking R code that can handle multiple operations simultaneously, improving performance for concurrent tasks.

Python ↔ R interoperability example

Here’s a concrete example of R and Python working together through NNG’s protocols. Both processes use their respective NNG bindings to establish a direct communication channel using NNG’s ‘pair’ protocol—a simple one-to-one bidirectional communication pattern.

Python side (using pynng):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


import numpy as np
import pynng

# Create socket to communicate with R
socket = pynng.Pair0(listen="ipc:///tmp/nanonext.socket")

# Wait for data from R
raw = socket.recv()
array = np.frombuffer(raw)
print(array)  # [1.1 2.2 3.3 4.4 5.5]

# Process data (could be ML model prediction, transformation, etc.)
processed = array * 2  # Simple example: multiply by 2

# Send back to R as bytes
socket.send(processed.tobytes())
socket.close()

R side (using nanonext):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


library(nanonext)

# Connect to Python process
sock <- socket("pair", dial = "ipc:///tmp/nanonext.socket")

# Send numeric data directly to Python (no serialization needed!)
sock |> send(c(1.1, 2.2, 3.3, 4.4, 5.5), mode = "raw")

# Receive processed results back from Python
processed_data <- sock |> recv(mode = "double")
print(processed_data)  # [1] 2.2 4.4 6.6 8.8 11.0

# Continue with R-specific analysis
summary(processed_data)
plot(processed_data)

close(sock)

What just happened? Your R session sent numerical data directly to Python’s memory space, Python processed it (possibly performing inference on a complex ML model), and sent results back—all happening in memory without touching the disk. This is orders of magnitude faster than traditional file-based approaches.

Why this matters

Speed: Direct memory communication is dramatically faster than file I/O or HTTP requests. Your R session doesn’t wait for disk writes or network round-trips.

Simplicity: No need to design REST APIs, manage file formats, or handle serialization. Send R vectors directly and receive results back.

Flexibility: nanonext supports different communication patterns (one-to-one, one-to-many, publish/subscribe) and transport methods (IPC, TCP, WebSocket).

Reliability: Built on NNG, a mature library that powers mission-critical systems in technology, finance and many other industries.

Asynchronous: Your R session stays responsive. Send a request for a long-running computation and continue working while it processes in the background.

The future is multi-language

nanonext facilitates a shift toward more flexible, performance-oriented data science workflows. Instead of being locked into a single language or accepting the overhead of traditional integration methods, you can now build systems where each component uses the best tool for the job.

We’re excited to see how the R community uses these capabilities.

Ready to try it? Visit the package website for comprehensive documentation, or explore the code at the GitHub repository .

Air 0.7.0

Davis Vaughan — Wed, 11 Jun 2025 00:00:00 +0000

We’re very excited to announce Air 0.7.0 , a new release of our extremely fast R formatter. This post will act as a roundup of releases 0.5.0 through 0.7.0, including: even better Positron support, a new feature called autobracing, and an official GitHub Action! If you haven’t heard of Air, read our announcement blog post first to get up to speed. To install Air, read our editors guide .

Positron

The Air extension is now included in Positron by default, and will automatically keep itself up to date. We’ve been working hard to ensure that Air leaves a positive first impression, and we think that having Positron come batteries included with Air really helps with that! Positron now also ships with Ruff , the extremely fast Python formatter and linter, ensuring that you have a great editing experience out of the box, no matter which language you prefer.

We’ve also streamlined the process of adding Air to a new or existing project. With dev usethis, you can now run usethis::use_air() to automatically configure recommended Air settings. In particular, this will:

Create an empty air.toml .

Create .vscode/settings.json filled with the following settings. This enables Format on Save within your workspace.

1
2
3
4
5
6


{
    "[r]": {
        "editor.formatOnSave": true,
        "editor.defaultFormatter": "Posit.air-vscode"
    }
}

Create .vscode/extensions.json filled with the following settings. This automatically prompts contributors that don’t have the Air extension to install it when they open your workspace, ensuring that everyone is using the same formatter!
1 2 3 4 5

{ "recommendations": [ "Posit.air-vscode" ] }
Update your .Rbuildignore to exclude Air related configuration, if you’re working on an R package.

Once you’ve used usethis to configure Air, you can now immediately reformat your entire workspace by running Air: Format Workspace Folder from the Command Palette (accessible via Cmd + Shift + P on Mac/Linux, or Ctrl + Shift + P on Windows). I’ve found that this is invaluable for adopting Air in an existing project!

To summarize, we’ve reduced our advice on adding Air to an existing project down to:

Open Positron
Run usethis::use_air()
Run Air: Format Workspace Folder
Commit, push, and then enjoy using Format on Save forevermore 😄

More editors!

Positron isn’t the only editor that’s received some love! We now have official documentation for using Air in the following editors:

We’re very proud of the fact that Air can be used within any editor, not just RStudio and Positron! This documentation was a community effort - thanks in particular to @taplasz , @PMassicotte , @m-muecke , @TymekDev , and @wurli .

Autobracing

Autobracing is the process of adding braces (i.e. { } ) to if statements, loops, and function definitions to create more consistent, readable, and portable code. It looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


for (i in seq_along(x)) x[[i]] <- x[[i]] + 1L

# Becomes:
for (i in seq_along(x)) {
  x[[i]] <- x[[i]] + 1L
}

function(x, y)
  call_that_spans_lines(
    x,
    y,
    fixed_option = FALSE
  )

# Becomes:
function(x, y) {
  call_that_spans_lines(
    x,
    y,
    fixed_option = FALSE
  )
}

It’s particularly important to autobrace multiline if statements for portability, which we roughly define as the ability to copy and paste that if statement into any context and have it still parse correctly. Consider the following if statement:

1
2
3
4
5
6


do_something <- function(this = TRUE) {
  if (this)
    do_this()
  else 
    do_that()
}

As written, this is correct R code, but if you were to pull out the if statement and place it in a file at “top level” and try to run it, you’d see a parse error:

1
2
3
4
5


if (this)
  do_this()
else 
  do_that()
#> Error: unexpected 'else'

In practice, this typically bites you when you’re debugging and you send a chunk of lines to the console:

Air autobraces this if statement to the following, which has no issues with portability:

1
2
3
4
5
6
7


do_something <- function(this = TRUE) {
  if (this) {
    do_this()
  } else {
    do_that()
  }
}

Give side effects some Air

We believe code that create side effects which modify state or affect control flow are important enough to live on their own line. For example, the following stop() call is an example of a side effect, so it moves to its own line and is autobraced:

1
2
3
4
5
6


if (anyNA(x)) stop("`x` can't contain missing values.")

# Becomes:
if (anyNA(x)) {
  stop("`x` can't contain missing values.")
}

You might be thinking, “But I like my single line if statements!” We do too! Air still allows single line if statements if they look to be used for their value rather than for their side effect. These single line if statements are still allowed:

1
2
3
4
5


x <- if (condition) this else that

x <- x %||% if (condition) this else that

list(a = if (condition) this else that)

Similarly, single line function definitions are also still allowed if they don’t already have braces and don’t exceed the line length:

1
2
3


add_one <- function(x) x + 1

bools <- map_lgl(xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x))

For the full set of rules, check out our documentation on autobracing .

Empty braces

You may have noticed the following forced expansion of empty {} in previous versions of Air:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


dummy <- function() {}

# Previously became:
dummy <- function() {
}

tryCatch(fn, error = function(e) {})

# Previously became:
tryCatch(fn, error = function(e) {
})

my_fn(expr = {}, option = TRUE)

# Previously became:
my_fn(
  expr = {
  }, 
  option = TRUE
)

As of 0.7.0, empty braces {} are now never expanded, which retains the original form of each of these examples.

`skip` configuration

In our release post , we detailed how to disable formatting using a # fmt: skip comment for a single expression, or a # fmt: skip file comment for an entire file. Skip comments are useful for disabling formatting for one-off function calls, but sometimes you may find yourself repeatedly using functions from a domain specific language (DSL) that doesn’t follow conventional formatting rules. For example, the igraph package contains a DSL for constructing a graph from a literal representation:

1

igraph::graph_from_literal(A +-+ B +---+ C ++ D + E)

By default, Air would format this as:

1

igraph::graph_from_literal(A + -+B + ---+C + +D + E)

If you use graph_from_literal() often, it would be annoying to add # fmt: skip comments at every call site. Instead, air.toml now supports a skip field that allows you to specify function names that you never want formatting for. Specifying this would retain the original formatting of the graph_from_literal() call, even without a # fmt: skip comment:

1

skip = ["graph_from_literal"]

In the short term, you may also want to use this for tibble::tribble() calls, i.e. skip = ["tribble"]. In the long term, we’re hoping to provide more sophisticated tooling for formatting using a specified alignment .

GitHub Action

Air now has an official GitHub Action, setup-air . This action really only has one job - to get Air installed on your GitHub runner and put on the PATH. The basic usage is:

1
2


- name: Install Air
  uses: posit-dev/setup-air@v1

If you need to pin a version:

1
2
3
4


- name: Install Air 0.4.4
  uses: posit-dev/setup-air@v1
  with:
    version: "0.4.4"

From there, you can call Air’s CLI in downstream steps. A minimal workflow that errors if any files require formatting might look like:

1
2
3
4
5


- name: Install Air
  uses: posit-dev/setup-air@v1

- name: Check formatting
  run: air format . --check

Rather than creating the workflow file yourself, we instead recommend using usethis to pull in our example workflow :

1

usethis::use_github_action(url = "https://github.com/posit-dev/setup-air/blob/main/examples/format-suggest.yaml")

This is a special workflow that runs on pull requests. It calls air format and then uses reviewdog/action-suggester to push any formatting diffs as GitHub Suggestion comments on your pull request. It looks like this:

You can accept all suggestions in a single batch, which will then rerun the format check, along with any other GitHub workflows (like an R package check), so you can feel confident that accepting the changes hasn’t broken anything.

We like this workflow because it provides an easy way for external contributors who aren’t using Air to still abide by your formatting rules. The external contributor can even accept the suggestions themselves, so by the time you look at their pull request it’s already good to go from a formatting perspective ✅!

Acknowledgements

A big thanks to the 49 users who helped make this release possible by finding bugs, discussing issues, contributing documentation, and writing code: @adisarid , @aronatkins , @ateucher , @avhz , @aymennasri , @christophe-gouel , @dkStevensNZed , @eitsupi , @ELICHOS , @fh-mthomson , @fzenoni , @gaborcsardi , @grasshoppermouse , @hadley , @idavydov , @j-dobner , @jacpete , @jeffkeller-einc , @jhk0530 , @joakimlinde , @JosephBARBIERDARNAL , @JosiahParry , @kkanden , @krlmlr , @Kupac , @kv9898 , @lcolladotor , @lulunac27a , @m-muecke , @maelle , @matanhakim , @njtierney , @novica , @ntluong95 , @philibe , @PMassicotte , @RobinKohrs , @salim-b , @sawelch-NIVA , @schochastics , @Sebastian-T-T , @stevenpav-helm , @t-kalinowski , @taplasz , @tbadams45cdm , @wurli , @xx02al , @Yunuuuu , and @yutannihilation .

Fonts in R

Thomas Lin Pedersen — Mon, 12 May 2025 00:00:00 +0000

(An updated version of this blog post will be available at the systemfonts webpage )

The purpose of this document is to give you a thorough overview of fonts in R. However, for this to be possible, you’ll first need a basic understanding of fonts in general. If you already have a thorough understanding of digital typography you can skip to the next section .

Digital typography

Many books could be, and have been, written about the subject of typography. This blog post is not meant to be an exhaustive deep dive into all areas of this vast subject. Rather, it is meant to give you just enough understanding of core concepts and terminology to appreciate how it all plays into using fonts in R.

Typeface or font?

There is a good chance that you, like 99% of world, use “font” as the term describing “the look” of the letters you type. You may, perhaps, have heard the term “typeface” as well and thought it synonymous. This is in fact slightly wrong, and a great deal of typography snobbery has been dealt out on that account (much like the distinction between packages and libraries in R). It is a rather inconsequential mix-up for the most part, especially because 99% of the population wouldn’t bat an eye if you use them interchangeably. However, the distinction between the two serves as a good starting point to talk about other terms in digital typography as well as the nature of font files, so let’s dive in.

When most people use the word “font” or “font family”, what they are actually describing is a typeface. A typeface is a style of lettering that forms a cohesive whole. As an example, consider the well-known “Helvetica” typeface. This name embraces many different weights (bold, normal, light) as well as slanted (italic) and upright. However, all of these variations are all as much Helvetica as the others - they are all part of the same typeface.

A font is a subset of a typeface, describing a particular variation of the typeface, i.e. the combination of weight, width, and slant that comes together to describe the specific subset of a typeface that is used. We typically give a specific combination of these features a name, like “bold” or “medium” or “italic”, which we call the font style¹. In other words, a font is a particularly style within a typeface.

Different fonts from the Avenir Next typeface

In the rest of this document we will use the terms typeface and font with the meaning described above.

Font files

Next, we need to talk about how typefaces are represented for use by computers. Font files record information on how to draw the individual glyphs (characters), but also instructions about how to draw sequences of glyphs like distance adjustments (kerning) and substitution rules (ligatures). Font files typically encode a single font but can encode a full typeface:

typefaces <- systemfonts::system_fonts()[, c("path", "index", "family", "style")]

# Full typeface in one file
typefaces[typefaces$family == "Helvetica", ]
#> # A tibble: 6 × 4
#>   path                                index family    style        
#>                                                
#> 1 /System/Library/Fonts/Helvetica.ttc     2 Helvetica Oblique      
#> 2 /System/Library/Fonts/Helvetica.ttc     4 Helvetica Light        
#> 3 /System/Library/Fonts/Helvetica.ttc     5 Helvetica Light Oblique
#> 4 /System/Library/Fonts/Helvetica.ttc     1 Helvetica Bold         
#> 5 /System/Library/Fonts/Helvetica.ttc     3 Helvetica Bold Oblique 
#> 6 /System/Library/Fonts/Helvetica.ttc     0 Helvetica Regular

# One font per font file
typefaces[typefaces$family == "Arial", ]
#> # A tibble: 4 × 4
#>   path                                                     index family style      
#>                                                                
#> 1 /System/Library/Fonts/Supplemental/Arial.ttf                 0 Arial  Regular    
#> 2 /System/Library/Fonts/Supplemental/Arial Bold.ttf            0 Arial  Bold       
#> 3 /System/Library/Fonts/Supplemental/Arial Bold Italic.ttf     0 Arial  Bold Italic
#> 4 /System/Library/Fonts/Supplemental/Arial Italic.ttf          0 Arial  Italic

Here, each row is a font, with family giving the name of the typeface, and style the font style.

It took a considerable number of tries before the world managed to nail the digitial representation of fonts, leading to a proliferation of file types. As an R user, there are three formats that are particularly improtant:

TrueType (ttf/ttc). Truetype is the baseline format that all modern formats stand on top of. It was developed by Apple in the ’80s and became popular due to its great balance between size and quality. Fonts can be encoded, either as scalable paths, or as bitmaps of various sizes, the former generally being preferred as it allows for seamless scaling and small file size at the same time.
OpenType (otf/otc). OpenType was created by Microsoft and Adobe to improve upon TrueType. While TrueType was a great success, the number of glyphs it could contain was limited and so was its support for selecting different features during shaping . OpenType resolved these issues, so if you want access to advanced typography features you’ll need a font in OpenType format.
Web Open Font Format (woff/woff2). TrueType and OpenType tend to create large files. Since a large percentage of the text consumed today is delivered over the internet this creates a problem. WOFF resolves this problem by acting as a compression wrapper around TrueType/OpenType to reduce file sizes while also limiting the number of advanced features provided to those relevant to web fonts. The woff2 format is basically identical to woff except it uses the more efficient brotli compression algorithm. WOFF was designed specifically to be delivered over the internet and support is still a bit limited outside of browsers.

While we have mainly talked about font files as containers for the shape of glyphs, they also carries a lot of other information needed for rendering text in a way pleasant for reading. Font level information records a lot of stylistic information about typeface/font, statistics on the number of glyphs and how many different mappings between character encodings and glyphs it contains, and overall sizing information such as the maximum descend of the font, the position of an underline relative to the baseline etc. systemfonts provdies a convenient way to access this data from R:

dplyr::glimpse(systemfonts::font_info(family = "Helvetica"))
#> Rows: 1
#> Columns: 24
#> $ path                "/System/Library/Fonts/Helvetica.ttc"
#> $ index               0
#> $ family              "Helvetica"
#> $ style               "Regular"
#> $ italic              FALSE
#> $ bold                FALSE
#> $ monospace           FALSE
#> $ weight              normal
#> $ width               normal
#> $ kerning             FALSE
#> $ color               FALSE
#> $ scalable            TRUE
#> $ vertical            FALSE
#> $ n_glyphs            2252
#> $ n_sizes             0
#> $ n_charmaps          10
#> $ bbox                <-11.406250, 17.343750, -5.765625, 13.453125>
#> $ max_ascend          9.234375
#> $ max_descend         -2.765625
#> $ max_advance_width   18
#> $ max_advance_height  12
#> $ lineheight          12
#> $ underline_pos       -1.203125
#> $ underline_size      0.59375

Further, for each glyph there is a range of information in addition to its shape:

systemfonts::glyph_info("j", family = "Helvetica", size = 30)
#> # A tibble: 1 × 9
#>   glyph index width height x_bearing y_bearing x_advance y_advance bbox     
#>                               
#> 1 j        77     6     27        -1        21         7         0

These terms are more easily understood with a diagram:

The x_advance in particular is important when rendering text because it tells you how far to move to the right before rendering the next glyph (ignoring for a bit the concept of kerning)

Text shaping

The next important concept to understand is text shaping, which, in the simplest of terms, is to convert a succession of characters into a sequence of glyphs along with their locations. Important here is the distinction between characters, the things you think of as letters, and glyphs, which is what the font will draw. For example, think of the character “f”, which is often tricky to draw because the “hook” of the f can interfere with other characters. To solve this problem, many typefaces include ligatures, like “ﬁ”, which are used for specific pairs of characaters. Ligatures are extremely important for languages like Arabic.

A few of the challenges of text shaping include kerning, bidirectional text, and font substitution. Kerning is the adjustment of distance between specific pairs of characters. For example, you can put “VM” a little closer together but “OO” needs to be a little further apart. Kerning is an integral part of all modern text rendering and you will almost solemnly notice it when it is absent (or worse, wrongly applied ).

Not every language writes text in the same direction, but regardless of your native script, you are likely to use arabic numerals which are always written left-to-right. This gives rise to the challenge of bidirectional (or bidi) text, which mixes text flowing in different directions. This imposes a whole new range of challenges!

Finally, you might request a character that a font doesn’t contain. One way to deal with this is to render a glyph representing a missing glyph, usually an empty box or a question mark. But it’s typically more useful to use the correct glyph from a different font. This is called font fallback and happens all the time for emojis, but can also happen when you suddenly change script without bothering to pick a new font. Font fallback is an imprecise science, typically relying on an operating system font that has a very large number of characters, but might look very different from your existing font.

Once you have determined the order and location of glyphs, you are still not done. Text often needs to be wrapped to fit into a specific width, it may need a specific justification, perhaps, indentation or tracking must be applied, etc. Thankfully, all of this is generally a matter of (often gnarly) math that you just have to get right. That is, all except text wrapping which should happen at the right boundaries, and may need to break up a word and inserting a hyphen etc.

Like I said, the pit of despair is bottomless…

Font handling in R

You hopefully arrive at this section with an appreciation of the horrors that goes into rendering text. If not, maybe this blog post will convince you.

Are you still here? Good.

Now that you understand the basics of what goes into handling fonts and text, we can now discuss the details of fonts in R specifically.

Fonts and text from a user perspective

The users perception of working with fonts in R is largely shaped by plots. This means using either base or grid graphics or one of the packages that have been build on top of it, like ggplot2 . While the choice of tool will affect where you specify the font to use, they generally agree on how to specify it.

Graphic system	Argument
	Typeface	Font	Size
Base Arguments are passed to `par()` to set globally or directly to the call that renders text (e.g. `text()`)	`family`	`font`	`cra` (pixels) or `cin` (inches) multiplied by `cex`
Grid Arguments are passed to the `gp` argument of relevant grobs using the `gpar()` constructor	`fontfamily`	`fontface`	`fontsize` (points) multiplied by `cex`
ggplot2 Arguments are set in `element_text()` to alter theme fonts or directly in the geom call to alter geom fonts	`family`	`face` (in `element_text()`) or `fontface` (in geoms)	`size` (points when used in `element_text()`, depends on the value of `size.unit` argument when used in geom)

From the table it is clear that in R fontfamily/family is used to describe the typeface and font/fontface/face is used to select a font from the typeface. Size settings is just a plain mess.

The major limitation in fontface (and friends) is that it takes a number, not a string, and you can only select from four options: 1: plain, 2: bold, 3: italic, and 4: bold-italic. This means, for example, that there’s no way to select Futura Condensed Extra Bold. Another limitation is that it’s not possible to specify any font variations such as using tabular numbers or stylistic ligatures.

Fonts and text from a graphics device perspective

In R, a graphics device is the part responsible for doing the rendering you request and put it on your screen or in a file. When you call png() or ragg::agg_png() you open up a graphics device that will receive all the plotting instructions from R. Both graphics devices will ultimately produce the same file type (PNG), but how they choose to handle and respond to the plotting instructions may differ (greatly). Nowhere is this difference more true than when it comes to text rendering.

After a user has made a call that renders some text, it is funneled through the graphic system (base or grid), handed off to the graphics engine, which ultimately asks the graphics device to render the text. From the perspective of the graphics device it is much the same information that the user provided which are presented to it. The text() method of the device are given an array of characters, the typeface, the size in points, and an integer denoting if the style is regular, bold, italic, or bold-italic.

This means that it is up to the graphics device to find the approprate font file (using the provided typeface and font style) and shape the text with all that that entails. This is a lot of work, which is why text is handled so inconsistently between graphics devices. Issues can range from not being able to find fonts installed on the computer, to not providing font fallback mechanisms, or even handling right-to-left text. It may also be that certain font file formats are not well supported so that e.g. color emojis are not rendered correctly.

There have been a number of efforts to resolve these problems over the years:

extrafont: Developed by Winston Chang, extrafont sought to mainly improve the situation for the pdf() device which generally only had access to the postscript fonts that comes with R. The package allows the pdf() device to get access to TrueType fonts installed on the computer, as well as provide means for embedding the font into the PDF so that it can be opened on systems where the font is not installed. (It also provides the capabilities to the Windows png() device).
sysfonts and showtext. These packages are developed by Yixuan Qiu and provide support for system fonts to all graphics devices, by hijacking the text() method of the graphics device to treat text as polygons or raster images. This guarantees your plots will look the same on every device, but it doesn’t do advanced text shaping, so there’s no support for ligatures or font substitution. Additionally, it produces large files with inaccessible text when used to produce pdf and svg outputs.
systemfonts and textshaping. These packages are developed by me to provide a soup-to-nuts solution to text rendering for graphics devices. systemfonts provides access to fonts installed on the system along with font fallback mechanisms, registration of non-system fonts, reading of font files etc. textshaping builds on top of systemfonts and provides a fully modern engine for shaping text. The functionality is exposed both at the R level and at the C level, so that graphics devices can directly access to font lookup and shaping.

We will fosus on systemfonts, because it’s designed to give R a modern text rendering stack. That’s unfortunately impossible without coordination with the graphics device, which means that to use all these features you need a supported graphics device. There are currently two options:

The ragg package provides graphics devices for rendering raster graphics in a variety of formats (PNG, JPEG, TIFF) and uses systemfonts and textshaping extensively.
The svglite package provides a graphic device for rendering vector graphics to SVG using systemfonts and textshaping for text.

You might notice there’s currently a big hole in this workflow: PDFs. This is something we plan to work on in the future.

A systemfonts based workflow

With all that said, how do you actually use systemfonts to use custom fonts in your plots? First, you’ll need to use ragg or svglite.

Using ragg

While there is no way to unilaterally make ragg::agg_png() the default everywhere, it’s possible to get close:

Positron: recent versions automatically use ragg for the plot pane if it’s installed.
RStudio IDE: set “AGG” as the backend under Global Options > General > Graphics.
ggplot2::ggsave() : ragg will be automatically used for raster output if installed.

R Markdown and Quarto: you need to set the dev option to "ragg_png". You can either do this with code:

1
2


#| include: false
knitr::opts_chunk$set(dev = "ragg_png")

Or in Quarto, you can set it in the yaml metadata:

1
2
3
4
5
6
7


---
title: "My Document"
format: html
knitr:
  opts_chunk:
    dev: "ragg_png"
---

If you want to use a font installed on your computer, you’re done!

grid::grid.text(
  "FUTURA 🎉",
  gp = grid::gpar(fontfamily = "Futura", fontface = 3, fontsize = 30)
)

Or, if using ggplot2

ggplot(na.omit(penguins)) +
  geom_point(aes(x = bill_len, y = body_mass, colour = species)) +
  labs(x = "Bill Length", y = "Body Mass", colour = "Species") +
  theme_minimal(base_family = "Futura")

If the results don’t look as you expect, you can use various systemfonts helpers to diagnose the problem:

systemfonts::match_fonts("Futura", weight = "bold")
#> # A tibble: 1 × 3
#>   path                                          index features  
#>                                                 
#> 1 /System/Library/Fonts/Supplemental/Futura.ttc     2 
systemfonts::font_fallback("🎉", family = "Futura", weight = "bold")
#>                                          path index
#> 1 /System/Library/Fonts/Apple Color Emoji.ttc     0

If you want to see all the fonts that are available for use, you can use systemfonts::system_fonts()

systemfonts::system_fonts()

#> # A tibble: 570 × 9
#>    path                                         index name  family style weight width italic monospace
#>                                                          
#>  1 /System/Library/Fonts/Supplemental/Rockwell…     2 Rock… Rockw… Bold  bold   norm… FALSE  FALSE    
#>  2 /System/Library/Fonts/Noteworthy.ttc             0 Note… Notew… Light normal norm… FALSE  FALSE    
#>  3 /System/Library/Fonts/Supplemental/Devanaga…     1 Deva… Devan… Bold  bold   norm… FALSE  FALSE    
#>  4 /System/Library/Fonts/Supplemental/Kannada …     0 Kann… Kanna… Regu… normal norm… FALSE  FALSE    
#>  5 /System/Library/Fonts/Supplemental/Verdana …     0 Verd… Verda… Bold  bold   norm… FALSE  FALSE    
#>  6 /System/Library/Fonts/ArialHB.ttc                8 Aria… Arial… Light light  norm… FALSE  FALSE    
#>  7 /System/Library/Fonts/AppleSDGothicNeo.ttc      10 Appl… Apple… Thin  thin   norm… FALSE  FALSE    
#>  8 /System/Library/Fonts/Supplemental/DecoType…     0 Deco… DecoT… Regu… normal norm… FALSE  FALSE    
#>  9 /System/Library/Fonts/Supplemental/Trebuche…     0 Treb… Trebu… Ital… normal norm… TRUE   FALSE    
#> 10 /System/Library/Fonts/Supplemental/Khmer MN…     0 Khme… Khmer… Regu… normal norm… FALSE  FALSE    
#> # ℹ 560 more rows

Extra font styles

As we discussed above, the R interface only allows you to select between four styles: plain, italic, bold, and bold-italic. If you want to use a thin font, you have no way of communicating this wish to the device. To overcome this, systemfonts provides register_variant() which allows you to register a font with a new typeface name. For example, to use the thin font from the Avenir Next typeface you can register it as follows:

systemfonts::register_variant(
  name = "Avenir Thin",
  family = "Avenir Next",
  weight = "thin"
)

Now you can use Avenir Thin where you would otherwise specify the typeface:

grid::grid.text(
  "Thin weight is soo classy",
  gp = grid::gpar(fontfamily = "Avenir Thin", fontsize = 30)
)

register_variant() also allows you to turn on font features otherwise hidden away:

systemfonts::register_variant(
  name = "Avenir Small Caps",
  family = "Avenir Next",
  features = systemfonts::font_feature(
    letters = "small_caps"
  )
)
grid::grid.text(
  "All caps — Small caps",
  gp = grid::gpar(fontfamily = "Avenir Small Caps", fontsize = 30)
)

Fonts from other places

Historically, systemfonts primary role was to access the font installed on your computer, the system fonts. But what if you’re using a computer where you don’t have the rights to install new fonts, or you don’t want the hassle of installing a font just to use it for a single plot? That’s the problem solved by systemfonts::add_font() which makes it easy to use a font based on a path. But in many cases you don’t even need that as systemfont now scans ./fonts and ~/fonts and adds any font files it find. This means that you can put personal fonts in a fonts folder in your home directory, and project fonts in a fonts directory at the root of the project. This is a great way to ensure that specific fonts are available when you deploy some code to a server.

And you don’t even need to leave R to populate these folders. systemfonts::get_from_google_fonts() will download and install a google font in ~/fonts:

systemfonts::get_from_google_fonts("Barrio")

grid::grid.text(
  "A new font a day keeps Tufte away",
  gp = grid::gpar(fontfamily = "Barrio", fontsize = 30)
)

And if you want to make sure this code works for anyone using your code (regardless of whether or not they already have the font installed), you can use systemfonts::require_font() . If the font isn’t already installed, this function download it from one of the repositories it knows about. If it can’t find it it will either throw an error (the default) or remap the name to another font so that plotting will still succeed.

systemfonts::require_font("Rubik Distressed")
#> Trying Google Fonts... Found! Downloading font to /var/folders/l4/tvfrd0ps4dqdr2z7kvnl9xh40000gn/T//Rtmp2qw4bE

grid::grid.text(
  "There are no bad fonts\nonly bad text",
  gp = grid::gpar(fontfamily = "Rubik Distressed", fontsize = 30)
)

By default, require_font() places new fonts in a temporary folder so it doesn’t pollute your carefully curated collection of fonts.

Font embedding in SVG

Fonts work a little differently in vector formats like SVG. These formats include the raw text and only render the font when you open the file. This makes for small, accessible files with crisp text at every level of zoom. But it comes with a price: since the text is rendered when it’s opened, it relies on the font in use being available on the viewer’s computer. This obviously puts you at the mercy of their font selection, so if you want consistent outputs you’ll need to embed the font.

In SVG, you can embed fonts using an @import statement in the stylesheet, and can point to a web resource so the SVG doesn’t need to contain the entire font. systemfonts provides facilities to generate URLs for import statements and can provide them in a variety of formats:

systemfonts::fonts_as_import("Barrio")
#> [1] "https://fonts.bunny.net/css2?family=Barrio&display=swap"
systemfonts::fonts_as_import("Rubik Distressed", type = "link")
#> [1] ""

Further, if the font is not available from an online repository, it can embed the font data directly into the URL:

substr(systemfonts::fonts_as_import("Chalkduster"), 1, 200)
#> [1] "data:text/css,@font-face%20%7B%0A%20%20font-family:%20%22Chalkduster%22;%0A%20%20src:%20url(data:font/ttf;charset=utf-8;base64,AAEAAAAMAIAAAwC4T1MvMmk8+wsAAAFIAAAAYGNtYXBJhgfNAAAEOAAACspnbHlmLDPYGwAAf"

svglite uses this feature to allow seamless font embedding with the web_fonts argument. It can take a URL as returned by fonts_as_import() or just the name of the typeface and the URL will automatically be resolved. Look at line 6 in the SVG generated below

svg <- svglite::svgstring(web_fonts = "Barrio")
grid::grid.text("Example", gp = grid::gpar(fontfamily = "Barrio"))
invisible(dev.off())
svg()
#> 
#> 
#> 
#> 
#>   
#> 
#> 
#> 
#>   
#>     
#>   
#> 
#> 
#> Example
#> 
#> 
#>

Want more?

This document has mainly focused on how to use the fonts you desire from within R. R has other limitations when it comes to text rendering specifically how to render text that consists of a mix of fonts. This has been solved by marquee and the curious soul can continue there in order to up their skills in rendering text with R

grid::grid.draw(
  marquee::marquee_grob(
    "_This_ **is** the {.red end}",
    marquee::classic_style(base_size = 30)
  )
)

Be aware that the style name is at the discretion of the developer of the typeface. It is very common to see discrepancies between the style name and e.g. the weight reported by the font (e.g. Avenir Next Ultra Light is a thin weight font). ↩︎

S7 0.2.0

Tomasz Kalinowski — Thu, 07 Nov 2024 00:00:00 +0000

We’re excited to announce that S7 v0.2.0 is now available on CRAN! S7 is a new object-oriented programming (OOP) system designed to supersede both S3 and S4. You might wonder why R needs a new OOP system when we already have two. The reason lies in the history of R’s OOP journey: S3 is a simple and effective system for single dispatch, while S4 adds formal class definitions and multiple dispatch, but at the cost of complexity. This has forced developers to choose between the simplicity of S3 and the sophistication of S4.

The goal of S7 is to unify the OOP landscape by building on S3’s existing dispatch system and incorporating the most useful features of S4 (along with some new ones), all with a simpler syntax. S7’s design and implementation have been a collaborative effort by a working group from the R Consortium , including representatives from R-Core, Bioconductor, tidyverse/Posit, ROpenSci, and the wider R community. Since S7 builds on S3, it is fully compatible with existing S3-based code. It’s also been thoughtfully designed to work with S4, and as we learn more about the challenges of transitioning from S4 to S7, we’ll continue to add features to ease this process.

Our long-term goal is to include S7 in base R, but for now, you can install it from CRAN:

install.packages("S7")

What’s new in the second release

The second release of S7 brings refinements and bug fixes. Highlights include:

Support for lazy property defaults, making class setup more flexible.
Custom property setters now run on object initialization.
Significant speed improvements for setting and getting properties with @ and @<-.
Expanded compatibility with base S3 classes.
convert() now provides a default method for transforming a parent class into a subclass.

Additionally, there are numerous bug fixes and quality-of-life improvements, such as better error messages, improved support for base Ops methods, and compatibility improvements for using @ in R versions prior to 4.3. You can see a full list of changes in the release notes .

Who should use S7

S7 is a great fit for R users who like to try new things but don’t need to be the first. It’s already used in several CRAN packages, and the tidyverse team is applying it in new projects. While you may still run into a few issues, many early problems have been resolved.

Usage

library(S7)

Let’s dive into the basics of S7. To learn more, check out the package vignettes, including a more detailed introduction in vignette("S7") , and coverage of generics and methods in vignette("generics-methods") , and classes and objects in vignette("classes-objects") .

Classes and objects

S7 classes have formal definitions, specified by new_class() , which includes a list of properties and an optional validator. For example, the following code creates a Range class with start and end properties, and a validator to ensure that start is always less than end:

Range <- new_class("Range",
  properties = list(
    start = class_double,
    end = class_double
  ),
  validator = function(self) {
    if (length(self@start) != 1) {
      "@start must be length 1"
    } else if (length(self@end) != 1) {
      "@end must be length 1"
    } else if (self@end < self@start) {
      "@end must be greater than or equal to @start"
    }
  }
)

new_class() returns the class object, which also serves as the constructor to create instances of the class:

x <- Range(start = 1, end = 10)
x
#> 
#>  @ start: num 1
#>  @ end  : num 10

Properties

The data an object holds are called its properties. Use @ to get and set properties:

x@start
#> [1] 1
x@end <- 20
x
#> 
#>  @ start: num 1
#>  @ end  : num 20

Properties are automatically validated against the type declared in new_class() (in this case, double) and checked by the class validator:

x@end <- "x"
#> Error: @end must be , not 
x@end <- -1
#> Error:  object is invalid:
#> - @end must be greater than or equal to @start

Generics and methods

Like S3 and S4, S7 uses functional OOP, where methods belong to generic functions, and method calls look like regular function calls: generic(object, arg2, arg3). A generic uses the types of its arguments to automatically pick the appropriate method implementation.

You can create a new generic with new_generic() , specifying the arguments to dispatch on:

inside <- new_generic("inside", "x")

To define a method for a specific class, use method(generic, class) <- implementation:

method(inside, Range) <- function(x, y) {
  y >= x@start & y <= x@end
}

inside(x, c(0, 5, 10, 15))
#> [1] FALSE  TRUE  TRUE  TRUE

Printing the generic shows its methods:

inside
#>  inside(x, ...) with 1 methods:
#> 1: method(inside, Range)

And you can retrieve the method for a specific class:

method(inside, Range)
#>  method(inside, Range)
#> function (x, y) 
#> {
#>     y >= x@start & y <= x@end
#> }

Known limitations

While we are pleased with S7’s design, there are still some limitations:

S7 objects can be serialized to disk (with saveRDS() ), but the current implementation saves the entire class specification with each object. This may change in the future.
Support for implicit S3 classes "array" and "matrix" is still in development.

We expect the community will uncover more issues as S7 is more widely adopted. If you encounter any problems, please file an issue at https://github.com/RConsortium/OOP-WG/issues . We appreciate your feedback in helping us make S7 even better! 😃

Acknowledgements

Thank you to all people who have contributed issues, code, and comments to this release:

@calderonsamuel , @Crosita , @DavisVaughan , @dipterix , @guslipkin , @gvelasq , @hadley , @jeffkimbrel , @jl5000 , @jmbarbone , @jmiahjones , @jonthegeek , @JosiahParry , @jtlandis , @lawremi , @MarcellGranat , @mikmart , @mmaechler , @mynanshan , @rikivillalba , @sjcowtan , @t-kalinowski , @teunbrand , and @waynelapierre .

pkgdown 2.1.0

Hadley Wickham — Mon, 08 Jul 2024 00:00:00 +0000

We’re delighted to announce the release of pkgdown 2.1.0. pkgdown is designed to make it quick and easy to build a beautiful and accessible website for your package.

You can install it from CRAN with:

install.packages("pkgdown")

This is a massive release with a bunch of new features. I’ll highlight the most important here, but as always, I highlight recommend skimming the release notes for other smaller improvements and bug fixes.

First, and most importantly, please join me in welcoming two new authors to pkgdown: Olivier Roy and Salim Brüggemann . They have both contributed many improvements to the package and I’m very happy to officially have them aboard as package authors.

library(pkgdown)

Lifecycle changes

Let’s get started with the important stuff, the lifecycle updates . Most important we’ve decided to deprecate support for Bootstrap 3, which was superseded in December 2021. We’re starting to more directly encourage folks to move away from it as maintaining two separate sets of site templates is a time sink. If you’re still using BS3, now’s the time to upgrade .

There are three other changes that are less likely to affect folks:

The document argument to build_site() and build_reference() has been removed after being deprecated in pkgdown 1.4.0; use the devel argument instead.
autolink_html() was deprecated in pkgdown 1.6.0 and now warns every time you use it; use downlit::downlit_html_path() instead.
preview_page() has been deprecated; use preview_site() instead.

Major new features

pkgdown 2.1.0 has two major new features: support for Quarto vignettes and a new light switch that toggles between light and dark modes.

Quarto support

build_article() /build_articles() now support articles and vignettes written with Quarto. To use it, make sure you have the the latest version of Quarto, 1.5, which was released last week. By and large you should be able to just write in Quarto and things will just work, but you will need to make a small change to your GitHub action. Learn more at vignette("quarto") .

Combining the individual quarto and pkgdown templating systems is a delicate art, so while I’ve done my best to make it work, there may be some rough edges. Check out the current known limitations in vignette("quarto") , and please file an issue if you encounter a quarto feature that doesn’t work quite right.

Light switch

pkgdown sites can now provide a “light switch” that allows the reader to switch between light and dark modes (based on work in bslib by @gadenbuie ). You can try it out on https://pkgdown.r-lib.org : the light switch appears at the far right at the navbar and remembers the users choice between visits to your site.

(Note that the light switch works differently to quarto dark mode. In quarto, you can provide two completely different themes for light and dark mode. In pkgdown, dark mode is a relatively thin overlay that based on your light theme colours.)

For now, you’ll need to opt-in to the light-switch by adding the following to your _pkgdown.yml:

1
2


template
  light-switch: true

In the future we hope to turn it on automatically.

You can learn more about customising the light switch in vignette("customise") : you can choose to select your own syntax highlighting scheme for dark mode, override dark-specific BS lib variables, and move its location in the navbar.

User experience

We’ve made a bunch of small changes to enhance the user experience of pkgdown sites:

We’ve continued in our efforts to make pkgdown sites as accessible as possible by now warning if you’ve forgotten to add alt text to images (including plots) in your articles. We’ve also added a new vignette("accessibility") which describes additional manual tasks you can perform to make your site as accessible as possible.
build_reference() adds anchors to arguments making it possible to link directly to an argument. This is very useful when you’re trying to direct folks to the documentation for a specific argument, e.g. https://pkgdown.r-lib.org/reference/build_site.html#arg-devel .
build_reference_index() now displays function lifecycle badges next to the function name . If you want to gather together (e.g.) all the deprecated function in one spot in the reference index, you can use the new topic selector has_lifecycle("deprecated").
The new template.math-rendering option allows you to control how math is rendered on your site. The default uses mathml which is zero dependency but has the lowest fidelity. If you use a lot of math on your site, you can switch back to the previous method with mathjax, or try out katex, a faster alternative.
pkgdown sites no longer depend on external content distribution networks (CDN) for common javascript, CSS, and font files. CDNs no longer provide any performance advantages and make deployment harder inside certain locked-down corporate environments.
pkgdown includes translations for more terms including “Abstract” and “Search site”. A big thanks to @jplecavalier, @dieghernan, @krlmlr, @LDalby, @rich-iannone, @jmaspons, and @mine-cetinkaya-rundel for providing updated translations in French, Spanish, Portugese, Germna, Catalan, and Turkish!

I’ve also written vignette("translations") , a brief vignette that discusses how translation works for non-English sites, and includes how you can create translations for new languages. (This is a great way to contribute to pkgdown if you are multi-lingual!)

Developer experience

We’ve also made a bunch of minor improvements to make improve the package developer experience:

YAML validation has been substantially improved so you should get much clearer errors if you have made a mistake in your _pkgdown.yml. Please file an issue if you find a case where the error message is not helpful.
The build_*() functions (apart from build_site() ) no longer automatically preview in interactive sessions since they all emit clickable links to any files that have changed. You can continue to use preview_site() to open the site in your browser.
The build_*() functions now work better if you’re previewing just part of a site and haven’t built the whole thing. It should no longer be necessary to run init_site() in most cases, and you shouldn’t be able to get into a state where you’re told to run init_site() and then it doesn’t work.
We give more and clearer details of the site building process including reporting on exactly what is generated by bslib, what is copied from templates, and what redirects are generated.

Acknowledgements

A big thanks to all 212 folks who contributed to this release! @Adafede , @AEBilgrau , @albertocasagrande , @alex-d13 , @AliSajid , @arkadiuszbeer , @ArneBab , @asadow , @ateucher , @avhz , @banfai , @barcaroli , @BartJanvanRossum , @bastistician , @ben18785 , @bijoychandraAU , @Bisaloo , @bkmgit , @bnprks , @brycefrank , @bschilder , @bundfussr , @cararthompson , @Carol-seven , @cbailiss , @cboettig , @cderv , @chlebowa , @chuxinyuan , @cromanpa94 , @cthombor , @d-morrison , @DanChaltiel , @DarioS , @davidchall , @DavisVaughan , @dbosak01 , @dchiu911 , @ddsjoberg , @DeepanshKhurana , @dhersz , @dieghernan , @djhocking , @dkarletsos , @dmurdoch , @dshemetov , @dsweber2 , @dvg-p4 , @DyfanJones , @ecmerkle , @eddelbuettel , @eeholmes , @eitsupi , @eliocamp , @elong0527 , @EmilHvitfeldt , @erikarasnick , @esimms999 , @espinielli , @etiennebacher , @ewenharrison , @filipsch , @FlukeAndFeather , @francoisluc , @friendly , @fweber144 , @gaborcsardi , @gadenbuie , @galachad , @gangstR , @gavinsimpson , @GeoBosh , @GFabien , @ggcostoya , @ghost , @givison , @gladkia , @glin , @gmbecker , @gravesti , @GregorDeCillia , @gregorypenn , @gsmolinski , @gsrohde , @gungorMetehan , @hadley , @harshkrishna17 , @HenrikBengtsson , @hfrick , @hrecht , @hsloot , @idavydov , @idmn , @igordot , @IndrajeetPatil , @jabenninghoff , @jack-davison , @jangorecki , @jayhesselberth , @jennybc , @jeroen , @JerryWho , @jhelvy , @jmaspons , @john-harrold , @john-ioannides , @jonasmuench , @jonnybaik , @josherrickson , @joshualerickson , @JosiahParry , @jplecavalier , @JSchoenbachler , @juliasilge , @jwimberl , @kalaschnik , @kevinushey , @klmr , @krlmlr , @LDalby , @ldecicco-USGS , @lhdjung , @LiNk-NY , @lionel- , @Liripo , @lorenzwalthert , @lschneiderbauer , @mabesa , @maelle , @maRce10 , @margotbligh , @marine-ecologist , @markfairbanks , @martinlaw , @matt-dray , @mattfidler , @matthewjnield , @MattPM , @mccarthy-m-g , @MEO265 , @merliseclyde , @MichaelChirico , @mikeblazanin , @mikeroswell , @mine-cetinkaya-rundel , @MLopez-Ibanez , @Moohan , @mpadge , @mrcaseb , @mrchypark , @ms609 , @msberends , @musvaage , @nanxstats , @nathaneastwood , @netique , @nicholascarey , @nicolerg , @olivroy , @pearsonca , @peterdesmet , @phauchamps , @przmv , @quantsch , @ramiromagno , @rcannood , @rempsyc , @rgaiacs , @rich-iannone , @rickhelmus , @rmflight , @robmoss , @royfrancis , @rsangole , @ryantibs , @salim-b , @samuel-marsh , @SebKrantz , @SESjo , @sgvignali , @spsanderson , @srfall , @stefanoborini , @stephenashton-dhsc , @strengejacke , @swsoyee , @t-kalinowski , @talgalili , @tanho63 , @tedmoorman , @telphick , @TFKentUSDA , @ThierryO , @thisisnic , @thomasp85 , @tomsing1 , @tony-aw , @trevorld , @tylerlittlefield , @uriahf , @urswilke , @ValValetl , @venpopov , @vincentvanhees , @wangq13 , @willgearty , @wviechtb , @xuyiqing , @yjunechoe , @ynsec37 , @zeehio , and @zkamvar .

withr 3.0.0

Lionel Henry — Thu, 18 Jan 2024 00:00:00 +0000

It’s not without jubilant bearing that we announce the release of the 3.0.0 version of withr , the tidyverse solution for automatic cleanup of resources! In this release, the internals of withr were rewritten to improve the performance and increase the compatibility with base R’s on.exit() mechanism.

You can install it from CRAN with:

install.packages("withr")

In this blog post we’ll go over the changes that made this rewrite possible, but first we’ll review the cleanup strategies made possible by withr.

You can see a full list of changes in the release notes .

Cleaning up resources with base R and with withr

Traditionally, resource cleanup in R is done with base::on.exit() . Cleaning up in the on-exit hook ensures that the cleanup happens both in the normal case, when the code has finished running without error, and in the error case, when something went wrong and execution is interrupted.

on.exit() is meant to be used inside functions but it also works within local() , which we’ll use here for our examples:

local({
  on.exit(message("Cleaning time!"))
  print(1 + 2)
})
#> [1] 3
#> Cleaning time!

local({
  on.exit(message("Cleaning time!"))
  stop("uh oh")
  print(1 + 2)
})
#> Error:
#> ! uh oh
#> Cleaning time!

on.exit() is guaranteed to run no matter what and this property makes it invaluable for resource cleaning. No more accidental littering!

However the process of cleaning up this way can be a bit verbose and feel too manual. Here is how you’d create and clean up a temporary file for instance:

local({
  my_file <- tempfile()

  file.create(my_file)
  on.exit(file.remove(my_file))

  writeLines(c("a", "b"), con = my_file)
})

Wouldn’t it be great if we could wrap this code up in a function? That’s the goal of withr’s local_-prefixed functions. They combine both the creation or modification of a resource and its (eventual) restoration to the original state into a single function:

local({
  my_file <- withr::local_tempfile()

  writeLines(c("a", "b"), con = my_file)
})

In this case we have created a resource (a file), but the same principle applies to modifying resources such as global options:

local({
  # Let's temporarily print with a single decimal place
  withr::local_options(digits = 1)
  print(1/3)
})
#> [1] 0.3

# The original option value has been restored
getOption("digits")
#> [1] 7

print(1/3)
#> [1] 0.3333333

And you can equivalently use the with_-prefixed variants (from which the package takes its name!), this way you don’t need to wrap in local() :

withr::with_options(list(digits = 1), print(1/3))
#> [1] 0.3

The with_ functions are useful for creating very small scopes for given resources, inside or outside a function.

The withr 3.0.0 rewrite

Traditionally, withr implemented its own exit event system on top of on.exit() . We needed an extra layer because of a couple of missing features:

When multiple resources are managed by a piece of code, the order in which these resources are restored or cleaned up sometimes matter. The most consistent order for cleanup is last-in first-out (LIFO). In other words the oldest resource, on which younger resources might depend, is cleaned up last. But historically R only supported first-in first-out (FIFO) order.
The other missing piece was being able to inspect the contents of the exit hook. The sys.on.exit() R helper was created for this purpose but was affected by a bug that prevented it from working inside functions.

We contributed two changes to R 3.5.0 that filled these missing pieces, fixing the sys.on.exit() bug and adding an after argument to on.exit() to allow last-in first-out ordering.

Until now, we haven’t been able to leverage these contributions because of our policy of supporting the current and previous four versions of R . Now that enough time has passed, it was time for a rewrite! Our version of base::on.exit() is withr::defer() . Along with better default behaviour, withr::defer() allows the clean up of resources non-locally (ironically an essential feature for implementing local_ functions). Given the changes in R 3.5.0, withr::defer() can now be implemented as a simple wrapper around on.exit() .

One benefit of the rewrite is that mixing withr tools and on.exit() in the same function now correctly interleaves cleanup:

local({
  on.exit(print(1))

  withr::defer(print(2))

  on.exit(print(3), add = TRUE, after = FALSE)

  withr::defer(print(4))

  print(5)
})
#> [1] 5
#> [1] 4
#> [1] 3
#> [1] 2
#> [1] 1

But the main benefit is increased performance. Here is how defer() compared to on.exit() in the previous version:

base <- function() on.exit(NULL)
withr <- function() defer(NULL)

# withr 2.5.2
bench::mark(base(), withr(), check = FALSE)[1:8]
#> # A tibble: 2 × 8
#>   expression      min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>                 
#> 1 base()            0   82ns  6954952.        0B    696.   9999     1
#> 2 withr()      26.2µs 27.9µs    35172.    88.4KB     52.8  9985    15

withr 3.0.0 has now caught up to on.exit() quite a bit:

# withr 3.0.0
bench::mark(base(), withr(), check = FALSE)[1:8]
#> # A tibble: 2 × 8
#>   expression      min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>                 
#> 1 base()            0   82ns  7329829.        0B       0  10000     0
#> 2 withr()      2.95µs  3.4µs   280858.        0B     225.  9992     8

Of course on.exit() is still much faster, in part because defer() supports more features (more on that below), but mostly because on.exit is a primitive function whereas defer() is implemented as a normal R function. That said, we hope that we now have made defer() (and the local_ and with_ functions that use it) sufficiently fast to be used even in performance-critical micro-tools.

Improved withr features

Over the successive releases of withr we’ve improved the behaviour of cleanup expressions interactively, in scripts executed with source() , and in knitr. on.exit() is a bit inconsistent when it is used outside of a function:

Interactively, it doesn’t do anything.
In source() and in knitr, it runs immediately instead of a the end of the script

withr::defer() and the withr::local_ helpers try to be more helpful for these cases.

Interactively, it saves the cleanup action in a special global hook and you get information about how to actually perform the cleanup:

file <- withr::local_tempfile()
#> Setting global deferred event(s).
#> i These will be run:
#>   * Automatically, when the R session ends.
#>   * On demand, if you call `withr::deferred_run()`.
#> i Use `withr::deferred_clear()` to clear them without executing.

# Clean up now
withr::deferred_run()
#> Ran 1/1 deferred expressions

In knitr or source() ¹, the cleanup is performed at the end of the document or of the script. If you need chunk-level cleanup, use local() as we’ve been doing in the examples of this blog post:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


Cleaning up at the end of the document:

```r
document_wide_file <- withr::local_tempfile()
```

Cleaning up at the end of the chunk:

```r
local({
  local_file <- withr::local_tempfile()
})
```

Starting from withr 3.0.0, you can also run deferred_run() inside of a chunk:

1
2
3
4


```r
withr::deferred_run()
#> Ran 1/1 deferred expressions
```

Acknowledgements

Thanks to the github contributors who helped us with this release!

@ashbythorpe , @bastistician , @DavisVaughan , @fkohrt , @gaborcsardi , @gdurif , @hadley , @HenrikBengtsson , @honghaoli42 , @IndrajeetPatil , @jameslairdsmith , @jennybc , @jonkeane , @krlmlr , @lionel- , @maelle , @MichaelChirico , @MLopez-Ibanez , @moodymudskipper , @multimeric , @orichters , @pfuehrlich-pik , @solmos , @tillea , and @vanhry .

source() is only supported by default when running in the global environment, which is usually the case. For the special case of sourcing in a local environment, you need to set options(withr.hook_source = TRUE) first. ↩︎

roxygen2 7.3.0

Hadley Wickham — Thu, 11 Jan 2024 00:00:00 +0000

We’re well pleased to announce the release of roxygen2 7.3.0. roxygen2 allows you to write specially formatted R comments that generate R documentation files (man/*.Rd) and the NAMESPACE file. roxygen2 is used by over 13,000 CRAN packages.

You can install it from CRAN with:

install.packages("roxygen2")

There are four major improvements in this release:

The NAMESPACE roclet now reports if you have S3 methods that are missing an @export tag. All S3 methods need to be @exported even if the generic is not. This avoids rare, but hard to debug, problems. If you think this is giving a false positive, please file an issue and suppress the warning with @exportS3Method NULL.

I’ve also considerably revamped the documentation for S3 methods in vignette("namespace") . The docs now discuss what exporting an S3 method really means, and why it would be technically better to call it registering the method.
Finally, the NAMESPACE roclet once again regenerates imports before loading package code and parsing roxygen blocks. This has been the goal for a long time , but we accidentally broke it when adding support for code execution in markdown blocks. This change resolves a family of problems where you somehow bork your NAMESPACE and can’t easily get fix it because you can’t re-document the package because you can’t load your package because your NAMESPACE is borked.
@docType package now works like "_PACKAGE" , including creating a {packagename}-package alias automatically. This resolves a bug introduced in roxygen2 7.0.0 that meant that many packages lacked the correct alias for their package documentation topic.
"_PACKAGE" does a better job of automatically generating aliases. In particular, it will no longer generate a duplicate alias if you have a function with the same name as your package (like glue::glue() or reprex::reprex() ). If you’ve previously had to hack around this bug, you can now delete any custom @aliases tags associated with the "_PACKAGE" docs.

You can see a full list of other minor improvements and bug fixes in the release notes .

Acknowledgements

A big thanks to the 46 folks who helped make this release possible through their thoughtful questions and carefully crafted code! @andrewmarx , @ashbythorpe , @ateucher , @bahadzie , @bastistician , @beginb , @brodieG , @bryanhanson , @cbielow , @daattali , @DanChaltiel , @dpprdan , @dsweber2 , @espinielli , @hadley , @hughjonesd , @jeroen , @jmbarbone , @johnbaums , @jonocarroll , @kathi-munk , @krlmlr , @kylebutts , @lionel- , @LouisLeNezet , @maelle , @MaximilianPi , @MichaelChirico , @moodymudskipper , @msberends , @multimeric , @musvaage , @neshvig10 , @olivroy , @ralmond , @RMHogervorst , @Robinlovelace , @rossellhayes , @rsbivand , @sbgraves237 , @schradj , @sebffischer , @simonpcouch , @stemangiola , @tau31 , and @trusch139 .

Three ways errors are about to get better in tidymodels

Simon Couch — Fri, 10 Nov 2023 00:00:00 +0000

Twice a year, the tidymodels team comes together for “spring cleaning,” a week-long project devoted to package maintenance. Ahead of the week, we come up with a list of maintenance tasks that we’d like to see consistently implemented across our packages. Many of these tasks can be completed by running one usethis function, while others are much more involved, like issue triage.¹ In tidymodels, triaging issues in our core packages helps us to better understand common ways that users struggle to wrap their heads around an API choice we’ve made or find the information they need. So, among other things, refinements to the wording of our error messages is a common output of our spring cleanings. This blog post will call out three kinds of changes to our erroring that came out of this spring cleaning:

Improving existing errors: The outcome went missing
Do something where we once did nothing: Predicting with things that can’t predict
Make a place and point to it: Model formulas

To demonstrate, we’ll walk through some examples using the tidymodels packages:

library(tidymodels)
#> ── Attaching packages ──────────────────────────── tidymodels 1.1.1 ──
#> ✔ broom        1.0.5          ✔ recipes      1.0.8.9000
#> ✔ dials        1.2.0          ✔ rsample      1.2.0     
#> ✔ dplyr        1.1.3          ✔ tibble       3.2.1     
#> ✔ ggplot2      3.4.4          ✔ tidyr        1.3.0     
#> ✔ infer        1.0.5          ✔ tune         1.1.2.9000
#> ✔ modeldata    1.2.0          ✔ workflows    1.1.3     
#> ✔ parsnip      1.1.1.9001     ✔ workflowsets 1.0.1     
#> ✔ purrr        1.0.2          ✔ yardstick    1.2.0
#> ── Conflicts ─────────────────────────────── tidymodels_conflicts() ──
#> ✖ purrr::discard() masks scales::discard()
#> ✖ dplyr::filter()  masks stats::filter()
#> ✖ dplyr::lag()     masks stats::lag()
#> ✖ recipes::step()  masks stats::step()
#> • Use suppressPackageStartupMessages() to eliminate package startup messages

Note that my installed versions include the current dev version of a few tidymodels packages. You can install those versions with:

pak::pak(paste0("tidymodels/", c("tune", "parsnip", "recipes")))

The outcome went missing 👻

The tidymodels packages focus on supervised machine learning problems, predicting the value of an outcome using predictors.² For example, in the code:

linear_spec <- linear_reg()

linear_fit <- fit(linear_spec, mpg ~ hp, mtcars)

The mpg variable is the outcome. There are many ways that an analyst may mistakenly fail to pass an outcome. In the most straightforward case, they might omit the outcome on the LHS of the formula:

1
2
3


fit(linear_spec, ~ hp, mtcars)
#> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
#>   incompatible dimensions

In this case, parsnip used to defer to the modeling engine to raise an error, which may or may not be informative.

There are many less obvious ways an analyst may mistakenly supply no outcome variable. For example, try spotting the issue in the following code, defining a recipe to perform principal component analysis (PCA) on the numeric variables in the data before fitting the model:

1
2
3
4
5
6


mtcars_rec <-
  recipe(mpg ~ ., mtcars) %>%
  step_pca(all_numeric())

workflow(mtcars_rec, linear_spec) %>% fit(mtcars)
#> Error: object '.' not found

A head-scratcher! To help diagnose what’s happening here, we could first try seeing what data is actually being passed to the model.

mtcars_rec_trained <-
  mtcars_rec %>% 
  prep(mtcars) 

mtcars_rec_trained %>% bake(NULL)
#> # A tibble: 32 × 5
#>      PC1   PC2    PC3     PC4    PC5
#>            
#>  1 -195.  12.8 -11.4   0.0164  2.17 
#>  2 -195.  12.9 -11.7  -0.479   2.11 
#>  3 -142.  25.9 -16.0  -1.34   -1.18 
#>  4 -279. -38.3 -14.0   0.157  -0.817
#>  5 -399. -37.3  -1.38  2.56   -0.444
#>  6 -248. -25.6 -12.2  -3.01   -1.08 
#>  7 -435.  20.9  13.9   0.801  -0.916
#>  8 -160. -20.0 -23.3  -1.06    0.787
#>  9 -172.  10.8 -18.3  -4.40   -0.836
#> 10 -209.  19.7  -8.94 -2.58    1.33 
#> # ℹ 22 more rows

Mmm. What happened to mpg? We mistakenly told step_pca() to perform PCA on all of the numeric variables, not just the numeric predictors! As a result, it incorporated mpg into the principal components, removing each of the original numeric variables after the fact. Rewriting using the correct tidyselect specification all_numeric_predictors():

mtcars_rec_new <- 
  recipe(mpg ~ ., mtcars) %>%
  step_pca(all_numeric_predictors())

workflow(mtcars_rec_new, linear_spec) %>% fit(mtcars)
#> ══ Workflow [trained] ════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#> 
#> ── Preprocessor ──────────────────────────────────────────────────────
#> 1 Recipe Step
#> 
#> • step_pca()
#> 
#> ── Model ─────────────────────────────────────────────────────────────
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> (Intercept)          PC1          PC2          PC3          PC4  
#>    43.39293      0.07609     -0.05266      0.57892      0.94890  
#>         PC5  
#>    -1.72569

Works like a charm. That error we saw previously could be much more helpful, though. With the current developmental version of parsnip, this looks like:

fit(linear_spec, ~ hp, mtcars)
#> Error:
#> ! `linear_reg()` was unable to find an outcome.
#> ℹ Ensure that you have specified an outcome column and that it hasn't
#>   been removed in pre-processing.

Or, with workflows:

workflow(mtcars_rec, linear_spec) %>% fit(mtcars)
#> Error:
#> ! `linear_reg()` was unable to find an outcome.
#> ℹ Ensure that you have specified an outcome column and that it hasn't
#>   been removed in pre-processing.

Much better.

Predicting with things that can’t predict

Earlier this year, Dr. Louise E. Sinks put out a wonderful blog post documenting what it felt like to approach the various object types defined in the tidymodels as a newcomer to the collection of packages. They wrote:

I found it confusing that fit, last_fit, fit_resamples, etc., did not all produce objects that contained the same information and could be acted on by the same functions.

This makes sense. While we try to forefront the intended mental model for fitting and predicting with tidymodels in our APIs and documentation, we also need to be proactive in anticipating common challenges in constructing that mental model.

For example, we’ve found that it’s sometimes not clear to users which outputs they can call predict() on. One such situation, as Louise points out, is with fit_resamples():

# fit a linear regression model to bootstrap resamples of mtcars
mtcars_res <- fit_resamples(linear_reg(), mpg ~ ., bootstraps(mtcars))

mtcars_res
#> # Resampling results
#> # Bootstrap sampling 
#> # A tibble: 25 × 4
#>    splits          id          .metrics         .notes          
#>                                          
#>  1  Bootstrap01  
#>  2  Bootstrap02  
#>  3  Bootstrap03  
#>  4  Bootstrap04  
#>  5  Bootstrap05  
#>  6  Bootstrap06  
#>  7  Bootstrap07  
#>  8  Bootstrap08  
#>  9  Bootstrap09  
#> 10  Bootstrap10  
#> # ℹ 15 more rows

With previous tidymodels versions, mistakenly trying to predict with this object resulted in the following output:

1
2
3
4


predict(mtcars_res)
#> Error in UseMethod("predict") : 
#>   no applicable method for 'predict' applied to an object of class
#>   "c('resample_results', 'tune_results', 'tbl_df', 'tbl', 'data.frame')"

Some R developers may recognize this error as what results when we didn’t define any predict() method for tune_results objects. We didn’t do so because prediction isn’t well-defined for tuning results. But, this error message does little to help a user understand why that’s the case.

We’ve recently made some changes to error more informatively in this case. We do so by defining a “dummy” predict() method for tuning results, implemented only for the sake of erroring more informatively. The same code will now give the following output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


predict(mtcars_res)
#> Error in `predict()`:
#> ! `predict()` is not well-defined for tuning results.
#> ℹ To predict with the optimal model configuration from tuning
#>   results, ensure that the tuning result was generated with the
#>   control option `save_workflow = TRUE`, run `fit_best()`, and
#>   then predict using `predict()` on its output.
#> ℹ To collect predictions from tuning results, ensure that the
#>   tuning result was generated with the control option `save_pred
#>   = TRUE` and run `collect_predictions()`.

References to important concepts or functions, like control options , fit_best() , and collect_predictions() , link to the help-files for those functions using cli’s erroring tools .

We hope new error messages like this will help to get folks back on track.

Model formulas

In R, formulas provide a compact, symbolic notation to specify model terms. Many modeling functions in R make use of “specials,” or nonstandard notations used in formulas. Specials are defined and handled as a special case by a given modeling package. parsnip defers to engine packages to handle specials, so you can work with them as usual. For example, the mgcv package provides support for generalized additive models in R, and defines a special called s() to indicate smoothing terms. You can interface with it via tidymodels like so:

# define a generalized additive model specification
gam_spec <- gen_additive_mod("regression")

# fit the specification using a formula with specials
fit(gam_spec, mpg ~ cyl + s(disp, k = 5), mtcars)
#> parsnip model object
#> 
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> mpg ~ cyl + s(disp, k = 5)
#> 
#> Estimated degrees of freedom:
#> 3.39  total = 5.39 
#> 
#> GCV score: 6.380152

While parsnip can handle specials just fine, the package is often used in conjunction with the greater tidymodels package ecosystem, which defines its own pre-processing infrastructure and functionality via packages like hardhat and recipes. The specials defined in many modeling packages introduce conflicts with that infrastructure. To support specials while also maintaining consistent syntax elsewhere in the ecosystem, tidymodels delineates between two types of formulas: preprocessing formulas and model formulas. Preprocessing formulas determine the input variables, while model formulas determine the model structure.

This is a tricky abstraction, and one that users have tripped up on in the past. Users could generate all sorts of different errors by 1) mistakenly passing model formulas where preprocessing formulas were expected, or 2) forgetting to pass a model formula where it’s needed. For an example of 1), we could pass recipes the same formula we passed to parsnip:

1
2
3
4


recipe(mpg ~ cyl + s(disp, k = 5), mtcars)
#> Error in `inline_check()`:
#> ! No in-line functions should be used here; use steps to 
#>   define baking actions.

But we just used a special with another tidymodels function! Rude!

Or, to demonstrate 2), we pass the preprocessing formula as we ought to but forget to provide the model formula:

1
2
3
4
5
6
7
8


gam_wflow <- 
  workflow() %>%
  add_formula(mpg ~ .) %>%
  add_model(gam_spec) 

gam_wflow %>% fit(mtcars)
#> Error in `fit_xy()`:
#> ! `fit()` must be used with GAM models (due to its use of formulas).

Uh, but I did just use fit()!

Since the distinction between model formulas and preprocessor formulas comes up in functions across tidymodels, we decide to create a central page that documents the concept itself, hopefully making the syntax associated with it come more easily to users. Then, we link to it all over the place. For example, those errors now look like:

recipe(mpg ~ cyl + s(disp, k = 5), mtcars)
#> Error in `inline_check()`:
#> ✖ No in-line functions should be used here.
#> ℹ The following function was found: `s`.
#> ℹ Use steps to do transformations instead.
#> ℹ If your modeling engine uses special terms in formulas, pass that
#>   formula to workflows as a model formula
#>   (`?parsnip::model_formula()`).

Or:

gam_wflow %>% fit(mtcars)
#> Error:
#> ! When working with generalized additive models, please supply
#>   the model specification to `workflows::add_model()` along with a
#>   `formula` argument.
#> ℹ See `?parsnip::model_formula()` to learn more.

While I’ve only outlined three, there are all sorts of improvements to error messages on their way to the tidymodels packages in upcoming releases. If you happen to stumble across them, we hope they quickly set you back on the right path. 🗺

Issue triage consists of categorizing, prioritizing, and consolidating issues in a repository’s issue tracker. ↩︎
See the tidyclust package for unsupervised learning with tidymodels! ↩︎

testthat 3.2.0

Hadley Wickham — Sun, 08 Oct 2023 00:00:00 +0000

We’re chuffed to announce the release of testthat 3.2.0. testthat makes it easy to turn your existing informal tests into formal, automated tests that you can rerun quickly and easily. testthat is the most popular unit-testing package for R, and is used by almost 9,000 CRAN and Bioconductor packages. You can learn more about unit testing at https://r-pkgs.org/tests.html .

You can install it from CRAN with:

install.packages("testthat")

testthat 3.2.0 includes relatively few new features but there have been nine patch releases since testthat 3.1.0. These patch releases contained a bunch of experiments that we now believe are ready for the world. So this blog post summarises the changes in 3.1.1 , 3.1.2 , 3.1.3 , 3.1.4 , 3.1.5 , 3.1.6 , 3.1.7 , 3.1.8 , 3.1.9 , and 3.1.10 over the last two years.

Here we’ll focus on the biggest news: new expectations, tweaks to the way that error snapshots are reported, support for mocking, a new way to detect if a test has changed global state, and a bunch of smaller UI improvements.

library(testthat)

Documentation

The first and most important thing to point out is that the second edition of R Packages contains updated and much expanded coverage of testing. Coverage of testing is now split up over three chapters:

There’s also a new vignette about special files (vignette("special-files") ) which describes the various special files that you find in tests/testthat and when you might need to use them.

New expectations

There are a handful of notable new expectations. expect_contains() and expect_in() work similarly to expect_true(all(expected %in% object)) or expect_true(all(object %in% expected)) but give more informative failure messages:

fruits <- c("apple", "banana", "pear")
expect_contains(fruits, "apple")
expect_contains(fruits, "pineapple")
#> Error: `fruits` (`actual`) doesn't fully contain all the values in "pineapple" (`expected`).
#> * Missing from `actual`: "pineapple"
#> * Present in `actual`:   "apple", "banana", "pear"

x <- c(TRUE, FALSE, TRUE, FALSE)
expect_in(x, c(TRUE, FALSE))
x <- c(TRUE, FALSE, TRUE, NA, FALSE)
expect_in(x, c(TRUE, FALSE))
#> Error: `x` (`actual`) isn't fully contained within c(TRUE, FALSE) (`expected`).
#> * Missing from `expected`: NA
#> * Present in `expected`:   TRUE, FALSE

expect_no_error() , expect_no_warning() , and expect_no_message() make it easier (and clearer) to confirm that code runs without errors, warnings, or messages. The default fails if there is any error/warning/message, but you can optionally supply either the message or class arguments to confirm the absence of a specific error/warning/message.

foo <- function(x) {
  if (x < 0) {
    x + "10"
  } else {
    x = 20
  }
}

expect_no_error(foo(-10))
#> Error: Expected `foo(-10)` to run without any errors.
#> ℹ Actually got a  with text:
#>   non-numeric argument to binary operator

# No difference here but will lead to a better failure later
# once you've fixed this problem and later introduce a new one
expect_no_error(foo(-10), message = "non-numeric argument")
#> Error: Expected `foo(-10)` to run without any errors matching pattern 'non-numeric argument'.
#> ℹ Actually got a  with text:
#>   non-numeric argument to binary operator

Snapshotting changes

expect_snapshot(error = TRUE) has a new display of error messages that strives to be closer to what you see interactively. In particular, you’ll no longer see the error class and you will now see the error call.

Old display:
```
Code
  f()
Error 
  baz
```

New display:

Code
  f()
Condition
  Error in `f()`:
  ! baz

If you have used expect_snapshot(error = TRUE) in your package, this means that you will need to re-run and approve your snapshots. We hope this is not too annoying and we believe it is worth it given the more accurate reflection of generated error messages. This will not affect checks on CRAN because, by default, snapshot tests are not run on CRAN.

Mocking

Mocking¹ is a tool for temporarily replacing the implementation of a function in order to make testing easier. Sometimes when testing a function, one part of it is challenging to run in your test environment (maybe it requires human interaction, a live database connection, or maybe it just takes a long time to run). For example, take the following imaginary function. It has a bunch of straightforward computation that would be easy to test but right in the middle of the function it calls complicated() which is hard to test:

my_function <- function(x, y, z) {
  a <- f(x, y)
  b <- g(y, z)
  c <- h(a, b)
  
  d <- complicated(c)
  
  i(d, 1, TRUE)
}

Mocking allows you to temporarily replace complicated() with something simpler, allowing you to test the rest of the function. testthat now supports mocking with local_mocked_bindings() , which temporarily replaces the implementation of a function. For example, to test my_function() you might write something like this:

test_that("my_function() returns expected result", {
  local_mocked_bindings(
    complicated = function(x) TRUE
  )
  ...
})

testthat has a complicated past with mocking. testthat introduced with_mock() in v0.9 (way back in 2014), but we started discovering problems with the implementation in v2.0.0 (2017) leading to its deprecation in v3.0.0 (2020). A few packages arose to fill the gap (like mockery , mockr , and mockthat ) but none of their implementations were completely satisfactory. Earlier this year a new approach occurred to me that avoids many of the problems of the previous approaches. This is now implemented in with_mocked_bindings() and local_mocked_bindings() ; we’ve been using these new functions for a few months now without problems, and it feels like time to announce to the world.

State inspector

In times gone by it was very easy to accidentally change the state of the world in a test:

test_that("side-by-side diffs work", {
  options(width = 20)
  expect_snapshot(
    waldo::compare(c("X", letters), c(letters, "X"))
  )
})

When you look at a single test it’s easy to spot the problem, and switch to a more appropriate way of temporarily changing the options, like withr::local_options() . But sometimes this mistake crept in a long time ago and is now hiding amongst hundreds or thousands of tests.

In earlier versions of testthat, finding tests that accidentally changed the world was painful: the only way was to painstakingly review each test. Now you can use set_state_inspector() to register a function that’s called before and after every test. If the function returns different values, testthat will let you know. You’ll typically do this either in tests/testhat/setup.R or an existing helper file.

So, for example, to detect if any of your tests have modified options you could use this state inspector:

set_state_inspector(function() {
  list(options = options())
})

Or maybe you’ve seen an R CMD check warning that you’ve forgotten to close a connection:

set_state_inspector(function() {
  list(connections = nrow(showConnections()))
})

And you can of course combine multiple checks just by returning a more complicated list.

UI improvements

testthat 3.2.0 includes a bunch of minor user interface improvements that should make day-to-day use of testthat more enjoyable. Some of our favourite highlights are:

Parallel testing now works much better with snapshot tests. (And updates to the processx package means that testthat no longer leaves processes around if you terminate a test process early.)
We use an improved algorithm to find the source reference associated with an expectation/error/warning/skip. We now look for the most recent call (within inside test_that() that has known source. This generally gives more specific locations than the previous approach and gives much better locations if an error occurs in an exit handler.
Tracebacks are no longer truncated and we use rlang’s default tree display; this should make it easier to track down problems when testing in non-interactive contexts.
Assuming you have a recent RStudio, test failures are now clickable, taking you to the line where the problem occurred. Similarly, when a snapshot test changes, you can now click that suggested code to run the appropriate snapshot_accept() call.
Skips are now only shown at the end of reporter summaries, not as tests are run. This makes them less intrusive in interactive tests while still allowing you to verify that the correct tests are skipped.

Acknowledgements

A big thanks to all 127 contributors who helped make these last 10 release of testthat happen, whether it be through contributed code or filing issues: @ALanguillaume , @alessandroaccettulli , @ambica-aas , @annweideman , @aronatkins , @ashander , @AshesITR , @astayleraz , @ateucher , @avraam-inside , @b-steve , @bersbersbers , @billdenney , @Bisaloo , @cboettig , @cderv , @chendaniely , @ChrisBeeley , @ColinFay , @CorradoLanera , @daattali , @damianooldoni , @DanChaltiel , @danielinteractive , @DavisVaughan , @daynefiler , @dbdimitrov , @dcaseykc , @dgkf , @dhicks , @dimfalk , @dougwyu , @dpprdan , @dvg-p4 , @elong0527 , @Enchufa2 , @etiennebacher , @FlippieCoetser , @florisvdh , @gaborcsardi , @gareth-j , @gavinsimpson , @ghill-fusion , @hadley , @heavywatal , @hfrick , @hhau , @hpages , @hsloot , @hughjonesd , @IndrajeetPatil , @jameslairdsmith , @jamieRowen , @jayruffell , @JBGruber , @jennybc , @JohnCoene , @jonathanvoelkle , @jonthegeek , @josherrickson , @kalaschnik , @kapsner , @kevinushey , @kjytay , @krivit , @krlmlr , @larmarange , @lionel- , @llrs , @luma-sb , @machow , @maciekbanas , @maelle , @majr-red , @maksymiuks , @mardam , @MarkMc1089 , @markschat , @MatthieuStigler , @maurolepore , @maxheld83 , @mbojan , @mcol , @mgirlich , @MichaelChirico , @mkb13 , @mkoohafkan , @MKyhos , @moodymudskipper , @Mosk915 , @mpjashby , @ms609 , @mtmorgan , @musvaage , @nealrichardson , @netique , @njtierney , @olivroy , @osorensen , @pbulsink , @peterdesmet , @r2evans , @radbasa , @remlapmot , @rfineman , @rgayler , @romainfrancois , @s-fleck , @salim-b , @schloerke , @sorhawell , @StatisMike , @StatsMan53 , @stela2502 , @stla , @t-kalinowski , @tansaku , @tomliptrot , @torres-pedro , @wes-brooks , @wfmueller29 , @wleoncio , @wurli , @yogat3ch , @yuliaUU , @yutannihilation , and @zsigmas .

Think mimicking, like a mockingbird, not making fun of. ↩︎

pak 0.6.0

Gábor Csárdi — Tue, 05 Sep 2023 00:00:00 +0000

We’re delighted to announce the release of pak 0.6.0. pak helps with the installation of R packages and many related tasks.

You can install pak from CRAN with:

install.packages("pak")

If you use an older R version, or a platform that CRAN does not have binary packages for, it is faster and simpler to install pak from our repository. See the details in the manual.

This blog post focuses on the exciting new improvements in the matching and installation of system requirements on Linux systems.

You can see a full list of changes in the release notes

System requirements

Many R packages require the installation of external software, otherwise they do not work, or even load. For example, the RPostgres R package requires the PostgreSQL client library, and by default dynamically links to it on Linux systems. This means that you (or the administrators of your system) need to install this library, typically in the form of a system package: libpq-dev on Ubuntu and Debian systems, or postgresql-server-devel or postgresql-devel on Red Hat, Fedora, etc. systems.

The good news is that pak now helps you with this:

it looks up the required system packages when installing R packages,
it lets you know if any required system packages are missing from your system, before the installation, and
it installs them automatically, if you are a superuser, or if you can use password-less sudo to start a superuser shell.

In addition, pak now also has some functions to query system requirements and system packages.

Supported platforms

pak 0.6.0 supports the following Linux systems currently:

Ubuntu Linux,
Debian Linux,
Red Hat Enterprise Linux,
SUSE Linux Enterprise,
OpenSUSE,
CentOS,
Rocky Linux,
Fedora Linux.

Call pak::sysreqs_platforms() to query the current list of supported platforms:

pak::sysreqs_platforms()[,1:3]
#>                        name    os distribution
#> 1              Ubuntu Linux linux       ubuntu
#> 2              Debian Linux linux       debian
#> 3              CentOS Linux linux       centos
#> 4               Rocky Linux linux   rockylinux
#> 5  Red Hat Enterprise Linux linux       redhat
#> 6  Red Hat Enterprise Linux linux       redhat
#> 7  Red Hat Enterprise Linux linux       redhat
#> 8              Fedora Linux linux       fedora
#> 9            openSUSE Linux linux     opensuse
#> 10    SUSE Linux Enterprise linux          sle

Call pak::system_r_platform() to check if pak has detected your platform correctly, and pak::sysreqs_is_supported() to see if it is supported:

pak::system_r_platform()
#> [1] "x86_64-pc-linux-gnu-ubuntu-22.04"

pak::sysreqs_is_supported()
#> [1] TRUE

R package installation

If you are using pak as the root user, on a supported platform, then during package installation pak will look up the required system packages, and will install the missing ones. Here is an example:

pak::pkg_install("RPostgres")
#> i Loading metadata databasev Loading metadata database ... done
#>  
#> > Will install 12 packages.
#> > Will download 12 packages with unknown size.
#> + DBI          1.1.3  [dl]
#> + RPostgres    1.4.5  [dl] + x libpq-dev
#> + Rcpp         1.0.11 [dl]
#> + bit          4.0.5  [dl]
#> + bit64        4.0.5  [dl]
#> + blob         1.2.4  [dl]
#> + generics     0.1.3  [dl]
#> + hms          1.1.3  [dl]
#> + lubridate    1.9.2  [dl]
#> + pkgconfig    2.0.3  [dl]
#> + timechange   0.2.0  [dl]
#> + withr        2.5.0  [dl]
#> > Will install 1 system package:
#> + libpq-dev  - RPostgres
#> i Getting 12 pkgs with unknown sizes
#> v Got blob 1.2.4 (x86_64-pc-linux-gnu-ubuntu-22.04) (45.94 kB)
#> v Got generics 0.1.3 (x86_64-pc-linux-gnu-ubuntu-22.04) (76.24 kB)
#> v Got hms 1.1.3 (x86_64-pc-linux-gnu-ubuntu-22.04) (98.35 kB)
#> v Got RPostgres 1.4.5 (x86_64-pc-linux-gnu-ubuntu-22.04) (455.11 kB)
#> v Got bit64 4.0.5 (x86_64-pc-linux-gnu-ubuntu-22.04) (475.41 kB)
#> v Got pkgconfig 2.0.3 (x86_64-pc-linux-gnu-ubuntu-22.04) (17.58 kB)
#> v Got timechange 0.2.0 (x86_64-pc-linux-gnu-ubuntu-22.04) (169.26 kB)
#> v Got DBI 1.1.3 (x86_64-pc-linux-gnu-ubuntu-22.04) (759.31 kB)
#> v Got withr 2.5.0 (x86_64-pc-linux-gnu-ubuntu-22.04) (228.73 kB)
#> v Got bit 4.0.5 (x86_64-pc-linux-gnu-ubuntu-22.04) (1.13 MB)
#> v Got lubridate 1.9.2 (x86_64-pc-linux-gnu-ubuntu-22.04) (980.37 kB)
#> v Got Rcpp 1.0.11 (x86_64-pc-linux-gnu-ubuntu-22.04) (2.15 MB)
#> i Installing system requirements
#> i Executing `sh -c apt-get -y update`
#> i Executing `sh -c apt-get -y install libpq-dev`
#> v Installed DBI 1.1.3  (1.1s)
#> v Installed RPostgres 1.4.5  (1.1s)
#> v Installed Rcpp 1.0.11  (1.2s)
#> v Installed bit 4.0.5  (1.2s)
#> v Installed bit64 4.0.5  (126ms)
#> v Installed blob 1.2.4  (86ms)
#> v Installed generics 0.1.3  (83ms)
#> v Installed hms 1.1.3  (59ms)
#> v Installed lubridate 1.9.2  (1.1s)
#> v Installed pkgconfig 2.0.3  (1.1s)
#> v Installed timechange 0.2.0  (63ms)
#> v Installed withr 2.5.0  (1.1s)
#> v 1 pkg + 16 deps: kept 5, added 12, dld 12 (6.58 MB) [17.1s]

Running R as a regular user

If you don’t want to use R as the superuser, but you can set up sudo without a password, that works as well. pak will detect the password-less sudo capability, and use it to install system packages, as needed.

If you run R as a regular (not root) user, and password-less sudo is not available, then pak will print the system requirements, but it will not try to install or update them.

If you are compiling R packages from source, and they need to link to system libraries, then their installation will probably fail, until you install these system packages.

If you are installing binary R packages (e.g. from P3M ), then the installation typically succeeds, but you won’t be able to load these packages into R, until you install the required system packages.

To demonstrate this, let’s remove the system package for the PostgreSQL client library:

system("apt-get remove -y libpq5")

If now we (re)install the binary RPostgres R package, the installation will succeed, but then library() fails because of the missing system package. (We will fix the broken R package below.)

#> i Loading metadata databasev Loading metadata database ... done
#>  
#> > Will install 1 package.
#> > Will download 1 package with unknown size.
#> + RPostgres   1.4.5 [dl] + x libpq-dev
#> x Missing 1 system package. You'll probably need to install it manually:
#> + libpq-dev  - RPostgres
#> i Getting 1 pkg with unknown size
#> v Cached copy of RPostgres 1.4.5 (x86_64-pc-linux-gnu-ubuntu-22.04) is the latest build
#> v Installed RPostgres 1.4.5  (1.1s)
#> v 1 pkg + 16 deps: kept 16, added 1 [5.7s]

library(RPostgres)
#> Error: package or namespace load failed for 'RPostgres' in dyn.load(file, DLLpath = DLLpath, ...):
#>  unable to load shared object '/root/R/x86_64-pc-linux-gnu-library/4.3/RPostgres/libs/RPostgres.so':
#>   libpq.so.5: cannot open shared object file: No such file or directory
#> Execution halted

Opting out

If you don’t want pak to install system packages for you, set the PKG_SYSREQS environment variable to false, or the pkg.sysreqs option to FALSE. See the complete list of configuration options in the config?pak manual page.

System requirements queries

pak 0.6.0 also has a number of functions to query system requirements and system packages. The pak::pkg_sysreqs() function is similar to pak::pkg_deps() but in addition to looking up package dependencies, it also looks up system dependencies, and only reports the latter:

pak::pkg_sysreqs(c("curl", "r-lib/xml2", "devtools", "CHRONOS"))
#> i Loading metadata databasev Loading metadata database ... done
#> -- Install scripts --------------------------------------------- Ubuntu 22.04 --
#> apt-get -y update
#> apt-get -y install libcurl4-openssl-dev libssl-dev git make libgit2-dev \
#>   zlib1g-dev pandoc libfreetype6-dev libjpeg-dev libpng-dev libtiff-dev \
#>   libicu-dev libfontconfig1-dev libfribidi-dev libharfbuzz-dev libxml2-dev \
#>   libglpk-dev libgmp3-dev default-jdk
#> R CMD javareconf
#> R CMD javareconf
#> 
#> -- Packages and their system dependencies --------------------------------------
#> CHRONOS     -- default-jdk, pandoc
#> credentials -- git
#> curl        -- libcurl4-openssl-dev, libssl-dev
#> fs          -- make
#> gert        -- libgit2-dev
#> gitcreds    -- git
#> httpuv      -- make, zlib1g-dev
#> igraph      -- libglpk-dev, libgmp3-dev, libxml2-dev
#> knitr       -- pandoc
#> openssl     -- libssl-dev
#> pkgdown     -- pandoc
#> png         -- libpng-dev
#> ragg        -- libfreetype6-dev, libjpeg-dev, libpng-dev, libtiff-dev
#> RCurl       -- libcurl4-openssl-dev, make
#> remotes     -- git
#> rJava       -- default-jdk, make
#> rmarkdown   -- pandoc
#> sass        -- make
#> stringi     -- libicu-dev
#> systemfonts -- libfontconfig1-dev, libfreetype6-dev
#> textshaping -- libfreetype6-dev, libfribidi-dev, libharfbuzz-dev
#> XML         -- libxml2-dev
#> xml2        -- libxml2-dev

See the manual of pak::pkg_sysreqs() to learn how to programmatically extract information from its return value.

pak::sysreqs_check_installed() is a handy function that checks if all system requirements are installed for some or all R packages in your library. This should report our broken RPostgres package:

pak::sysreqs_check_installed()
#> system package installed required by
#> -------------- --        -----------
#> libpq-dev      x         RPostgres

pak::sysreqs_fix_installed() goes one step further and also tries to install the missing system requirements:

pak::sysreqs_fix_installed()
#> i Need to install 1 system package.
#> i Installing system requirements
#> i Executing `sh -c apt-get -y update`
#> i Executing `sh -c apt-get -y install libpq-dev`

Now we can load RPostgres again:

library(RPostgres)

Configuration

There are several pak configuration options you can use to adjust how system requirements are handled. See the complete list in the config?pak manual page.

pak::sysreqs_db_list() , pak::sysreqs_dbmatch() and pak::sysreqs_db_update() list, query and update the built-in system requirements database.
pak::sysreqs_list_system_packages() lists system packages, including virtual packages and the features they provide.

More information

Acknowledgements

A big thank you to all those who have contributed to pak, or one of its workhorse packages since the v0.5.1 release:

@alexpate30 , @averissimo , @ArnaudKunzi , @billdenney , @Darxor , @drmowinckels , @Fan-iX , @gongyh , @hadley , @idavydov , @jefferis , @joan-yanqiong , @kevinushey , @kkmann , @klmr , @krlmlr , @lgaborini , @maelle , @maxheld83 , @maximsmol , @michaelmayer2 , @mine-cetinkaya-rundel , @olivroy , @pascalgulikers , @pawelru , @royfrancis , @tanho63 , @thomasyu888 , @vincent-hanlon , and @VincentGuyader .

webR 0.2.0 has been released

George Stagg — Wed, 16 Aug 2023 00:00:00 +0000

We’re absolutely thrilled to announce the release of webR 0.2.0! This release gathers together many updates and improvements to webR over the last few months, including improvements to the HTML canvas graphics device, support for Cairo-based bitmap graphics, accessibility and internationalisation improvements, additional Wasm R package support (including Shiny), a new webR REPL app, and various updates to the webR developer API.

This blog post will take a deep dive through the major breaking changes and new features available in webR 0.2.0. I also plan to record and release a series of companion videos discussing the new release, so keep an eye out if you’re someone who prefers watching and listening over reading long-form articles. I’ll update this post with all the links once they’re available.

WebAssembly and webR

My previous webR release blog post goes into detail about what WebAssembly is, why people are excited about it, and how it relates to the R community and ecosystem in general through webR. I would recommend it as a good place to start, if the project is new to you¹.

A short explanation is that WebAssembly (also known as Wasm) allows software that’s normally compiled for a specific computer system to instead run anywhere, including in web browsers. Wasm is the technology that powers Pyodide (used by Shinylive for Python ) and webR brings this technology to the R world. Using webR it is possible to run R code directly in a web browser², without the need for the traditional supporting R server to execute the code.

Running R code directly in a browser opens the door for many new and exciting uses for R on the web. Applications that I’m personally excited in seeing developed are,

Live and interactive R code and graphics in documents & presentations,
Tactile educational content for R, with examples that can be remixed on-the-fly by learners,
Reproducible statistics through containerisation and notebook-style literate programming.

Even in these early days, some of this is already being provided by development of downstream projects such as James Balamuta’s quarto-webr extension, allowing Quarto users to easily embed interactive R code blocks in their documents.

Interactive code blocks

One of my favourite demonstrations of what webR can do is interactive code blocks for R code. After a short loading period while the webR binary is downloaded, a Run code button will be enabled below. Using examples like this, R code can be immediately edited and executed – feel free to experiment! Click the “Run code” button to see the resulting box plot, change the colour from mediumseagreen to red and run the code again.

It’s easy to see the potential teaching benefit examples like this could bring to educational content or R package documentation.

The webR REPL app

WebR can be loaded into a web page to be used as a part of a wider web application, and ships with a demo application that does just that. The webR REPL app³ provides a simple R environment directly in your web browser. The app can be accessed at https://webr.r-wasm.org/v0.2.0/ and includes sections for R console input/output, code editing, file management, and graphics device output.

With the webR REPL app, a casual user could get up and running with R in seconds, without having to install any software on their machine. It is entirely feasible that they could perform the basics of data science entirely within their web browser!

Other than interactive code blocks, like in the example earlier, the webR REPL app is perhaps the first thing that users new to webR will interact with. For this reason, we have spent some time working to improve the technical implementation and user experience of using the app. The app has been completely rewritten in the React web framework, replacing the older jQuery library. This allows for better component code organisation and more rapid development of features and updates.

Code editor

The app now comes with a tabbed code editor, allowing for easier editing and execution of R code. The editor integrates with the webR virtual filesystem (VFS), meaning that multiple R scripts can be opened, edited, and saved and they will be available to the running Wasm R process.

The editor pane is built upon the excellent CodeMirror text editor, which provides most of the component’s functionality. CodeMirror provides built-in support for syntax highlighting of R code, which is enabled by default when R source files are displayed.

The editor is integrated with the currently running R process and automatic code suggestions are shown as you type, provided by R’s built in completion generator . The suggestions are context sensitive and are aware of package and function names, valid arguments, and even objects that exist in the global environment.

The running Wasm R process is also configured at initialisation to use the editor component as its display pager mechanism . With this configuration in place running commands such as ?rnorm in the app automatically opens a new read-only tab in the editor displaying R’s built-in documentation.

Plotting pane

The plotting pane has been updated to take advantage of improvements in webR’s HTML canvas graphics device, set as the default device as part of initialisation. In particular, multiple plots are now supported and older plots can be directly accessed using the previous and next buttons in the plotting toolbar. You can try this out with R’s built in graphics demo, by running demo(graphics) and/or demo(persp).

Files pane

The files pane has been completely redesigned, removing its dependency on jQuery and instead making use of the react-accessible-treeview package. As well as a technical improvement, this change means that interacting with the webR filesystem should be more usable to those with web accessibility requirements. We feel it’s important that, where possible, everybody is able to use our software.

Additional buttons have also been added to this pane, allowing users to easily manipulate the virtual file system visible to the running Wasm R process. New files and directories can be created or deleted, and text-based files can be directly opened and modified in the editor pane, removing the need to download, edit and then re-upload files.

Console pane

The R console component shown in the lower left portion of the app is powered by the wonderful xterm.js software, which provides a high performance terminal emulator on the web. R output looks at its best when running in this kind of environment, so that ANSI escape codes can be used to provide a much smoother console experience incorporating cursor placement, box drawing characters, bold text, terminal colours, and more.

An optional accessibility mode is provided by xterm.js so that terminal output is readable by screen reader software, such as macOS’s VoiceOver . The webR REPL app now enables this mode by default to improve the accessibility of terminal output.

HTML Canvas graphics device

The webR support package provides a custom webr::canvas() graphics device that renders output using the Web Canvas API . When the graphics device is used, drawing commands from R are translated into Canvas API calls. The browser renders the graphics and the resulting image data is drawn to a HTML element on the page.

With the release of webR 0.2.0, we have improved the performance and added new features to the HTML canvas graphics device.

Performance improvements with `OffscreenCanvas`

Using the Canvas API to draw graphics in a browser is elegant, but presents a problem. R is running via WebAssembly in a JavaScript Web Worker thread, but the element the plot image data is written to is on the main thread, part of the web page DOM . And, unfortunately, JavaScript Web Worker threads have no direct access to the DOM.

Previous releases of webR solve this problem in a rather naive way, it simply sends the Canvas API calls to the main thread to be executed there. This leads to a few issues,

Canvas API calls are serialised as text to be sent to the main thread. Sufficiently complex plot text must therefore be quoted and escaped.
Each API call is sent in a separate message. For a complex plot this can be thousands of messages to dispatch and handle.
The messaging is one-way, results of useful methods like measureText() cannot easily be retrieved.
Parsing and executing the API call on the main thread means using JavaScript’s eval() or Function() , leading to poor performance. These functions should also be avoided when possible in any case, for security reasons.

Solid engineering efforts could be made to improve the situation, e.g. through batching API calls and better encoding, but there is a better way: the OffscreenCanvas interface. OffscreenCanvas is designed to solve this exact problem of rendering graphics off-screen, such as in a worker thread. With OffscreenCanvas the Canvas API calls can all be executed on the worker thread, and only a single message containing the completed image data transferred to the main thread when rendering is complete. It is an efficient and technically satisfying solution, except that when webR 0.1.1 was released OffscreenCanvas wasn’t supported by the Safari web browser.

Today, on the other hand, OffscreenCanvas is supported in all major desktop and mobile browsers. Safari has supported it since version 16.4, and so with webR 0.2.0 we have rewritten the webr::canvas() graphics device to take full advantage of the OffscreenCanvas interface. This has led to a significant performance improvement, particularly when creating plots containing many points. The two videos below show the same plot rendered in webR 0.1.1 and 0.2.0, the difference is not just visible, but an order of magnitude faster.

A performance comparison plotting 300000 points in webR 0.1.1 and 0.2.0.

A potential downside is that users of less up-to-date browsers without OffscreenCanvas support won’t be able to use the webr::canvas() graphics device. Such users should instead make use of our additional updates to webR to support the traditional Cairo-based bitmap devices. The built-in graphics devices section discusses that in more detail.

Modern text rendering and internationalisation

With webR 0.1.1, the canvas graphics device had only minimal support for rendering text. The typeface was fixed, the font metrics were estimated with a heuristic, and Unicode characters outside the Basic Latin block often failed to render. It worked most of the time, but it was far from ideal. This area of software engineering is suprisingly difficult to get right, and even native installations of R can have serious text rendering issues .

In comparison, web browser support for text rendering is excellent. Now that we use the OffscreenCanvas interface, we too can take advantage of the years of work behind browser’s support for text on the web. The example below demonstrates several of the modern text rendering features now supported by webr::canvas() .

Any system font available to the web browser can now be used⁴. As well as a nice-to-have, this also provides improved accessibility. For example, there are fonts designed specifically for use by readers with dyslexia and other similar reading barriers⁵ that could be used for drawing text in plots.

Font metrics are now exact, using measureText() , rather than estimating the width and height of Latin glyphs using heuristics. This gives more accurate positioning of rendered text and improves the general quality of resulting plots.

Support for Unicode, font glyph fallback, complex ligatures, and right-to-left (RTL) text have all been improved. This vastly improves results when rendering text for international users, particularly for non-Latin RTL scripts such as the Arabic and Hebrew text in the example above.

Also, colour emoji can now be added to plots. 😃

Paths and winding rules

Additional support for the drawing and filling of paths and polygons, including with different winding rules , has been added to the webR canvas graphics device. An area where this new functionality makes a world of difference is plotting spatial features and maps. Previously broken R code for plotting maps with the ggplot2 and sf packages now works well with webR 0.2.0.

Output messages from the canvas graphics device

As a result of the changes to the HTML canvas graphics device, the structure of output messages communicated to the main thread has been redesigned. This is a breaking change and existing webR applications will need to be updated to listen for the new output messaging format.

A Plotting section has been added to the webR documentation describing how plotting works with the webr::canvas() device, and how to handle the output messages in your own web applications.

A 'canvas' type output message with an event property of 'canvasNewPage' indicates the start of a new plot,

1

{ type: 'canvas', data: { event: 'canvasNewPage' } }

An output message with an event property of 'canvasImage' indicates that there is some graphics data ready to be drawn,

1

{ type: 'canvas', data: { event: 'canvasImage', image: ImageBitmap } }

The image property in the message data contains a JavaScript ImageBitmap object. This can be drawn to a HTML element using the drawImage() method.

Built-in bitmap graphics devices

Not all environments where webR could be running support plotting to a HTML element. Older browsers may not support the required OffscreenCanvas interface, webR might be running server-side in Node.js, or webR might be running more traditional R code or packages that are unaware of the webr::canvas() graphics device.

For supporting these use cases, with webR 0.2.0 the built-in bitmap graphics devices are now able to be used, writing their output to the webR VFS. This includes the png() , bmp() , jpeg() , tiff() devices, and potentially others implemented using the Cairo graphics library.

In the example below, webR is loaded into a JavaScript environment and plotting is done using the built-in png() graphics device. The resulting image is written to the virtual filesystem and its contents can then be obtained using webR’s FS interface, designed to be similar to Emscripten’s filesystem API .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


import { WebR } from 'webr';

const webR = new WebR();
await webR.init();

await webR.evalRVoid(`
  png('/tmp/Rplot.png', width = 800, height = 800, res = 144)
  hist(rnorm(1000))
  dev.off()
`);

const plotImageData = await webR.FS.readFile('/tmp/Rplot.png');

The image data is contained in the plotImageData variable as a JavaScript UInt8Array . Once obtained from the VFS, the image can be served to the end user as a Blob file download, displayed on a web page, or if running webR server-side returned over the network.

Text rendering and font support

As with the webr::canvas() improvements described in the previous section, we feel it is important that the built in R graphics devices provides a high level of support for text rendering in webR. Here, however, the approach is different. The built-in graphics devices renders image data entirely within the WebAssembly environment, so we can no longer rely on the web browser for high quality text!

The built-in graphics devices are powered by the Cairo graphics library, which can now optionally be compiled for Wasm as part of the webR build process. In addition, when enabled various other libraries are compiled for Wasm to improve the quality of text rendering in Cairo,

Public releases of webR distributed via GitHub and CDN will be built with these libraries all enabled and included.

Font files on the VFS

When plotting with the built-in bitmap graphics devices, fonts must be accessible to the Cairo library through the webR VFS. A minimal selection of Google’s Noto fonts are bundled with webR when Cairo graphics is enabled.

The fontconfig library is also configured to search the VFS directory /home/web_user/fonts for additional fonts. Users who wish to use custom fonts, or alternative writing systems, may do so by uploading font files to this directory. In the case of international scripts or non-Latin Unicode such as emoji, fontconfig will automatically use font fallback to select reasonable fonts containing the required glyphs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


png(width = 1200, height = 800, res = 180)
plot(
  rnorm(1000), rnorm(1000),
  col = rgb(0, 0, 0, 0.5),
  xlim = c(-5, 5), ylim = c(-5, 5),
  main = "This is the title 🚀",
  xlab = "This is the x label",
  ylab = "This is the y label"
)
text(-3.5, 4, "This is English")
text(-3.5, -4, "هذا مكتوب باللغة العربية")
text(3.5, 4, "これは日本語です")
text(3.5, -4, "זה כתוב בעברית")
dev.off()

This is essentially the same example as in the previous section, demonstrating a selection of advanced font functionality. In this example we are rendering a PNG file using the built-in png() graphics device. We can see that by uploading appropriate fonts to the VFS, the same set of advanced text rendering features that are provided by the browser can also be used with R’s built-in bitmap graphics devices.

Lazy virtual filesystem

All of the additional features I’ve written about so far come with a price: increased Wasm binary and data download size. Consider the fonts in the previous section - each font file bundled with webR is going to increase the total size of the default webR filesystem by around 500KB.

This is a high price to pay in time and bandwidth when not every user is going to need every feature. A similar principle also applies to other files included with R by default. It’s nice that all the default R documentation, examples, and datasets are available on the VFS, but we don’t necessarily need those files downloaded every time to every client machine.

With webR 0.2.0 a “lazy” virtual filesystem mechanism, powered by a feature of Emscripten’s FS API , is introduced. With this, only the files required to launch R and use the default packages are downloaded at initialisation time. Additional files provided on the VFS are still available for use, but they are only downloaded from the remote server when they are requested in some way by the running Wasm R process.

With the introduction of the lazy virtual filesystem, along with other efficiency improvements, the initial download size for webR is now much smaller, a great improvement.

Component	0.1.1	0.2.0	(% of previous)
`R.bin.data`	25.3MB	5.2MB	20.6%
`R.bin.wasm`	12.8MB	1.7MB	7.5%
Total for the webR REPL app	40.2MB	9.5MB	23.6%

R packages

Since initial release, webR has supported loading R packages by first installing them to the Emscripten VFS using the helper function webr::install() or by manually placing R packages in the VFS at /usr/lib/R/library. We find that pure R packages usually work well, but R packages with underlying C (or Fortran, or otherwise…) code must be compiled from source for Wasm.

We host a public CRAN-like R package repository containing packages built for Wasm in this way, so that there exists a subset of useful and supported R packages that can be used with webR. The public repository is hosted at https://repo.r-wasm.org and this repo URL is used by default when running webr::install() to install a Wasm R package.

It remains the case that building custom R packages for Wasm is not well documented, but we do hope to improve the situation over time as our package build infrastructure develops and matures. In the future, we plan to provide a Wasm R package build system as a set of Docker containers, so that users are able to build their own packages for webR using a container environment.

WebAssembly system libraries for R packages

Many R packages require linking with system libraries to build and run. When building such R packages for WebAssembly, not only does the package code require compiling for Wasm, but also any system libraries that code depends on.

To expand support for R packages, webR 0.2.0 ships with additional recipes to build system libraries from source for Wasm. The libraries consist of a selection of utility, database, graphics, text rendering, geometry, and geospatial support packages, with specific libraries chosen for their possibility to be compiled for Wasm as well as the number of R packages relying on them. I expect that the number of system libraries supported will continue to grow over time as we attempt to build more R packages for Wasm.

As of webR 0.1.1, 219 packages were available to install through our public Wasm R package repo. With the release of webR 0.2.0 and its additional system libraries, the number of available packages is now 10324 (approximately 51% of CRAN packages). Though, it should be noted that these packages have not been tested in detail. Here, “available” just means that the Emscripten compiler successfully built the R package for Wasm, along with its prerequisite packages.

Public Wasm R packages dashboard

While available R packages can be listed using available.packages() with our CRAN-like Wasm R package repo, it’s not the smoothest experience for users simply wanting to check if a given package is available. A dashboard has been added to the repo index page which lists the available packages compiled for Wasm in an interactive table. The table also lists package dependencies, noting which prerequisite packages, if any, are still missing.

It might be interesting to note that this dashboard itself is running under webR, through a fully client-side Shiny app.

Running httpuv & Shiny under webR

Using features new to webR 0.2.0, a httpuv webR package shim has been created that provides the functionality usually provided by the httpuv R package. The package enables R to handle HTTP and WebSocket traffic, and is a prerequisite for the R Shiny package.

The shim works by taking advantage of the JavaScript Service Worker API . Normally Service Workers are used to implement fast offline caching of web content, but they can also be used as a general network proxy. The httpuv shim makes use of a Service Worker to intercept network traffic from a running Shiny web client, and forward that traffic to be handled by an instance of webR.

From the Shiny server’s point of view, it is communicating with the usual httpuv package using its R API. From the point of view of the Shiny web client, it is talking to a Shiny server over the network. Between the two, the JavaScript Service Worker and webR work together to act as a network proxy and handle the traffic entirely within the client⁶.

The httpuv shim package is still in the experimental stage, but it is currently available for testing and is included in our public webR package repository.

An example shiny app

An example Shiny app, making use of the httpuv shim and running fully client-side, is available at https://shiny-standalone-webr-demo.netlify.app .

Once the app has loaded in your browser, it’s possible to confirm that the app is running entirely client-side by observing the Shiny server trace output at the bottom of the screen. You should even be able to disconnect completely from the internet and continue to use the app offline.

The source code for the demo, which includes some information describing how to set up a webR Shiny server in this way, can be found at georgestagg/shiny-standalone-webr-demo. Note that this repository is targeted towards advanced web developers with prior experience of development with JavaScript Web Workers. It is intended as a demonstration of the technology, rather than a tutorial.

A coming-soon version of Shinylive for R will provide a much better user experience for getting fully client-side R Shiny apps up and running, without requiring advanced knowledge of JavaScript’s Worker API. I believe Shinylive with webR integration will pave the way for providing a user-friendly method to build and deploy containerised R Shiny apps, running on WebAssembly.

Changes to the webR developer API

It’s possible for webR to be used in isolation, but it’s likely that developers will want to interface webR with other JavaScript frameworks and tools. The dynamism and interconnectivity of the web is one of its great strengths, and we’d like the same to be true of webR. This section describes changes to webR’s developer API, used to interact with the running R session from the JavaScript environment.

Performance improvements with MessagePack protocol

When working to integrate webR into a wider application, at some point we will need to move data into the running R process, and later return results back to JavaScript. It’s possible to move data into R by evaluating R code directly, but the webR library also provides other ways to transfer raw data to R .

Consider the example below. Data is transferred from JavaScript into the running R process by binding jsData to an R variable in the global environment using webR.objs.globalEnv.bind() . Next, some computation on the data is done, represented as evaluating the do_analysis() R function. Finally the result is returned back to JavaScript, first as a reference to an R object and then transferring the result data back to the JavaScript environment using toJs() .

1
2
3
4
5


const jsData = [... some large JavaScript dataset ...];
await webR.objs.globalEnv.bind('data', jsData);

const ret = await webR.evalR("do_analysis(data)");
const result = await ret.toJs();

It’s easy to see how this workflow could be useful as part of a wider application, enabling a complex data manipulation or a statistical modelling in R that would otherwise be awkward to perform directly in JavaScript.

Behind the scenes, we’ve done work to ensure that data is transferred efficiently to and from the R environment, and in webR 0.2.0 the MessagePack protocol is now used as the main way that data is serialised and transferred, replacing JSON encoding.

This change provides a significant performance improvement. Initial testing shows an order of magnitude speed boost when transferring large sets of data from the JavaScript environment into R. Thanks to @jeroen for prompting me to look into it!

The typing of R object references

When working with webR in TypeScript it is important to keep track of R object types. All references to R objects are instances of the RObject class, and various subclasses implement specific features for each fundamental R data type.

In this example, an RDouble object is returned at runtime, but webR.evalR() is typed to return a generic RObject. Notice that the .toNumber() method exists on RDouble, but not on the RObject superclass. So while this example runs with no problem once compiled to JavaScript, it gives an error under TypeScript!

1
2
3
4


const obj = await webR.evalR('1.23456');
const data = await obj.toJs();

const num = await obj.toNumber(); // An error under TypeScript!

One solution is to use the as keyword to assert a specific type of RObject subclass. Alternatively, webR also provides variants of the evalR() function that return and convert results to a specific type of JavaScript object.

In many cases these methods will work well, but they require you to know for sure what type of R object has been returned. Additional support has been added in webR 0.2.0 to better handle typing when it is not entirely clear what type of RObject you have.

Type predicate functions

TypeScript supports a kind of return type known as a type predicate . These return types can be used to create user-defined type guards, functions that take an object argument and return a boolean indicating if the object is of a compatible type. With this, TypeScript is able to automatically narrow types based on the return value from the type predicate function.

WebR 0.2.0 ships with a selection of type predicate functions for each fundamental R data type supported by webR. In the following example, the TypeScript error described above is dealt with by using the function isRDouble() . Inside the branch, TypeScript narrows the object type to an RDouble, resolving the issue.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


import { isRDouble } from 'webr';

const obj = await webR.evalR('1.23456');

try {
  if (isRDouble(obj)) {
    // In this branch, TypeScript narrows the type of `obj` to an `RDouble`
    const num = await obj.toNumber();
  
    // Do something with `num` ...
  }
} finally {
  webR.destroy(obj);
}

Handling errors with `WebRError`

When executing R code with webR’s evalR() family of functions, by default any error condition from R is converted into a JavaScript Error and thrown. This feature can be very useful, because it allows developers to catch issues while executing R code in the native JavaScript environment.

However, consider the following example,

1
2
3
4
5
6


try {
  const result = await webR.evalR('some_R_code()');
  doSomethingWith(result);
} catch (e) {
  // Handle some error that occured
}

If an error is thrown, how can we tell if the error came from R or from some issue inside the JavaScript function? Nested try/catch could be used, but this becomes unwieldy quickly. Parsing the error message text is another option, though not so elegant.

With webR 0.2.0 any errors that occur in R code executed using evalR(), or any internal webR issues, are thrown as instances of WebRError . With this change, the instanceof keyword can be used to differentiate between errors occurring in R, and errors in JavaScript code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


import { WebRError } from 'webR';

try {
  const result = await webR.evalR('some_R_code()');
  doSomethingWith(result);
} catch (e) {
  if (e instanceof WebRError) {
    console.error("An error occured executing R code");
  } else {
    console.error("An error occured in JavaScript");
  }
  throw e;
}

Safely handling webR termination

Consider the following async loop, a useful pattern to continuously handle webR output messages,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


async function run() {
  for (;;) {
    const output = await webR.read();
    switch (output.type) {
      case 'stdout':
      case 'stderr':
        console.log(output.data);
        break;
      default:
        console.warn(`Unhandled output type: ${output.type}.`);
    }
  }
}

Here await webR.read() waits asynchronously for output messages from webR’s communication channel. For example, a running R process might print results between long computational delays. Such occasional printed output might be received as messages with a type property of 'stdout'.

After a message is received, it is handled in a switch statement and then the loop continues around to wait for another output message. This works well while webR is running, but what happens when terminated with webR.close() ? The R worker thread is stopped and destroyed, but the loop continues to wait for a message that will never come.

With webR 0.2.0 a new type of message is issued when webR is terminated using webR.close(). After the webR worker thread has been destroyed, a message is emitted on the usual output channel with a type property of 'closed', with no associated data property. The implication is that once this message has been emitted, that particular instance of webR has terminated and the the async loop is no longer needed.

With this change, exiting the loop once webR has terminated could be as simple as adding an extra case statement,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


async function run() {
  for (;;) {
    const output = await webR.read();
    switch (output.type) {
      case 'stdout':
      case 'stderr':
        console.log(output.data);
        break;
      case 'closed':
        return;
      default:
        console.warn(`Unhandled output type: ${output.type}.`);
    }
  }
}

Installation and next steps

Developers can integrate webR in their own JavaScript or TypeScript projects by installing the webR npm package , or by directly importing webR from CDN. Issues and PRs are accepted and welcome on the main r-wasm/webr GitHub repository.

npm

With this release, the webR npm package name has been updated, simplified from the original @r-wasm/webr package name to simply webr.

1

npm i webr

The original namespaced package @r-wasm/webr will be deprecated, and from v0.2.0 onwards npm will display a message pointing to the new package name.

CDN URL

Alternatively, webR can be imported directly as a module from CDN.

1

import { WebR } from "https://webr.r-wasm.org/v0.2.0/webr.mjs"

Binary release packages

Finally, binary webR packages can be downloaded from GitHub on the releases page of the r-wasm/webr repo.

Documentation

The next step of integrating webR into your own software should be to visit the documentation pages, provided at https://docs.r-wasm.org/webr/v0.2.0/ . My previous webR release blog post also briefly explains how to get started, though the docs go into much more detail.

Acknowledgements

A big thank you to all of webR’s early adopters, experimenting with the system and providing feedback in the form of GitHub Issues and PRs.

@Anurodhyadav , @arkraieski , @averissimo , @awconway , @bahadzie , @ceciliacsilva , @DanielEWeeks , @eteitelbaum , @fortunewalla , @gedw99 , @gwd-at , @hatemhosny , @hrbrmstr , @ivelasq , @JeremyPasco , @jeroen , @jooyoungseo , @jpjais , @kforner , @lauritowal , @lionel- , @matthiasbirkich , @neocarto , @noamross , @Polkas , @qiushiyan , @ries9112 , @SugarRayLua , @timelyportfolio , @WebReflection , and @WillemSleegers .

In addition, Danielle Navarro’s webR blog post is very good and Bob Rudis’s webR experiments are well worth exploring, along with his recent NY R conference talk . ↩︎
Also other JavaScript/Wasm environments, such as Node.js. For example, ROpenSci’s r-universe package platform provides download links for datasets contained in R packages, in a variety of formats, powered by running webR server-side in Node.js . ↩︎
REPL stands for “Read, Eval, Print, Loop”, and is another name for the R console that you’re probably familiar with. The application is named the “webR REPL app” because the original version simply provided the user with a fullscreen R console in their web browser. ↩︎
This also includes the world of CSS web fonts, but it is a little tricky. Extra work must be done so that the font is available to the Web Worker. Probably this can be handled better in a future release of webr::canvas() . ↩︎
Dyslexie , Open Dyslexic . Results of research in this area is mixed, but even if these fonts don’t improve the speed of text comprehension, some users may simply prefer or feel more comfortable with them. ↩︎
Shinylive for Python also uses a JavaScript Service Worker scheme to serve fully client-side apps. ↩︎

Understanding LoRA with a minimal example

Daniel Falbel — Thu, 22 Jun 2023 00:00:00 +0000

LoRA (Low-Rank Adaptation) is a new technique for fine tuning large scale pre-trained models. Such models are usually trained on general domain data, so as to have the maximum amount of data. In order to obtain better results in tasks like chatting or question answering, these models can be further ‘fine-tuned’ or adapted on domain specific data.

It’s possible to fine-tune a model just by initializing the model with the pre-trained weights and further training on the domain specific data. With the increasing size of pre-trained models, a full forward and backward cycle requires a large amount of computing resources. Fine tuning by simply continuing training also requires a full copy of all parameters for each task/domain that the model is adapted to.

LoRA: Low-Rank Adaptation of Large Language Models proposes a solution for both problems by using a low rank matrix decomposition. It can reduce the number of trainable weights by 10,000 times and GPU memory requirements by 3 times.

Method

The problem of fine-tuning a neural network can be expressed by finding a $\Delta \Theta$ that minimizes $L(X, y; \Theta_0 + \Delta\Theta)$ where $L$ is a loss function, $X$ and $y$ are the data and $\Theta_0$ the weights from a pre-trained model.

We learn the parameters $\Delta \Theta$ with dimension $|\Delta \Theta|$ equals to $|\Theta_0|$. When $|\Theta_0|$ is very large, such as in large scale pre-trained models, finding $\Delta \Theta$ becomes computationally challenging. Also, for each task you need to learn a new $\Delta \Theta$ parameter set, making it even more challenging to deploy fine-tuned models if you have more than a few specific tasks.

LoRA proposes using an approximation $\Delta \Phi \approx \Delta \Theta$ with $|\Delta \Phi| << |\Delta \Theta|$. The observation is that neural nets have many dense layers performing matrix multiplication, and while they typically have full-rank during pre-training, when adapting to a specific task the weight updates will have a low “intrinsic dimension”.

A simple matrix decomposition is applied for each weight matrix update $\Delta \theta \in \Delta \Theta$. Considering $\Delta \theta_i \in \mathbb{R}^{d \times k}$ the update for the $i$th weight in the network, LoRA approximates it with:

$$\Delta \theta_i \approx \Delta \phi_i = BA$$

where $B \in \mathbb{R}^{d \times r}$, $A \in \mathbb{R}^{r \times d}$ and the rank $r << min(d, k)$. Thus instead of learning $d \times k$ parameters we now need to learn $(d + k) \times r$ which is easily a lot smaller given the multiplicative aspect. In practice, $\Delta \theta_i$ is scaled by $\frac{\alpha}{r}$ before being added to $\theta_i$, which can be interpreted as a ’learning rate’ for the LoRA update.

LoRA does not increase inference latency, as once fine tuning is done, you can simply update the weights in $\Theta$ by adding their respective $\Delta \theta \approx \Delta \phi$. It also makes it simpler to deploy multiple task specific models on top of one large model, as $|\Delta \Phi|$ is much smaller than $|\Delta \Theta|$.

Implementing in torch

Now that we have an idea of how LoRA works, let’s implement it using torch for a minimal problem. Our plan is the following:

Simulate training data using a simple $y = X \theta$ model. $\theta \in \mathbb{R}^{1001, 1000}$.
Train a full rank linear model to estimate $\theta$ - this will be our ‘pre-trained’ model.
Simulate a different distribution by applying a transformation in $\theta$.
Train a low rank model using the pre=trained weights.

Let’s start by simulating the training data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


library(torch)

n <- 10000
d_in <- 1001
d_out <- 1000

thetas <- torch_randn(d_in, d_out)

X <- torch_randn(n, d_in)
y <- torch_matmul(X, thetas)

We now define our base model:

1

model <- nn_linear(d_in, d_out, bias = FALSE)

We also define a function for training a model, which we are also reusing later. The function does the standard traning loop in torch using the Adam optimizer. The model weights are updated in-place.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


train <- function(model, X, y, batch_size = 128, epochs = 100) {
  opt <- optim_adam(model$parameters)

  for (epoch in 1:epochs) {
    for(i in seq_len(n/batch_size)) {
      idx <- sample.int(n, size = batch_size)
      loss <- nnf_mse_loss(model(X[idx,]), y[idx])
      
      with_no_grad({
        opt$zero_grad()
        loss$backward()
        opt$step()  
      })
    }
    
    if (epoch %% 10 == 0) {
      with_no_grad({
        loss <- nnf_mse_loss(model(X), y)
      })
      cat("[", epoch, "] Loss:", loss$item(), "\n")
    }
  }
}

The model is then trained:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


train(model, X, y)
#> [ 10 ] Loss: 577.075 
#> [ 20 ] Loss: 312.2 
#> [ 30 ] Loss: 155.055 
#> [ 40 ] Loss: 68.49202 
#> [ 50 ] Loss: 25.68243 
#> [ 60 ] Loss: 7.620944 
#> [ 70 ] Loss: 1.607114 
#> [ 80 ] Loss: 0.2077137 
#> [ 90 ] Loss: 0.01392935 
#> [ 100 ] Loss: 0.0004785107

OK, so now we have our pre-trained base model. Let’s suppose that we have data from a slighly different distribution that we simulate using:

1
2
3
4


thetas2 <- thetas + 1

X2 <- torch_randn(n, d_in)
y2 <- torch_matmul(X2, thetas2)

If we apply out base model to this distribution, we don’t get a good performance:

1
2
3
4


nnf_mse_loss(model(X2), y2)
#> torch_tensor
#> 992.673
#> [ CPUFloatType{} ][ grad_fn =  ]

We now fine-tune our initial model. The distribution of the new data is just slighly different from the initial one. It’s just a rotation of the data points, by adding 1 to all thetas. This means that the weight updates are not expected to be complex, and we shouldn’t need a full-rank update in order to get good results.

Let’s define a new torch module that implements the LoRA logic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


lora_nn_linear <- nn_module(
  initialize = function(linear, r = 16, alpha = 1) {
    self$linear <- linear
    
    # parameters from the original linear module are 'freezed', so they are not
    # tracked by autograd. They are considered just constants.
    purrr::walk(self$linear$parameters, \(x) x$requires_grad_(FALSE))
    
    # the low rank parameters that will be trained
    self$A <- nn_parameter(torch_randn(linear$in_features, r))
    self$B <- nn_parameter(torch_zeros(r, linear$out_feature))
    
    # the scaling constant
    self$scaling <- alpha / r
  },
  forward = function(x) {
    # the modified forward, that just adds the result from the base model
    # and ABx.
    self$linear(x) + torch_matmul(x, torch_matmul(self$A, self$B)*self$scaling)
  }
)

We now initialize the LoRA model. We will use $r = 1$, meaning that A and B will be just vectors. The base model has 1001x1000 trainable parameters. The LoRA model that we are are going to fine tune has just (1001 + 1000) which makes it 1/500 of the base model parameters.

1

lora <- lora_nn_linear(model, r = 1)

Now let’s train the lora model on the new distribution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


train(lora, X2, Y2)
#> [ 10 ] Loss: 798.6073 
#> [ 20 ] Loss: 485.8804 
#> [ 30 ] Loss: 257.3518 
#> [ 40 ] Loss: 118.4895 
#> [ 50 ] Loss: 46.34769 
#> [ 60 ] Loss: 14.46207 
#> [ 70 ] Loss: 3.185689 
#> [ 80 ] Loss: 0.4264134 
#> [ 90 ] Loss: 0.02732975 
#> [ 100 ] Loss: 0.001300132 

If we look at $\Delta \theta$ we will see a matrix full of 1s, the exact transformation that we applied to the weights:

1
2
3
4
5
6
7
8
9


delta_theta <- torch_matmul(lora$A, lora$B)*lora$scaling
delta_theta[1:5, 1:5]
#> torch_tensor
#>  1.0002  1.0001  1.0001  1.0001  1.0001
#>  1.0011  1.0010  1.0011  1.0011  1.0011
#>  0.9999  0.9999  0.9999  0.9999  0.9999
#>  1.0015  1.0014  1.0014  1.0014  1.0014
#>  1.0008  1.0008  1.0008  1.0008  1.0008
#> [ CPUFloatType{5,5} ][ grad_fn =  ]

To avoid the additional inference latency of the separate computation of the deltas, we could modify the original model by adding the estimated deltas to its parameters. We use the add_ method to modify the weight in-place.

1
2
3


with_no_grad({
  model$weight$add_(delta_theta$t())  
})

Now, applying the base model to data from the new distribution yields good performance, so we can say the model is adapted for the new task.

1
2
3
4


nnf_mse_loss(model(X2), y2)
#> torch_tensor
#> 0.00130013
#> [ CPUFloatType{} ]

Concluding

Now that we learned how LoRA works for this simple example we can think how it could work on large pre-trained models.

Turns out that Transformers models are mostly clever organization of these matrix multiplications, and applying LoRA only to these layers is enough for reducing the fine tuning cost by a large amount while still getting good performance. You can see the experiments in the LoRA paper.

Of course, the idea of LoRA is simple enough that it can be applied not only to linear layers. You can apply it to convolutions, embedding layers and actually any other layer.

Image by Hu et al on the LoRA paper

What are Large Language Models? What are they not?

Sigrid Keydana — Tue, 20 Jun 2023 00:00:00 +0000

“At this writing, the only serious ELIZA scripts which exist are some which cause ELIZA to respond roughly as would certain psychotherapists (Rogerians). ELIZA performs best when its human correspondent is initially instructed to"talk” to it, via the typewriter of course, just as one would to a psychiatrist. This mode of conversation was chosen because the psychiatric interview is one of the few examples of categorized dyadic natural language communication in which one of the participating pair is free to assume the pose of knowing almost nothing of the real world. If, for example, one were to tell a psychiatrist “I went for a long boat ride” and he responded “Tell me about boats”, one would not assume that he knew nothing about boats, but that he had some purpose in so directing the subsequent conversation. It is important to note that this assumption is one made by the speaker. Whether it is realistic or not is an altogether separate question. In any case, it has a crucial psychological utility in that it serves the speaker to maintain his sense of being heard and understood. The speaker furher defends his impression (which even in real life may be illusory) by attributing to his conversational partner all sorts of background knowledge, insights and reasoning ability. But again, these are the speaker’s contribution to the conversation."

Joseph Weizenbaum, creator of ELIZA (Weizenbaum 1966).

GPT, the ancestor all numbered GPTs , was released in June, 2018 – five years ago, as I write this. Five years: that’s a long time. It certainly is as measured on the time scale of deep learning, the thing that is, usually, behind when people talk of “AI”. One year later, GPT was followed by GPT-2; another year later, by GPT-3. At this point, public attention was still modest – as expected, really, for these kinds of technologies that require lots of specialist knowledge. (For GPT-2, what may have increased attention beyond the normal, a bit, was OpenAI ’s refusal to publish the complete training code and full model weights, supposedly due to the threat posed by the model’s capabilities – alternatively, as argued by others, as a marketing strategy, or yet alternatively, as a way to preserve one’s own competitive advantage just a tiny little bit longer.

As of 2023, with GPT-3.5 and GPT-4 having followed, everything looks different. (Almost) everyone seems to know GPT, at least when that acronym appears prefixed by a certain syllable. Depending on who you talk to, people don’t seem to stop talking about that fantastic [insert thing here] ChatGPT generated for them, about its enormous usefulness with respect to [insert goal here]… or about the flagrant mistakes it made, and the danger that legal regulation and political enforcement will never be able to catch up.

What made the difference? Obviously, it’s ChatGPT , or put differently, the fact that now, there is a means for people to make active use of such a tool, employing it for whatever their personal needs or interests are¹. In fact, I’d argue it’s more than that: ChatGPT is not some impersonal tool – it talks to you, picking up your clarifications, changes of topic, mood… It is someone rather than something, or at least that’s how it seems. I’ll come back to that point in It’s us, really: Anthropomorphism unleashed . Before, let’s take a look at the underlying technology.

Large Language Models: What they are

How is it even possible to build a machine that talks to you? One way is to have that machine listen a lot. And listen is what these machines do; they do it a lot. But listening alone would never be enough to attain results as impressive as those we see. Instead, LLMs practice some form of “maximally active listening”: Continuously, they try to predict the speaker’s next utterance. By “continuously”, I mean word-by-word: At each training step, the model is asked to produce the subsequent word in a text.

Maybe in my last sentence, you noted the term “train”. As per common sense, “training” implies some form of supervision. It also implies some form of method. Since learning material is scraped from the internet, the true continuation is always known. The precondition for supervision is thus always fulfilled: A supervisor can just compare model prediction with what really follows in the text. Remains the question of method. That’s where we need to talk about deep learning, and we’ll do that in Model training .

Overall architecture

Today’s LLMs are, in some way or the other, based on an architecture known as the Transformer. This architecture was originally introduced in a paper catchily titled “Attention is all you need” (Vaswani et al. 2017). Of course, this was not the first attempt at automating natural-language generation – not even in deep learning, the sub-type of machine learning whose defining characteristic are many-layered (“deep”) artificial neural networks. But there, in deep learning, it constituted some kind of paradigm change. Before, models designed to solve sequence-prediction tasks (time-series forecasting, text generation…) tended to be based on some form of recurrent architecture, introduced in the 1990’s (eternities ago, on the time scale of deep-learning) by (Hochreiter and Schmidhuber 1997). Basically, the concept of recurrence, with its associated threading of a latent state, was replaced by “attention”. That’s what the paper’s title was meant to communicate: The authors did not introduce “attention”²; instead, they fundamentally expanded its usage so as to render recurrence superfluous.

How did that ancestral Transformer look? – One prototypical task in natural language processing is machine translation. In translation, be it done by a machine or by a human, there is an input (in one language) and an output (in another). That input, call it a code. Whoever wants to establish its counterpart in the target language first needs to decode it. Indeed, one of two top-level building blocks of the archetypal Transformer was a decoder, or rather, a stack of decoders applied in succession. At its end, out popped a phrase in the target language³. What, then, was the other high-level block? It was an encoder, something that takes text (or tokens, rather, i.e., something that has undergone tokenization) and converts it into a form the decoder can make sense of. (Obviously, there is no analogue to this in human translation.)

From this two-stack architecture, subsequent developments tended to keep just one. The GPT family, together with many others, just kept the decoder stack. Now, doesn’t the decoder need some kind of input – if not to translate to a different language, then to reply to, as in the chatbot scenario? Turns out that no, it doesn’t – and that’s why you can also have the bot initiate the conversation. Unbeknownst to you, there will, in fact, be an input to the model – some kind of token signifying “end of input”. In that case, the model will draw on its training experience to generate a word likely to start out a phrase. That one word will then become the new input to continue from, and so forth. Summing up so far, then, GPT-like LLMs are Transformer Decoders.

The question is, how does such a stack of decoders succeed in fulfilling the task?

GPT-type models up close

In opening the black box, we focus on its two interfaces – input and output – as well as on the internals, its core.

Input

For simplicity, let me speak of words, not tokens. Now imagine a machine that is to work with – more even: “understand”⁴ – words. For a computer to process non-numeric data, a conversion to numbers necessarily has to happen. The straightforward way to effectuate this is to decide on a fixed lexicon, and assign each word a number. And this works: The way deep neural networks are trained, they don’t need semantic relationships to exist between entities in the training data to memorize formal structure. Does this mean they will appear perfect while training, but fail in real-world prediction? – If the training data are representative of how we converse, all will be fine. In a world of perfect surveillance, machines could exist that have internalized our every spoken word. Before that happens, though, the training data will be imperfect.

A much more promising approach than to simply index words, then, is to represent them in a richer, higher-dimensional space, an embedding space. This idea, popular not just in deep learning but in natural language processing overall, really goes far beyond anything domain-specific – linguistic entities, say⁵. You may be able to fruitfully employ it in virtually any domain – provided you can devise a method to sensibly map the given data into that space. In deep learning, these embeddings are obtained in a clever way: as a by-product of sorts of the overall training workflow. Technically, this is achieved by means of a dedicated neural-network layer⁶ tasked with evolving these mappings. Note how, smart though this strategy may be, it implies that the overall setting – everything from training data via model architecture to optimization algorithms employed – necessarily affects the resulting embeddings. And since these may be extracted and made use of in down-stream tasks, this matters⁷.

As to the GPT family, such an embedding layer constitutes part of its input interface – one “half”, so to say. Technically, the second makes use of the same type of layer, but with a different purpose. To contrast the two, let me spell out clearly what, in the part we’ve talked about already, is getting mapped to what. The mapping is between a word index – a sequence 1, 2, …, – on the one hand and a set of continuous-valued vectors of some length – 100, say – on the other. (One of them could like this: $\begin{bmatrix} 1.002 & 0.71 & 0.0004 &...\\ \end{bmatrix}$) Thus, we obtain an embedding for every word. But language is more than an unordered assembly of words. Rearranging words, if syntactically allowed, may result in drastically changed semantics. In the pre-transformer paradigma, threading a sequentially-updated hidden state took care of this. Put differently, in that type of model, information about input order never got lost throughout the layers. Transformer-type architectures, however, need to find a different way. Here, a variety of rivaling methods exists. Some assume an underlying periodicity in semanto-syntactic structure. Others – and the GPT family, as yet and insofar we know, has been part of them⁸ – approach the challenge in exactly the same way as for the lexical units: They make learning these so-called position embeddings a by-product of model training. Implementation-wise, the only difference is that now the input to the mapping looks like this: 1, 2, …, where “maximum position” reflects choice of maximal sequence length supported.

Summing up, verbal input is thus encoded – embedded, enriched – twofold as it enters the machine. The two types of embedding are combined and passed on to the model core, the already-mentioned decoder stack.

Core Processing

The decoder stack is made up of some number of identical blocks (12, in the case of GPT-2). (By “identical” I mean that the architecture is the same; the weights – the place where a neural-network layer stores what it “knows” – are not. More on these “weights” soon.)

Inside each block, some sub-layers are pretty much “business as usual”. One is not: the attention module, the “magic” ingredient that enabled Transformer-based architectures to forego keeping a latent state. To explain how this works, let’s take translation as an example.

In the classical encoder-decoder setup, the one most intuitive for machine translation, imagine the very first decoder in the stack of decoders. It receives as input a length-seven cypher, the encoded version of an original length-seven phrase. Since, due to how the encoder blocks are built, input order is conserved, we have a faithful representation of source-language word order. In the target language, however, word order can be very different. A decoder module, in producing the translation, had rather not do this by translating each word as it appears. Instead, it would be desirable for it to know which among the already-seen tokens is most relevant right now, to generate the very next output token. Put differently, it had better know where to direct its attention.

Thus, figure out how to distribute focus is what attention modules do. How do they do it? They compute, for each available input-language token, how good a match, a fit, it is for their own current input. Remember that every token, at every processing stage, is encoded as a vector of continuous values. How good a match any of, say, three source-language vectors is is then computed by projecting one’s current input vector onto each of the three. The closer the vectors, the longer the projected vector. ⁹ Based on the projection onto each source-input token, that token is weighted, and the attention module passes on the aggregated assessments to the ensuing neural-network module.

To explain what attention modules are for, I’ve made use of the machine-translation scenario, a scenario that should lend a certain intuitiveness to the operation. But for GPT-family models, we need to abstract this a bit. First, there is no encoder stack, so “attention” is computed among decoder-resident tokens only. And second – remember I said a stack was built up of identical modules? – this happens in every decoder block. That is, when intermediate results are bubbled up the stack, at each stage the input is weighted as appropriate at that stage. While this is harder to intuit than what happened in the translation scenario, I’d argue that in the abstract, it makes a lot of sense. For an analogy, consider some form of hierarchical categorization of entities. As higher-level categories are built from lower-level ones, at each stage the process needs to look at its input afresh, and decide on a sensible way of subsuming similar-in-some-way categories.

Output

Stack of decoders traversed, the multi-dimensional codes that pop out need to be converted into something that can be compared with the actual phrase continuation we see in the training corpus. Technically, this involves a projection operation as well a strategy for picking the output word – that word in target-language vocabulary that has the highest probability. How do you decide on a strategy? I’ll say more about that in the section Mechanics of text generation , where I assume a chatbot user’s perspective.

Model training

Before we get there, just a quick word about model training. LLMs are deep neural networks, and as such, they are trained like any network is. First, assuming you have access to the so-called “ground truth”, you can always compare model prediction with the true target. You then quantify the difference – by which algorithm will affect training results. Then, you communicate that difference – the loss – to the network. It, in turn, goes through its modules, from back/top to start/bottom, and updates its stored “knowledge” – matrices of continuous numbers called weights. Since information is passed from layer to layer, in a direction reverse to that followed in computing predictions, this technique is known as back-propagation.

And all that is not triggered once, but iteratively, for a certain number of so-called “epochs”, and modulated by a set of so-called “hyper-parameters”. In practice, a lot of experimentation goes into deciding on the best-working configuration of these settings.

Mechanics of text generation

We already know that during model training, predictions are generated word-by-word; at every step, the model’s knowledge about what has been said so far is augmented by one token: the word that really was following at that point. If, making use of a trained model, a bot is asked to reply to a question, its response must by necessity be generated in the same way. However, the actual “correct word” is not known. The only way, then, is to feed back to the model its own most recent prediction. (By necessity, this lends to text generation a very special character, where every decision the bot makes co-determines its future behavior.)

Why, though, talk about decisions? Doesn’t the bot just act on behalf of the core model, the LLM – thus passing on the final output? Not quite. At each prediction step, the model yields a vector, with values as many as there are entries in the vocabulary. As per model design and training rationale, these vectors are “scores” – ratings, sort of, how good a fit a word would be in this situation. Like in life, higher is better. But that doesn’t mean you’d just pick the word with the highest value. In any case, these scores are converted to probabilities, and a suitable probability distribution is used to non-deterministically pick a likely (or likely-ish) word. The probability distribution commonly used is the multinomial distribution, appropriate for discrete choice among more than two alternatives. But what about the conversion to probabilities? Here, there is room for experimentation.

Technically, the algorithm employed is known as the softmax function. It is a simplified version of the Boltzmann distribution , famous in statistical mechanics, used to obtain the probability of a system’s state given that state’s energy and the temperature of the system. But for temperature¹⁰, both formulae are, in fact, identical. In physical systems, temperature modulates probabilities in the following way: The hotter the system, the closer the states’ probabilities are to each other; the colder it gets, the more distinct those probabilities. In the extreme, at very low temperatures there will be a few clear “winners” and a silent majority of “losers”.

In deep learning, a like effect is easy to achieve (by means of a scaling factor). That’s why you may have heard people talk about some weird thing called “temperature” that resulted in [insert adjective here] answers. If the application you use lets you vary that factor, you’ll see that a low temperature will result in deterministic-looking, repetitive, “boring” continuations, while a high one may make the machine appear as though it were on drugs.

That concludes our high-level overview of LLMs. Having seen the machine dissected in this way may already have left you with some sort of opinion of what these models are – not. This topic more than deserves a dedicated exposition – and papers are being written pointing to important aspects all the time – but in this text, I’d like to at least offer some input for thought.

Large Language Models: What they are not

In part one,describing LLMs technically, I’ve sometimes felt tempted to use terms like “understanding” or “knowledge” when applied to the machine. I may have ended up using them; in that case, I’ve tried to remember to always surround them with quotes. The latter, the adding quotes, stands in contrast to many texts, even ones published in an academic context (Bender and Koller 2020). The question is, though: Why did I even feel compelled to use these terms, given I do not think they apply, in their usual meaning? I can think of a simple – shockingly simple, maybe – answer: It’s because us, humans, we think, talk, share our thoughts in these terms. When I say understand, I surmise you will know what I mean.

Now, why do I think that these machines do not understand human language, in the sense we usually imply when using that word?

A few facts

I’ll start out briefly mentioning empirical results, conclusive thought experiments, and theoretical considerations. All aspects touched upon (and many more) are more than worthy of in-depth discussion, but such discussion is clearly out of scope for this synoptic-in-character text.

First, while it is hard to put a number on the quality of a chatbot’s answers, performance on standardized benchmarks is the “bread and butter” of machine learning – its reporting being an essential part of the prototypical deep-learning publication. (You could even call it the “cookie”, the driving incentive, since models usually are explicitly trained and fine-tuned for good results on these benchmarks.) And such benchmarks exist for most of the down-stream tasks the LLMs are used for: machine translation, generating summaries, text classification, and even rather ambitious-sounding setups associated with – quote/unquote – reasoning.

How do you assess such a capability? Here is an example from a benchmark named “Argument Reasoning Comprehension Task” (Habernal et al. 2018).

Claim: Google is not a harmful monopoly
Reason: People can choose not to use Google
Warrant: Other search engines don’t redirect to Google
Alternative: All other search engines redirect to Google

Here claim and reason together make up the argument. But what, exactly, is it that links them? At first look, this can even be confusing to a human. The missing link is what is called warrant here – add it in, and it all starts to make sense. The task, then, is to decide which of warrant or alternative supports the conclusion, and which one does not.

If you think about it, this is a surprisingly challenging task. Specifically, it seems to inescapingly require world knowledge. So if language models, as has been claimed, perform nearly as well as humans, it seems they must have such knowledge – no quotes added. However, in response to such claims, research has been performed to uncover the hidden mechanism that enables such seemingly-superior results. For that benchmark, it has been found (Niven and Kao 2019) that there were spurious statistical cues in the way the dataset was constructed – those removed, LLM performance was no better than random.

World knowledge, in fact, is one of the main things an LLM lacks. Bender et al. (Bender and Koller 2020) convincingly demonstrate its essentiality by means of two thought experiments. One of them, situated on a lone island, imagines an octopus¹¹ inserting itself into some cable-mediated human communication, learning the chit-chat, and finally – having gotten bored – impersonating one of the humans. This works fine, until one day, its communication partner finds themselves in an emergency, and needs to build some rescue tool out of things given in the environment. They urgently ask for advice – and the octopus has no idea what to respond. It has no ideas what these words actually refer to.

The other argument comes directly from machine learning, and strikingly simple though it may be, it makes its point very well. Imagine an LLM trained as usual, including on lots of text involving plants. It has also been trained on a dataset of unlabeled photos, the actual task being unsubstantial – say it had to fill out masked areas. Now, we pull out a picture and ask: How many of that blackberry’s blossoms have already opened? The model has no chance to answer the question.

Now, please look back at the Joseph Weizenbaum quote I opened this article with. It is still true that language-generating machine have no knowledge of the world we live in.

Before moving on, I’d like to just quickly hint at a totally different type of consideration, brought up in a (2003!) paper by Spärck Jones (Spaerck 2004). Though written long before LLMs, and long before deep learning started its winning conquest, on an abstract level it is still very applicable to today’s situation. Today, LLMs are employed to “learn language”, i.e., for language acquisition. That skill is then built upon by specialized models, of task-dependent architecture. Popular real-world¹² down-stream tasks are translation, document retrieval, or text summarization. When the paper was written, there was no such two-stage pipeline. The author was questioning the fit between how language modeling was conceptualized – namely, as a form of recovery – and the character of these down-stream tasks. Was recovery – inferring a missing, for whatever reasons – piece of text a good model, of, say, condensing a long, detailed piece of text into a short, concise, factual one? If not, could the reason it still seemed to work just fine be of a very different nature – a technical, operational, coincidental one?

[…] the crucial characterisation of the relationship between the input and the output is in fact offloaded in the LM approach onto the choice of training data. We can use LM for summarising because we know that some set of training data consists of full texts paired with their summaries.¹³

It seems to me that today’s two-stage process notwithstanding, this is still an aspect worth giving some thought.

It’s us: Language learning, shared goals, and a shared world

We’ve already talked about world knowledge. What else are LLMs missing out on?

In our world, you’ll hardly find anything that does not involve other people. This goes a lot deeper than the easily observable facts: our constantly communicating, reading and typing messages, documenting our lives on social networks… We don’t experience, explore, explain a world of our own. Instead, all these activities are inter-subjectively constructed. Feelings are¹⁴. Cognition is; meaning is. And it goes deeper yet. Implicit assumptions guide us to constantly look for meaning, be it in overheard fragments, mysterious symbols, or life events.

How does this relate to LLMs? For one, they’re islands of their own. When you ask them for advice – to develop a research hypothesis and a matching operationalization, say, or whether a detainee should be released on parole – they have no stakes in the outcome, no motivation (be it intrinsic or extrinsic), no goals. If an innocent person is harmed, they don’t feel the remorse; if an experiment is successful but lacks explanatory power, they don’t sense the shallowness; if the world blows up, it won’t have been their world.

Secondly, it’s us who are not islands. In Bender et al.’s octopus scenario, the human on one side of the cable plays an active role not just when they speak. In making sense of what the octopus says, they contribute an essential ingredient: namely, what they think the octopus wants, thinks, feels, expects… Anticipating, they reflect on what the octopus anticipates.

As Bender et al. put it:

It is not that O’s utterances make sense, but rather, that A can make sense of them.

That article (Bender and Koller 2020) also brings impressive evidence from human language acquisition: Our predisposition towards language learning notwithstanding, infants don’t learn from the availability of input alone. A situation of joint attention is needed for them to learn. Psychologizing, one could hypothesize they need to get the impression that these sounds, these words, and the fact they’re linked together, actually matters.

Let me conclude, then, with my final “psychologization”.

It’s us, really: Anthropomorphism unleashed

Yes, it is amazing what these machines do. (And that makes them incredibly dangerous power instruments.) But this in no way affects the human-machine differences that have been existing throughout history, and continue to exist today. That we are inclined to think they understand, know, mean – that maybe even they’re conscious: that’s on us. We can experience deep emotions watching a movie; hope that if we just try enough, we can sense what a distant-in-evolutionary-genealogy creature is feeling; see a cloud encouragingly smiling at us; read a sign in an arrangement of pebbles.

Our inclination to anthropomorphize is a gift; but it can sometimes be harmful. And nothing of this is special to the twenty-first century.

Like I began with him, let me conclude with Weizenbaum.

Some subjects have been very hard to convince that ELIZA (with its present script) is not human.

Photo by Marjan Blan on Unsplash

Bender, Emily M., and Alexander Koller. 2020. “Climbing Towards NLU: On Meaning, Form, and Understanding in the Age of Data.” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Online), July, 5185–98. https://doi.org/10.18653/v1/2020.acl-main.463 .

Caliskan, Aylin, Pimparkar Parth Ajay, Tessa Charlesworth, Robert Wolfe, and Mahzarin R. Banaji. 2022. “Gender Bias in Word Embeddings.” Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, July. https://doi.org/10.1145/3514094.3534162 .

Habernal, Ivan, Henning Wachsmuth, Iryna Gurevych, and Benno Stein. 2018. “The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants.” Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (New Orleans, Louisiana), June, 1930–40. https://doi.org/10.18653/v1/N18-1175 .

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (December): 1735–80. https://doi.org/10.1162/neco.1997.9.8.1735 .

Niven, Timothy, and Hung-Yu Kao. 2019. “Probing Neural Network Comprehension of Natural Language Arguments.” CoRR abs/1907.07355. http://arxiv.org/abs/1907.07355 .

Spaerck, Karen. 2004. “Language Modelling’s Generative Model : Is It Rational?”

Vaswani, Ashish, Noam Shazeer, Niki Parmar, et al. 2017. Attention Is All You Need. https://arxiv.org/abs/1706.03762 .

Weizenbaum, Joseph. 1966. “ELIZA - a Computer Program for the Study of Natural Language Communication Between Man and Machine.” Commun. ACM (New York, NY, USA) 9 (1): 36–45. https://doi.org/10.1145/365153.365168 .

Evidently, this is not about singling out ChatGPT as opposed to other chatbots; rather, I’m adopting it as the prototypical such application, since it is the one omnipresent in the media these days. ↩︎
I’m using quotes to refer to how attention is operationalized in deep learning, as opposed to how it is conceptualized in cognitive science or psychology. ↩︎
If you’re wondering how that is possible – shouldn’t there be a separate, top-level module for generation? – no, there need not be. That’s because training implies prediction. ↩︎
Why the quotes? See Large Language Models: What they are not . ↩︎
As a fascinating example from dynamical systems theory, take delay coordinate embeddings . ↩︎
Suitably named embedding layer. ↩︎
See, for example, (Caliskan et al. 2022). ↩︎
For GPT-4, even high-level model information has not been released. ↩︎
Mathematically, this is achieved by a pretty standard and pervasively-used, in machine learning, operation, the dot product. ↩︎
… and the Boltzmann constant – but that being a constant, we don’t consider it here. ↩︎
That choice of species is probably not a coincidence: see https://en.wikipedia.org/wiki/Cephalopod_intelligence . ↩︎
As opposed to the aforementioned problems subsumed under “reasoning”, those having been constructed for research purposes. ↩︎
From (Spaerck 2004). ↩︎
See https://lisafeldmanbarrett.com/books/how-emotions-are-made/ . ↩︎

Best Practices on Posit Open Source

Rapp 0.3.0

Why a command-line interface for R?

How Rapp works

A tiny example

Running the script

Auto-generated help

Breaking change in 0.3.0: positional arguments are now required by default

Highlights in 0.3.0

Options, switches, and repeatable flags from plain R

Subcommands with switch()

Installable launchers for package CLIs

Get started

Learn more

mirai 2.6.0

How mirai works

The async foundation for the modern R stack

HTTP launcher

Posit Workbench

Custom APIs

C-level dispatcher

race_mirai()

Synchronous mode

Daemon synchronization with everywhere()

Minor improvements and fixes

Try it now

Acknowledgements

nanonext 1.8.0

Streaming HTTP/WebSocket server

Basic HTTP server

Static file serving

WebSocket server

HTTP streaming and Server-Sent Events

TLS/SSL support

Full response headers for HTTP client

Async HTTP with Shiny

New documentation

Bug fixes and improvements

Looking ahead

Acknowledgements

yaml12: YAML 1.2 for R and Python

Install

Quick start (R)

Quick start (Python)

Why YAML 1.2?

Highlights

A consistent API in R and Python

Tags and handlers (opt-in, meaning, safe defaults)

Simplification and missing values (R)

Non-string mapping keys

Mapping order is preserved

Document streams and front matter

Performance and safety notes

Wrapping up

Learn more

Acknowledgements

testthat 3.3.0

Claude Code experiences

Lifecycle changes

Expectations and the interactive testing experience

Other new features

Acknowledgements

pkgdown 2.2.0

Acknowledgements

mirai 2.5.0

Introduction to mirai

A unique design philosophy

OpenTelemetry integration

Reproducible parallel RNG

User interface improvements

Compute profile helper functions

Re-designed daemons()

New info() function

Acknowledgements

nanonext 1.7.0

Introducing nanonext: breaking down language barriers in data science

The challenge: multi-language data science

The solution: NNG’s scalability protocols

What can you do with nanonext?

Python ↔ R interoperability example

Subcommands with `switch()`

`race_mirai()`

Daemon synchronization with `everywhere()`

Re-designed `daemons()`

New `info()` function

`skip` configuration