Software by Simon Couch#
Events attended by Simon Couch#
Posts and resources by Simon Couch#
Data analysis with Posit AI-assistants | Sara Altman & Simon Couch | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Sara Altman who walks through using Posit’s AI assistants to analyze data, including a sneak peek at Posit Assistant, and Simon Couch drops by to give us a demo of the reviewer package! Together, Sara and Simon author the Posit AI Newsletter, the best place to stay up-to-date with all the cool tools and advice on staying an informed and level-headed AI user.
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Sara Altman, Simon Couch
Sara’s Bluesky: https://bsky.app/profile/sara-altman.bsky.social Sara’s LinkedIn: https://www.linkedin.com/in/sarakaltman/ Sara’s GitHub: https://github.com/skaltman Posit AI Newsletter by Sara and Simon: https://posit.co/blog/?category=roundups
Resources from the hosts and chat:
Positron IDE → https://positron.posit.co/ Databot Extension → https://positron.posit.co/databot.html Getting started with Positron Assistant → https://positron.posit.co/assistant-getting-started.html Posit Assistant (Private Beta) → https://posit-ai-beta.share.connect.posit.cloud/ Reviewer Package (by Simon Couch) → https://github.com/simonpcouch/reviewer ellmer Package → https://elmer.tidyverse.org/ chatlas Package → https://github.com/posit-dev/chatlas Read the Posit AI Newsletter → https://posit.co/blog/?category=roundups Sign up to get the Posit AI Newsletter → http://pos.it/ai-news Simon’s blog post about local LLMs not quite being ready for primetime → https://posit.co/blog/local-models-are-not-there-yet/ Join the waitlist for Posit AI in RStudio → https://posit.co/products/ai/ Posit AI Known Issues & FAQs → https://posit-ai-beta.share.connect.posit.cloud/#frequently-asked-questions-faqs Blog post from Simon and Sara about Privacy and LLMs → https://posit.co/blog/trust-llm-tools/ DS Lab YouTube playlist → https://youtube.com/playlist?list=PL9HYL-VRX0oSeWeMEGQt0id7adYQXebhT&si=7tmU6EAJpO5S7GBh
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 07:23 “Would you mind real quick just briefly explaining the differences between Positron Assistant and Databot?” 15:01 “Is there any way to configure reasoning efforts when signing in with GitHub Copilot?” 15:49 “Does DataBot already support other providers beyond Cloud?” 20:36 “What is the cases with monetary penalty in the console output?” 22:14 “Do you happen to know if the column names of the dataset are very, very messy?” 23:18 “Can you add skills to DataBot?” 26:36 “This code isn’t being saved anywhere. So where does it go?” 27:38 “There a way to know what all the slash commands are?” 28:51 Requesting Databot to use the namespace operator 33:58 “Is there a way to search within that Databot pane?” 39:34 “Have you noticed any time differences with how quickly things run-in RStudio versus Positron?” 40:33 “What happens if you open that URL that it mentions at the bottom in your browser?” 40:50 Clarifying the difference between Posit Assistant and Positron Assistant 43:18 “What is the typical token burn rate?” 53:31 “Is this on CRAN and working in both Positron and RStudio?”

The mall package: using LLMs with data frames in R & Python | Edgar Ruiz | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Edgar Ruiz as they walk through how mall works (with ellmer) in R, and then python. The mall package lets you use LLMs to process tabular or vectors of data, letting you do things such as feeding it a column of reviews and asking mall to use an anthropic model via ellmer to add a column of summaries or sentiments. Follow along with the code here: https://github.com/LibbyHeeren/mall-package-r
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Edgar Ruiz
Edgar’s Bluesky: https://bsky.app/profile/theotheredgar.bsky.social Edgar’s LinkedIn: https://www.linkedin.com/in/edgararuiz/ Edgar’s GitHub: https://github.com/edgararuiz
Resources from the hosts and chat:
Ollama → https://ollama.com/download Posit Data Science Lab → https://posit.co/dslab mall package → https://mlverse.github.io/mall/ ellmer package → https://elmer.tidyverse.org/ Libby’s Positron theme (Catppuccin) → https://marketplace.visualstudio.com/items?itemName=Catppuccin.catppuccin-vsc GitHub repo with Libby and Edgar’s code → https://github.com/LibbyHeeren/mall-package-r LLM providers supported by ellmer → https://ellmer.tidyverse.org/index.html#providers vitals package → https://vitals.tidyverse.org/ chatlas package → https://posit-dev.github.io/chatlas/ polars package → https://pola.rs/ narwhals package → https://narwhals-dev.github.io/narwhals/ pandas package → https://pandas.pydata.org/ LM Studio → https://lmstudio.ai/ Simon Couch’s blog → https://www.simonpcouch.com/ Edgar’s dataset: TidyTuesday Animal Crossing Dataset (May 5, 2020) → https://github.com/rfordatascience/tidytuesday Libby’s dataset: Kaggle Tweets Dataset → https://www.kaggle.com/datasets/mmmarchetti/tweets-dataset Blog from Sara and Simon on evaluating LLMs → https://posit.co/blog/r-llm-evaluation-03/ Data Science Lab YouTube playlist → https://www.youtube.com/watch?v=LDHGENv1NP4&list=PL9HYL-VRX0oSeWeMEGQt0id7adYQXebhT&index=2 AWS Bedrock → https://aws.amazon.com/bedrock/ Anthropic → https://www.anthropic.com/ Google Gemini → https://gemini.google.com/ What is rubber duck debugging anyway?? → https://en.wikipedia.org/wiki/Rubber_duck_debugging
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction to Libby, Isabella, Edgar, and the mall package + ellmer package 07:14 “What’s the difference between using mall for these NLP tasks versus traditional or classical NLP?” 09:37 “Can mall be used with a local LLM?” 17:32 “What kind of laptop specs should I realistically have to make good use of these models?” 22:12 “Are you limited to three output options?” 22:55 “Can mall return the prediction probabilities?” 24:14 “What are a rule of thumb set of specs for a machine so local LLMs are practically feasible?” 24:47 “Would that be in the additional prompt area where you’re defining things?” 25:04 “You could use the vitals package to compare models, right?” 25:24 “Can we use LM Studio instead of Ollama?” 28:35 “How do you iterate and validate the model?” 36:39 “Why use paste if it is all text?” 37:31 “Are these recent tweets (from X) or older ones from actual Twitter?” 40:23 “Is there a playlist for the Data Science Labs on YouTube?” 46:11 “Does that mean that the python version does not work with pandas?” 50:14 “Where is this data set from?”


How to deploy Shiny apps in 2026 | Alex Chisholm | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Posit product manager Alex Chisholm as he walks through the evolution of shiny app deployment over the years, how to deploy shiny apps in the modern era, and peeks into Posit’s roadmap for future development. Do you call it “deployment” or “publishing” when it comes to Shiny apps? 🤔
This is a super friendly and conversational space, and being there live in the Discord chat can’t be beat!! We hope you get to join us sometime soon.
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Daniel Chen, Alex Chisholm
Alex Chisholm’s LinkedIn: http://www.linkedin.com/in/chisholm1
Resources from the hosts and chat:
Posit Connect Cloud for deploying Shiny apps in the modern era: https://connect.posit.cloud/ Install Positron: https://positron.posit.co/ Simon Couch’s blog post on local LLMs not being good enough yet: https://www.simonpcouch.com/blog/2025-12-04-local-agents/ Blue-Green Shiny App Deployments using Posit Connect posit::conf(2025) talk by Ryszard Szymański: https://youtu.be/QEEGLWj0nas Digital Ocean: https://www.digitalocean.com/ Ollama local LLM: https://ollama.com/ py-sidebot app template: https://shiny.posit.co/py/templates/sidebot/ querychat app template: https://shiny.posit.co/py/templates/querychat/ Dan Chen mentioned Render in the chat as an alternative to Digital Ocean: https://render.com/ Alex Chisholm’s AB testing GitHub repo example: https://github.com/alex-chisholm/shiny-r-abtesting Edward in the chat shared a GitHub repo for using GitHub actions to execute remote SSH commands: https://github.com/appleboy/ssh-action Abu in the chat shared blue-green vs. canary deployments: https://octopus.com/devops/software-deployments/blue-green-vs-canary-deployments/ Frank in the chat mentioned Simon’s blog on using local LLMs with the chores package: https://www.simonpcouch.com/blog/2025-12-10-chores-0-3-0/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 03:03 Meaningful applications and value creation 05:31 The evolution of Shinyapps.io and Posit Connect 08:12 DigitalOcean and Droplets 09:36 DigitalOcean vs. commercial cloud providers, 11:48 Comparisons: DigitalOcean, Azure, and AWS 14:47 Replicating local environments with Docker 16:51 The open-source Shiny Server 18:20 Use case: University of Illinois CITL 20:02 Key considerations for deployment decisions 21:53 GitHub Actions and version control 23:31 Addressing single points of failure and maintainability 24:38 Posit Connect Cloud features and portfolio 26:01 Beyond Shiny: Quarto, Streamlit, and Dash 27:07 Handling secrets and database credentials 28:56 Custom vanity links vs. UUIDs 30:04 Blue-Green deployment strategies 31:55 “Is it easy to set up a developer workflow?” 34:46 Guardrails for AI powered apps and token usage 37:32 Small language models and Ollama 38:29 Sidebot AI demo and LLM integration 39:41 Understanding manifest.json and dependencies 45:00 Automatic publish on GitHub push 46:51 The future of Shinyapps.io and migration 48:33 “Did you just build a custom agent for that specific dashboard?” 51:43 Publishing from RStudio IDE to Connect Cloud 54:16 Preview: Inspecting website APIs for data harvesting

Simon Couch - Practical AI for data science
Practical AI for data science (Simon Couch)
Abstract: While most discourse about AI focuses on glamorous, ungrounded applications, data scientists spend most of their days tackling unglamorous problems in sensitive data. Integrated thoughtfully, LLMs are quite useful in practice for all sorts of everyday data science tasks, even when restricted to secure deployments that protect proprietary information. At Posit, our work on ellmer and related R packages has focused on enabling these practical uses. This talk will outline three practical AI use-cases—structured data extraction, tool calling, and coding—and offer guidance on getting started with LLMs when your data and code is confidential.
Presented at the 2025 R/Pharma Conference Europe/US Track.
Resources mentioned in the presentation:
- {vitals}: Large Language Model Evaluations https://vitals.tidyverse.org/
- {mcptools}: Model Context Protocol for R https://posit-dev.github.io/mcptools/
- {btw}: A complete toolkit for connecting R and LLMs https://posit-dev.github.io/btw/
- {gander}: High-performance, low-friction Large Language Model chat for data scientists https://simonpcouch.github.io/gander/
- {chores}: A collection of large language model assistants https://simonpcouch.github.io/chores/
- {predictive}: A frontend for predictive modeling with tidymodels https://github.com/simonpcouch/predictive
- {kapa}: RAG-based search via the kapa.ai API https://github.com/simonpcouch/kapa
- Databot https://positron.posit.co/dat

Simon Couch - From hours to minutes: Accelerating your tidymodels code
From hours to minutes: Accelerating your tidymodels code - Simon Couch
Abstract: This talk demonstrates a 145-fold speedup in training time for a machine learning pipeline with tidymodels through 4 small changes. By adapting a grid search on a canonical model to use a more performant modeling engine, hooking into a parallel computing framework, transitioning to an optimized search strategy, and defining the grid to search over carefully, users can drastically cut down on the time to develop machine learning models with tidymodels without sacrificing predictive performance.
Resources mentioned in the talk:
- Presentation slides https://simonpcouch.github.io/rpharma-24/#/
- GitHub repository for talk https://github.com/simonpcouch/rpharma-24
- Efficient Machine Learning with R: Low-Compute Predictive Modeling with tidymodels https://emlwr.org
- Optimizing model parameters faster with tidymodels https://www.simonpcouch.com/blog/2023-08-04-parallel-racing/
Presented at the 2024 R/Pharma Conference

Simon Couch - Fair machine learning
In recent years, high-profile analyses have called attention to many contexts where the use of machine learning deepened inequities in our communities. After a year of research and design, the tidymodels team is excited to share a set of tools to help data scientists develop fair machine learning models and communicate about them effectively. This talk will introduce the research field of machine learning fairness and demonstrate a fairness-oriented analysis of a machine learning model with tidymodels.
Talk by Simon Couch
Slides: https://simonpcouch.github.io/conf-24 GitHub Repo: https://github.com/simonpcouch/conf-24

How to train, evaluate, and deploy a machine learning workflow with tidymodels & Posit Team
Helpful resources: Github: https://github.com/simonpcouch/mutagen Follow-up Q&A Session: https://youtube.com/live/vwBVOBQfc_U If you want to book a call with our team to chat more about Posit products: pos.it/chat-with-us Don’t want to meet, but curious who else on your team is using Posit? pos.it/connect-us Blog post on tidymodels + Posit Connect: https://posit.co/blog/pharmaceutical-machine-learning-with-tidymodels-and-posit-connect/ Tidy Modeling with R book: https://www.tmwr.org/
Timestamps: 1:44 - Three steps for developing a machine learning model 3:35 - What is a machine learning model? 7:02 - Overview of machine learning with Posit Team 7:36: Step 1: Understand and clean data 11:05 - Step 2: Train and evaluate models (why you might be interested using tidymodels) 23:02 - Step 3: Deploying a machine learning model from Posit Workbench to Posit Connect 30:14 - Summary 31:21 - Helpful resources
Machine learning models are all around us, from Netflix movie recommendations to Zillow property value estimates to email spam filters.
As these models play an increasingly large role in our personal and professional lives, understanding and embracing them has never been more important; machine learning helps us make better, data-driven decisions.
The tidymodels framework is a powerful set of tools for building—and getting value out of—machine learning models with R.
Data scientists use tidymodels to:
- Gain access to a wide variety of machine learning methods
- Guard against common mistakes
- Easily deploy models through tidymodels’ integration with vetiver
Join Simon Couch from the tidyverse team on Wednesday, October 25th at 11am ET as he walks through an end-to-end machine learning workflow with Posit Team.
No registration is required to attend - simply add it to your calendar using this link: pos.it/team-demo

posit::conf(2023) Workshop: Introduction to tidymodels
Register now: http://pos.it/conf Instructors: Hannah Frick, Simon Couch, Emil Hvitfeldt Workshop Duration: 1-Day Workshop
This workshop is for you if you: • have intermediate R knowledge, experience with tidyverse packages, and either of the R pipes • can read data into R, transform and reshape data, and make a wide variety of graphs • have had some exposure to basic statistical concepts such as linear models, random forests, etc.
Intermediate or expert familiarity with modeling or machine learning is not required.
This workshop will teach you core tidymodels packages and their uses: data splitting/resampling with rsample, model fitting with parsnip, measuring model performance with yardstick, and basic pre-processing with recipes. Time permitting, you’ll be introduced to model optimization using the tune package. You’ll learn tidymodels syntax as well as the process of predictive modeling for tabular data



