Simon Couch

Senior Software Engineer

simonpcouch

Software by Simon Couch#

Events attended by Simon Couch#

Posts and resources by Simon Couch#

Data analysis with Posit AI-assistants | Sara Altman & Simon Couch | Data Science Lab

The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!

The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.

On this call, Libby Heeren is joined by Sara Altman who walks through using Posit’s AI assistants to analyze data, including a sneak peek at Posit Assistant, and Simon Couch drops by to give us a demo of the reviewer package! Together, Sara and Simon author the Posit AI Newsletter, the best place to stay up-to-date with all the cool tools and advice on staying an informed and level-headed AI user.

Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Sara Altman, Simon Couch

Sara’s Bluesky: https://bsky.app/profile/sara-altman.bsky.social Sara’s LinkedIn: https://www.linkedin.com/in/sarakaltman/ Sara’s GitHub: https://github.com/skaltman Posit AI Newsletter by Sara and Simon: https://posit.co/blog/?category=roundups

Resources from the hosts and chat:

Positron IDE → https://positron.posit.co/ Databot Extension → https://positron.posit.co/databot.html Getting started with Positron Assistant → https://positron.posit.co/assistant-getting-started.html Posit Assistant (Private Beta) → https://posit-ai-beta.share.connect.posit.cloud/ Reviewer Package (by Simon Couch) → https://github.com/simonpcouch/reviewer ellmer Package → https://elmer.tidyverse.org/ chatlas Package → https://github.com/posit-dev/chatlas Read the Posit AI Newsletter → https://posit.co/blog/?category=roundups Sign up to get the Posit AI Newsletter → http://pos.it/ai-news Simon’s blog post about local LLMs not quite being ready for primetime → https://posit.co/blog/local-models-are-not-there-yet/ Join the waitlist for Posit AI in RStudio → https://posit.co/products/ai/ Posit AI Known Issues & FAQs → https://posit-ai-beta.share.connect.posit.cloud/#frequently-asked-questions-faqs Blog post from Simon and Sara about Privacy and LLMs → https://posit.co/blog/trust-llm-tools/ DS Lab YouTube playlist → https://youtube.com/playlist?list=PL9HYL-VRX0oSeWeMEGQt0id7adYQXebhT&si=7tmU6EAJpO5S7GBh

► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu

Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co

Thanks for learning with us!

Timestamps 00:00 Introduction 07:23 “Would you mind real quick just briefly explaining the differences between Positron Assistant and Databot?” 15:01 “Is there any way to configure reasoning efforts when signing in with GitHub Copilot?” 15:49 “Does DataBot already support other providers beyond Cloud?” 20:36 “What is the cases with monetary penalty in the console output?” 22:14 “Do you happen to know if the column names of the dataset are very, very messy?” 23:18 “Can you add skills to DataBot?” 26:36 “This code isn’t being saved anywhere. So where does it go?” 27:38 “There a way to know what all the slash commands are?” 28:51 Requesting Databot to use the namespace operator 33:58 “Is there a way to search within that Databot pane?” 39:34 “Have you noticed any time differences with how quickly things run-in RStudio versus Positron?” 40:33 “What happens if you open that URL that it mentions at the bottom in your browser?” 40:50 Clarifying the difference between Posit Assistant and Positron Assistant 43:18 “What is the typical token burn rate?” 53:31 “Is this on CRAN and working in both Positron and RStudio?”

Simon Couch

The mall package: using LLMs with data frames in R & Python | Edgar Ruiz | Data Science Lab

The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!

The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.

On this call, Libby Heeren is joined by Edgar Ruiz as they walk through how mall works (with ellmer) in R, and then python. The mall package lets you use LLMs to process tabular or vectors of data, letting you do things such as feeding it a column of reviews and asking mall to use an anthropic model via ellmer to add a column of summaries or sentiments. Follow along with the code here: https://github.com/LibbyHeeren/mall-package-r

Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Edgar Ruiz

Edgar’s Bluesky: https://bsky.app/profile/theotheredgar.bsky.social Edgar’s LinkedIn: https://www.linkedin.com/in/edgararuiz/ Edgar’s GitHub: https://github.com/edgararuiz

Resources from the hosts and chat:

Ollama → https://ollama.com/download Posit Data Science Lab → https://posit.co/dslab mall package → https://mlverse.github.io/mall/ ellmer package → https://elmer.tidyverse.org/ Libby’s Positron theme (Catppuccin) → https://marketplace.visualstudio.com/items?itemName=Catppuccin.catppuccin-vsc GitHub repo with Libby and Edgar’s code → https://github.com/LibbyHeeren/mall-package-r LLM providers supported by ellmer → https://ellmer.tidyverse.org/index.html#providers vitals package → https://vitals.tidyverse.org/ chatlas package → https://posit-dev.github.io/chatlas/ polars package → https://pola.rs/ narwhals package → https://narwhals-dev.github.io/narwhals/ pandas package → https://pandas.pydata.org/ LM Studio → https://lmstudio.ai/ Simon Couch’s blog → https://www.simonpcouch.com/ Edgar’s dataset: TidyTuesday Animal Crossing Dataset (May 5, 2020) → https://github.com/rfordatascience/tidytuesday Libby’s dataset: Kaggle Tweets Dataset → https://www.kaggle.com/datasets/mmmarchetti/tweets-dataset Blog from Sara and Simon on evaluating LLMs → https://posit.co/blog/r-llm-evaluation-03/ Data Science Lab YouTube playlist → https://www.youtube.com/watch?v=LDHGENv1NP4&list=PL9HYL-VRX0oSeWeMEGQt0id7adYQXebhT&index=2 AWS Bedrock → https://aws.amazon.com/bedrock/ Anthropic → https://www.anthropic.com/ Google Gemini → https://gemini.google.com/ What is rubber duck debugging anyway?? → https://en.wikipedia.org/wiki/Rubber_duck_debugging

► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu

Thanks for learning with us!

Timestamps 00:00 Introduction to Libby, Isabella, Edgar, and the mall package + ellmer package 07:14 “What’s the difference between using mall for these NLP tasks versus traditional or classical NLP?” 09:37 “Can mall be used with a local LLM?” 17:32 “What kind of laptop specs should I realistically have to make good use of these models?” 22:12 “Are you limited to three output options?” 22:55 “Can mall return the prediction probabilities?” 24:14 “What are a rule of thumb set of specs for a machine so local LLMs are practically feasible?” 24:47 “Would that be in the additional prompt area where you’re defining things?” 25:04 “You could use the vitals package to compare models, right?” 25:24 “Can we use LM Studio instead of Ollama?” 28:35 “How do you iterate and validate the model?” 36:39 “Why use paste if it is all text?” 37:31 “Are these recent tweets (from X) or older ones from actual Twitter?” 40:23 “Is there a playlist for the Data Science Labs on YouTube?” 46:11 “Does that mean that the python version does not work with pandas?” 50:14 “Where is this data set from?”

Edgar Ruiz, Simon Couch

How to deploy Shiny apps in 2026 | Alex Chisholm | Data Science Lab

The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!

The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.

On this call, Libby Heeren is joined by Posit product manager Alex Chisholm as he walks through the evolution of shiny app deployment over the years, how to deploy shiny apps in the modern era, and peeks into Posit’s roadmap for future development. Do you call it “deployment” or “publishing” when it comes to Shiny apps? 🤔

This is a super friendly and conversational space, and being there live in the Discord chat can’t be beat!! We hope you get to join us sometime soon.

Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Daniel Chen, Alex Chisholm

Alex Chisholm’s LinkedIn: http://www.linkedin.com/in/chisholm1

Resources from the hosts and chat:

Posit Connect Cloud for deploying Shiny apps in the modern era: https://connect.posit.cloud/ Install Positron: https://positron.posit.co/ Simon Couch’s blog post on local LLMs not being good enough yet: https://www.simonpcouch.com/blog/2025-12-04-local-agents/ Blue-Green Shiny App Deployments using Posit Connect posit::conf(2025) talk by Ryszard Szymański: https://youtu.be/QEEGLWj0nas Digital Ocean: https://www.digitalocean.com/ Ollama local LLM: https://ollama.com/ py-sidebot app template: https://shiny.posit.co/py/templates/sidebot/ querychat app template: https://shiny.posit.co/py/templates/querychat/ Dan Chen mentioned Render in the chat as an alternative to Digital Ocean: https://render.com/ Alex Chisholm’s AB testing GitHub repo example: https://github.com/alex-chisholm/shiny-r-abtesting Edward in the chat shared a GitHub repo for using GitHub actions to execute remote SSH commands: https://github.com/appleboy/ssh-action Abu in the chat shared blue-green vs. canary deployments: https://octopus.com/devops/software-deployments/blue-green-vs-canary-deployments/ Frank in the chat mentioned Simon’s blog on using local LLMs with the chores package: https://www.simonpcouch.com/blog/2025-12-10-chores-0-3-0/

► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu

Thanks for learning with us!

Timestamps 00:00 Introduction 03:03 Meaningful applications and value creation 05:31 The evolution of Shinyapps.io and Posit Connect 08:12 DigitalOcean and Droplets 09:36 DigitalOcean vs. commercial cloud providers, 11:48 Comparisons: DigitalOcean, Azure, and AWS 14:47 Replicating local environments with Docker 16:51 The open-source Shiny Server 18:20 Use case: University of Illinois CITL 20:02 Key considerations for deployment decisions 21:53 GitHub Actions and version control 23:31 Addressing single points of failure and maintainability 24:38 Posit Connect Cloud features and portfolio 26:01 Beyond Shiny: Quarto, Streamlit, and Dash 27:07 Handling secrets and database credentials 28:56 Custom vanity links vs. UUIDs 30:04 Blue-Green deployment strategies 31:55 “Is it easy to set up a developer workflow?” 34:46 Guardrails for AI powered apps and token usage 37:32 Small language models and Ollama 38:29 Sidebot AI demo and LLM integration 39:41 Understanding manifest.json and dependencies 45:00 Automatic publish on GitHub push 46:51 The future of Shinyapps.io and migration 48:33 “Did you just build a custom agent for that specific dashboard?” 51:43 Publishing from RStudio IDE to Connect Cloud 54:16 Preview: Inspecting website APIs for data harvesting

Simon Couch

Simon Couch: Fair machine learning

Simon Couch Fair machine learning Cascadia R Conf 2024 Regular talk, 10:25-10:40

In recent years, high-profile analyses have called attention to many contexts where the use of machine learning deepened inequities in our communities. A machine learning model resulted in wealthy homeowners being taxed at a significantly lower rate than poorer homeowners; a model used in criminal sentencing disproportionately predicted black defendants would commit a crime in the future compared to white defendants; a recruiting and hiring model penalized feminine-coded words—like the names of historically women’s colleges—when evaluating résumés. In late 2022, a group of Posit employees across teams, roles, and technical backgrounds formed a reading group to engage with literature on machine learning fairness, a research field that aims to define what it means for a statistical model to act unfairly and take measures to address that unfairness. We then designed functionality and resources to help data scientists measure and critique the ways in which the machine learning models they’ve built might disparately impact people affected by that model. This talk will introduce the research field of machine learning fairness and demonstrate a fairness-oriented analysis of a model with tidymodels, a framework for machine learning in R.

Pronouns: he/him Chicago, IL Simon Couch is a software engineer at Posit PBC (formerly RStudio) where he works on open source statistical software. With an academic background in statistics and sociology, Simon believes that principled tooling has a profound impact on our ability to think rigorously about data. He authors and maintains a number of R packages and blogs about the process at simonpcouch.com

Simon Couch

How to train, evaluate, and deploy a machine learning workflow with tidymodels & Posit Team

Helpful resources: Github: https://github.com/simonpcouch/mutagen Follow-up Q&A Session: https://youtube.com/live/vwBVOBQfc_U If you want to book a call with our team to chat more about Posit products: pos.it/chat-with-us Don’t want to meet, but curious who else on your team is using Posit? pos.it/connect-us Blog post on tidymodels + Posit Connect: https://posit.co/blog/pharmaceutical-machine-learning-with-tidymodels-and-posit-connect/ Tidy Modeling with R book: https://www.tmwr.org/

Timestamps: 1:44 - Three steps for developing a machine learning model 3:35 - What is a machine learning model? 7:02 - Overview of machine learning with Posit Team 7:36: Step 1: Understand and clean data 11:05 - Step 2: Train and evaluate models (why you might be interested using tidymodels) 23:02 - Step 3: Deploying a machine learning model from Posit Workbench to Posit Connect 30:14 - Summary 31:21 - Helpful resources

Machine learning models are all around us, from Netflix movie recommendations to Zillow property value estimates to email spam filters.

As these models play an increasingly large role in our personal and professional lives, understanding and embracing them has never been more important; machine learning helps us make better, data-driven decisions.

The tidymodels framework is a powerful set of tools for building—and getting value out of—machine learning models with R.

Data scientists use tidymodels to:

Gain access to a wide variety of machine learning methods
Guard against common mistakes
Easily deploy models through tidymodels’ integration with vetiver

Join Simon Couch from the tidyverse team on Wednesday, October 25th at 11am ET as he walks through an end-to-end machine learning workflow with Posit Team.

No registration is required to attend - simply add it to your calendar using this link: pos.it/team-demo

Simon Couch

posit::conf(2023) Workshop: Introduction to tidymodels

Register now: http://pos.it/conf Instructors: Hannah Frick, Simon Couch, Emil Hvitfeldt Workshop Duration: 1-Day Workshop

This workshop is for you if you: • have intermediate R knowledge, experience with tidyverse packages, and either of the R pipes • can read data into R, transform and reshape data, and make a wide variety of graphs • have had some exposure to basic statistical concepts such as linear models, random forests, etc.

Intermediate or expert familiarity with modeling or machine learning is not required.

This workshop will teach you core tidymodels packages and their uses: data splitting/resampling with rsample, model fitting with parsnip, measuring model performance with yardstick, and basic pre-processing with recipes. Time permitting, you’ll be introduced to model optimization using the tune package. You’ll learn tidymodels syntax as well as the process of predictive modeling for tabular data

Emil Hvitfeldt, Hannah Frick, Simon Couch

parsnip rsample tidymodels tidyverse yardstick Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Simon Couch | tidymodels/stacks: A Grammar for Stacked Ensemble Modeling | RStudio

Full title: tidymodels/stacks, Or, In Preparation for Pesto: A Grammar for Stacked Ensemble Modeling

Through a community survey conducted over the summer, the RStudio tidymodels team learned that users felt the #1 priority for future development in the tidymodels package ecosystem should be ensembling, a statistical modeling technique involving the synthesis of multiple learning algorithms to improve predictive performance. This December, we were delighted to announce the initial release of stacks, a package for tidymodels-aligned ensembling. A particularly statistically-involved pesto recipe will help us get a sense for how the package works and how it advances the tidymodels package ecosystem as a whole.

About Simon: Simon Couch is an R developer and statistics student at Reed College, where he is entering the final semester of his undergraduate degree. He co-authors and maintains R packages including broom, infer, and stacks, leads trainings and workshops as an RStudio-certified tidyverse trainer, and researches in algorithmic data privacy. He interned on the RStudio tidymodels team in summer 2020, and is currently applying to doctoral programs in statistics

Simon Couch