rstudio
RStudio is an integrated development environment (IDE) for R
RStudio is an integrated development environment (IDE) for R that provides a complete workbench for R programming. It consolidates all essential R development tools—console, source editor, plots, workspace, help, and history—into a single customizable interface.
The IDE features syntax highlighting, code completion, and direct code execution from the editor. It includes full support for authoring Sweave and TeX documents. RStudio runs on Windows, Mac, and Linux, and can be deployed as a server to allow multiple users to access it through a web browser.
Contributors#
Resources featuring rstudio#
Regression models still rule in P&C Insurance | Jim Weiss | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We’d love to see you!
This week’s guest was Jim Weiss, Chief Risk Officer, commercial and executive at Crum and Forster!
Some topics covered in this week’s Hangout were the use of regression modeling (have you ever heard of the Tweedie distribution?) in property and casualty insurance, handling changing model results and communicating them to business stakeholders, using GenAI and “co-opetition” to identify and prevent fraudulent claims, and identifying and managing bias or confounding effects in pricing models.
Resources mentioned in the video and chat: The Once and Future C&F → https://www.cfins.com/the-once-and-future-cf-landing/ Tweedie distribution → https://en.wikipedia.org/wiki/Tweedie_distribution Statistical Rethinking Lectures Playlist → https://www.youtube.com/playlist?list=PLDcUM9US4XdNOlqSyhe38US8mFgmqzI14 Considerations for Managing Potential Bias in Pricing Models → https://eforum.casact.org/article/91188-considerations-for-managing-potential-bias-in-pricing-models Pointblank for data validation (Python) → https://posit-dev.github.io/pointblank/ Pointblank (R) → https://rstudio.github.io/pointblank/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh The Lab: https://pos.it/dslab LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Timestamps 00:00 Introduction 02:16 “Who is Crum and Forster? What do you do? What types of problems does your team help them solve?” 04:14 “What kind of data types would you see in your day to day or your team would see in your day to day? And what is an example of a problem that you feel like you’ve solved lately or that you’ve been working on lately?” 10:14 “What types of regression do you use?” 13:35 “In your insurance modeling career, what is the most unusual or unexpected variable that has contributed to one of your models?” 18:48 “How do you handle it when you see that the results are significantly different across models?” 21:45 “Are you Team Bayesian or Team Frequentist when it comes to your statistics?” 27:13 “My health care organization is interested in identifying fraudulent claims. Currently, they’re looking at Excel spreadsheets, same time, different person. Do you have any advice on a better way to guide them to automation?” 30:51 “What software do you use to do your job?” 35:40 “Do you ever use instrumental variables in your models?” 47:47 “Do you have any career advice for us? Is there something you wish that you had known when you were first entering the industry?” 50:38 “How much do you and your team use AI to help you along?” 52:56 “How does your team span expertise in such varied fields?”
Data analysis with Posit AI-assistants | Sara Altman & Simon Couch | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Sara Altman who walks through using Posit’s AI assistants to analyze data, including a sneak peek at Posit Assistant, and Simon Couch drops by to give us a demo of the reviewer package! Together, Sara and Simon author the Posit AI Newsletter, the best place to stay up-to-date with all the cool tools and advice on staying an informed and level-headed AI user.
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Sara Altman, Simon Couch
Sara’s Bluesky: https://bsky.app/profile/sara-altman.bsky.social Sara’s LinkedIn: https://www.linkedin.com/in/sarakaltman/ Sara’s GitHub: https://github.com/skaltman Posit AI Newsletter by Sara and Simon: https://posit.co/blog/?category=roundups
Resources from the hosts and chat:
Positron IDE → https://positron.posit.co/ Databot Extension → https://positron.posit.co/databot.html Getting started with Positron Assistant → https://positron.posit.co/assistant-getting-started.html Posit Assistant (Private Beta) → https://posit-ai-beta.share.connect.posit.cloud/ Reviewer Package (by Simon Couch) → https://github.com/simonpcouch/reviewer ellmer Package → https://elmer.tidyverse.org/ chatlas Package → https://github.com/posit-dev/chatlas Read the Posit AI Newsletter → https://posit.co/blog/?category=roundups Sign up to get the Posit AI Newsletter → http://pos.it/ai-news Simon’s blog post about local LLMs not quite being ready for primetime → https://posit.co/blog/local-models-are-not-there-yet/ Join the waitlist for Posit AI in RStudio → https://posit.co/products/ai/ Posit AI Known Issues & FAQs → https://posit-ai-beta.share.connect.posit.cloud/#frequently-asked-questions-faqs Blog post from Simon and Sara about Privacy and LLMs → https://posit.co/blog/trust-llm-tools/ DS Lab YouTube playlist → https://youtube.com/playlist?list=PL9HYL-VRX0oSeWeMEGQt0id7adYQXebhT&si=7tmU6EAJpO5S7GBh
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 07:23 “Would you mind real quick just briefly explaining the differences between Positron Assistant and Databot?” 15:01 “Is there any way to configure reasoning efforts when signing in with GitHub Copilot?” 15:49 “Does DataBot already support other providers beyond Cloud?” 20:36 “What is the cases with monetary penalty in the console output?” 22:14 “Do you happen to know if the column names of the dataset are very, very messy?” 23:18 “Can you add skills to DataBot?” 26:36 “This code isn’t being saved anywhere. So where does it go?” 27:38 “There a way to know what all the slash commands are?” 28:51 Requesting Databot to use the namespace operator 33:58 “Is there a way to search within that Databot pane?” 39:34 “Have you noticed any time differences with how quickly things run-in RStudio versus Positron?” 40:33 “What happens if you open that URL that it mentions at the bottom in your browser?” 40:50 Clarifying the difference between Posit Assistant and Positron Assistant 43:18 “What is the typical token burn rate?” 53:31 “Is this on CRAN and working in both Positron and RStudio?”

Positron workflows that make life easier | Andrew Heiss | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Andrew Heiss as he demonstrates some tips and tricks about his personal workflow and tools that he actually uses to make life easier in Positron. This is the ultimate list of data life hacks to make your workflow soooo much nicer. Check out Andrew’s blog post here to follow along with the tools he mentions: https://andhs.co/dsl
Hosting crew from Posit: Libby Heeren, Isabella Velasquez
Andrew’s Website: https://www.andrewheiss.com/ Andrew’s Bluesky: https://bsky.app/profile/andrew.heiss.phd Andrew’s LinkedIn: https://www.linkedin.com/in/andrewheiss/
Resources from the hosts and chat:
Andrew’s blog post containing links to all of the tools he mentions: https://www.andrewheiss.com/blog/2026/01/13/dsl-positron-workflow/ Open VSX Registry: https://open-vsx.org/ DataPasta: https://milesmcbain.github.io/datapasta/ Pastum (like datapasta for Positron): https://open-vsx.org/extension/atsyplenkov/pastum Positron Project docs: https://positron.posit.co/migrate-rstudio-rproj.html Garrick’s data science extension bundle package: https://github.com/gadenbuie/positron-plus-1-e Emil’s keyboard shortcut blog post: https://emilhvitfeldt.com/post/positron-key-bindings/ Native Tabs for Mac: https://lucasprag.com/posts/underrated-vscode-feature-native-tabs/ Andrew’s posit::conf(2025) Talk: https://youtu.be/UCloM4GcfVY Arc browser that Andrew is using: https://arc.net/ Andrew’s YAML headers he sets up using espanso: https://github.com/andrewheiss/espanso/blob/52da6c43c6d1ebaf3231770b1b66971d1dfb374a/match/markdown-pandoc-quarto.yml#L118
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps: 00:00 Introduction 01:50 Switching to Positron full-time 03:19 Extensions in Positron 04:44 How to evaluate if an extension is safe 07:05 Air extension (auto-formatting) 08:21 Better Comments extension 10:15 Moving the Activity Bar 12:20 Pastum extension (DataPasta equivalent) 14:26 Rainbow CSV extension 15:34 Spell Right extension 17:34 Managing projects in Positron vs RStudio 20:18 “Do you know if there are extensions… that will conditionally format cells?” 20:40 “Do you explicitly add a dot here file?” 21:44 Project Manager extension 25:34 “How did you discover all of these?” 26:38 “How is GitHub integrated into Positron?” 29:10 Peacock extension 31:16 The Connections pane 36:38 “When I change the Peacock color, it’s changing colors for everything.” 37:59 “Does he use any DuckDB extensions?” 39:05 Raycast 43:35 Raycast scripts 44:30 NotebookLM 45:31 “Is there a hack to manage a repo that is both a project and an R package?” 48:00 Espanso 53:15 “Is Raycast a replacement for Spotlight and Bartender?” 54:00 “Is there an easy way to see all of the shortcuts?”
How to use {pointblank} to understand, validate, and document your data
How to use {pointblank} to understand, validate, and document your data - Rich Iannone
Abstract: This workshop will focus on the data quality and data documentation workflows that the pointblank package makes possible. We will use functions that allow us to: (1) quickly understand a new dataset (2) validate tabular data using rules that are based on our understanding of the data (3) fully document a table by describing its variables and other important details. The pointblank package was created to scale from small validation problems (“Let’s make certain this table fits my expectations before moving on”) to very large (“Let’s validate these 35 database tables every day and ensure data quality is maintained”) and we’ll delve into all sorts of data quality scenarios so you’ll be comfortable using this package in your organization. Data documentation is seemingly and unfortunately less common in organizations (maybe even less than the practice of data validation). We’ll learn all about how this doesn’t have to be a tedious chore. The pointblank package allows you to create informative and beautiful data documentation that will help others understand what’s in all those tables that are so vital to an organization.
Resources mentioned in the workshop:
- Workshop GitHub repository: https://github.com/rich-iannone/pointblank-workshop
- pointblank documentation: https://rstudio.github.io/pointblank/

What even is dbt? An Analytics engineer explains | Laurie Merrell & Michael Chow | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Jarvis Innovations Lead Analytics Engineer Laurie Merrell and Posit Principal Software Engineer Michael Chow as they walk us through a beginner dbt project and let us ask as many questions as we like (and we do, we ask all the questions, including, WHAT EVEN IS dbt??). This is a super friendly, MESSY, collaborative, and curious peek at dbt. It’s is a tool that’s often mysterious to data scientists and it’s a big enough framework that it can feel tough to get started with. Walking through the basics makes it way easier to get into!
Hosting crew from Posit: Libby Heeren, Isabella Velasquez
Laurie’s LinkedIn: https://www.linkedin.com/in/laurie-merrell/
Michael’s socials and urls: LinkedIn: https://www.linkedin.com/in/michael-a-chow/ Bluesky: https://bsky.app/profile/mchow.com GitHub: https://github.com/machow
Resources from the hosts and chat:
Michael Chow’s talk about dbt at the Coalesce Conference in 2022: https://www.youtube.com/watch?v=EYdb1x1cO9U Beginner dbt project Michael is using: https://github.com/dbt-labs/jaffle_shop_duckdb Laurie’s Coalesce talk with Ian and Jenna: https://www.youtube.com/watch?v=6aX7tAfMmIM Link to installation page for the DuckDB CLI: https://duckdb.org/install/?platform=macos&environment=cli “Why is dbt so important” shared by Jenna in the chat: https://highgrowthengineering.substack.com/p/why-is-dbt-so-important- dbtplyr: https://hub.getdbt.com/emilyriederer/dbtplyr/latest/ Parquet: https://parquet.apache.org/ From stored procedures to dbt: A modern migration playbook: https://www.getdbt.com/blog/stored-procedures-dbt-migration-playbook How to structure our dbt projects: https://docs.getdbt.com/best-practices/how-we-structure/1-guide-overview Jenna Jordan’s blog on dbt mesh: https://jennajordan.me/blog/data-mesh-dbt
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 01:09 Guest introductions: Michael Chow and Laurie Merrell 04:15 Overview of today’s session 05:51 Setting up the GitHub Codespace 07:00 The data science workflow vs. organizational needs 10:06 Why dbt is hard to learn in the abstract 13:34 “Could we back up and explain what dbt is again?” 19:12 Running ‘dbt build’ 20:00 Inspecting the database with DuckDB CLI 26:21 “Does dbt have concurrency or dependency capabilities?” 27:37 Understanding the ‘ref’ macro 29:52 “Is dbt an orchestrator?” 31:14 “Starting a project from scratch with just SQL?” 32:04 “How is this better than writing Python scripts?” 35:46 “Is data source detection dynamic with dbt?” 38:36 Generating and serving dbt docs 46:51 “Is dbt an IDE like RStudio, but for SQL?” 52:32 Branching and development environments 53:57 “Where would you begin on a brand new project?” 56:38 “How would you validate dependencies and downstream impacts?” 57:48 Defining a view versus a table

How to deploy Shiny apps in 2026 | Alex Chisholm | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Posit product manager Alex Chisholm as he walks through the evolution of shiny app deployment over the years, how to deploy shiny apps in the modern era, and peeks into Posit’s roadmap for future development. Do you call it “deployment” or “publishing” when it comes to Shiny apps? 🤔
This is a super friendly and conversational space, and being there live in the Discord chat can’t be beat!! We hope you get to join us sometime soon.
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Daniel Chen, Alex Chisholm
Alex Chisholm’s LinkedIn: http://www.linkedin.com/in/chisholm1
Resources from the hosts and chat:
Posit Connect Cloud for deploying Shiny apps in the modern era: https://connect.posit.cloud/ Install Positron: https://positron.posit.co/ Simon Couch’s blog post on local LLMs not being good enough yet: https://www.simonpcouch.com/blog/2025-12-04-local-agents/ Blue-Green Shiny App Deployments using Posit Connect posit::conf(2025) talk by Ryszard Szymański: https://youtu.be/QEEGLWj0nas Digital Ocean: https://www.digitalocean.com/ Ollama local LLM: https://ollama.com/ py-sidebot app template: https://shiny.posit.co/py/templates/sidebot/ querychat app template: https://shiny.posit.co/py/templates/querychat/ Dan Chen mentioned Render in the chat as an alternative to Digital Ocean: https://render.com/ Alex Chisholm’s AB testing GitHub repo example: https://github.com/alex-chisholm/shiny-r-abtesting Edward in the chat shared a GitHub repo for using GitHub actions to execute remote SSH commands: https://github.com/appleboy/ssh-action Abu in the chat shared blue-green vs. canary deployments: https://octopus.com/devops/software-deployments/blue-green-vs-canary-deployments/ Frank in the chat mentioned Simon’s blog on using local LLMs with the chores package: https://www.simonpcouch.com/blog/2025-12-10-chores-0-3-0/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 03:03 Meaningful applications and value creation 05:31 The evolution of Shinyapps.io and Posit Connect 08:12 DigitalOcean and Droplets 09:36 DigitalOcean vs. commercial cloud providers, 11:48 Comparisons: DigitalOcean, Azure, and AWS 14:47 Replicating local environments with Docker 16:51 The open-source Shiny Server 18:20 Use case: University of Illinois CITL 20:02 Key considerations for deployment decisions 21:53 GitHub Actions and version control 23:31 Addressing single points of failure and maintainability 24:38 Posit Connect Cloud features and portfolio 26:01 Beyond Shiny: Quarto, Streamlit, and Dash 27:07 Handling secrets and database credentials 28:56 Custom vanity links vs. UUIDs 30:04 Blue-Green deployment strategies 31:55 “Is it easy to set up a developer workflow?” 34:46 Guardrails for AI powered apps and token usage 37:32 Small language models and Ollama 38:29 Sidebot AI demo and LLM integration 39:41 Understanding manifest.json and dependencies 45:00 Automatic publish on GitHub push 46:51 The future of Shinyapps.io and migration 48:33 “Did you just build a custom agent for that specific dashboard?” 51:43 Publishing from RStudio IDE to Connect Cloud 54:16 Preview: Inspecting website APIs for data harvesting

Integrating Shiny with Epic EHR | Matt Maloney | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Matt Maloney, Director of Applied AI and Data Science at City of Hope, to chat about applying data science to cancer care operations, integrating open source data science tools like Shiny with Electronic Health Records (like Epic), and the evolving governance of generative AI in healthcare.
In this Hangout, we explore the technical and operational strategies behind integrating custom data science applications directly into clinical workflows. Matt discusses how his team moves beyond standalone tools by embedding Shiny apps and other solutions into Epic, allowing medical coders and providers to access predictions and summaries without leaving their primary software environment-of-choice. He also mentions the “build vs. buy” decision-making process as vendors release their own AI solutions, emphasizing the importance of validating external models against their specific patient population.
Resources mentioned in the video and zoom chat: City of Hope → https://www.cityofhope.org Unity Health Toronto Customer Story → https://posit.co/about/customer-stories/unity-health-toronto/ pointblank (Data Validation package) → https://rstudio.github.io/pointblank/
If you didn’t join live, one great discussion you missed from the zoom chat was about where data science teams sit within community members’ organizations and whether they like it or not, specifically the pros and cons of being housed within IT versus embedded inside business units. Participants debated access to infrastructure versus proximity to business stakeholders, with several sharing their own experiences of shifting between these departments (or between companies with different structures). Let us know below if you’d like to hear more about this topic!
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh The Lab: https://pos.it/dslab LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Timestamps: 00:00 Introduction 03:37 “What does the data science function at City of Hope help with?” 08:52 “Tell us a little bit more about how you’re integrating Posit with Epic” 16:08 “How do you handle the needs of privacy with the push to adopt AI?” 18:40 “How do you manage to stay abreast of technical advancements?” 22:45 “At what point do you hand off your data work to the software engineering team?” 27:23 “How much has development that involves LLMs and generative AI taken hold?” 30:38 “Does your team evaluate a lot of the things that Epic might be throwing your way?” 34:41 “How does Epic pass an encounter number or a patient ID to Posit Connect?” 35:57 “How does your team handle these nuanced pieces of clinical information?” 40:29 “Do the administrators appreciate the time that it takes to do things?” 44:22 “What happens in the academic division?” 46:10 “Do you have a piece of career advice for us?”
Exploring Positron settings | Isabel Zimmerman & Davis Vaughan | Data Science Lab
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We’d love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Posit engineers Isabel Zimmerman and Davis Vaughan as they share some of their favorite settings in Positron, a super customizable data science IDE. Come laugh with us as we can’t seem to figure out that VSCode calls rainbow parentheses “bracket pair colorization”
Hosting crew from Posit: Libby Heeren, Isabella Velasquez, Daniel Chen, Isabel Zimmerman, Davis Vaughan
Resources from the hosts and chat: Install Positron: https://positron.posit.co/ Positron docs on keyboard shortcuts: https://positron.posit.co/keyboard-shortcuts.html Nathan Jeffery’s “click to open a .RDS file” keybinding: https://nathan-jeffery.netlify.app/blog/2025-08-26-read-rds-positron/ Positron R pipe setting (paste in browser and it’ll open in Positron): positron://settings/positron.r.pipe One of Dan Chen’s faves, the native tab feature in VSCode + Positron: https://lucasprag.com/posts/underrated-vscode-feature-native-tabs/ The list of RStudio keybindings that you get when you turn on RStudio keybindings in Positron: https://positron.posit.co/migrate-rstudio-keybindings.html Indent rainbow extension: https://open-vsx.org/extension/oderwat/indent-rainbow Rainbow brackets setting (paste in browser and it’ll open in Positron): positron://settings/editor.bracketPairColorization.enabled Setting hierarchy (User vs Workspace settings) in Positron: https://code.visualstudio.com/docs/configure/settings#_settings-precedence Rainbow CSV extension (not by Posit): https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv Positron +1ePositron, an extension pack for dev and data science, by Garrick Aden-Buie: https://open-vsx.org/extension/grrrck/positron-plus-1-e Publishing from VS Code or Positron: https://docs.posit.co/connect/user/publishing-positron-vscode/ Posit Connect Cloud plans: https://connect.posit.cloud/plans Enter Folder extension that Libby mentions: https://open-vsx.org/extension/xiangda/enter-folder Catppuccin themes (shared by Rory Lawless, and now some of Libby’s favorites!): https://open-vsx.org/extension/Catppuccin/catppuccin-vsc
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co The Lab: https://pos.it/dslab Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us!
Timestamps 00:00 Introduction 00:42 Guest Introductions: Isabel and Davis 02:41 Positron Settings overview 04:11 How to enable “Format on Save” 04:34 “How do I open settings in JSON or UI?” 05:10 Auto Save on focus change 08:26 Enabling RStudio key bindings 09:28 “Why doesn’t the cursor move with code edits?” 12:18 User vs. Workspace settings 14:34 Creating and using Profiles 16:13 “Can I use the magrittr pipe with Control+Shift+M?” 17:23 Searching and managing keyboard shortcuts 19:42 Creating custom code snippets 21:31 The Indent Rainbow extension 24:04 Enabling rainbow parenthesis/brackets 25:08 Managing Python and R interpreters 26:32 Rearranging and hiding UI panes 28:04 Rainbow CSV and favorite extensions 29:26 Using the Enter Folder extension 31:05 Understanding the setting hierarchy 32:48 Adding symbols to Quick Open search 36:00 “Is there a way to shift focus using keyboard shortcuts?” 38:04 Modifying keybindings JSON for specific languages 39:20 “How do you find trustworthy extensions?” 43:11 “How can I publish to shinyapps.io from Positron?” 44:03 Deploying with Posit Publisher and Connect Cloud 48:32 Customizing themes with RainGlow extension 50:36 “Is there an Import Data Set wizard in Positron?” 53:01 Conclusion and community resources



Take Positron to Work with Positron Pro
While you’ve downloaded Positron on your desktop and are loving it, you may still have a few questions about using it at work:
- What happens when my analysis requires more memory than my laptop has?
- How can I bring Positron into my company’s secure environment?
- How can I access data in Databricks or Snowflake from Positron?
Nick Rohrbaugh, Senior Product Marketing Manager at Posit shares how Positron Pro, available exclusively within Posit Workbench, transforms from a powerful desktop editor into a fully governed, enterprise-ready IDE.
In this webinar, you will learn how:
- Data teams get immediate access to scalable, server-side compute and secure, one-click data authorization.
- IT leaders can centralize, secure, and manage Positron alongside RStudio, Jupyter, and VS Code, all from a single platform.
Helpful links: Positron: https://posit.co/products/ide/positron/ Posit Workbench: https://posit.co/products/enterprise/workbench/
Keeping LLMs in Their Lane: Focused AI for Data Science and Research
From R+AI 2025, hosted by R Consortium
Keynote
LLMs are powerful, flexible, easy-to-use… and often wrong. This is a dangerous combination, especially for data analysis and scientific research, where correctness and reproducibility are core requirements. Fortunately, it turns out that by carefully applying LLMs to narrower use cases, we can turn them into surprisingly reliable assistants that accelerate and enhance, rather than undermine, scientific work. This is not just theory—I’ll showcase working examples of seamlessly integrating LLMs into analytic workflows, helping data scientists build interactive, intelligent applications without needing to be web developers. You’ll see firsthand how keeping LLMs focused lets us leverage their “intelligence” in a way that’s practical, rigorous, and reproducible.
Bio
Joe Cheng is the CTO and first employee at Posit, PBC (formerly known as RStudio), where he helped create the RStudio IDE, Shiny web framework, and Databot agent for exploratory data analysis.
R Consortium Resources
Main R Consortium Site: https://www.r-consortium.org/ R+AI website: https://rconsortium.github.io/RplusAI_website R Consortium webinars: https://r-consortium.org/webinars/webinars.html Blog: https://r-consortium.org/blog/ LinkedIn: https://www.linkedin.com/company/r-consortium/

AI-Powered Data Science in Positron
We’re so excited to introduce Positron, a free, next-generation data science IDE that makes it easy to work in both R and Python. Positron builds on our years of experience developing RStudio and is a fork of VS Code, designed specifically for data work. This means you get a modern coding interface with features tailored for data science, like a built-in data explorer, AI assistance, interpreter management, and more.
This is our 2nd event in our Positron series and focused on AI-Powered Data Science.
Here at Posit, we strive to create products where AI works with you, not against you. In efforts to continue this mission, we are excited to introduce agentic AI capabilities in Positron, our new, free code editor for R and Python, that are designed from the ground up to follow these principles. Positron’s AI capabilities automate the tedious parts of the data science workflow, but always keep you, the expert, in control.
00:00 Introduction 00:32 What is Positron? (The Next-Gen Data Science IDE) 03:43 Introducing Positron Assistant 05:10 Bring your own LLM 05:47 Your environment as context 07:19 Inline code suggestions 07:58 Introducing Databot 08:22 The WEAR loop 10:15 Demo time 10:56 Opening Positron Pro in Posit Team 14:29 Opening a new folder in Positron 15:22 Cloning a repo from GitHub 17:24 Positron icons 19:15 Positron search bar and command palette 20:18 Changing interpreters and opening a Quarto document 21:30 Running code and populating the Variables Pane 22:41 Using the Data Explorer 25:28 Creating a plot and debugging with Positron Assistant 30:46 Editing code using inline code suggestions 34:42 Sharing a Quarto document 36:12 Opening Databot 37:02 Exploring data with Databot 43:37 Creating a report using Databot findings 45:12 Wrap up
Additional resources: GitHub Repo for this Example: https://github.com/posit-dev/positron-ai-workshop Getting Started with Positron: Quick Tour: https://posit.co/resources/videos/get-started-with-positron-a-quick-tour-and-community-qa/ Introducing Databot (Blog Post): https://posit.co/blog/introducing-databot/ Posit AI Newsletter: https://posit.co/blog/2025-09-26-ai-newsletter/ Positron Assistant: https://positron.posit.co/assistant.html
Air: A blazingly fast R code formatter - Davis Vaughan, Lionel Henry
In Python, Rust, Go, and many other languages, code formatters are widely loved. They run on every save, on every pull request, and in git pre-commit hooks to ensure code consistently looks its best at all times.
In this talk, you’ll learn about Air, a new R code formatter. Air is extremely fast, capable of formatting individual files so fast that you’ll question if its even running, and of formatting entire projects in under a second. Air integrates directly with your favorite IDEs, like Positron, RStudio, and VS Code, and is available on the command line, making it easy to standardize on one tool even for teams using various IDEs.
Once you start using Air, you’ll never worry about code style ever again!
https://www.tidyverse.org/blog/2025/02/air/ https://github.com/posit-dev/air


Lessons from a Broad & Varied Data Science Career | Arcenis Rojas | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Arcenis Rojas, Data Scientist at Indeed, to chat about econometrics, public vs private sector data science, navigating a varied career trajectory, AI integration in the hiring sphere, and making friends at conferences.
In this Hangout, Arcenis talked about how his career journey has been wide as opposed to vertically narrow. He shared that this breadth of experience has given him confidence that he can quickly figure out any dataset. He feels it also taught him how to communicate effectively about data to people at different levels and across various domains. He also shared his tech stack at Indeed, including RStudio, Positron, AWS, Snowflake, Quarto for reporting, Shiny for apps, and Posit Connect for deploying them.
An attendee asked about the impacts of AI on the job search space, and Arcenis shared the AI at Work Report (linked below) from the Indeed Hiring Lab. He says, based on research, generative AI is expected to assist many people but only replace small segments of the workforce in the coming 5-10 years, and that entry-level knowledge work is predicted to be the most highly impacted area.
Resources mentioned in the video and zoom chat: Indeed Hiring Lab: AI at Work Report 2025 → https://www.hiringlab.org/2025/09/23/ai-at-work-report-2025-how-genai-is-rewiring-the-dna-of-jobs/ To Explain or to Predict? (Galit Shmueli, 2010) → https://arxiv.org/abs/1101.0891 Announcing the 2025 table and plotnine contests → https://posit.co/blog/announcing-the-2025-table-and-plotnine-contests/
If you didn’t join live, one great discussion you missed from the zoom chat was about the wide variety of data types data scientists work with. Attendees shared that their data included genomics, finance/trading, environmental/natural resources, e-commerce products, and medical/clinical data. What kind of data types do you work with?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Timestamps 00:00 Introduction 06:16 “What do you like to do for fun?” 08:51 “What are the unique aspects of financial and economic data science?” 15:07 “What are econometrics?” 16:02 “Is the difference that hard sciences stats is trying to explain what happened where econometrics might be what might happen in the future?” 19:39 “Suggestions for making data friends and going to a conference alone.” 23:26 “Do you see any misconceptions about the job market online, specifically the ATS thing?” 29:52 “How has your varied career trajectory been an advantage or a challenge in data science?” 34:08 “How is the recent hype wave of AI integration manifesting in the hiring sphere?” 40:08 “What are the tools that you use in your job for reporting?” 41:42 “How do you know when it is time to pivot and leave your role because your skills are stagnating?” 45:56 “How would you persuade leadership to use R or Python?” 49:32 “Did you find yourself always trying to use more complex models when simpler ones would serve the audience better?”
Max Kuhn - Measuring LLM Effectiveness
For information on upcoming conferences, visit https://www.dataconf.ai .
Measuring LLM Effectiveness by Max Kuhn
Abstract: How can we quantify how accurately LLMs perform? In late 2024, Anthropic released a preprint of a manuscript about statistically analyzing model evaluations. The concepts are on target, but the statistical tactics have narrow applicability. A simpler statistical framework can be used to quantify LLM models that can be used in many more scenarios/experimental designs. We’ll describe these methods and show an example.
Bio: Max Kuhn is a software engineer at Posit PBC (nee RStudio). He is working on improving R’s modeling capabilities and maintaining about 30 packages, including caret. He was a Senior Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He has been applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. He and Kjell Johnson wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association, which recognizes the best book reviewed in Technometrics in 2015. He has co-written several other books: Feature Engineering and Selection, Tidy Models with R, and Applied Machine Learning for Tabular Data (in process).
Presented at The New York Data Science & AI Conference Presented by Lander Analytics (August 27, 2025)
Hosted by Lander Analytics
(https://www.landeranalytics.com )

The Power of Snowflake and Posit Workbench (Jonathan Regenstein, Snowflake) | posit::conf(2025)
The Power of Snowflake and Posit Workbench: Macroeconomic Data Exploration in the Cloud
Speaker(s): Jonathan Regenstein
Abstract:
In this talk, we will utilize the Posit Workbench Native App to demonstrate how macroeconomic research can be run in the Snowflake cloud, powered by R & RStudio.
Starting with data sourced from the Snowflake marketplace, we will import, transform, visualize, and, finally, model data using the Orbital framework to push tidymodels down to the cloud. This is full-stack, R-driven macroeconomic research in the cloud. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
10 Years of Data Science Tools…and What Happens Next (Jonathan McPherson) | posit::conf(2025)
10 Years of Data Science Tools… and What Happens Next
Speaker(s): Jonathan McPherson
Abstract:
In this talk, I’ll reflect on a decade of work on RStudio and the principles of tool-building that have led it to become the standard data science environment for R. We’ll talk about how those same principles have guided the development of Positron, a new data science environment from Posit, and how you can apply them to your own tool-building work.
Slides - https://github.com/rstudio/rstudio-conf/blob/main/2025/jonathanmcpherson/10%20Years%20of%20Data%20Science%20Tools.key posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
AI Coding Assistants: Hype, Help, or Hindrance? (Rebecca Barter, Arine) | posit::conf(2025)
AI Coding Assistants: Hype, Help, or Hindrance?
Speaker(s): Rebecca Barter
Abstract:
AI coding assistants like ChatGPT, GitHub Copilot, and Codeium promise to revolutionize our coding workflows—but how useful are they in practice? Are they our new overlords here to take our jobs? Or just a passing gimmick? I think the reality lies somewhere in between, and that understanding these tools is key to staying relevant in today’s rapidly evolving data science ecosystem.
In this talk, I’ll show how I’ve used these AI tools in RStudio, Positron, and VS Code to speed up both my advanced R workflows as well as my learning experience as an intermediate Python programmer, providing examples, pitfalls, and best practices.
Materials - https://rlbarter.github.io/posit-conf-2025/#0 posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
Air - A blazingly fast R code formatter (Davis Vaughn & Lionel Henry, Posit) | posit::conf(2025)
Air - A blazingly fast R code formatter
Speaker(s): Davis Vaughan; Lionel Henry
Abstract:
In Python, Rust, Go, and many other languages, code formatters are widely loved. They run on every save, on every pull request, and in git pre-commit hooks to ensure code consistently looks its best at all times.
In this talk, you’ll learn about Air, a new R code formatter. Air is extremely fast, capable of formatting individual files so fast that you’ll question if its even running, and of formatting entire projects in under a second. Air integrates directly with your favorite IDEs, like Positron, RStudio, and VS Code, and is available on the command line, making it easy to standardize on one tool even for teams using various IDEs.
Once you start using Air, you’ll never worry about code style ever again! posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/


Approaching Positron from RStudio (Mauro Lepore, Recast) | posit::conf(2025)
Approaching Positron from RStudio Speaker(s): Mauro Lepore
Abstract: Many data science teams that traditionally worked with R and RStudio are now attracting developers with experience in Python and VS Code. Positron is a polyglot IDE supporting both R and Python and incorporating tools from both RStudio and VS Code. However, jumping straight from one familiar IDE to an unfamiliar one can be intimidating, slow down productivity, and impair adoption.
In this talk, I’ll show some tools and techniques you can use in RStudio and VS Code to start your transition to Positron today—with minimal friction—while staying productive in your preferred IDE.
posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
Get the Latest on Posit’s Commercial Products | posit::conf(2025)
Get the Latest on Posit’s Commercial Products
Speaker(s): Kelly O’Briant; Tom Mock; Joe Roberts; Kara Woo; Alex Chisholm; Chetan Thapar; Steve Nolen
Abstract:
Join us for an overview of the latest developments across Posit’s commercial product ecosystem. This session will cover Posit Workbench, Package Manager, Connect, Connect Cloud, and our growing portfolio of managed services including Snowflake and beyond. Hear directly from the product managers and engineers who are building these tools, and get insights into what’s coming next.
0:00 Introduction 3:30 Audited jobs, Positron Pro sessions, and GenAI in Workbench 14:20 Auth and integrations with RStudio Pro sessions in Package Manager 21:00 An intro to Chronicle for Posit Team 25:57 Building container images in Connect 39:50 Organization plans in Connect Cloud 49:00 A Snowflake Native App offering for Connect and Workbench 1:02:00 An intro to Posit Team Dedicated
posit::conf(2025)
Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
What we’re doing to make Quarto fast(er) (Carlos Scheidegger, Posit) | posit::conf(2025)
What we’re doing to make Quarto fast(er) Speaker(s): Carlos Scheidegger
Abstract: Quarto is a powerful system, but its performance leaves much to be desired. In this talk, I’ll go through the things that make Quarto slow, and I will describe the journey I’m taking in 2025 to fix the issues. This is going to be a deeper technical talk on performance analysis, profiling, and will include discussing the custom tooling we’ve had to build to measure performance in a system as complex as Quarto.
Quarto markdown repo: https://github.com/rstudio/rstudio-conf/blob/main/2025/github.com/quarto-dev/quarto-markdown
posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/

A first look at Positron - Julia Silge
Description Positron is a next generation data science IDE built by the creators of RStudio. It has been available for beta testing for a number of months, and R users may have wondered if they should try it or if it will be a good fit for them. This new IDE is an extensible tool built to facilitate exploratory data analysis, reproducible authoring, and publishing data artifacts, and it is an IDE that supports but is not built only for R. How should an R user think about Positron, compared to the other options out there?
In this talk, learn about how and why Positron is designed the way it is, what will feel familiar or new coming from other IDEs such as RStudio, and when (or if) people who use R should consider giving it a try. You’ll hear about different choices when it comes to defaults and ways of working, such as how to think about your projects or folders and how to manage multiple versions of R. You will also learn about new functionality for R users and package developers that we have never had before, like new approaches for managing R package tests and the ability to customize an IDE using extensions. If you are curious about Positron and how it fits into the R ecosystem, you’ll come away from this talk with more details about its capabilities and more clarity about whether it may be a good choice for you.
Additional Material or Paper *Visit https://positron.posit.co for documentation and installers *Find us on GitHub at https://github.com/posit-dev/positron *Positron is currently available on Posit Workbench in preview

Next-gen data science: a conversation with the Ravit Show and Eric Pité
Eric Pité, SVP of Product & Strategy, sat down with the Ravit Show at Databricks Data + AI Summit to have a conversation about the evolution of open source data science, the Posit and Databricks integration, and the place of AI in modern data science. Some highlights from the conversation are below:
– The evolution from RStudio to Posit It’s more than just a name change — it’s a signal of how seriously they’re embracing Python while staying true to their R roots and open-source mission.
– Open source meets enterprise Eric shared how Posit is navigating the balance between community contribution and commercial sustainability — and why that mission still matters in an AI-first world.
– Positron + Databricks Their strategic partnership is one to watch. Positron is making it easier to do high-quality data science (in R and Python) inside the Databricks ecosystem, with an emphasis on collaboration, reproducibility, and performance.
– AI’s place in modern data science Eric and Ravit chatted about how Posit sees the role of AI in helping data teams be more productive, without losing rigor or transparency.
– The power of storytelling with code One of our favorite parts is the emphasis on communication. Building models is not enough—data scientists need to share insights clearly, and Posit is building for that.
Learn more about the Databricks and Posit partnership by exploring the partnership page and resources: https://posit.co/use-cases/databricks/
Posit Conf 2025 Keynote Previews | Kieran Healy & Jonathan McPherson | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you! Thursdays at 12PM US Eastern
We were recently joined by upcoming Posit Conf 2025 keynote speakers Kieran Healy, Professor of Sociology at Duke University, and Jonathan McPherson, Software Architect at Posit PBC, to chat about how and why open-source IDEs like RStudio and Positron get made, how to do data visualization for discovery and explanation, what their keynotes are going to be about, and what’s next for Posit’s IDE development, including AI integration.
In this Hangout, Kieran talked about the trustworthy data visualization. He highlighted that while data visualization is a powerful way to condense and present information, often creating compelling and authoritative artifacts, phrases like “visual storytelling” can be problematic if they encourage presenting a predetermined narrative not fully supported by data. He emphasized that the trustworthiness of visualizations does not come solely from the techniques used or the software, but from a “web of social processes and individual commitments” that cannot be easily automated.
Jonathan talked about the future of Positron and its relationship with RStudio, addressing whether Positron is intended to replace RStudio. He clarified that the long-term goal for Positron is to make it the best Integrated Development Environment (IDE) for working with data in any language. He explained that Positron is built with an extensibility layer, allowing anyone to write plugins for new languages or capabilities, making it a robust and evolving data science workbench. It does not have all of RStudio’s features and makes different design trade-offs. RStudio, having evolved over decades, is highly optimized for specific R-based workflows and remains the best at what it does for those use cases.
Resources mentioned in the video and zoom chat: Posit Conference 2025 Registration → https://posit.co/conference/ Kieran Healy’s Website → https://kieranhealy.org Kieran Healy’s book “The Ordinal Society” → https://theordinalsociety.com/ Kieran Healy’s book “Data Visualization: A Practical Introduction” → https://socviz.co/ Jonathan McPherson’s LinkedIn → https://www.linkedin.com/in/jonathanmcpherson Joe Cheng’s AI Talk on Harnessing LLMs for Data Analysis → https://youtu.be/owDd1CJ17uQ?feature=shared TidyTuesday GitHub → https://github.com/rfordatascience/tidytuesday Positron IDE → https://positron.posit.co/ Will R Chase’s talk on making clear plots → https://www.youtube.com/watch?v=h5cTacaWE6I
If you didn’t join live, one great discussion you missed from the zoom chat was about the ongoing debate and practical tips for moving from presenting tables of numbers to visualizations. Community members shared various strategies, including using color-mapped tables as an intermediate step, providing both tables and visuals, and ensuring accessibility and interpretability for diverse audiences. Are you team tables or team graphs?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!

Easier data and asset sharing across projects and teams with {pins} and Databricks
Led by Edgar Ruiz, Software Engineer at Posit PBC April 30th at 11 am ET / 8 am PT
Sharing data assets can be challenging for many teams. Some may rely on emailed files to keep analyses up to date, making it difficult to keep current or know what version of the data is used. {pins} improves sharing data and other assets across projects and teams. It enables us to publish, or ‘pin’, to a variety of places, such as Amazon S3, Posit Connect and Dropbox.
Given recent customer feedback, the ability to publish, or ‘pin’ to Databricks Volumes has been added to R. The same capability is also currently in the works for the Python version of {pins}.
This session on April 30th will showcase the acceleration of predictions by distributing a ‘pinned’ model using pins and Spark in Databricks. We’ll walk through integrating {pins} with Databricks in your team’s projects and cover novel uses of pins inside the Databricks ecosystem.
GitHub repo: https://github.com/edgararuiz/talks/tree/main/end-to-end
Here are a few additional resources that you might find interesting:
- Pins for R: https://pins.rstudio.com/
- Pins for Python: https://rstudio.github.io/pins-python/
- More information on how Posit and Databricks work together: https://posit.co/use-cases/databricks/
- Customer Spotlight: Standardizing a safety model with tidymodels, Posit Team & Databricks at Suffolk Construction: https://youtu.be/yavHEWpgrCQ
- Q&A Recording: https://youtube.com/live/HDTDmEaK5zQ?feature=share

Standardizing a safety model with tidymodels, Posit Team & Databricks at Suffolk Construction
If you’ve ever struggled with standardizing machine learning workflows, ensuring secure data access, or scaling insights across your organization, this month’s Posit Team Workflow demo is for you.
Maxwell Patterson, Data Scientist at Suffolk walked us through how their team is:
Standardizing model workflows using tidymodels, vetiver, Shiny, and Quarto Leveraging row-level permissions in Shiny apps to improve data governance Using Databricks and Posit to gain insights faster and more securely
A few helpful links for this demo: Suffolk Customer Spotlight: https://posit.co/about/customer-stories/suffolk/ Quarto email customization: https://docs.posit.co/connect/user/quarto/#email-customization Vetiver package: https://rstudio.github.io/vetiver-r/reference/vetiver_deploy_rsconnect.html Pins package: https://pins.rstudio.com/ Tidymodels “meta-package” https://tidymodels.tidymodels.org/ More information on how Posit and Databricks work together: https://posit.co/use-cases/databricks/
Do you use both Databricks and Posit, but not together yet. You can use this link to chat more with our team as well: https://pos.it/chat-databricks
Q&A Recording: https://youtube.com/live/zU-bBUJMyQ4?feature=share To add future workflow demos on your calendar: https://pos.it/team-demo
^ These demos happen the last Wednesday of every month
Data Science at the Command Line and Polars | Jeroen Janssens | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Jeroen Janssens, Senior Developer Relations Engineer at Posit, to chat about his career journey from machine learning to developer relations, the advantages of using the command line for data science, his books “Data Science at the Command Line” and “Python Polars”, and advice for aspiring DevRel professionals.
In this Hangout, we explore the benefits of working on the command line versus not. Jeroen explained that while the initial command line interface might seem stark, it offers a very different and powerful way to interact with your computer. The Unix command line is ubiquitous across various systems, from Raspberry Pis to supercomputers. Its strength lies in the ability to connect tools together through standard output and input, allowing for quick and iterative solutions by combining specialized tools. This fosters an interactive nature with a short feedback loop and provides closer interaction with the file system, making ad hoc data exploration efficient.
Resources mentioned in the video and zoom chat: Jeroen’s LinkedIn → https://www.linkedin.com/in/jeroenjanssens/ Data Science at the Command Line → https://jeroenjanssens.com/dsatcl/ Python Polars: The Definitive Guide → https://polarsguide.com/ Plotnine → https://plotnine.org/ Winner of the 2024 plotnine Plotting Contest → https://posit.co/blog/winner-of-the-2024-plotnine-plotting-contest/ Talk about plotnine → https://www.youtube.com/watch?v=xdD8r84sqYY R for Data Science → https://r4ds.had.co.nz/ Jeroen’s plotnine translation of R for Data Science → https://jeroenjanssens.com/plotnine/ froggeR package → https://azimuth-project.tech/froggeR/ Reticulate → https://rstudio.github.io/reticulate/ Install Windows Subsystem for Linux (WSL) → https://learn.microsoft.com/en-us/windows/wsl/install UTM for macOS (Virtualization) → https://mac.getutm.app fish shell → https://fishshell.com/ Quartodoc → https://github.com/machow/quartodoc Focusmate (Accountability Partner Tool) → https://www.focusmate.com/ Surface Area of Luck → https://modelthinkers.com/mental-model/surface-area-of-luck CRAN R Extensions Manual → https://cran.r-project.org/doc/manuals/r-release/R-exts.html
If you didn’t join live, one great thing you missed from the zoom chat was people sharing their varied experiences with the command line, with many admitting they primarily use it for basic navigation or only when necessary, and some sharing helpful tools and tips for those less familiar. Let us know below if you’d like to hear more about this topic!
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!

Tables in Python with Great Tables
Tables in Python with Great Tables - Rich Iannone, Michael Chow
Resources mentioned in the workshop:
- Workshop GitHub Repository: https://github.com/rich-iannone/great-tables-mini-workshop
- Great Tables https://posit-dev.github.io/great-tables/articles/intro.html
- {reactable-py} https://github.com/machow/reactable-py
- Save a gt table as a file https://gt.rstudio.com/reference/gtsave.html
- {gto} Insert gt tables into Word documents https://gsk-biostatistics.github.io/gto/
- GT.save https://posit-dev.github.io/great-tables/reference/GT.save.html
- define_units https://posit-dev.github.io/great-tables/reference/define_units.html#great_tables.define_units
- Posit Tables Contest 2024 winners: https://posit.co/blog/2024-table-contest-winners/
Editor’s note: During this workshop, several interruptions from an unwanted and disruptive intruder (commonly referred to as a “Zoom bomber”) occurred. We removed those instances from the recording, however that causes a few of the workshop sections to appear disjointed. We apologize for the inconvenience.
Workshop recorded as part of the 2024 R/Pharma Workshop Series


Company Branding Workflow Demo Live Q&A - February 26th
Hi, there! If you started here first, please refer back to the Demo: https://youtu.be/U48y0_yzEPY
This Q&A Session followed a workflow demo on “how to apply consistent company branding across reports, dashboards, and apps”
Key Links:
- GitHub Repo for Example: https://github.com/skaltman/brand-yml-demo
- brand.yml GitHub repo: https://posit-dev.github.io/brand-yml/
- Follow-along blog post: https://posit.co/blog/unified-branding-across-posit-tools-with-brand-yml/
Additional Resources Mentioned in Q&A:
- Quarto specific page on brand: https://quarto.org/docs/authoring/brand.html
- Typography: https://posit-dev.github.io/brand-yml/brand/typography.html
- brand.yml + pkgdown: https://github.com/rstudio/bslib/tree/main/pkgdown
- LLM brand.yml prompt: https://posit-dev.github.io/brand-yml/articles/llm-brand-yml-prompt/
- Inspiration/gallery: https://posit-dev.github.io/brand-yml/inspiration/
Please note, the main demo will be here: https://youtu.be/U48y0_yzEPY?feature=shared
If you’d like to ask questions anonymously, you can use: https://pos.it/demo-questions
Company-branded reports, apps, and dashboards made easier with brand.yml & Posit
You will learn: How to apply consistent company branding across reports, dashboards, and apps
Key Links:
- GitHub Repo for Example: https://github.com/skaltman/brand-yml-demo
- brand.yml GitHub repo: https://posit-dev.github.io/brand-yml/
- Follow-along blog post: https://posit.co/blog/unified-branding-across-posit-tools-with-brand-yml/
- Q&A after the Demo: https://youtube.com/live/kuEbRfmm4G4?feature=share
Additional Resources Mentioned in Q&A:
- Quarto specific page on brand: https://quarto.org/docs/authoring/brand.html
- Typography: https://posit-dev.github.io/brand-yml/brand/typography.html
- brand.yml + pkgdown: https://github.com/rstudio/bslib/tree/main/pkgdown
- LLM brand.yml prompt: https://posit-dev.github.io/brand-yml/articles/llm-brand-yml-prompt/
- Inspiration/gallery: https://posit-dev.github.io/brand-yml/inspiration/
Why we think this is important: Consistent company branding in your reports and apps (with your logo, colors, and fonts) can help make your work look more professional, but are often tricky to get right.
Common challenges we’ve heard from the community:
- Excessive manual effort: Applying colors, fonts, and logos across reports, apps, and dashboards takes time and is prone to errors.
- Difficult to update: When brand guidelines change, it’s difficult to update all products consistently.
- Team consistency: Ensuring all contributors follow branding guidelines is challenging to manage.
How to join future events: We host workflow demos the last Wednesday of every month. You can add them to your calendar with this link: https://www.addevent.com/event/Eg16505674
Full playlist of workflow demo recordings: https://www.youtube.com/playlist?list=PL9HYL-VRX0oRsUB5AgNMQuKuHPpNDLBVt
Have suggestions? Comment below.
Thank you for joining us!
Shiny community, hackathons, and his AI mindset | Joe Cheng | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Joe Cheng, CTO at Posit, to chat about the Shiny contest, the use of AI in data science, and designing hackathons for learning new technologies. We were joined by several past and present Shiny contest winners who gave great advice on how to get started if you want to participate (and we really hope you do)!
In this Hangout, we explore the evolution of the Shiny contest since its inception, including what made the 2024 submissions unique and the ways the contest encourages community contribution and learning. Joe also shared about his personal journey from feeling skepticism about AI to seeing and embracing its potential. We got some amazing questions from the Hangout attendees! We hope you join us live next time to ask some of your own questions
Resources mentioned in the video and zoom chat:
2024 Shiny Contest Winners → https://posit.co/blog/winners-of-the-2024-shiny-contest/
Joe’s AI Hackathon Slides → https://jcheng5.github.io/llm-quickstart/quickstart.html
Shiny Assistant → https://gallery.shinyapps.io/assistant/
Isabella’s blog post on prototyping with Shiny Assistant → https://posit.co/blog/ai-powered-shiny-app-prototyping/
Posit Conf Workshops → https://reg.rainfocus.com/flow/posit/positconf25/attendee-portal/page/sessioncatalog?tab.day=20250916&search.sessiontype=1675316728702001wr6r
Shiny Conference 2025 → https://www.shinyconf.com/
Call for Speakers Shiny Conf 2025 → https://sessionize.com/shiny-conf-2025/
Shiny Tableau → https://rstudio.github.io/shinytableau/
Echarts4r → https://echarts4r.john-coene.com
Elmer package on Github → https://github.com/tidyverse/ellmer
All the Shiny app links mentioned in the video and zoom chat: Eric Nantz 2021 Shiny Contest Submission → https://forum.posit.co/t/the-hotshots-racing-dashboard-shiny-contest-submission/104925 Eric Nantz’s R/Pharma conference keynote on AI → https://youtu.be/AfMa1CVUdXU?si=ThLsKFyonntxzBUF Eric Nantz’s Haunted Places app → https://youtu.be/vX09QGMuOfo?si=K5_uPfK5bcfZZ92l Umair Durrani’s Shiny Storytelling app → https://umair.shinyapps.io/storytimegcp/ Umair’s Blue Sky profile → https://bsky.app/profile/transport-talk.bsky.social Umair’s Shiny meetings project on Github → https://github.com/shiny-meetings/shiny-meetings Abby Stamm’s Shiny Accessibility app → https://github.com/ajstamm/shiny-a11y-app
If you didn’t join live, one great discussion you missed from the zoom chat was about everyone’s favorite interactive plotting tools. Someone asked whether Plotly was the best option, and lots of people said they loved ggiraph, echarts4r, ObservableJS, and others. What about you?! What’s your favorite interactive plotting library?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!

The Power of Snowflake and Posit Workbench: Macroeconomic Data Exploration in the Cloud
In this live event, we will utilize the Posit Workbench Native App to demonstrate that macroeconomic research can be run in the Snowflake cloud but powered by R and RStudio.
Starting with data sourced from the Snowflake marketplace, we will import, transform, visualize, and, finally, model data using the Orbital framework to push tidymodels down to the cloud. This is full-stack, R-driven macroeconomic research in the cloud.
Add the event to your calendar: https://evt.to/eugmedshw Learn more about the Snowflake and Posit partnership: https://posit.co/use-cases/snowflake/
Wes McKinney & Hadley Wickham (on cross-language collaboration, Positron, career beginnings, & more)
We hosted a special event hosted by Posit PBC with Wes McKinney (Pandas & Apache Arrow) and Hadley Wickham (rstats & tidyverse) to ask questions, share your thoughts, and exchange insights about cross-language collaboration with fellow data community members.
Here’s a preview into what came up in conversation:
- Cross-language collaboration between R and Python
- Positron, a new polyglot data science IDE
- Open source development, how Wes and Hadley got involved in open source and their experiences in building and maintaining open-source projects such as Pandas and the tidyverse.
- Documentation for R and Python, especially in the context of teams that use both languages (shoutout to Quarto!)
- The use of LLMs in data science
- The emergence of libraries like Polars and DuckDB
- Challenges of switching between the two languages
- Package development and maintenance for polyglot teams that have internal packages in both languages
- The future of data science
The chat was on fire for this conversation and we’ve gathered most of the links shared among the community below:
Documentation mentioned: Positron, next-generation data science IDE built by Posit: https://positron.posit.co/ Quarto tabset documentation: https://quarto.org/docs/output-formats/html-basics.html#tabset-groups
Packages / Extensions mentioned: Pins: https://pins.rstudio.com/ Vetiver: https://vetiver.posit.co Orbital: https://orbital.tidymodels.org Elmer: https://elmer.tidyverse.org Tabby Extension: https://quarto.thecoatlessprofessor.com/tabby/
Blog posts: AI chat apps with Shiny for Python: https://shiny.posit.co/blog/posts/shiny-python-chatstream/ Using an LLM to enhance a data dashboard written in Shiny: R Sidebot & Python Sidebot Marco Gorelli Data Science Hangout (polars): https://youtu.be/lhAc51QtTHk?feature=shared Emily Riederer’s blog post on Polars: https://www.emilyriederer.com/post/py-rgo-polars/ Jeffrey Sumner’s tabset example: https://rpy.ai/posts/visualizations%20with%20r%20and%20python/r_python_visualizations Emily Riederer’s blog post on Python and R ergonomics: https://www.emilyriederer.com/post/py-rgo/11 Sam Tyner’s blog post on Lessons from “Tidy Data”: https://medium.com/@sctyner90/10-lessons-from-tidy-data-on-its-10th-anniversary-dbe2195a82b7
Other: Hadley Wickham’s cocktails website: https://cocktails.hadley.nz 5 Posit subscription management to find out about new tools, events, etc.: https://posit.co/about/subscription-management/
New to Posit? Posit builds enterprise solutions and open source tools for people who do data science with R and Python. (We are also the company formerly called RStudio) We’d love to have you join us for future community events!
Every Thursday from 12-1pm ET we host a Data Science Hangout with the community and invite you to join us! You can add that event to your calendar with this link: https://www.addevent.com/event/Qv9211919

Sharpening your axe and the BAU trap | Steph Locke | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Steph Locke, Digital and App Innovation Leader at Microsoft, to chat about how to persuade your manager to give you more time for professional development, why “business as usual” (BAU) work can choke development, and how killing projects is actually a good thing.
In this Hangout, we explore the importance of investing time in skill development and how this can lead to long-term gains in efficiency and quality. Steph shares advice on how to talk to managers about the value of “sharpening your axe,” and why it is more efficient to train in order to do things well initially than to spend time on continuous maintenance of subpar work.
Is a bunch of “business as usual” work bogging down the potential output of development teams? This is where Steph’s concept of investing in high quality work up-front comes in. She talks about the dangers of rushing to release products that aren’t built with high quality and low ongoing maintenance in mind: “…when [people] don’t necessarily have time to invest in doing things in a way that’s going to have high quality and low maintainability requirements and is easy to extend and create new things, when people aren’t doing that, anything they ship is going to then cost them more time to do something to to look after whatever they’ve shipped. That then gives them less time to ship the next thing.” If this resonates with you, give Steph a follow on LinkedIn and make sure you read her article.
Resources mentioned in the video and zoom chat: Steph’s article on how BAU chokes development → https://www.linkedin.com/pulse/developer-velocity-choked-bau-stephanie-locke-fsiqe Posit’s documentation on using copilot in the RStudio IDE → https://docs.posit.co/ide/user/ide/guide/tools/copilot.html A YouTube video about using Shiny apps for data entry into a backend database → https://www.youtube.com/watch?v=zDJc8sXh2qw
If you didn’t join live, one great discussion you missed from the zoom chat was around the discrepancies in pay and responsibilities between individual contributor (IC) and management tracks in tech companies, and the perception that management roles often have better compensation and advancement opportunities, even when the work may not be more valuable than IC contributions. Let us know below if you’d like to hear more about this topic! Is there a need for more non-people-leader technical leadership roles for highly skilled individual contributors?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Quarto Websites 1: Build your homepage | Charlotte Wickham & Emil Hvitfeldt | Posit
In this video, you’ll get a running start by using a template we’ve designed to be functional and attractive, and that will serve as a foundation for the rest of the video series. You’ll customize the content of your homepage, and how it looks, and along the way learn about the two key files in a Quarto website index.qmd and _quarto.yml. Finally, you’ll learn one way to publish your website so other people can see it.
In this video: 0:21 Use a template to get started 2:33 Preview the template homepage, index.qmd 4:12 Customize the content of your homepage 5:45 “About” pages 7:22 Customize the image on your homepage 9:24 Website configuration, _quarto.yml 10:40 Customize colors with YAML 13:45 Customize fonts with YAML 17:00 Publish your site
Links: About pages: https://quarto.org/docs/websites/website-about.html Appearance options you can set in YAML: https://quarto.org/docs/output-formats/html-themes.html#basic-options
Code: Starter source code: https://github.com/EmilHvitfeldt/website-template Final source code: https://github.com/cwickham/quarto-website-video/tree/v0.1
For more in-depth coverage and slides check out: https://posit-conf-2024.github.io/quarto-websites/
Do you need a professional website to showcase your work? If you’ve used Quarto to produce a document, you’ve already got the technical skills to create a Quarto website. In this video series, you’ll learn everything else you need to build a website and customize its appearance.
This video series is for you if you:
- Have used Quarto to generate documents (e.g. HTML, PDF, MS Word etc.)
- Are comfortable editing plain text documents (e.g .qmd) in your IDE (e.g. RStudio, Visual Studio Code etc.)
- Want to walk away with your own personal website
Taught by: Charlotte Wickham (https://www.cwick.co.nz/ ) Emil Hvitfeldt (https://emilhvitfeldt.com/ )
Videos in this series:
- Build your homepage [https://youtu.be/l7r24gTEkEY]
- Add pages and navigation [https://youtu.be/k65E-8PXZmA] 3: Customize appearance with CSS/SCSS [https://youtu.be/pAN2Hiq0XGs] 4: Add lists of content with listings [https://youtu.be/bv_Cw-3HI1Y]


Quarto Websites 2: Add pages and navigation | Charlotte Wickham | Posit
Now you’ve got a homepage, you’ll likely want to add some other pages. In this video, learn how to add pages to your website, and help people find them, by adding them to your website navigation.
In this video: 1:00 Add a page to your website 2:54 Your file structure determines your URL structure 5:49 Add a link to your page in navigation 7:50 Customize navigation item text and icon 9:12 Control where items appear in the navigation bar 10:16 Navigation bar options 11:11 Switch to side navigation 12:22 Other types of navigation 16:30 Wrap Up
Links: List of icons you can use in navigation items: https://icons.getbootstrap.com/ Top navigation bar options: https://quarto.org/docs/websites/website-navigation.html#top-navigation Quarto website navigation: https://quarto.org/docs/websites/website-navigation.html
Code: Starter source code: https://github.com/cwickham/quarto-website-video/tree/v0.1 Final source code: https://github.com/cwickham/quarto-website-video/tree/v0.2
For more in-depth coverage and slides check out: https://posit-conf-2024.github.io/quarto-websites/
Do you need a professional website to showcase your work? If you’ve used Quarto to produce a document, you’ve already got the technical skills to create a Quarto website. In this video series, you’ll learn everything else you need to build a website and customize its appearance.
This video series is for you if you:
- Have used Quarto to generate documents (e.g. HTML, PDF, MS Word etc.)
- Are comfortable editing plain text documents (e.g .qmd) in your IDE (e.g. RStudio, Visual Studio Code etc.)
- Want to walk away with your own personal website
Taught by: Charlotte Wickham (https://www.cwick.co.nz/ ) Emil Hvitfeldt (https://emilhvitfeldt.com/ )
Videos in this series:
- Build your homepage [https://youtu.be/l7r24gTEkEY]
- Add pages and navigation [https://youtu.be/k65E-8PXZmA] 3: Customize appearance with CSS/SCSS [https://youtu.be/pAN2Hiq0XGs] 4: Add lists of content with listings [https://youtu.be/bv_Cw-3HI1Y]


Quarto Websites 3: Customize appearance with CSS/SCSS | Emil Hvitfeldt | Posit
You now have a set of content you are happy with on your website, but how do you customize the look and feel of your site beyond options set in YAML? In this video, you’ll start by learning the basics of CSS and SCSS and how to make good design choices. Then, you’ll see how to apply these choices to your Quarto website.
In this video: 0:14 What is HTML? 3:23 CSS Selectors 8:05 CSS Attributes 8:25 Layout attributes 10:24 Reducing repetition with SASS/SCSS? 15:26 Consistent design 16:36 Choosing colors 17:50 Choosing fonts 19:28 Maintaining accessibility 22:13 Apply SCSS to your website 24:16 Change the appearance of headings 25:28 Change the appearance of navigation bar 30:30 Use google fonts
Links: Color contrast checker: https://colourcontrast.cc/ Google fonts: https://fonts.google.com/
Code: Starter source code: https://github.com/cwickham/quarto-website-video/tree/v0.2 Final source code: https://github.com/cwickham/quarto-website-video/tree/v0.3
For more in-depth coverage and slides check out: https://posit-conf-2024.github.io/quarto-websites/
Do you need a professional website to showcase your work? If you’ve used Quarto to produce a document, you’ve already got the technical skills to create a Quarto website. In this video series, you’ll learn everything else you need to build a website and customize its appearance.
This video series is for you if you:
- Have used Quarto to generate documents (e.g. HTML, PDF, MS Word etc.)
- Are comfortable editing plain text documents (e.g .qmd) in your IDE (e.g. RStudio, Visual Studio Code etc.)
- Want to walk away with your own personal website
Taught by: Charlotte Wickham (https://www.cwick.co.nz/ ) Emil Hvitfeldt (https://emilhvitfeldt.com/ )
Videos in this series:
- Build your homepage [https://youtu.be/l7r24gTEkEY]
- Add pages and navigation [https://youtu.be/k65E-8PXZmA] 3: Customize appearance with CSS/SCSS [https://youtu.be/pAN2Hiq0XGs] 4: Add lists of content with listings [https://youtu.be/bv_Cw-3HI1Y]


Quarto Websites 4: Add lists of content with listings | Charlotte Wickham | Posit
Adding a listing page to your website is a great way to showcase your projects, talks, publications or blog posts. In this video you’ll learn how to create a listing page in Quarto and see two ways to populate it with content: Quarto documents, or a yaml file.
In this video: 0:50 Use a listing to add a blog 3:36 Listing options 5:47 Why use a listing? 7:22 Use a YAML file to populate a project portfolio 9:50 Customize the display of a listing 12:10 Advanced customization of listings 13:42 Remove pages
Links: Listings: https://quarto.org/docs/websites/website-listings.html Andrew Heiss’ teaching listing: https://www.andrewheiss.com/teaching/
Code: Starter source code: https://github.com/cwickham/quarto-website-video/tree/v0.3 Final source code: https://github.com/cwickham/quarto-website-video/tree/v0.4
For more in-depth coverage and slides check out: https://posit-conf-2024.github.io/quarto-websites/
Do you need a professional website to showcase your work? If you’ve used Quarto to produce a document, you’ve already got the technical skills to create a Quarto website. In this video series, you’ll learn everything else you need to build a website and customize its appearance.
This video series is for you if you:
- Have used Quarto to generate documents (e.g. HTML, PDF, MS Word etc.)
- Are comfortable editing plain text documents (e.g .qmd) in your IDE (e.g. RStudio, Visual Studio Code etc.)
- Want to walk away with your own personal website
Taught by: Charlotte Wickham (https://www.cwick.co.nz/ ) Emil Hvitfeldt (https://emilhvitfeldt.com/ )
Videos in this series:
- Build your homepage [https://youtu.be/l7r24gTEkEY]
- Add pages and navigation [https://youtu.be/k65E-8PXZmA] 3: Customize appearance with CSS/SCSS [https://youtu.be/pAN2Hiq0XGs] 4: Add lists of content with listings [https://youtu.be/bv_Cw-3HI1Y]


Running unified attribution at scale | Martin Stein @ Conversion Logix | Data Science Hangout
Join us for a conversation with Martin Stein, Chief Analytics Officer at Conversion Logix, to chat about running unified attribution at scale, the tools they leverage, languages Conversion Logix’s uses, and the workflow and operations to make it all happen.
Featured Leader Bio: Martin Stein is a seasoned data scientist, entrepreneur, and Chief Analytics Officer at Conversion Logix. With over 20 years of experience in AI, machine learning, and data science, he brings a wealth of expertise to the multifamily industry, developing cutting-edge MarTech solutions that empower marketing leaders to make data-driven decisions to maximize their investments. Stein is also the co-founder of Bend.ai, an AI/machine learning company acquired by Conversion Logix. His career includes significant roles at Apple, IDG/IDC, G5, RStudio, and startups like FileThis, Defined.ai, and Union.ai. Throughout his career, he has demonstrated a proven ability to build and scale products to over $200 million in revenue.
Martin shares insights on topics like:
The problem with “last-touch” attribution in marketing and how his team is building solutions to provide a more comprehensive understanding of marketing channel effectiveness. Common data science struggles, including data access, data inconsistency, production challenges, and team dynamics. The value of community in a data scientist’s journey, particularly for those exploring new areas like AI. Tips for effectively communicating data science concepts and building business cases, both internally and with clients. Approaches to measuring channel success in marketing and comparing it to copy success, emphasizing the importance of considering factors beyond individual channel performance. How tools like Pins and Connect facilitate data management, workflow efficiency, and model deployment. Insights into the use of vetiver for model evaluation, highlighting its ability to assess model behavior with known and unknown data and its relevance to MLOps practices.
Don’t miss this insightful episode packed with practical advice and real-world examples for data scientists at all levels! A few people in the chat said it felt like they were listening to a really great mentor speak, and we agree ️
Resources mentioned in the episode and chat:
Causal Analysis: Impact Evaluation and Causal Machine Learning with Applications in R by Martin Huber https://mitpress.mit.edu/9780262545914/causal-analysis/ Causal Inference: The Mixtape by Scott Cunningham https://mixtape.scunning.com/ Kevin Ushey’s R Studio Conf talk on Renv https://www.youtube.com/watch?v=4wRiPG9LM3o Rami Krispin’s VS Code and R setup https://github.com/RamiKrispin/vscode-r
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: (https://www.linkedin.com/company/posit-software ) Twitter: (https://x.com/posit_pbc )
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)
Thanks for hanging out with us!
Quarto Dashboards 1: Hello, Dashboards! | Mine Çetinkaya-Rundel | Posit
You already analyze and summarize your data in computational notebooks with R and/or Python. What’s next? You can share your insights or allow others to make their own conclusions in eye-catching dashboards and straight-forward to author, design, and deploy Quarto Dashboards, regardless of the language of your data processing, visualization, analysis, etc. With Quarto Dashboards, you can create elegant and production-ready dashboards using a variety of components, including static graphics (ggplot2, Matplotlib, Seaborn, etc.), interactive widgets (Plotly, Leaflet, Jupyter Widgets, htmlwidgets, etc.), tabular data, value boxes, text annotations, and more. Additionally, with intelligent resizing of components, your Quarto Dashboards look great on devices of all sizes. And importantly, you can author Quarto Dashboards without leaving the comfort of your “home” – in plain text markdown with any text editor (VS Code, RStudio, Neovim, etc.) or any notebook editor (JupyterLab, etc.).
This video takes you through
0:00 - Overview of building dashboards with Quarto 0:15 - Dashboard basics 7:40 - First dashboard in R 10:30 - First dashboard in Python 11:43 - Live coding demo
Slides can be found at https://mine.quarto.pub/quarto-dashboards/1-hello-dashboards/#/title-slide and the starter documents for the accompanying exercises at https://github.com/mine-cetinkaya-rundel/olympicdash .
Materials for all parts of the videos can be accessed at https://mine.quarto.pub/quarto-dashboards

Quarto Dashboards 3: Theming and Styling | Mine Çetinkaya-Rundel | Posit
Theming and styling Quarto dashboards built with R and/or Python.
Before watching this video, you might want to watch Parts 1 & 2.
This video takes you through
0:00 - Theming (including Bootswatch themes, light/dark mode, customizing themes with SCSS) 3:55 - Styling 4:55 - Live coding demo
Slides can be found at https://mine.quarto.pub/quarto-dashboards/3-theming-styling and the starter documents for the accompanying exercises at https://github.com/mine-cetinkaya-rundel/olympicdash .
Materials for all parts of the videos can be accessed at https://mine.quarto.pub/quarto-dashboards .
You already analyze and summarize your data in computational notebooks with R and/or Python. What’s next? You can share your insights or allow others to make their own conclusions in eye-catching dashboards and straight-forward to author, design, and deploy Quarto Dashboards, regardless of the language of your data processing, visualization, analysis, etc. With Quarto Dashboards, you can create elegant and production-ready dashboards using a variety of components, including static graphics (ggplot2, Matplotlib, Seaborn, etc.), interactive widgets (Plotly, Leaflet, Jupyter Widgets, htmlwidgets, etc.), tabular data, value boxes, text annotations, and more. Additionally, with intelligent resizing of components, your Quarto Dashboards look great on devices of all sizes. And importantly, you can author Quarto Dashboards without leaving the comfort of your “home” – in plain text markdown with any text editor (VS Code, RStudio, Neovim, etc.) or any notebook editor (JupyterLab, etc.).
This workshop will walk you through building an increasingly complex dashboard using various layout options and deploy them as static web pages (with no special server required) as well as with a Shiny Server on the backend for enhanced interactivity.
This course is for you if you:
- do data analysis in computational notebooks
- share your results with your audience in static or interactive dashboards
- want to improve the design, user interface, and experience of your dashboards

Abigail Haddad - GitHub: How To Tell Your Professional Story
GitHub is more than just a version control tool, it’s a way of explaining your professional identity to prospective employers and collaborators – and you can build your profile now, before you’re looking for new opportunities. This talk is about how to think of GitHub as an opportunity, not a chore, and how to represent yourself well without making developing your GitHub profile into a part-time job. I’ll talk about why GitHub adds value beyond a personal website, what kinds of projects are helpful to share, and some good development practices to get in the habit of, regardless of your project specifics.
Talk by Abigail Haddad
Slides: https://github.com/rstudio/rstudio-conf/tree/master/2024/abigailhaddad/haddad_2024_posit_slides.pdf
Brandon Sucher - Beyond the Classroom: Unspoken Realities of a Data Science Career
Embarking on a data science career extends well beyond academic knowledge. In many ways, the learning has just begun. Soft skills have become increasingly valuable, with effective collaboration being essential for success. Additionally, there may be moments when advocating for your own work is crucial, turning data scientists into persuasive salespeople for their own insights and contributions. In this talk, I’ll touch on some of the aspects of a data science job that aren’t talked about as frequently, including onboarding successfully, becoming a subject matter expert, and understanding the end-to-end data workflow.
Talk by Brandon Sucher
Slides: https://github.com/rstudio/rstudio-conf/blob/master/2024/brandonsucher/Posit_Conf_2024_Slides.pdf
Claire Bai - Translating clinical guidance to actionable insights with R
COTA’s team of oncologists and data scientists curate real-world data used by life science companies and healthcare partners to inform drug development and patient care. Over time, we have received many of the same questions from our data users, which indicated a dire need for translating our internal clinical guidance and data model knowledge into a tool for successfully navigating our data. We developed rwnavigator, an R package that helps users easily prepare COTA data for analysis with time-to-event packages. As first-time package developers, we ran into many challenges as we created, tested, and deployed rwnavigator. We hope to share with the greater R community our motivations for developing this package and best practices we learned along the way.
Talk by Claire Bai
Slides: https://github.com/rstudio/rstudio-conf/blob/master/2024/clairebai/rwnavigator_FINAL.pptx
Eric Leung - R Scripts to Databricks: Lessons in Production Workflow
This talk is about how a team at The Walt Disney Company took the past year to take a local R workflow into production. This updated process uses a mix of R, Python, SQL, Databricks, and Tableau dashboard, all of which involves multiple teams and stakeholders. The project started as a manual monthly process to measure the effect of ESPN’s live TV marketing to get consumers to convert to cross-channel platforms related to ESPN. But then we not only needed to automate this process, but also to scale it to measure the effect of marketing. The few lessons shared are: don’t reinvent the wheel; use not only the best tool for the job, but the best available; take time to get used to new tools.
Talk by Eric Leung
Slides: https://github.com/rstudio/rstudio-conf/tree/master/2024/ericleung/Leung_PositConf_Lessons.pptx
Introducing Positron, a new data science IDE - posit conf 2024
Positron is a next-generation data science IDE that is newly available to the community for early beta testing. This new IDE is an extensible tool built to facilitate exploratory data analysis, reproducible authoring, and publishing data artifacts. Positron currently supports these data workflows in either or both Python and R and is designed with a forward-looking architecture that can support other data science languages in the future. In this session, learn from the team-building Positron about how and why it is designed the way it is, what will feel familiar or new coming from other IDEs, and whether it might be a good fit for your own work.
Talk by Julia Silge, Isabel Zimmerman, Tom Mock, Jonathan McPherson, Lionel Henry, Davis Vaughan, and Jenny Bryan
Slide deck 1: https://speakerdeck.com/juliasilge/introducing-positron Slide deck 6: https://speakerdeck.com/jennybc/positron-for-r-and-rstudio-users





Parallelize R code using user-defined functions in sparklyr
If you’re an Apache Spark user, you benefit from its speed and scalability for big data processing.
However, you might still want to leverage R’s extensive ecosystem of packages and intuitive syntax. One effective way to do this is by writing user-defined functions (UDFs) with sparklyr.
UDFs enable you to execute R functions within Spark, harnessing Spark’s processing power and combining the strengths of both tools.
In this tutorial, you’ll learn how to:
- Open Posit Workbench as a Databricks user
- Start a Databricks cluster within Posit Workbench
- Connect to a cluster within Posit Workbench
- View Databricks data in RStudio
- Create a prediction function
- Create a user-defined function with sparklyr
Read our most recent blog that covers parallelizing R code using user-defined functions (UDFs) in sparklyr: https://posit.co/blog/databricks-udfs/
Learn more about our Databricks partnership: https://posit.co/solutions/databricks/
Watch other tutorials on using Databricks and RStudio: https://youtube.com/playlist?list=PL9HYL-VRX0oR-3AgWbXtlfdr29626EjRJ&feature=shared
Quarto: Elevating R Markdown for Advanced Publishing | Christophe Dervieux
In the dynamic landscape of data analysis and scientific publishing, R Markdown has been pivotal for the R community, allowing users to seamlessly blend code, narrative and results in a cohesive narrative. Now, Quarto emerges as a powerful tool that builds on years of experience but also goes beyond R Markdown, providing more flexibility and power in scientific communication.
This talk aims to present Quarto as the new alternative for scientific publishing. We will delve into how Quarto enhances the user experience for R enthusiasts, maintaining the syntax familiarity of R Markdown while introducing innovative and improved functionalities across multiple formats, similar to R Markdown ones.
Why switch to Quarto from R Markdown? In which cases? How does Quarto integrate with existing workflows? Hopefully everyone will feel inspired to try out Quarto!
https://quarto.org/docs/get-started/
Timestamps: 0:00 Introduction 0:41 Quarto is an open-source, scientific and technical publishing system 1:22 Computational documents and scientific markdown made easy for single source publishing 3:08 How to use Quarto 4:24 Quarto works with VS Code, Positron, Jupyter, & RStudio 5:22 Quarto’s multi-language workflow 7:21 Quarto syntax 8:40 Quarto formats (html, pdf, docx, typst, beamer, pptx, revealjs, etc.) 12:19 HTML Theming 14:10 Typst CSS for nice table output in PDF 16:24 Publishing (Quarto Pub, GitHub Pages, Posit Connect, Posit Cloud, Netlify, Confluence, Hugging Face, etc.) 17:36 Shortcodes 19:10 Quarto Extensions 19:49 Quarto Projects 22:53 Project configuration examples for a website and a book 23:42 Resources to get started!

How to automatically detect data changes for your Shiny Calendar app (ft: Jira, pins, Posit Connect)
Do you manage constantly changing data and need your Shiny app to automatically update?
On August 28th at 11 am ET, Isabella Velásquez demonstrated a streamlined workflow for handling frequently updated datasets in Shiny. You’ll see how to simplify your process for keeping dynamic data current and how to reflect those changes in your app or dashboard.
Github repo to follow along or make it your own! https://github.com/posit-marketing/shiny-calendar
Timestamps: 1:03 - Introduction of the project (end goal: calendar that integrates with Jira to track and visualize a schedule for managing deadlines of content) 2:26 - Pulling data from an API in Python or R 2:56 - Introduction to pins (and scheduling automatic refreshes of it in Posit Connect) 4:30 - Introduction to Shiny for both Python and R (its power lies in reactivity) 5:10 - Enter pin_reactive_read() function 6:12 - Introduction to Posit Team 6:37 - Opening a new session within Posit Workbench and overview of code needed to create the calendar [Github repo: https://github.com/posit-marketing/shiny-calendar] 12:07 - toastui package used for Calendar (ex: adding colors to labels) 12:47 - Writing clean data to Posit Connect board 13:16 - Rendered Quarto doc for pulling Jira data from the board 14:00 - Deploying Quarto to Posit Connect (using push button deployment) and scheduling to run 16:54 - Using the data just pinned in the Shiny app 21:17 - Overview of Shiny Content Calendar application 23:04 - Creating an issue in Jira board and adjusting schedule in Posit Connect to show new item in Shiny calendar. 24:00 - pin_reactive_read automatically detects change and shows it in the Shiny app
During this workflow demo, you will learn:
- How {pins} stores and retrieves ever-changing data with ease
- How to use pin_reactive_read() in Shiny to automatically trigger updates when your data changes
- How Posit Connect can be set up to rerun your {pin} on a schedule, ensuring your app is updated without disruption
- How to deploy an always-up-to-date app for seamless sharing with stakeholders
Other helpful links: pin_reactive_read: https://pins.rstudio.com/reference/pin_reactive_read.html Basic reactivity in Mastering Shiny: https://mastering-shiny.org/basic-reactivity.html#reactive-programming Understanding reactivity on the Shiny site: https://shiny.posit.co/r/articles/build/understanding-reactivity/ Github repo: https://github.com/posit-marketing/shiny-calendar Shiny Calendar: https://pub.demo.posit.team/public/shiny-calendar/ Q&A Recording
If you like these workflow demos, you can join us monthly! They happen the last Wednesday of every month at 11 am ET. Add it to your calendar here: https://pos.it/team-demo
Predicting Lending Rates with Databricks, tidymodels, and Posit Team
Machine learning algorithms are reshaping financial decision-making, changing how the industry manages financial risk.
For our workflow demo today on June 26th at 11 am ET, Garrett Grolemund at Posit will show how to use both Posit and Databricks to apply machine learning methods to the consumer credit market, where accurately predicting lending rates is critical for customer acquisition.
*Please note that while the workflow focuses on a financial example, the general workflow will be useful to those using Databricks and R together across any industry.
During this workflow demo, you will learn how to:
- Connect to historical lending rate data stored in Databricks Delta Lake
- Tune and cross-validate a penalized linear regression (LASSO) that predicts interest rates
- Select variables with the penalized linear regression model (LASSO)
- Build an interactive Shiny app to provide a customer-facing user interface for our model
- Deploy the app to production on Posit Connect, and arrange for the app to access Databricks
Resources for the demo: GitHub repo for today’s materials: https://github.com/posit-dev/databricks-finance-app Accompanying Guide: https://pub.demo.posit.team/public/predicting-lending-rates/lending-rate-prediction.html Q&A Recording: https://youtube.com/live/wNI3AhHP7uM
Additional follow-up links: GitHub Repo: https://github.com/posit-dev/databricks-finance-app Accompanying Guide: https://pub.demo.posit.team/content/fec42b3d-3aa9-43e1-8312-0ff553d09851/lending-rate-prediction.html While this demo uses ODBC package to connect to Databricks, you can also use sparklyr R package. Learn more about both here: https://docs.posit.co/ide/server-pro/user/rstudio-pro/guide/databricks.html Example using sparklyr instead of ODBC: https://posit.co/blog/reporting-on-nyc-taxi-data-with-rstudio-and-databricks/ Posit Workbench provides additional features for managing Databricks Credentials, learn more here: https://docs.posit.co/ide/server-pro/user/posit-workbench/guide/databricks.html#databricks-with-r For more on the Posit x Databricks partnership: https://posit.co/solutions/databricks/ Blog post on Edgar’s workshop on Databricks at conf: https://posit.co/blog/using-databricks-with-r-conf-workshop/ Solutions article on ODBC and Databricks: https://solutions.posit.co/connections/db/databases/databricks/
Want to chat more with Posit? To talk with Posit about integrating Posit & Databricks: https://posit.co/schedule-a-call/?booking_calendar__c=DatabricksJune2024Demo
Had fun and want to join again? You can add the monthly recurring event to your calendar with this link: https://pos.it/team-demo
R-Ladies Rome (English) - R in Production - Hadley Wickham
In this inspiring talk, dive into the world of R in production with Hadley Wickham, Chief Scientist at Posit PBC (formerly RStudio).
Explore the challenges and best practices for deploying R solutions in real-world production environments, from effective code structuring to ensuring scalability and reliability. Whether you’re a seasoned data scientist or just beginning your journey with R, this event equips you with invaluable insights and actionable tips to drive impactful outcomes in your organization. Don’t miss out on this engaging discussion!
Material:
0:00 Welcome & R-Ladies Introduction by Dorota Rizik (R-Ladies NYC) 6:28 Introduction and Dr. Wickham’s Talk 53:46 Q&A
Have a look at our WebSite for more insights about our events: https://rladiesrome.org

Tom Mock @ Posit PBC | Data Science Hangout
We were recently joined by Tom Mock, Product Manager at Posit PBC to chat about career growth, starting out in a sales role, TidyTuesday, and being so good they can’t ignore you.
Speaker Bio: Tom Mock is a Product Manager at Posit, overseeing the Posit Workbench and RStudio team. He fell in love with R and data science through his graduate research, using R and RStudio to wrangle, analyze, model, and visualize my data. He became passionate about growing the R community, and founded #TidyTuesday to help newcomers and seasoned vets improve their Tidyverse skills.
Links mentioned: TidyTuesday: https://github.com/rfordatascience/tidytuesday Table Contest: https://posit.co/blog/announcing-the-2024-table-contest/ Posit Conference: https://posit.co/conference/ Monthly Workflow Demos: https://www.addevent.com/event/Eg16505674 gt package: https://gt.rstudio.com/ So Good They Can’t Ignore You book recommendation: https://www.goodreads.com/book/show/13525945-so-good-they-can-t-ignore-you Community Builder Quarto Site: https://pos.it/community-builder
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software
To join future data science hangouts, add to your calendar here: https://pos.it/dsh
We’d love to have you join us in the conversation live!
Thanks for hanging out with us!
Daniel Chen - Hello Community
Hello Community by Daniel Chen
Visit https://rstats.ai for information on upcoming conferences.
Abstract: Special Appearance
Bio: Daniel teaches data science at UBC and works as a data science educator for Posit, working on the RStudio Academy team. Author of Pandas for Everyone.
Twitter: https://twitter.com/chendaniely
Presented at the 2024 New York R Conference (May 17, 2024) Hosted by Lander Analytics (https://landeranalytics.com )
Max Kuhn -SHINYLIVE IS SO EASY
SHINYLIVE IS SO EASY by Max Kuhn
Visit https://rstats.ai for information on upcoming conferences.
Abstract: shinylive is an extension to the Quarto open-source scientific and technical publishing system. It enables shiny applications to run locally, without a shiny server using WebAssembly. I’ll show examples and discuss the limitations of using shinylive.
Bio: Max Kuhn is a software engineer at Posit PBC (nee RStudio). He is working on improving R’s modeling capabilities and maintaining about 30 packages, including caret. He was a Senior Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He has been applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. He and Kjell Johnson wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association, which recognizes the best book reviewed in Technometrics in 2015. He has co-written several other books: Feature Engineering and Selection, Tidy Models with R, and Applied Machine Learning for Tabular Data (in process).
Twitter: https://twitter.com/topepos
Presented at the 2024 New York R Conference (May 17, 2024) Hosted by Lander Analytics (https://landeranalytics.com )

R-Ladies Gaborone & R-Ladies RTP (English) - Personal R Administration
R-Ladies Gaborone and R-Ladies RTP co-host E. David Aja as he demonstrates tips, tricks, tweaks, and some hacks for building data science dev environments that you won’t be afraid to come back to in a year.
Slides link https://rstats-wtf.github.io/wtf-personal-radmin-slides/#/title-slide What They Forgot to Teach You About R https://rstats.wtf/
Speaker E. David Aja : https://www.linkedin.com/in/edavidaja/ R-Ladies Gaborone: https://www.meetup.com/rladies-gaborone/ R-Ladies RTP: https://www.meetup.com/r-ladies-rtp/
Extras#
Customising your .rprofile https://kanto.rbind.io/blog/customising-your-r-profile/
Locating R and R Adjacent Software and Configuration Files https://www.pipinghotdata.com/posts/2022-06-02-locating-r-and-r-adjacent-software-and-configuration-files/
CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-Universe - posit::conf https://www.youtube.com/watch?v=XDiyAvpo2uk
You should be using renv | RStudio (2022) : https://www.youtube.com/watch?v=GwVx_pf2uz4
Featured music https://open.spotify.com/artist/0cmWgDlu9CwTgxPhf403hb
Analyze and explore data stored in Snowflake using R
James Blair, Senior Product Manager, Cloud Integrations at Posit, will demonstrate using the R language to analyze and explore data stored in Snowflake. He will also show you how easy it is to set up an R environment inside Posit Workbench that runs as a native app on Snowpark Container Services.
You also find out how using the dbplyr interface can be used to push computation data into Snowflake, giving you access to greater memory and compute power than in a standard R session.
It’s easy to get started. In just a few minutes, you can work in your R session securely inside Snowflake using the RStudio Pro IDE in Posit Workbench. Posit also supports VS Code and Jupyter for data scientists who prefer to work in other languages like Python, so you can continue to use the tools you know and love.
Learn more about the Snowflake and Posit integration: https://posit.co/solutions/snowflake/
Connecting RStudio and Databricks with ODBC
The odbc package, in conjunction with a driver, provides DBI support and an ODBC connection.
With the new odbc::databricks_connect function, you can create an ODBC connection to determine and configure the necessary settings to access your Databricks account. Your Databricks HTTP path is the only argument you need to run databricks_connect(). Provide your HTTP path and you will be able to see your Databricks data in the RStudio Connections Pane. Then, you can analyze your data in RStudio.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- RStudio IDE and Posit Workbench 2023.12.0: What’s New: https://posit.co/blog/rstudio-2023-12-0-whats-new/
- Posit Professional Drivers 2024.03.0: Support for Apple Silicon: https://posit.co/blog/pro-drivers-2024-03-0/
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
Connecting RStudio and Databricks with sparklyr
You can connect RStudio and Databricks with the sparklyr package.
First, load the necessary packages. Next, set up the connection with sparklyr::spark_connect(). See your data in the Connections pane. Check which databases you’re connected to. Navigate the data structure by expanding the levels. Navigate to the table you want to explore…and analyze your data in RStudio.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- Posit Connect: https://posit.co/products/enterprise/connect/
- sparklyr and Databricks Connect v2: https://spark.posit.co/deployment/databricks-connect.html
- Deploying to Posit Connect: https://spark.posit.co/deployment/databricks-posit-connect.html
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
Databricks Authentication in Posit Workbench
Posit Workbench now has delegated Databricks credentials.
Users can log into a Databricks Workspace when starting an RStudio or VS Code session. Authentication relies on OAuth-backed refresh tokens rather than Personal Access Tokens (PATs). This allows for more granular control over permissions, built-in expiration, and defined scopes. Once logged in, you can interact directly with your Databricks clusters in that Workspace in your preferred environment.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- RStudio IDE and Posit Workbench 2023.12.0: What’s New: https://posit.co/blog/rstudio-2023-12-0-whats-new/
- Posit Professional Drivers 2024.03.0: Support for Apple Silicon: https://posit.co/blog/pro-drivers-2024-03-0/
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
Databricks Pane in Posit Workbench
We’ve introduced a Databricks Pane in RStudio Pro for discovering and managing Databricks clusters. Users can stay in RStudio Pro without navigating to the Databricks web interface to see their clusters. Click on the ‘Connect to’ icon to open up a new Databricks Connection pop-up. This forms the initial connection to Databricks. Once you click ‘ok’, you will be connected to Databricks.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- RStudio IDE and Posit Workbench 2023.12.0: What’s New: https://posit.co/blog/rstudio-2023-12-0-whats-new/
- Posit Professional Drivers 2024.03.0: Support for Apple Silicon: https://posit.co/blog/pro-drivers-2024-03-0/
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
Databricks Pro Driver in Posit Workbench
We’ve added a Databricks driver to our Professional Drivers. The RStudio Connections Pane allows users to connect to their Databricks clusters from the IDE. Select the Databricks driver from the list of available drivers. Select the Driver to establish the connection. The driver can be used with the new databricks() function from the odbc package to connect to Databricks clusters and SQL warehouses.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- RStudio IDE and Posit Workbench 2023.12.0: What’s New: https://posit.co/blog/rstudio-2023-12-0-whats-new/
- Posit Professional Drivers 2024.03.0: Support for Apple Silicon: https://posit.co/blog/pro-drivers-2024-03-0/
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
R-Ladies Rome (English) - Extending the data science workflow: {vetiver} and {pins}
In this video, Isabel Zimmerman goes through the fundamental aspects of machine learning operations (MLOps) tasks, bridging the gap between data analysis and model deployment. While data practitioners excel in data analysis and model development, there’s often a significant gap in understanding tasks beyond the conventional data science workflow.
You’ll explore crucial MLOps concepts, such as deploying models as API endpoints and monitoring model decay, while leveraging the powerful capabilities of the vetiver and pins packages.
Material:
- presentation: https://www.isabelizimm.me/talk-extending-ds-workflow-rladies/
- RStudioConf2022 talk: https://www.isabelizimm.me/talks/rstudioconf2022/
- Vetiver website: https://vetiver.rstudio.com/
0:00 Welcome & R-Ladies Rome Chapter Introduction 0:04:45 Slido Pools 0:10:15 Talk Intro 0:10:56 Isalbel’s Talk 0:47:53 Hands-on session 1:02:20 Q&A
Have a look at our WebSite for more insights about our events: https://rladiesrome.quarto.pub/website/talks/

How to build business reports with Quarto
How do you create the report look and feel that your leadership team expects?
Christophe Dervieux at Posit joined us on Wednesday, March 27th to share how to style Quarto docs and send scheduled email updates to required stakeholders.
Helpful resources: ️ Getting started with Quarto: https://quarto.org/docs/get-started ️ User guide: https://quarto.org/docs/guide ️ Github repo with this example: https://github.com/quarto-examples/quarto-business-report ️ Q&A Recording: https://youtube.com/live/bqk75igHo8M?feature=share ️ If you’re interested in learning more about Posit Connect, pos.it/chat-with-us
Timestamps: 02:00 - What is Quarto? 02:40 - How does Quarto work? (.md, .qmd or .ipynb as source files) 03:45 - How to get started with Quarto if you’re new to it? 04:51 - Using Quarto from within RStudio 05:00 - Using Quarto within VSCode with extension & Jupyter Lab extension 05:37 - Visual Editor for Quarto 07:22 - Customer Tracker Report in RStudio IDE (using source code: https://github.com/quarto-examples/quarto-business-report ) 10:39 - Making Quarto report downloadable as Excel doc (adding download button) 11:37 - Adding a table of contents to your Quarto report 12:23 - Spread Quarto graphics across page so that they go into margin 13:10 - Customizing theme in Quarto (Bootstrap 5) https://quarto.org/docs/output-formats/html-themes.html 14:45 - Increasing font size in Quarto report 17:10 - Customizing theme rules 21:16 - Publishing Quarto report to Posit Connect 22:35 - Scheduling Quarto report to automatically run 23:35 - Preview of default / non-customized email 23:58 - Customizing your Quarto email 26:52 - Customized email preview that Posit Connect can send 27:56 - Setting access controls for Quarto report on Connect and when you want emails to send
Resources shared in Q&A session: Community discussion for ongoing Quarto questions: https://forum.posit.co/tag/quarto Quarto document language: https://quarto.org/docs/authoring/language.html babelquarto (for multilingual project, book, or website): https://docs.ropensci.org/babelquarto/ Quarto Manuscripts: https://quarto.org/docs/manuscripts/ Managing Execution in Quarto: https://quarto.org/docs/projects/code-execution.html Quarto Extensions: https://quarto.org/docs/extensions/ Project Profiles in Quarto: https://quarto.org/docs/projects/profiles.html Custom branding deeper dive: https://www.youtube.com/watch?v=V82BBU9ldcM Quarto Parameters: https://quarto.org/docs/computations/parameters.html Lua Development: https://quarto.org/docs/extensions/lua.html Quarto CLI Discussions on Github: https://github.com/quarto-dev/quarto-cli/discussions Data Science Hangout every Thursday at 12 ET: https://posit.co/data-science-hangout/ Get connected with others at your org using Posit: pos.it/connect-us
There is no need to register; join us here on YouTube at the time above or you can add to your calendar using the link below:
pos.it/team-demo
We host these Workflow Demos on the last Wednesday of every month, so you can use the link above to add the recurring event as well. If you ever have ideas for topics or questions about them, you can comment below in YouTube!

R-Ladies Rome(English) - Extending the data science workflow: {vetiver} and {pins}- Isabel Zimmerman
In this video, Isabel Zimmerman goes through the fundamental aspects of machine learning operations (MLOps) tasks, bridging the gap between data analysis and model deployment. While data practitioners excel in data analysis and model development, there’s often a significant gap in understanding tasks beyond the conventional data science workflow.
You’ll explore crucial MLOps concepts, such as deploying models as API endpoints and monitoring model decay, while leveraging the powerful capabilities of the vetiver and pins packages.
Material:
- presentation: https://www.isabelizimm.me/talk-extending-ds-workflow-rladies/
- RStudioConf2022 talk: https://www.isabelizimm.me/talks/rstudioconf2022/
- Vetiver website: https://vetiver.rstudio.com/
0:00 Welcome & R-Ladies Rome Chapter Introduction 0:04:45 Slido Pools 0:10:15 Talk Intro 0:10:56 Isalbel’s Talk 0:47:53 Hands-on session 1:02:20 Q&A
Have a look at our WebSite for more insights about our events: https://rladiesrome.quarto.pub/website/talks/

How to bring modern UI to your Shiny apps
Looking for ways to make your Shiny apps a little…shinier?
Join Garrett Grolemund at Posit on Wednesday, January 31st at 11 am ET to learn how to theme and brand your own apps.
The session will highlight how to:
Layout an app with the bslib package (modern UI toolkit with no knowledge of CSS required) Add cards, value boxes, and logos Customize the theme of the app Tweak the theme by swapping out primary colors, secondary colors, and more. Quickly apply the theme to every plot in the app Work with bootstrap classes
Helpful Resources: ️ Code at : https://github.com/garrettgman/shiny-styling-demo bslib package: https://rstudio.github.io/bslib/ bsicons package: https://github.com/rstudio/bsicons ️ Bootstrap icons: https://icons.getbootstrap.com/ ️ Bootstrap CSS classes: https://bootstrapshuffle.com/classes thematic package: https://rstudio.github.io/thematic/ gitlink package: https://github.com/colearendt/gitlink
️ Follow-up links: Posit Team: https://posit.co/products/enterprise/team/ Request evaluation: pos.it/chat-with-us Posit Team demo resources: pos.it/demo-resources
LIVE Q&A ROOM for ~11:45 am on January 31st: https://youtube.com/live/1G8ZM6kbt8c?feature=share
There is no need to register; join us here on YouTube at the time above or you can add to your calendar using the link below:
pos.it/team-demo
We host these Workflow Demos on the last Wednesday of every month, so you can use the link above to add the recurring event as well
GitHub Copilot on Posit Cloud
Speed up your coding projects in the RStudio IDE on Posit Cloud with GitHub Copilot, an AI coding assistant.
Learn more in our blog post: https://posit.co/blog/github-copilot-on-posit-cloud/
Posit Cloud: https://posit.cloud/ GitHub Copilot: https://github.com/features/copilot RStudio User Guide: https://docs.posit.co/ide/user/ide/guide/tools/copilot.html
How to collaborate effectively with other data scientists (version control, project sharing, etc.)
Tis the season for joy and connection, and what better time to extend that collaborative spirit into your data science endeavors
By popular demand, our upcoming monthly workflow with Ryan Johnson on December 27th is dedicated to enhancing teamwork. It’s recorded, so no worries if you’re out! Or perhaps you’ll add us to the mix of holiday movies and watch from the couch!
This Month’s Focus: All Things Collaborative Working
Version control Git-backed deployment Project sharing
Date & Time: Wednesday, December 27th at 11 am ET.
Packages mentioned: Shiny: https://shiny.posit.co/ bslib: https://rstudio.github.io/bslib/
️ Follow-up links: Posit Team: https://posit.co/products/enterprise/ … Talk to us directly: pos.it/chat-with-us Posit Team demo resources: pos.it/demo-resources
There is no need to register; join us here on YouTube at the time above or you can add to your calendar using the link below:
pos.it/team-demo
We host these Workflow Demos on the last Wednesday of every month, so you can use the link above to add the recurring event as well.
We will use this thread on the Posit Community Forum for follow-up Q&A from this month’s session: https://community.rstudio.com/t/event-on-12-27-collaborative-workflows-w-posit-team-version-control-git-backed-deployment-project-sharing/179181 (shortlink: pos.it/workflow-dec-23)
Happy holidays! Cheers to 2024!
Github Copilot integration with RStudio, it’s finally here! - posit::conf(2023)
Presented by Tom Mock
This talk closes issue #10148, “Github Copilot integration with RStudio”, the most upvoted feature request in RStudio’s history. Code generating AI tools like Github Copilot‚ promise an “AI pair programmer that offers autocomplete-style suggestions as you code”. For the first time, we’ll show a native integration of Copilot into RStudio, helping to build on that promise by providing AI-generated “ghost text” autocompletion with R and other languages. I’ll also provide a comparison of Copilot’s “ghost text” to a chat-style interface in RStudio via the {chattr} package from the Posit MLVerse team.
To make the most of these new features, I’ll walk through some examples of how sharing additional context, comments, code, and other “prompt engineering” can help you go from code-generating AI tools that feels like an annoying backseat driver to an experienced copilot. We’ll close with a robust end-to-end example of how these new RStudio integrations and packages can help you be a more productive developer.
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Data science infrastructure for your org. Session Code: TALK-1117
Large Language Models in RStudio - posit::conf(2023)
Presented by James Wade
Large language models (LLMs), such as ChatGPT, have shown the potential to transform how we code. As an R package developer, I have contributed to the creation of two packages – gptstudio and gpttools – specifically designed to incorporate LLMs into R workflows within the RStudio environment.
The integration of ChatGPT allows users to efficiently add code comments, debug scripts, and address complex coding challenges directly from RStudio. With text embedding and semantic search, we can teach ChatGPT new tricks, resulting in more precise and context-aware responses. This talk will delve into hands-on examples to showcase the practical application of these models, as well as offer my perspective as a recent entrant into public package development.
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: I can’t believe it’s not magic: new tools for data science. Session Code: TALK-1154
Succeed in the Life Sciences with R/Python and the Cloud - posit::conf(2023)
Presented by Colby Ford
This talk covers best practices and lessons learned surrounding the use of R and Python by technical teams in the cloud, focusing on Posit Workbench, Azure ML, and Databricks.
In the life sciences, whether it’s pharma, biotech, research, or another type of organization, we are unique in that we blend scientific knowledge with technical skills to extract insights from large, complex datasets. In the cloud, we can architect solutions to help us scale, automate, and collaborate. Interestingly, the use of R and Python by bioinformatics, genomics, biostatistics, and data science teams can be challenging in a cloud-first world where all the data is somewhere other than your laptop (like a data lake).
In this talk, I will share best practices and lessons learned surrounding the use of R and Python by technical teams in the cloud. We’ll focus on the use of Posit Workbench and RStudio on various cloud services such as Azure ML and Databricks.
Tuple, The Cloud Genomics Company: https://tuple.xyz
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Pharma. Session Code: TALK-1069
tidymodels: Adventures in Rewriting a Modeling Pipeline - posit::conf(2023)
Presented by Ryan Timpe
An overview of the benefits unlocked on our data science team by adopting tidymodels.
Data science sure has changed over the past few years! Everyone’s talking about production. RStudio is now Posit. Models are now tidy.
This talk is about embracing that change and updating existing models using the tidymodels framework. I recently completed this change, letting go of our in-production code and revisioning it with tidymodels. My team ended up with a faster, more scalable pipeline that enabled us to better automate our workflow and increase our scale while improving our stakeholders’ experiences.
I’ll share tips and tricks for adopting the tidymodels framework in existing products, best practices for learning and upskilling teams, and advice for using tidymodel packages to build more accessible data science tools.
Materials: https://www.ryantimpe.com/files/tidymodels_adventures_positconf2023.pdf
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Tidy up your models. Session Code: TALK-1082
Towards the Next Generation of Shiny UI
Presented by Carson Sievert
Create awesome looking and feature rich Shiny dashboards using the bslib R package.
Shiny recently celebrated its 10th birthday, and since its birth, has grown tremendously in many areas; however, a hello world Shiny app still looks roughly like it did 10 years ago. The bslib R package helps solve this problem making very easy to apply modern and customizable styling your Shiny apps, R Markdown / Quarto documents, and more. In addition, bslib also provides dashboard-focused UI components like expandable cards, value boxes, sidebar layouts, and more to help you create delightful Shiny dashboards.
Materials:
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Shiny user interfaces. Session Code: TALK-1124

Using Data to Protect Traditional Lifeways - posit::conf(2023)
Presented by Angie Reed
The spirit of Penobscot Nation’s work to protect the health of their relative, the Penobscot River, is embodied in the Penobscot water song which says ““Water, we love you, thank you so much water, we respect you.”” Because the Penobscot River is not a natural resource - she is a relative, family - this song describes the foundation of our efforts to protect her health and well-being. The identity of Penobscot people cannot be disconnected from the river, and protecting this traditional lifeway is at the heart of our work.
For over a decade we have used R to manage, transform, analyze, and visualize data, and the free, open-source Posit products help us leave a legacy of good data management and the ability to share results with Penobscot Nation citizens. You will learn more about how our use of R has helped us achieve more stringent protections for the Penobscot River and how we engage young people in every step of this work. We are also part of a larger network of tribal environmental professionals, working together to learn R and share data and insights. We will give you information about how you can volunteer to help expand the network of folks providing technical assistance on any R and RStudio related topics.
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: End-to-end data science with real-world impact. Session Code: TALK-1144
What an Early 2000s Reality Show Taught Me about File Management - posit::conf(2023)
Presented by Reiko Okamoto
Clutter, whether it’s physical or digital, destroys our ability to focus; home organization ideas can be extended to create an workspace where analysts feel inspired to work with data.
Ideas from home organization shows are surprisingly applicable to file management. Using a room divider to establish dedicated zones for different activities in a studio apartment is analogous to creating self-contained projects in RStudio. Likewise, swapping mismatched hangers with matching ones to tidy a closet resembles the adoption of a file naming convention to make a directory easier to navigate.
In this talk, I will share good practices in file management through the lens of home organization. We all know that clutter, whether it is in our physical space or on our machine, destroys our ability to focus. These practices will help R users of all levels create a serene, relaxing environment where they feel inspired to work with data.
https://reikookamoto.github.io/; https://github.com/reikookamoto/posit-conf-2023-neat
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Getting %$!@ done: productive workflows for data science. Session Code: TALK-1090
What’s New in Quarto?* - posit::conf(2023)
Presented by Charlotte Wickham
It’s been over a year since Quarto 1.0, an open-source scientific and technical publishing system, was announced at rstudio::conf(2022). In this talk, I’ll highlight some of the improvements to Quarto since then. You’ll learn about new formats, options, tools, and ways to supercharge your content. And, if you haven’t used Quarto yet, come to see some reasons to try it out.
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Quarto (1). Session Code: TALK-1072

Edgar Ruiz - GitHub Copilot in RStudio
GitHub Copilot in RStudio - Edgar Ruiz
Presentation slides available at https://colorado.posit.co/rsc/rstudio-copilot/#/TitleSlide
Speaker Bio: Edgar Ruiz is a solutions engineer at Posit with a background in deploying enterprise reporting and business intelligence solutions. He is the author of multiple articles and blog posts sharing analytics insights and server infrastructure for data science. Edgar is the author and administrator of the https://db.rstudio.com web site, and current administrator of the sparklyr web site: https://spark.rstudio.com . Co-author of the dbplyr package, and creator of the dbplot, tidypredict and modeldb package.
Presented at the 2023 R/Pharma Conference (October 26, 2023)

Max Kuhn - Serverless Quarto
Serverless Quarto - Max Kuhn
Resources mentioned in the presentation:
- Slides: https://topepo.github.io/2023-r-pharma
- Example: https://topepo.github.io/shinylive-in-book-test
Bio: Max Kuhn is a software engineer at Posit (née RStudio). He is working on improving R’s modeling capabilities and maintaining about 30 packages, including caret. He was a Senior Director of Nonclinical Statistics at Pfizer and had been applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. He, and Kjell Johnson, wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association. Their second book, Feature Engineering and Selection, was published in 2019, and his book Tidy Models with R, was published in 2022.
Presented at the 2023 R/Pharma Conference (October 25, 2023)

J.J. Allaire - Keynote: Dashboards with Jupyter and Quarto | PyData NYC 2023
https://drive.google.com/file/d/1O_ed6OKEXZBIzKn6yyF9W7f6NaNV-L3J/view?usp=drive_link
Keynote by JJ Allaire
J.J. is the Founder and CEO of Posit (which you might only know by its previous name, RStudio). J.J. is now leading the Quarto project, a Jupyter-based scientific and technical publishing system. In this talk, J.J. will introduce Quarto Dashboards, an easy way to create production quality dashboards from Jupyter Notebooks. J.J. will also more broadly discuss Posit’s recent work in the open source PyData ecosystem along with plans for significantly expanding that work in the future.
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps

Posit Cloud Essentials | Ep. 5: Teaching data courses and workshops
On the last Tuesday of every month, we host an event – Posit Cloud Essentials – where we explore the ins and outs of Posit Cloud, diving into its key features, valuable tips, and real-world use cases. The event is open to all and hosted on YouTube with a live Q&A during each month’s event.
THIS MONTH’S EVENT: You’re an expert in teaching data, so why waste time configuring servers or desktop environments for your students? With Posit Cloud, you can teach data skills from your web browser, with pre-configured projects, simple sharing, and flexible subscription options.
Posit Cloud has helped thousands of data educators by providing easy access to the coding tools relied on by companies today. This demo will highlight Posit Cloud features that support course management and curriculum development centered around RStudio and Jupyter Notebook projects.
WHAT IS POSIT CLOUD? Posit Cloud makes it easy to move your entire workflow into a unified online experience with project management and publishing capabilities. Use your favorite coding languages and environments and share your work seamlessly with others, all from the comfort of your web browser.
EXPLORE MORE ABOUT POSIT CLOUD: Create a free Posit Cloud account → https://posit.cloud/plans?utm_source=youtube&utm_medium=organic_social&utm_campaign=cloud_launch
View the entire library of Posit Cloud Essential events → https://youtube.com/playlist?list=PL9HYL-VRX0oS4CXCA81eno41u8K3ckGVH&feature=shared
GitHub Copilot in Rstudio, it’s finally here!
Thomas Mock, PhD, Workbench Product Manager at Posit PBC.
In this webinar, part of a new quarterly R/Med seminar series, Thomas demonstrates how to set up Copilot in RStudio and then provides examples of using it to generate code by providing context and comments. Some key points covered include:
- How Copilot works by predicting the next token based on context.
- Tips for using Copilot effectively, like breaking problems down simply, specifically, and using comments.
- Examples of Copilot generating functions, tests, and repeating tasks.
- Using other tools like Chatter to ask questions when stuck.
Main Sections
00:00 Intro 01:35 What is generative AI? 04:27 What is Copilot? 08:50 Copilot in RStudio 10:15 Get started 13:01 Getting the most out of the generative loop 15:14 Simple and specific 24:29 Getting stuck? 28:53 {chattr} package 31:17 Generative AI tools with Posit Workbench and RStudio 32:43 Examples using Copilot inside RStudio 51:42 Q&A and RStudio User Guide
More Resources
R Medicine Virtual Conference 2023: https://www.youtube.com/playlist?list=PL4IzsxWztPdlpR3NqGzUI01M4_jqzIWqo
R Consortium https://www.r-consortium.org/ Blog: https://www.r-consortium.org/news/blog Join: https://www.r-consortium.org/about/join Twitter: https://twitter.com/Rconsortium LinkedIn: https://www.linkedin.com/company/r-consortium Mastodon: https://fosstodon.org/@RConsortium
Keynote, JJ Allaire: Reproducible Manuscripts with Quarto
JJ. Allaire CEO at Posit, PBC JJ is software engineer and entrepreneur who builds tools that empower people with technology. JJ has conceived and designed several industry leading products by balancing market, customer, and technical considerations, and by maintaining intimate involvement in all aspects of software design and construction. He is currently the founder and CEO of statistical computing company RStudio (now, a part of Posit, https://posit.co/) . https://github.com/jjallaire https://mobile.twitter.com/fly_upside_down

How to keep data up-to-date with 6 pins workflows (aka avoid data-final.csv & data-final-final.csv)
Ever chase a CSV through a series of emails or had to decide between data-final.csv and data-final-final.csv?
Pins (both for R & Python) is a package that a bunch of people at the Data Science Hangout wish they knew about earlier. It allows you to publish and share objects (data, models, etc.) across projects and with your colleagues.
Pins package (R) - https://pins.rstudio.com/ Pins package (Python) - https://pypi.org/project/pins/
Timestamps: 1:15 - Posit Team Overview 2:18 - Introduction to pins (scenarios where you might want to consider using pins) 4:42 - Installing pins 6:24 - Workflow #1: Pinning an R Object to Posit Connect (from RStudio) 10:23 - Workflow #2: Pinning a Python Object to Posit Connect (from JupyterLab) 15:19 - Workflow #3: Reading in a Python pin in an R Session 16:07 - Workflow #4: Reading an R pin into a Python session 17:50 - Workflow #5: Pin versioning 21:50 - Workflow #6: Automating the pin writing process (through job scheduling on Connect)
Helpful resources: Q&A for this session on August 30th: https://youtube.com/live/8hc9ck1ZNLE Blog post on pinning an R dataset to Posit Connect: https://posit.co/blog/pins-posit-connect/
Many people find this useful for:
- Scheduling reports that need to be updated with the newest data each week
- Reusing data across multiple projects or content (Shiny app, Jupyter Notebook, Quarto doc, etc.)
We host these end-to-end workflow demos on the last Wednesday of every month. No registration is required to attend - simply add it to your calendar using this link: pos.it/team-demo
If you ever have ideas for topics or questions about them, please let us know in the comments
Posit Cloud Essentials | Ep 2: Managing Data Projects with Spaces
On the last Tuesday of every month, we host an event – Posit Cloud Essentials – where we explore the ins and outs of Posit Cloud, diving into its key features, valuable tips, and real-world use cases. The event is open to all and hosted on YouTube with a live Q&A during each month’s event.
This month, Alex Chisholm, Product Manager for Posit Cloud, walks through how to manage your R and Python data projects with Spaces in Posit Cloud.
You can organize code with RStudio and Jupyter Notebooks, host interactive applications, and publish data-driven documents all in one dedicated environment. Then you can invite others to join your space, carefully setting permissions regarding what they can see and do.
This demo will teach you everything you need to know about Posit Cloud spaces through an example of how a data consultant might organize their work.
What is Posit Cloud? Posit Cloud makes it easy to move your entire workflow into a unified online experience, complete with project management and publishing capabilities. Use your favorite coding languages and environments and share your work seamlessly with others, all from the comfort of your own web browser.
No registration is required to join the events. Simply add the event to your calendar using the link below.
Create a free Posit Cloud account → https://posit.cloud/plans?utm_source=youtube&utm_medium=organic_social&utm_campaign=cloud_launch
Add future Posit Cloud Essential events to your calendar → http://evt.to/adahaeuow
View the full library of Posit Cloud Essential events → https://youtube.com/playlist?list=PL9HYL-VRX0oS4CXCA81eno41u8K3ckGVH&feature=shared
Building an MLOps strategy from the ground up - Isabel Zimmerman, RStudio PBC | Crunch 2022
This talk was recorded at Crunch Conference 2022. Isabel from RStudio PBC spoke about building an MLOps strategy from the ground up.
“By the end of this talk, people will understand what the term MLOps entails, different options for deployment, and when different methods work best.”
The event was organized by Crafthub.
You can watch the rest of the conference talks on our channel.
If you are interested in more speakers, tickets and details of the conference, check out our website: https://crunchconf.com/ If you are interested in more events from our company: https://crafthub.events/

Teaching the tidyverse in 2023 | Mine Çetinkaya-Rundel
Recommendations for teaching the tidyverse in 2023, summarizing package updates most relevant for teaching data science with the tidyverse, particularly to new learners.
00:00 Introduction 00:46 Using addins to switch between RStudio themes (See https://github.com/mine-cetinkaya-rundel/addmins for more info) 01:40 Native pipe 03:08 Nine core packages in tidyverse 2.0.0 07:15 Conflict resolution in the tidyverse 11:30 Improved and expanded *_join() functionality 22:05 Per operation grouping 27:41 Quality of life improvements to case_when() and if_else() 31:41 New syntax for separating columns 34:51 New argument for line geoms: linewidth 36:08 Wrap up
See more in the Teaching the tidyverse in 2023 blog post https://www.tidyverse.org/blog/2023/08/teach-tidyverse-23

Daniel Chen - Moving to Quarto from RMarkdown and Python Jupyter Notebooks
Moving to Quarto from RMarkdown and Python Jupyter Notebooks by Daniel Chen
Visit https://rstats.ai/nyr to learn more.
Bio: Daniel teaches data science at UBC and works as a data science educator for RStudio, working on the RStudio Academy team. He just moved to Vancouver by car in a cross-country across-border road trip with his dad.
Twitter: https://twitter.com/chendaniely
Presented at the 2023 New York R Conference (July 13, 2023)
Max Kuhn - The Post-Modeling Model to Fix the Model
The Post-Modeling Model to Fix the Model by Max Kuhn
Visit https://rstats.ai/nyr to learn more.
Abstract: It’s possible to get a model that has good numerical performance but has predictions that are not really consistent with the data. Model calibration is a tool that can fix this. We’ll show some examples of poor predictions and how different calibration tools can re-align them to the data.
Bio: Max Kuhn is a software engineer at RStudio. He is currently working on improving R’s modeling capabilities. He was a Senior Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He was applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. Max is the author of numerous R packages for techniques in machine learning and reproducible research. He, and Kjell Johnson, wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association, which recognizes the best book reviewed in Technometrics in 2015. Their second book, Feature Engineering and Selection, was published in 2019.
Twitter: https://twitter.com/topepos
Presented at the 2023 New York R Conference (July 14, 2023)

Why RStudio is now Posit (J.J. Allaire | Posit CEO) - KNN Ep. 158
Today, I had the pleasure of interviewing J.J. Allaire. J.J. is the founder of RStudio and the creator of the RStudio IDE. He is an author of several packages in the R Markdown publishing ecosystem including rmarkdown, flexdashboard, learnr, and distill, and also worked extensively on the R interfaces to Python, Spark, and TensorFlow. J.J. is now leading the Quarto project, which is a new Jupyter-based scientific and technical publishing system. In this episode, we learn about why RStudio has now repositioned itself as Posit, how it maximizes its open-source nature as a B Corp, and how J.J. as an open-source advocate views the private nature of many LLMs. I really enjoyed this conversation, and I hope you will as well!
Posit - https://posit.co/
Podcast Sponsors, Affiliates, and Partners:
- Pathrise - http://pathrise.com/KenJee | Career mentorship for job applicants (Free till you land a job)
- Taro - http://jointaro.com/r/kenj308 (20% discount) | Career mentorship if you already have a job
- 365 Data Science (57% discount) - https://365datascience.pxf.io/P0jbBY | Learn data science today
- Interview Query (10% discount) - https://www.interviewquery.com/?ref=kenjee | Interview prep questions
Listen to Ken’s Nearest Neighbors on all the main podcast platforms! On Apple Podcasts: https://podcasts.apple.com/us/podcast/kens-nearest-neighbors/id1538368692 (Please rate if you enjoy it!) On Spotify: https://open.spotify.com/show/7fJsuxiZl4TS1hqPUmDFbl On Google: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5idXp6c3Byb3V0LmNvbS8xNDMwMDQxLnJzcw?sa=X&ved=0CAMQ4aUDahcKEwjQ2bGBhfbsAhUAAAAAHQAAAAAQAQ
MORE DATA SCIENCE CONTENT HERE: My Twitter - https://twitter.com/KenJee_DS LinkedIn - https://www.linkedin.com/in/kenjee/ Kaggle - https://www.kaggle.com/kenjee Medium Articles - https://medium.com/@kenneth.b.jee Github - https://github.com/PlayingNumbers My Sports Blog - https://www.playingnumbers.com ️ 66DaysOfData Discord Server - https://discord.com/invite/4p37sy5muZ
[84] Reproducible Publications with Python and Quarto (Thomas Mock)
Join our Meetup group: https://www.meetup.com/data-umbrella
Tom Mock: Reproducible Publications with Python and Quarto
Resources#
Full transcript#
https://blog.dataumbrella.org/quarto-blog
About the Event#
Quarto is an open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication. The system has support for reproducible embedded computations, equations, citations, crossrefs, figure panels, callouts, advanced layouts, and more. In this talk we’ll explore the use of Quarto with Python, describing both integration with IPython/Jupyter and the Quarto VS Code extension. Users can author Jupyter notebooks or documents as plain text markdowns with code in Python, R, Julia or Observable. Quarto includes the ability to publish high-quality articles, reports, presentations, websites, blogs, and books in HTML, PDF, MS Word, ePub, Reveal.js and more.
Timestamps#
00:00 Data Umbrella introduction 03:41 Introduce the speaker, Thomas Mock 04:14 Thomas begins 05:14 RStudio is now Posit 05:55 What is Quarto? 07:13 Origins of Quarto 08:31 Goal: Computation Document 09:09 Goal: Scientific Markdown 10:03 Goal: Single Source Publishing 10:33 Simple example of what Quarto looks like (YAML, Markup, Markdown, code chunks) 12:29 Simple example: multi-format (output formats: html, pdf, docx, epub, pptx, revealjs) 13:16 List of what is possible with Quarto 14:02 So, what is Quarto: quarto is a language-agnostic command line interface (CLI) 15:27 Basic Quarto workflow 16:43 Difference between “render” and “preview” 17:16 IPython 18:43 Stored/frozen computation and reproducibility 20:36 A *.qmd is a plain text file 21:28 Quarto doesn’t have to be plain text 22:12 Rendering pipeline 22:57 What to do with my existing .ipynb? 24:23 Comfort of your own workspace: JupyterLab, Visual Studio Code, 25:00 Auto-completion in RStudio + VSCode 26:01 Quarto Extensions and Visual / Live Editor 27:19 Quarto, unified document layout 29:54 Quarto, unified syntax across Markdown and code 31:11 Built-in vs Custom 33:01 Extending Quarto with Extensions 33:51 Interactivity, Jupyter Widgets (with plots, matplotlib, etc) 34:15 Interactivity, Observable 35:01 Interactivity, on the fly Observable “widgets” 36:24 Parameters - one source, many outputs 37:36 Rendering with parameters 38:27 Quarto Publish 38:57 Quarto, crafted with love and care (the team) 39:30 Quarto Resources (installation) 39:44 Quarto resources: video tutorials 40:13 Q: Can Quarto documents be shared like Overleaf docs and can users import article templates for specific journals into Quarto? 41:39 new! Manuscript option to bundle an entire project together (bundle can be shipped to a journal) 42:48 Q: Is Quarto git friendly? 43:28 Q: Has Quarto already been used in published scientific work? 44:14 publishing books with Quarto 44:22 Q: Any general suggestions for outputting to docx (Word)? 45:20 Q: Any tips on how Quarto can help conda users? 46:14 Q: Can you use GitHub Actions with Quarto? 47:18 Q: Can you have individual environments for each blog post? 49:50 Download CLI (command line interface) for Quarto 51:10 Example Gallery 51:44 nbdev project 53:14 Quarto blog, Shinylive extension 55:12 Q: How can I use Quarto to write scientific papers?
About the Speaker: Tom Mock#
- Twitter: https://twitter.com/thomas_mock
- GitHub: https://github.com/jthomasmock
#python #quarto #rstats
How to schedule a Quarto document on Posit Connect
Episode 3: Scheduling a Quarto Doc (with custom branding) on Posit Connect Led by: Ryan Johnson, Data Science Advisor
Live Q&A recording: https://youtu.be/JUgChPCa3vs
Follow-up links:
- Posit Team: https://posit.co/products/enterprise/team/
- Talk to us directly: https://posit.co/schedule-a-call/?booking_calendar__c=RST_YT_Demo
- Follow-along blog post: https://posit.co/blog/scheduling-a-quarto-doc-on-posit-connect/
- Source code for example: https://github.com/ryjohnson09/quarto-job-scheduling
- Posit Team demo resources: pos.it/demo-resources
Timestamps: 1:45 - What is Posit Team? 3:31 - The data we are analyzing: R package download data from within Posit Package Manager (via experimental API) 4:44 - What is Quarto? (we are creating two Quarto docs today) 8:06 - Create a new session in Posit Workbench 8:56 - Create a new project within the RStudio IDE 10:13 - Create the first Quarto document (ETL: extra, transform, load workflow) 13:55 - Publish the first Quarto doc to Posit Connect 16:50 - Take the package download results and pin it to Posit Connect (overview of pins) 18:30 - Schedule the first Quarto doc to run every day at 7am on Posit Connect 20:49 - Create the second Quarto document (report for stakeholders) 23:41 - View the first “boring report” 24:54 - Using a custom Posit format for the report 27:39 - First look of the themed report without modifications 29:00 - Adding Posit themed colors to the gt table 31:35 - Apply code chunk options to hide code from output 32:44 - Publish the second Quarto document to Posit Connect 34:28 - View finished custom branded Quarto document 34:44 - Define specific users who have access to the Quarto doc on Posit Connect 35:10 - Schedule the second Quarto doc to read in the pinned data from the first Quarto doc 36:30 - Example emailed report from the scheduled Quarto report
On the last Wednesday of every month, we host a Posit Team demo and Q&A session that is open to all. You can use this to add the event to your own calendar.
Who are these monthly demos for? Everyone is welcome to join us - regardless of industry, background, or experience!
We will discuss topics that will speak to:
- Data scientists and administrators new to Posit Team or are looking to grow their understanding of our toolchain,
- Teams searching for a new analytic platform built to support open-source data science,
- And, those that are just curious about Posit Team!
What you can expect from the monthly Posit Team demo:
During the session, we will walk through an end-to-end data science workflow and demo the core functionality of Posit Team while highlighting some of our latest features!
While each session’s content will vary slightly, here are a few core topics we will address each month:
- Open Source Analytics: The future of data science is open source. We’ll discuss methods for leveraging open-source tools and packages in a secure and scalable way!
- Deployment: How to share the amazing data science assets your Team has built, including web applications, machine learning models, APIs, and more!
- Data Access: Data comes in various forms and is stored in various ways. We’ll discuss best practices for accessing, reading, and writing data!
- Job Scheduling: Do you have recurring data science jobs? We’ll show you how to automate these processes using Posit Connect.
What is Posit Team?
Posit Team is a bundle of our popular professional software (Posit Workbench, Posit Connect, and Posit Package Manager) for developing data science projects, publishing data products, and managing packages.
Registration is not required. The event will be streamed through YouTube Premiere
RStudio + Amazon SageMaker | Build Beyond Your Laptop
Did you know that you can use RStudio, the best IDE for R and Python users, with Amazon Sagemaker?
RStudio on Amazon SageMaker makes it easy for R users to quickly and easily get started coding in RStudio on AWS from their browser, no server setup required, by using a new integration with Posit Workbench.
In this webinar, Posit team members will show you how to get started with RStudio on Amazon SageMaker to analyze your organization’s data in S3 and train ML models.
As a fully managed offering on Amazon SageMaker, this release makes it easy for DevOps teams and IT Admins to administer, secure, and scale their organization’s centralized data science infrastructure with familiar AWS tools and frameworks.
Learn more at: https://posit.co/products/cloud/sagemaker/ Talk to us about using RStudio and SageMaker: https://posit.co/schedule-a-call/?booking_calendar__c=Sagemaker
How to deploy a Shiny application using clinical trial data to Posit Connect
Episode 2: Publishing a Shiny application in R to Posit Connect - Using Clinical Trial Data Led by: Ryan Johnson, Data Science Advisor
Follow-up links:
- Posit Team: https://posit.co/products/enterprise/team/
- Talk to us directly: https://posit.co/schedule-a-call/?booking_calendar__c=RST_YT_Demo
- Follow-along blog post: https://posit.co/blog/publishing-a-shiny-app-in-r-with-clinical-trial-data-to-posit-connect/
- Source code for example: https://github.com/ryjohnson09/adam_analysis
- Posit Team demo resources: pos.it/demo-resources
Timestamps: 1:35 - High-level overview of Posit Team 3:30 - Overview of clinical trial data used 5:31 - Opening up RStudio session on Posit Workbench 7:51 - Creating a new directory in RStudio 9:16 - Upload the ADaM dataset to Posit Workbench 10:17 - Using packages from a validated repository on Posit Package Manager 12:37 - Install packages for your Shiny application 13:49 - Pasting the code for the Shiny application (https://github.com/ryjohnson09/adam_analysis ) 16:16 - Publishing your Shiny application to Posit Connect 18:36 - Changing access controls to published Shiny application 20:25 - Using renv to record your R environment
On the last Wednesday of every month, we host a Posit Team demo and Q&A session that is open to all. You can use this to add the event to your own calendar.
Who are these monthly demos for? Everyone is welcome to join us - regardless of industry, background, or experience!
We will discuss topics that will speak to:
- Data scientists and administrators new to Posit Team or are looking to grow their understanding of our toolchain,
- Teams searching for a new analytic platform built to support open-source data science,
- And, those that are just curious about Posit Team!
What you can expect from the monthly Posit Team demo:
During the session, we will walk through an end-to-end data science workflow and demo the core functionality of Posit Team while highlighting some of our latest features!
While each session’s content will vary slightly, here are a few core topics we will address each month:
- Open Source Analytics: The future of data science is open source. We’ll discuss methods for leveraging open-source tools and packages in a secure and scalable way!
- Deployment: How to share the amazing data science assets your Team has built, including web applications, machine learning models, APIs, and more!
- Data Access: Data comes in various forms and is stored in various ways. We’ll discuss best practices for accessing, reading, and writing data!
- Job Scheduling: Do you have recurring data science jobs? We’ll show you how to automate these processes using Posit Connect.
What is Posit Team?
Posit Team is a bundle of our popular professional software (Posit Workbench, Posit Connect, and Posit Package Manager) for developing data science projects, publishing data products, and managing packages.
Registration is not required. The event will be streamed through YouTube Premiere
Get started with Quarto | Mine Çetinkaya-Rundel
This video walks you through creating documents, presentations, and websites and publishing with Quarto. The video features authoring Quarto documents with executable R code chunks using the RStudio Visual Editor (https://quarto.org/docs/visual-editor/) .
00:00 Introduction 00:34 Authoring a document with Quarto 01:13 Using the RStudio visual editor 04:13 Code chunks and chunk options 06:31 Inserting cross references to figures and tables (https://quarto.org/docs/authoring/cross-references.html ) 08:56 Adding a citation from a DOI (https://quarto.org/docs/visual-editor/technical.html#citations ) 10:10 Seamlessly switching between output formats 10:58 Creating Quarto presentations (https://quarto.org/docs/presentations/ ) 14:36 Customizing the output location of code in presentations (https://quarto.org/docs/presentations/revealjs/#output-location ) 16:09 Creating a website from scratch (https://quarto.org/docs/websites/ ) 19:19 Creating multi-format documents (https://quarto.org/docs/output-formats/html-multi-format.html ) 20:22 Publishing the website to QuartoPub (https://quarto.org/docs/publishing/quarto-pub.html )

[79] Create a Python Web App Using Shiny (Gordon Shotwell)
Join our Meetup group for more events! https://www.meetup.com/data-umbrella
Resources#
- website for presentation: https://shiny.rstudio.com/py/
- https://shiny.rstudio.com/py/docs/reactive-mutable.html
- https://www.shinyapps.io/
- https://huggingface.co/new-space
- https://shiny.rstudio.com/py/docs/shinylive.html
- https://shiny.rstudio.com/py/api/reactive.poll.html#shiny.reactive.poll
About the Event#
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack. If you want to develop a python web application you usually need to choose between simple, limited frameworks like Streamlit and more extensible frameworks like Dash. This can cause a lot of problems if you get started with a simple framework but then discover that you need to refactor your application to accommodate the next user request. Shiny for Python differs from other frameworks because it has tremendous range. You can build a small application in a few minutes with the confidence that the framework can handle much more complex problems. In this workshop we will go through the core limitations of Streamlit, and build a Shiny app which avoids those limitations.
Timestamps#
00:00 Welcome
00:23 Reshama introduces Data Umbrella
03:45 Reshama introduces Gordon Shotwell
04:21 Gordon Shotwell begins
04:29 The motivation to develop Shiny for Python
06:05 The main strength of both the R and Python library
06:56 What Gordon Shotwell will build during his presentation
07:25 Shiny documentation website
08:01 QuickStart for R users showing differences between the R and Python libraries
08:44 All the function reference in Shiny
09:08 Demo starts
09:50 Virtual environment
10:36 How to start shiny app in the terminal
11:15 Install shiny extension in VS Code which makes it easier to preview the web app
11:36 How the output function works on the preview app to execute
12:22 Penguin dataset description for the demo
12:45 Modules/submodules shiny app is built on
13:04 How to add a sidebar layout (sidebar, panel sidebar and panel main)
13:43 How to read in the data and the output functions
14:31 How to define some server logic
14:59 The conventional shiny rule
16:30 Use of slide input
17:50 Where the reactive magic comes in
19:30 Important note on what can really slow down your shiny app
20:14 Importance of Python data copy method when using external dataset
21:01 Important note to avoid dependency inside the render function
21:30 Q&A
29:35 Adding a plot to the output: The UI sides
30:12 Adding a plot to the output: The render sides
32:16 The core principle of reactivity in which you do not want to repeat yourself
33:26 Reactivate calculation concept which allows you to store intermediate values in one place
37:24 Q&A
38:53 Reactive calculations and rendering functions
39:30 Side-effects or user effect. Another class of interactions
41:18 How to tell reactive effect what it should respond to or what events to watch before executing
41:53 How to update the data filter in the side-effect function
42:22 The second important pattern for shiny
43:00 One of the important things to pay attention to once you start learning/using shiny
44:45 Series of Q&A until the end of the video. Some response includes live demo
01:01:03 Gordon Shotwell ends his presentation
01:01:17 Reshama closes the session
About the Speaker#
Gordon Shotwell is a Software Engineer at Posit. He’s been using Shiny to solve business problems for the past ten years.
- LinkedIn: https://www.linkedin.com/in/gshotwell/
Key Links#
- Transcript: https://github.com/data-umbrella/event-transcripts/blob/main/2023/78-gordon-shiny.md
- Meetup Event: https://www.meetup.com/data-umbrella/events/292848290/
- Video: https://youtu.be/pXidQWYY14w
#python #deployment
posit::conf(2023) Workshop: Advanced Quarto with R + RStudio
Register now: http://pos.it/conf Instructor: Andrew Bray Workshop Duration: 1-Day Workshop
This course is for you if you: • have a basic knowledge of how to use the RStudio IDE • have experience working with single R Markdown and/or Quarto files • are excited to author multi-document projects like books, websites, and blogs
Participants who are new to computational documents will benefit from taking Intro to Quarto with R and RStudio: Documents and Presentations before joining this workshop.
This workshop will prepare you to author a rich array of documents in Quarto, the next generation of R Markdown. Quarto is an open-source scientific and technical publishing system that offers multilingual programming language support to create dynamic and static documents, books, presentations, blogs, and other online resources.
The focus for this workshop will be on projects that weave together multiple documents and allow you to write books and build websites. You will also learn various ways to deploy and publish your Quarto projects on the web
posit::conf(2023) Workshop: Fundamentals of Package Development
Register now: http://pos.it/conf Instructor: Andy Teucher Workshop Duration: 1-Day Workshop
This workshop is for you if: • You have written several R scripts and find yourself wondering how to reuse or share the code you’ve written • You know how to write functions in R • You are looking for a way to take the next step in your R programming journey
We will be demonstrating some workflows using Git and GitHub. Knowledge of these tools is not required, and you will absolutely be able to complete the workshop without them, but some of the lessons will be more rewarding to you if you are prepared to try them out. If you are looking to get started with Git and GitHub, we recommend you register for the “What they forgot to teach you about R” workshop on Day 1, and join us for this workshop on Day 2.
We are often faced with the need to share our code with others, or find ourselves writing similar code over and over again across different projects. In R, the fundamental unit of reusable code is a package, containing helpful functions, documentation, and sometimes sample data. This workshop will teach you the fundamentals of package development in R, using tools and principles developed and used extensively by the tidyverse team - specifically the ‘devtools’ family of packages including usethis, testthat, and roxygen2. These packages and workflows help you focus on the contents of your package rather than the minutiae of package structure.
You will learn the structure of a package, how to organize your code, and workflows to help you develop your package iteratively. You will learn how to write good documentation so that users can learn how to use your package, and how to use automated testing to ensure it is functioning the way you expect it to, now and into the future. You will also learn how to check your package for common problems, and how to distribute your package for others to use.
This will be an interactive 1-day workshop, and we will be using the RStudio IDE to work through the materials, as it has been designed to work well with the development practices we will be featuring
posit::conf(2023) Workshop: Introduction to Quarto with R + RStudio
Register now: http://pos.it/conf Instructor: Andrew Bray Workshop Duration: 1-Day Workshop
This course is for you if you: • have a basic knowledge of how to use the RStudio IDE • have some familiarity with markdown, or • are excited to author flexible single documents like technical reports and slide presentations
Seasoned users of R Markdown will get more out of the Advanced Quarto with R and RStudio: Projects, Websites, Books, and More workshop, which is focused on projects, a distinct strength of Quarto in authoring work that spans multiple documents.
This workshop will prepare you to author a rich array of documents in Quarto, the next generation of R Markdown. Quarto is an open-source scientific and technical publishing system that offers multilingual programming language support to create dynamic and static documents, books, presentations, blogs, and other online resources.
The focus for this workshop will be on single documents. You will learn to create static documents, to add interactivity to them with Shiny and htmlwidgets, or steer them in the direction of sophisticated scientific documents. In the afternoon you’ll take the same authoring approaches to create slide presentations in various formats such as reveal.js, beamer, and pptx
posit::conf(2023) Workshop: Steal like an Rtist: Creative Coding in R
Register now: http://pos.it/conf Instructors: Ijeamaka Anyene Fumagalli & Sharla Gelfand Workshop Duration: 1-Day Workshop
This workshop is for you if you: • are comfortable with R and RStudio, experience with tidyverse and ggplot2 • are interested in applying data visualization skills more creatively, but may not know where to start or how to develop style/inspiration • are an artist interested in exploring code as another medium for creating their work
R is a tool for data analysis but also can be used for self-expression. This workshop will be an introduction to creative coding in R in order to make visual art. We will take an inspiration-first approach, using compelling pieces to discuss and learn the techniques that shape the work. This workshop takes guidance from its namesake, the book “Steal Like An Artist” by Austin Kleon - once we have identified and learned to recreate existing works, we will cover how to take this inspiration and transform, remix, or reinterpret it in the pursuit of developing our own work and artistic styles.
This workshop is hands-on and will cover color theory and manipulation, a reintroduction of the data frame as the foundation for creating art (instead of just for analyzing data!), using ggplot2 as an artistic canvas, creating basic and specialized shapes, tiling and pattern making, developing your own functions and using iteration. We will also discuss how to use controlled randomness to convert a standalone piece into a generative art system that can produce many distinct outputs. Creative coding may seem a world apart from data analysis, but we see a large overlap and intersection of the skills used in both, not to mention the creative muscles that are already used in data visualization
posit::conf(2023) Workshop: Teaching Data Science Masterclass
Register now: http://pos.it/conf Instructor: Dr. Mine Çetinkaya-Rundel Workshop Duration: 1-Day Workshop
This course is for you if you: • you want to learn / discuss curriculum, pedagogy, and computing infrastructure design for teaching data science with R and RStudio using the tidyverse and Quarto • you are interested in setting up your class in Posit Cloud • you want to integrate version control with git into your teaching and learn about tools and best practices for running your course on GitHub
This masterclass is aimed primarily at participants teaching data science in an academic setting in semester-long courses, however much of the information and tooling we introduce is applicable for shorter teaching experiences like workshops and bootcamps as well. Basic knowledge of R is assumed and familiarity with the tidyverse and Git is preferred.
There has been significant innovation in introductory statistics and data science courses to equip students with the statistical, computing, and communication skills needed for modern data analysis. Success in data science and statistics is dependent on the development of both analytical and computational skills, and the demand for educators who are proficient at teaching both these skills is growing. The goal of this masterclass is to equip educators with concrete information on content, workflows, and infrastructure for painlessly introducing modern computation with R and RStudio within a data science curriculum. In a nutshell, the day you’ll spend in this workshop will save you endless hours of solo work designing and setting up your course.
Topics will cover teaching the tidyverse in 2023, highlighting updates to R for Data Science (2nd ed) and Data Science in a Box as well as present tooling options and workflows for reproducible authoring, computing infrastructure, version control, and collaboration.
The workshop will be comprised of four modules: • Teaching data science with the tidyverse and Quarto • Teaching data science with Git and GitHub • Organizing, publishing, and sharing of course materials • Computing infrastructure for teaching data science
Throughout each module we’ll shift between the student perspective and the instructor perspective. The activities and demos will be hands-on; attendees will also have the opportunity to exchange ideas and ask questions throughout the session.
In addition to gaining technical knowledge, participants will engage in discussion around the decisions that go into developing a data science curriculum and choosing workflows and infrastructure that best support the curriculum and allow for scalability. We will also discuss best practices for configuring and deploying classroom infrastructures to support these tools

posit::conf(2023) Workshop: What They Forgot to Teach You About R
Register now: http://pos.it/conf Instructors: Shannon Pileggi and David Aja Workshop Duration: 1-Day Workshop
This course is for you if you answer yes to these questions: • Have you been using R for a while and feel there might be better ways to organize your R life, but don’t know what they are? • Do you want to put programming on pause and learn about actionable programming-adjacent workflows for streamlining analysis in R? • Are you willing to feel a bit of (git) pain to leverage the benefits of version control for collaboration and time travel?
This 1-day What They Forgot (WTF) To Teach You About R workshop is for experienced R and RStudio users who want to (re)design their R lifestyle via project-oriented workflows and version control for data science (Git/GitHub). At the conclusion of the workshop, you will have strategies for organizing data science projects and workflows, employing robust file paths, constructing human and machine-readable file names, and facilitating collaboration with yourself or others via version control
Launch different development environments and manage cluster options with Posit Workbench
Posit Workbench: https://posit.co/products/enterprise/workbench/
Data scientists should be able to use the language and development environment they prefer.
Jupyter Notebook, JupyterLab, VS Code, and RStudio are all available development environments within Posit Workbench.
Workbench is also exceptional for managing compute resources. Use Kubernetes and Slurm and adjust the CPU and memory to match the job you’re trying to run
RStudio on Amazon SageMaker
Working with analysis or a data set that exceeds the capabilities of your local workstation? One simple option for scaling up is RStudio on Amazon SageMaker.
See how you can get started quickly.
Link to learn more: https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio.html
Securely store and share database credentials across projects with Data Connections | Posit Cloud
This video introduces the new Data Connections feature in Posit Cloud that enables users to more quickly securely store and share database credentials across multiple data projects.
Led by Alex Chisholm - Product Manager, Posit Cloud
Posit Cloud (formerly RStudio Cloud) lets you access Posit’s powerful set of data science tools right in your browser – no installation or complex configuration required.
Create a free account on Posit Cloud: https://posit.cloud/ Find the publicly available database credentials used in this tutorial: https://rnacentral.org/help/public-database Read more on the Posit Blog: https://posit.co/blog/
Leveraging Pins with Posit Connect
Leveraging Pins with Posit Connect Ryan Johnson, Data Science Advisor at Posit
You might find this helpful if:
- You have reports that need to be regularly updated so you want to schedule them to run with the newest data each week
- You reuse data across multiple projects or pieces of content (Shiny app, Jupyter Notebooks, Quarto doc, etc.)
- You’ve chased a CSV through a series of email exchanges, or had to decide between data-final.csv and data-final-final.csv
- You haven’t heard of pins yet!
For some workflows, a CSV file is the best choice for storing data. However, for the majority of cases, the data would do better if stored somewhere centrally accessible by multiple people where the latest version is always available. This is particularly true if that data is reused across multiple projects or pieces of content. With the pins package it’s easier than ever to have repeatable data.
Timestamps: 0:17 - install the pins package and load into your environment 0:32 - register a board 0:59 - connecting to your Posit Connect instance from Posit Workbench or RStudio IDE 1:43 - define the Connect instance as your board 2:01 - pin the mtcars dataset to your Connect instance 2:38 - a pinned dataset on Posit Connect 2:50 - reading a pinned dataset
Additional resources:
- Example workflow that involves Quarto, pins, plumber API, vetiver and shiny: machine-learning-pipeline-with-vetiver-and-quarto/
- Connect User Guide - Pins for R: https://docs.posit.co/connect/user/pins/
- Connect User Guide - Pins for Python: https://docs.posit.co/connect/user/python-pins/
- 9 ways to use Posit Connect that you shouldn’t miss: https://posit.co/blog/9-ways-posit-connect/
Learn more: If you haven’t had a chance to try Posit Connect before or you’d like to learn how your team can better leverage pins, schedule a demo with our team to learn more! https://posit.co/schedule-a-call/?booking_calendar__c=RSC_Demo
On the last Wednesday of every month, we host a Posit Team demo and Q&A session that is open to all. You can use this to add the event to your own calendar: pos.it/team-demo
Dan Negrey @ MarketBridge | Creating a framework for consistent measurement | Data Science Hangout
We were joined by Dan Negrey, Director, Analytics at MarketBridge.
At (15:11) we asked Dan about a tip for impacting the business with data science.
So I think every business is going to have KPIs (Key Performance Indicators), and there’s going to be other metrics besides KPIs, things that lead into that. As crazy as it sounds, some organizations struggle to measure those and to do so in a consistent and repeatable way.
Maybe they measure something that just comes from one person sitting at a desk, and they’ve done it for six years, and they leave. All of a sudden, who knows how they do that?
Creating a framework for consistent measurement is huge for an organization.
The measurement is consistent and the outcomes are measured consistently. Then taking action to improve those outcomes can be thought of as more reliable because the measurement process is consistent.
So that would be one thing for sure. Another – on that note, is decision making. Every company makes decisions. A lot of us are here because we like to do this kind of work, but most of our companies exist because they like to make money, and they like to grow. So we find a balance between doing what we do to help our company to achieve their goals.
Find ways to help your company optimize cost, reduce waste and increase growth.
All of that is through measuring and looking at decisions that have been made in the past and thinking about how they could have been made differently. This could be through historical analysis or building models to help make those decisions more effectively.
That’s a huge win for any organization.
There was also lot of love for repeatable data with the pins package at this Data Science Hangout.
Dan Negrey shared: “Pins has been a huge package that we’ve started using a year or so ago…if you’ve never used pins, it’s definitely worth checking out.”
Helpful resources on pins: Pins for R: http://pins.rstudio.com/ Pins for Python: https://lnkd.in/ghmxiEHV Great repo that uses pins: https://lnkd.in/ezvBkav Workflow that involves Quarto, pins, plumber API, vetiver and shiny: https://lnkd.in/e6gnMXfD Link to Ryan’s video & stepping stones: https://lnkd.in/erR-Mjr9
Other resources: MarketBridge career page (with open data science roles): https://marketbridge.applytojob.com/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)
David Granjon & Bo Wang @ Novartis | User-friendly, self-serve tools | Data Science Hangout
We were recently joined by David Granjon and Bo Wang, Senior Data Science Experts at Novartis during the Appsilon Shiny Conference.
We learned how their team designs production ready apps for clinical trials, from wireframing activities to automated deployments.
How do you get feedback from users when you don’t hear from them directly?
To know whether the applications are properly used, they use a tracker within Posit Connect. This is basically an application David developed using the Connect API.
With the Connect API, he can see “who is accessing the application, which gives you some insights like which team is using it and how to target support.” If you develop an application for several teams, you can have some insight of which team is using the application the most. If you see one team is using the application 20 times more than another one, maybe you want to invest more people to provide two paths here.
He supplemented this information by developing an R package, shinyHeatmap, to record in-app usage. “Each click is recorded to identify dead zones to refactor design,” he says. “If you have an app no one visits, maybe it’s poorly designed.”
For more information on that Connect API: https://pkgs.rstudio.com/connectapi/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)
Keynote: Hadley Wickham - Embracing multi-lingual data science | PyData Global 2022
RStudio recently changed its name to Posit to reflect the fact that we’re already a company that does more than just R. Come along to this talk to hear a few of the reasons that we love R, and to learn about some of the open source tools we’re working on for python.
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps

Data Science Hangout | JJ Allaire, Posit PBC | Making data science more open and collaborative
We were recently joined by J.J. Allaire, Founder and CEO of Posit PBC.
What made you so interested in open source initially?
It was really that the first big software projects I worked on were proprietary software, and I found it disappointing that proprietary software products kind of get very tied up with the company that sponsors them – the fate of that company, the other products that that the company might offer, or who that company gets acquired by really end up affecting the fate of the projects.
I also found that while some proprietary software is either cheap or low cost, there’s a certain impediment to adoption associated with the price. As a creative person, I want my work to be available as broadly as possible.
I like to make the work available to as many people as possible, but also there’s a dimension of the durability of the work. Is the work going to be around in 30 years or 40 years? What are the things that would make it? So that’s what made open source appealing to me.
What are you most excited about at Posit in the next year ahead?
It’s going to be related to Quarto, of course, because that’s what I’m spending all my time on. I would say that the thing I’m excited about is that we recognized Quarto is really powerful and flexible, and easy to use for a certain subset of users who are very technical and very motivated, but we actually want to make Quarto available more broadly. So working on tooling that lets both technical and non technical people collaborate over documents, and also lets some less technical people participate in using Quarto. (You can sort of see some of this work in the visual editor that’s in RStudio) Those are the kind of things that I’m focused on for the next year, and I’m excited to see those getting realized.
In hindsight, what is one of the best decisions that you ever made for your career or your education?
I think this actually has a lot of commonality with talking to a lot of people about their careers. I think when I was about in my late 30s, I had been involved in both developing software and also starting companies. I really learned about myself, that the company part of things – the management and entrepreneurship – I really did not like it all. I didn’t enjoy it.
So I kind of said, I’m happy to be involved in starting companies but I’m absolutely not going to do that part, and I’m going to have to find a partner or other people who are excited about doing that part. What I really want to do is focus on engineering and product development.
It’s very easy as a company founder to get pulled into all kinds of other things and that’s what happened to me the first 2 times. Just being super clear about that, and saying I won’t even do it unless I can satisfy this condition.
I think that’s pretty broadly applicable in that a lot of us accumulate a lot of responsibility in our careers. Some of it is necessary and important for a given role or company, but then eventually being clear about, what do I really like to do and really want to do? What do I feel like I should do or what is put upon me?
I know a lot of people now who are in their 40s who have actually managed dozens of people and are like, yeah, I don’t really want to do that. I don’t ever want to have anyone work for me anymore. So I’m going to be really clear about that. I’m going to walk in, and people are going to try to give me people and try to make me a manager, and I’m not going to do it.
I think it applies pretty broadly, but just knowing yourself and setting pretty rigid constraints about what you’re willing to do in the workplace and not. Everybody’s different, and both can be really rewarding. I know a lot of people who find management very rewarding, but I know a lot of people who find it really alienating.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)

Alex Gold - Avoid App Failures Through Code Promotion
Avoid App Failures Through Code Promotion by Alex Gold
Visit https://rstats.ai/gov/ to learn more.
Abstract: It’s all too easy to write an app or report or add an update and suddenly, cold bead of sweat running down your back, realize everything is broken. In this talk you’ll learn how to think about avoiding this moment with good R promotion practices. You’ll learn about a general framework for code promotion, as well as specific tools you can use to make deployments risk-free and easy.
Bio: Alex leads the Solutions Engineering team at Posit (formerly RStudio), where he helps organizations use R and Python in their enterprise environments. Alex loves all things #rstats and was a data science manager, data scientist, and economics researcher before coming to RStudio. He lives just outside Washington DC with his wife and their puppy and enjoys cooking, tai chi, and landscaping in his free time.
Twitter: https://twitter.com/alexkgold
Presented at the 2022 Government & Public Sector R Conference (December 1, 2022)
Rich Iannone | What’s new and exciting in gt 0.8.0 | Posit
With the gt package, anyone can make wonderful-looking tables using the R programming language. Rich Iannone, maintainer of gt, shows what’s new and improved in gt 0.8.0!
00:00 Introduction 00:42 Find/Replace values with sub_values() 02:46 Find values and style them with tab_style_body() 05:00 Place a cell in your Quarto/RMarkdown doc with extract_cells() 07:13 Make numbers more readable with cols_align_decimal() 08:54 See column id info with tab_info() 11:03 Date and time formatting improvements
For more details: • Demo script in this video: https://pos.it/gt8 • Read the blog post on gt 0.8.0: https://posit.co/blog/new-features-upgrades-in-gt-0-8-0/ • Learn more at https://gt.rstudio.com/ • See a full list of new features and improvements at https://gt.rstudio.com/news/index.html#gt-080

December 2022 Webinar: The R Workflow – Dr Ryan Johnson from Posit
The R Workflow Wednesday 21st December.
Ryan Johnson, Posit discusses the following (using a Shiny app as the end product):
- R Markdown Job Scheduling
- Pins - https://pins.rstudio.com/
- Plumber APIs
This session is a show-and-tell getting you familiar with various open source tools (Pins, Plumber) and how they can be used in combination with pro tools (Posit Connect) to improve workflows.
Resource links can be found on the NHS-R Community website: https://nhsrcommunity.com/events/december-2022-webinar-the-r-workflow-dr-ryan-johnson-from-posit/
R at AstraZeneca: upskilling our workforce through education, experience, and exposure
We were joined on November 29th at 12PM EST by Gabriella Rustici & Guillaume Desachy, who shared their experience about the R journey AstraZeneca is currently on.
Resources: ⬢ R @ AZ: Building a Community in the Pharmaceutical Industry Blog Post: https://www.rstudio.com/blog/building-a-community-in-the-pharmaceutical-industry/ ⬢ R in Pharma YouTube videos: https://www.youtube.com/c/RinPharma ⬢ Posit Pharma Site: https://posit.co/solutions/pharma/
Timestamps: 4:53 - Start of session 5:41 - Paradigm shift in the pharmaceutical industry (many people are multilingual) 6:40 - Profile of R users at AstraZeneca (varied across data science, clinicians, medical director) 8:28 - Meet the R&D Learning & Development Team 10:03 - 3E Framework: Education, Exposure, Experience 12:11 - Bridging the science community and data science audience 13:11 - We all learn differently (solutions that suit different needs & styles) 17:31 - Index of learning (self-led index, synchronicity index) 12:20 - Experiential Learning 20:47 - The community of R users at AstraZeneca 21:05 - The early days (April 2021) 21:52 - azTidyTuesday: a playground to hone data viz skills 24:15 - internal R conference 25:38 - R function of the month 27:04 - Lunch & LeaRn 28:20 - R @ AZ 10:1 29:44 - Communication expanded from internal social media to R @ AZ Monthly Newsletter 31:57 - Workshops with Posit 32:11 - AZRHotdesk - come with your questions and someone will help you solve it 35:02 - Wish list for 2023 38:38 - Start of Q&A section
Abstract: The use of R continues to become more and more important at AstraZeneca. It is a true paradigm shift that we have embarked on! This shift has required upskilling our workforce to make them proficient R users.
To do so, we are leveraging the 3Es of learning: education, experience and exposure.
Learn more about their team’s Data Science Educational Program and how the team at AstraZeneca has built their own strong community of R users - where learning takes place through experience and exposure.
Speaker bios: Gabriella is Data Science Learning Senior Director in Astrazeneca’s R&D Data Science & AI where she is responsible for developing a strategy for, and creating a centralised approach to, data science learning for R&D. Gabriella completed her PhD at the Wellcome Sanger Institute and previously run bioinformatics training programs at the University of Cambridge and the European Bioinformatics Institute, in the UK. She is passionate about designing, implementing and evaluating effective and scalable solutions to educate scientists and data science practitioners at all career stages.
Guillaume is passionate about helping bring new medicines to patients by leveraging the power of statistics and precision medicine. Since October 2020, he has been doing so at AstraZeneca where he works as a Statistical Science Director. In addition, since March 2022, he have been leading a team of 15 collaborators focusing on building the community of R users at AstraZeneca, called R @ AZ.
During the event you can ask questions anonymously through slido here as well: rstd.io/meetup-questions
Blog post on R @ AZ, Building a Community in the Pharmaceutical Industry: https://www.rstudio.com/blog/building-a-community-in-the-pharmaceutical-industry/
Please note the recording of this session will be shared at the same YouTube Live link
Data Science in People Analytics | Led by Elizabeth Esarove, AT&T
People are the face, heart, and hands of a company. In people analytics, we analyze data to reveal actionable insights that provide evidence for decisions regarding employees, work, and business objectives. This talk will cover the use of data science for people analytics projects such as workforce planning, improving employee engagement, and retaining talent.
Speaker bio: Elizabeth Esarove is a data scientist in People Analytics at AT&T. In her role, Elizabeth is part of a larger team focused on embedding data and analytics into the root of decision-making and transforming insights into actionable solutions that improve employee outcomes and drive business value.
Timestamps: *Q&A timestamps listed further below 3:42 - Start of session 5:14 - What is People Analytics 6:26 - Opportunities for Data Science in People Analytics 7:10 - Using Predictive Models to Reduce Attrition 11:10 - Segmenting Your Population 18:55 - Communicating with Leaders 20:11 - Time Series Forecasting for Workforce Changes 24:41 - Analyzing Employee Survey Comments
Helpful Resources Below: *more follow-up to come with a Q&A blog post in the works
People Analytics Books Mentioned today: Handbook of Regression Modeling in People Analytics: with examples in R, Python and Julia by Keith McNulty https://lnkd.in/eBFgniFG Excellence in People Analytics: How to Use Workforce Data to Create Business Value by Jonathan Ferrar and David Green https://a.co/d/bJrMRuW
People analytics books shared in a previous data science hangout: Predictive HR Analytics: Mastering the HR Metric: https://a.co/d/5Hx05mw Inclusalytics - How Diversity, Equity and Inclusion Leaders Use Data to Drive Their Work: https://lnkd.in/g48tdrMu
Other links shared by Liz: Time Series Models Forecasting: Principles and Practice by Rob Hyndman and George Athanasopoulos https://otexts.com/fpp3/ Text Analytics Text Mining with R by Julia Silge & David Robinson https://lnkd.in/emawveZd
Additional resources shared: R Gov Conference: https://lnkd.in/ePfN7jru (David Meza is presenting on the RStudio (Posit) Ecosystem as a Critical Part of NASA Analytics Capabilities) People analytics for getting to the moon | Data Science Hangout with David Meza, NASA: https://lnkd.in/eDirbgCF For LATAM and Spanish Speaking people, Sergio Garcia Mora shared the R4HR community which has developed lots of free access content: https://data-4hr.com/ John Kelly IV shared the Human Resources Science LinkedIn Group: https://lnkd.in/eEMpYAfk Adrian M. Pérez shared the People Analytics Handbook: https://lnkd.in/ecsWy-dA Data Science Hangout: pos.it/dsh All upcoming #Posit community events: pos.it/community-events
Q&A Timestamps: *the following timestamps are approximate. 16:00 - What are the most important people analytics KPIs @ AT&T? Can you share how your team/HR acts on these predictions (for optimal policy) both experimentally and ethically? do you implement new policy in smaller groups? 23:00 - How have you validated the predictive models? Looking backwards, how precise were they? 25:00 - Do you work with your HRBPs to segment your population? 25:00 - What languages are you using to build your predictive models? 31:00 - Do you include demographic information (gender, race, age) in your models? 31:00 - Are your surveys anonymous? 32:00 - How would you get the ROI from HR attrition modeling? 34:00 - Are most data scientists from a Psychometrics background? 35:00 - Is there a kind of “critical mass” to apply People Analytics? (just for big companies?) 36:00 - Looking at positive / negative comments, do you quote verbatim comments in your reports? (e.g. “here is one of the very positive / very negative comments we received”) 37:00 - Do you use something like Snowflake to store and model your data? And do you deploy these models automatically or manually update them? 38:00 - R user here. How do you balance between people-ops focused analytics tools from outside vendors (often very expensive, but helpful) with custom in-house analytics (often time-consuming)? 41:00 - How much of your work is driven by HR leadership, by HR business leaders, or by the HR analytics team pushing modeling and insights to those groups? 42:00 - What was your journey into learning data science and getting into people analytics? 44:00 - Do you have a role in education business units? to improve their questions, etc.? 45:00 - What is the HR tech stack at AT&T? Does your team have a data engineer solely for people data since they’re more sensitive? 47:00 - How do you present your results? (an application, report, power point) and how important is it to learn other languages (javascript, css, sql)? If you were to start a people analytics team in a company (+1000), how do you start? 50:00 - Do you use an internal tool for surveys? Do you use thresholds to maintain anonymity? 53:00 - Does AT&T have remote workers? If so, does people analytics segment on remote vs hybrid vs on-site?

Open Source Chat - {gt} with Rich Iannone
Join Rich Iannone, maintainer of the {gt} package, as he takes questions from the community about the latest in {gt} v0.7.0, and building great looking data display tables with R.
Key Resources: ⬡ Get started with {gt} - https://gt.rstudio.com
Reach out: 38:48 - How do I ask Rich about {gt}, feature requests, bug reports, how to solve a problem via {gt}? Rich and the {gt} team would love to hear from you. ⬡ Feature requests & bug reports with GitHub Issues, https://github.com/rstudio/gt/issues ⬡ GitHub Discussions, https://github.com/rstudio/gt/discussions ⬡ Ask the community a question, https://community.rstudio.com/tag/gt ⬡ Follow {gt} on Twitter, feel free to reach out and ask questions, https://twitter.com/gt_package
Timestamps
Rich Iannone Introduction.
03:52 - Why {gt}? - What does {gt} bring to the table? Why so much effort into static, data display tables?
05:50 - Why open source? Why is {gt} open source and why have you dedicated your career to develop open source software?
08:30 - {gt} v0.7.0, Tell us about those new vector formatting functions in {gt}. Why did you include them? Could you show us some examples?
{gt}’s vector formatting functions help you customize the styling, look and feel of your values. Converting the output values R gives you, and making them look exactly the way you want them to can be tricky. A lot of work was put into {gt} to give nice value formatting options. You can now access all these outside of a gt table; e.g. in text, in a plot, etc.
22:35 - Could you provide an example or two with the new styling function called opt_stylize()? What kinds of tables can you make with that? Can you extend that with your own tweaks?
28:15 - Can you make your own themes and share them? “How do I create my own custom theme for my table? A theme I can share with the rest of my organization?”
31:58 - What is the distinction between tab_options and the opt_* functions? Why would a function be in opt_* and not tab_options?
34:00 - sub_values() function, to find and replace certain values in your table.
36:50 - What is the current support for latex in {gt} at the moment? “Personally, I much prefer HTML, but for scientific publications, we are asked to provide a LaTeX file.”
42:50 - “In my work, I often produce A4 output in PDF, mainly with ggplot2 content. It would be nice to be able to combine ggplot + gt tables in a similar way {patchwork} works. Having the plot and the table next to it is very useful sometimes.”
44:30 - Interactive Tables with {gt}?
47:45 - “Any plans to make applying of same style to several columns easier? Unless I’m mistaken, the locations argument of tab_style requires one to specify an individual column. See here: https://gt.rstudio.com/reference/tab_style.html#examples."
Yes, supply a vector of columns or use tidyselect functions.
49:15 - “Excel output with {gt}? Would be a huge improvement. I often have to produce tabular output that can be easily reused. Usually it means Excel tables. So far I have mainly done this with Python and openpyxl or PyWin32 (through COM). A simple solution in R would be great.”
50:20 - Support for additional output formats with {gt}? Excel, PowerPoint, etc.?
50:25 - {pointplank}, a package to methodically validate your data whether in the form of data frames or as database tables., https://rich-iannone.github.io/pointblank/
. Check out the workshop materials at https://github.com/rich-iannone/pointblank-workshop
55:50 - “Are there ways to have grouped rows? I mean when repeated rows have same characters can we merge them to one?”
58:00 - “Is there an ability to add ‘battleship coordinates’ (e.g. column letters & row numbers) to a gt object? This is a standard for table across my org and I’ve been trying to figure out how to implement it.”
59:59 “Do you have suggestions or examples of building out & applying corporate formatting to gt tables (e.g. adding a company logo, company colors, etc.)?”
01:04:30 - “With PDF/LaTeX output for wide tables, it does not shrink the table.”

Data Science Hangout | Melissa Perry, Peloton | Design Thinking with Data
We were joined by Melissa Perry, Senior Manager, eCommerce Analytics at Peloton Interactive, Inc. Melissa is a value-driven data science leader with a passion for growing the next generation of data experts and moving beyond reporting, into optimized decision-making.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout (All are welcome! We’d love to see you!)
Data Science Hangout | Patrick Tennant, MMHPI | Welcoming People into Conversations to Make Change
We were joined by Patrick Tennant, Director of Evaluation and Analytics at Meadows Mental Health Policy Institute.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout (All are welcome! We’d love to see you!)
Data Science Hangout | Unity Health Toronto | Deploying & Monitoring Models Across a Hospital
We were joined by three leaders from Unity Health Toronto: Derek Beaton, Jamie Beverly, and Sebnem Sahin Kuzulugil (surprise special guest! we will be updating the hangout image!)
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/posit
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout (All are welcome! We’d love to see you!)
First Steps in Learning the Use of Git & GitHub in RStudio
First Steps in Learning the Use of Git & GitHub in RStudio Led by Mouna Belaid
Slides for presentation: https://mounabelaid.github.io/First-Steps-in-Learning-the-Use-of-Git-and-GitHub-in-RStudio/#1
Abstract: Collaborating on code development and tracking changes in code across versions are daily issues for big teams.
This hands-on workshop will be a gentle introduction to learning how to manage and keep tracking your source code history by walking through basic Git and GitHub fundamentals. Mount will also share with you her experience in creating the {git-github} cheatsheet which focuses on the essential git commands in a good visual design.
Requirements: ️ R & RStudio installed ️ Please sign up for a free GitHub account here https://github.com/ ️ Download and install Git by following this link https://git-scm.com/downloads
The accidental analytics engineer
There’s a good chance you’re an analytics engineer who just sort of landed in an analytics engineering career. Or made a murky transition from data science/data engineering/software engineering to full-time analytics person. When did you realize you fell into the wild world of analytics engineering?
In this session, Michael Chow (RStudio) draws upon his experience building open source data science tools and working with the data science community to discuss the early signs of a budding analytics engineer, and the small steps these folks can take to keep the best parts of Python and R, all while moving towards engineering best practices.
Check the slides here: https://docs.google.com/presentation/d/1H2fVa-I4D8ibanlqLutIrwPOVypIlXVzEITDUNzzPpU/edit?usp=sharing
Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/

Aaron Chafetz | Digging a Pit of Success for Your Organization | RStudio (2022)
How does a US federal agency analyze tens of millions of records across 30,000 sites in over 50 countries analyze these data efficiently and effectively? Five years ago, our team ventured beyond the confines of (largely) Excel and towards leveraging R to be more efficient in our analysis and workflows since data is instrumental to the mission of ending the global HIV epidemic. We have created our own ‘pit of success’, providing analysts the infrastructure and support needed to ease the learning of and working with R in our specific context. We will be sharing our experiences in digging this ‘pit of success’ that organizations would benefit from.
Talk materials are available at https://speakerdeck.com/achafetz/digging-a-pit-of-success-for-your-organization(opens in a new tab).
Session: Cat herding: solving big problems by bringing people together
Alex Farach | Let’s start at the beginning - bits to character encoding in R | RStudio (2022)
Attendees will recieve a broad overview of the encoding and decoding process in the human-to-computer loop, how bits are used, and the math that gets us to common bit values. A brief history of ASCII, Latin-1, and UTF-8 will be provided as well.
Attendees will also be exposed to how character encoding works in R and in the tidyverse.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/alexfarach/bits_to_character_in_R_RSTUDIO%20-%20Alex%20F.pdf
Session: Lightning Talks
Alice Walsh | Becoming Creative: How I Designed a Quilt with R | RStudio (2022)
When someone asks about essential skills for data careers, I often hear responses like R, Python, and machine learning. However, I argue that creativity is an underrated skill that you can and should practice. In this talk, I want to tell you a story about a project I did to stretch my creative brain and use my favorite tool, R. I designed a quilt in R using generative art ideas. Then I created individual blocks that make up the larger design. I used foundation paper piecing, a method that allows for intricate designs but has geometrical constraints. I hope my talk will entertain and inspire folks to exercise their creative muscles to improve their performance and enjoyment of their day jobs.
Talk materials are available at https://github.com/awalsh17/quiltR .
Session: Eye candy: surprising and delightful uses of R
Barret Schloerke | {shinytest2}: Unit testing for Shiny applications | RStudio (2022)
Manually testing Shiny applications is often laborious, inconsistent, and doesn’t scale well. Whether you are developing new features, fixing bug(s), or simply upgrading dependencies, it is critical to know when regressions are introduced. The new {shinytest2} R package provides a toolkit for unit testing Shiny apps and seamlessly integrates with {testthat}. Under the hood, it uses the new {chromote} R package to render apps in a headless Chrome browser with features such as live preview and built in debugging tools. In this talk, you’ll learn how to test Shiny apps by simply recording your actions as code and extending it to test more particular aspects of your app, resulting in fewer bugs and more confidence in future development.
Talk materials are available at https://bit.ly/shinytest2-conf22
Session: I like big apps: Shiny apps that scale

Beatriz Milz | Making Awesome Automations with GitHub Actions | Posit (2022)
This talk is an introduction to GitHub Actions (GHA), which is a feature from GitHub that allows us to automate several tasks in R. In this presentation, I aim to answer these questions: “What is GitHub Actions? How can I run R Scripts with it?”. I will list supplementary materials that are helpful to learn how to start automating tasks in R projects and packages.
Talk materials are available at https://beamilz.com/talks/en/2022-rstudio-conf/
Session: Lightning Talks
Bryan Shalloway | From summarizing projects to setting tags, uses of parsing R files | RStudio
I’ll walk through a few potential uses of parsing out the functions and packages in projects.
Creating a reference table: With so many #rstats learning materials out there, it’s often helpful to parse-out the functions from a project and create a lookup table that complements your notes. Analyzing files: A network visualization of the packages may provide insights as to which files or projects are most related to one another as well as which packages are most central to a body of work. Setting tags: Picking good consistent tags for your blogdown website is hard. It’s easier to just parse out the packages in each post and use those to organize your website. Examples will use helpers from the new {funspotr} package: https://github.com/brshallo/funspotr/
Talk materials are available at https://github.com/brshallo/funspotr-rstudioconf2022
Session: Just typing R code: advanced R programming
Caro Buck | The Benefit of Talking to the "Non-Datas" | RStudio (2022)
Data literacy is a tool to build understanding- of the world and ourselves. Data, AI and tech are sometimes portrayed as scary and unknowable; however, data can be for everyone. Data, and decisions based off data, have enormous implications in our daily lives. We (data practitioners) likely have some baseline understanding of numbers and how to read a chart. But others, whether our friends, family members or coworkers, might not have the same level of understanding.
This talk will address how to talk to these seemingly “non-data” people, the benefits of talking data with them, and (hopefully) encourage more curiosity and wonder at the creativity of data. We will also briefly cover what data literacy is and why we ought to care about it.
Talk materials are available at https://rconf-2022-caro-buck.netlify.app/#/section
Session: Working with people is hard
Colin Gillespie | Comparing Package Versions with Diffify | Posit (2022)
Even when we run the simplest of R scripts, we are using dozens of R packages. We use packages for data cleaning, writing reports, graphics and modelling. One of the strengths of R, is the depth of packages.
Unfortunately, packages change and break our code. Not all R packages have NEWS file, and even those that do, it might not be complete.
The diffify service aims to make comparing between package versions easier. For example, is there a new Import? Or perhaps a package has been removed from Suggests? Maybe the arguments of a function have changed? Or a function is no longer exported. Diffify can help.
NB: We have completed the back-end infrastructure, and are currently working on the front-end. Expected launch: ~May 1st
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/colingillespie/2022-07-27_rstudio-conf%20-%20Colin%20Gillespie.pdf
Session: Lightning Talks
Danielle Dempsey | Save an ocean of time: streamline data wrangling with R | RStudio (2022)
My organization currently has over 250 oceanographic sensors deployed around the coast of Nova Scotia, Canada. Together, these generate around 4 million rows of data every year. I was shocked when I discovered my colleagues manually compiled, formatted, and analyzed these data using hundreds of Excel spreadsheets. This was highly time consuming, error prone, and lacked traceability. To improve this workflow, I developed an R package that reduced processing time by 95%. The package has since become integral to our data pipeline, including quality control, analysis, visualization, and report generation in RMarkdown. The resulting datasets have already proven invaluable to industry leaders looking to invest in Nova Scotia’s coastal resources.
Talk materials are available at https://github.com/dempsey-CMAR/2022_rstudio_conf(opens in a new tab).
Session: Cat herding: solving big problems by bringing people together
David Keyes | What they forgot to teach you about starting a business with R | RStudio (2022)
Lots of people I meet want to start their own business. “I know how to use R,” they figure, “so I should be able to go out on my own, find clients, and work for myself.”
The reality for many people is very different. They spend weeks on a business plan, website, and social media strategy. Then they sit down at their desk, waiting for the flood of clients. But the clients never come.
Since starting R for the Rest of Us in 2019, I’ve learned a lot of lessons along the way about (and how not) to run a business using R. In this talk, I’ll share some of these lessons.
Whether you have dreams to start a huge business or want to freelance using R, this talk will help you get started with the next chapter of your career.
Talk materials are available at https://dgkeyes.com/rbusiness
Session: What they forgot to teach you about your career
David Smith | Zero-setup R workshops with GitHub Codespaces | RStudio (2022)
If you’ve ever tried to run a workshop using R, you’ll be aware of the challenges of getting everyone’s laptop set up to able to run your R scripts, Rmarkdown documents, or Jupyter Notebooks without errors.
What if you could host a workshop using R that required no setup from the participants at all? With GitHub Codespaces, a GitHub repository becomes a cloud-based engine for running R in a container with a single click. Every participant, regardless of the power, configuration or operating system of their laptop will have the same experience, all with NO setup in advance.
In this talk, I’ll describe the process and share tips for setting up a GitHub repository for an R-based workshop to take advantage of GitHub Codespaces.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/davidsmith/Zero%20Setup%20Workshops%20RStudioConf%202022%20-%20David%20Smith.pdf
Session: Lightning Talks
Davis Vaughan | It’s about time | RStudio (2022)
Dealing with date-times is hard. Dealing with date-times without the proper tooling is even harder! clock is an R package that aims to provide comprehensive and safe handling of date-times. It goes beyond the date and date-time types that base R provides, implementing new types for year-month, year-quarter, ISO year-week, and many other date-like formats, all with up to nanosecond precision. In this talk, you’ll see how clock emphasizes “safety first” when manipulating date-times, and how these new date-time types can be used in your own work.
Talk materials are available at https://speakerdeck.com/davisvaughan/2022-rstudio-conf-its-about-time
Session: Lightning Talks

Dewey Dunnington | Accelerating geospatial computing using Apache Arrow | RStudio (2022)
The ‘arrow’ R package and wider Apache Arrow ecosystem provide an end-to- end solution for querying and computing on in-memory and bigger-than-memory data sets using the Apache Arrow C++ library. In this talk we introduce the ‘geoarrow’ package, which extends Arrow to provide efficient columnar storage for spatial types and functions to support spatial queries in the Arrow compute engine. We focus on a workflow where (1) data are stored in multiple files that can be hosted remotely (e.g., on S3-compatible storage), (2) queries are processed batchwise and in parallel allowing for efficient processing of bigger- than-memory geospatial data and (3) results can be passed without copying to Rust, Python, or other R packages for further analysis.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/deweydunnington/Accelerating%20geospatial%20computing%20using%20Apache%20Arrow%20-%20Dewey%20Dunnington.pdf
Session: Lightning Talks
George Stagg | WebR: R compiled for WebAssembly and running in the browser | RStudio (2022)
In this talk I introduce webR, a port of R to WebAssembly using Emscripten. WebR brings a full R environment to the browser, enabling R code execution, numerical analysis, loading packages and more. No local or cloud-based R servers are required as all computation is performed within the browser. I give a brief overview of our build process for webR, describing the toolchain and some of the issues we encountered. A publicly available web-based R session is demonstrated, with package and plotting support.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/georgestagg/webr%20-%20George%20Stagg.pdf
Session: Lightning Talks

Hamel Husain | Literate Programming With Jupyter Notebooks and Quarto | RStudio (2022)
Jupyter Notebooks play a critical role in in the workflow of many users. Notebooks are used to document existing code, to quickly prototype and iterate on ideas, and as a medium of technical communication. However, package developers typically use an entirely separate set of more traditional development tools, and the context switching between these tools and notebooks can be frustrating. Not only do you lose the ability to iterate fast, but you lose the ability to document and test your code in-situ, requiring you to create documentation and tests separately from source code.
Nbdev is a literate programming framework that allows you to develop Python libraries within Jupyter Notebooks. In this talk, Hamel will describe the integration between Nbdev amd Quarto, which enables library developers to author their documentation right alongside their code, and automatically produce a Quarto website for their package. The result is a seamless workflow for developing, documenting, and testing software packages all within Jupyter Notebooks, with no context-switching required.
Nbdev: https://github.com/fastai/nbdev
Session: Quarto deep dive
Hannah Frick | Censored - Survival Analysis in Tidymodels | Posit (2022)
tidymodels is extending support for survival analysis and censored is a new parsnip extension package for survival models. It offers various types of models: parametric models, semi-parametric models like the Cox model, and tree- based models like decision trees, boosted trees, and random forests. They all come with the consistent parsnip interface so that you can focus on the modelling instead of details of the syntax. Happy modelling!
Talk materials are available at https://hfrick.github.io/rstudio-conf-2022
Session: Updates from the tidymodels team

Jacqueline Nolis | I made an entire e-commerce platform on Shiny | RStudio (2022)
E-commerce requires passing data between many components like managing a shopping cart, taking payment, fulfilling orders, and sending emails. I’ve successfully created a full e-commerce platform entirely in R for a quirky side project. The R package ggirl lets users order ggplot2 plots as postcards and more via R functions. Those R functions pass data to a separate Shiny app, which then passes data other services like Stripe payment APIs and printing APIs. In this talk I will walk through how to use packages like httr, callr, and brochure to have your Shiny apps call external services and do many tasks in parallel. You’ll leave the talk with more ways to use Shiny than dashboards plus the knowledge to monetize your existing dashboards!
Talk materials are available at https://link.jnolis.com/rstudio22-slides
Session: Unexpected uses of R
Jamie Ralph | Developing internal tools for multi-lingual teams | RStudio (2022)
Internal packages are great for boosting productivity and promoting good practice, but what kinds of challenges do we face when designing solutions for multi-lingual teams? Here I will advocate for a design approach we are using at Bumble to build Python and R packages with the same foundations. I will discuss the benefits of this approach for the developer and the wider organisation.
Talk materials are available at https://github.com/jamie-ralph/rstudio-conf-2022
Session: Some of my best friends use Python
Josiah Parry | Exploratory Spatial Data Analysis in the tidyverse | RStudio (2022)
R has come quite a long way to enable spatial analysis over the past few years. Packages such as sf have made spatial analysis and mapping easier for many. However, adoption of R for spatial statistics and econometrics has been limited. Many spatial analysts, researchers, and practitioners lean on Python libraries such as pysal.
In this talk I briefly discuss my journey through spatial analysis and introduce a new package sfdep which provides a tidy interface to spatial statistics and noteably exploratory spatial data analysis. sfdep is an interface to the spdep package as well as implements other common exploratory spatial statistics.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/josiahparry/rstudio__conf(2022L)%20-%20Josiah%20Parry.pdf
Session: Lightning Talks
June Choe | Cracking open ggplot internals with {ggtrace} | RStudio (2022)
The inner workings of {ggplot2} are difficult to grasp even for experienced users because its internal object-oriented (ggproto) system is hidden from user- facing functions, by design. This is exacerbated by the foreignness of ggproto itself, which remains the largest hurdle in the user-to-developer transition. However this needs not to be the case: ggplot internals have clear parallels to data wrangling, where data is passed between methods that take inputs and return outputs. Capitalizing on this connection, package {ggtrace} exposes the familiar functional programming logic of ggplot with functions that inspect, capture, or modify steps in a ggplot object’s execution pipeline, enabling users to learn the internals through trial-and-error.
Talk materials are available at https://github.com/yjunechoe/ggtrace-rstudioconf2022
Session: Just typing R code: advanced R programming
Kelly O’Briant | Remote Content Execution with RStudio Connect and Kubernetes | RStudio (2022)
This summer the Posit Connect team will announce a feature which has been over two years in the making: “Remote” off-host content execution with launcher in Kubernetes.
We have been quietly beta testing the Launcher feature with select partners and customers for several months while we prepare for the public announcement.
This talk will highlight why someone might want to use this new execution mode with Connect, show just how seamless it is to get everything configured in a fresh environment on EKS, and finally set some critical context for what publishers and administrators should expect by addressing the anticipated FAQs.
Talk materials are available at https://kelly.quarto.pub/rstudioconf-talk-2022/
Session: Data science in production
Kirill Müller | dm: Analyze, build and deploy relational data models | RStudio (2022)
dm bridges the gap in the data pipeline between standalone data frames and relational databases. Implementing a “grammar of joined tables”, it provides a consistent set of verbs for consuming, creating, and deploying relational data models. In this talk I present a short overview of how dm can help your data analysis and ETL processes, and highlight recent developments.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/kirillmuller/dm-rstudioconf2022.pdf/
Session: Databases
Lewis Kirvan | Sometimes you just need words | RStudio (2022)
This talk will trace the evolution of a report from a mostly text free dashboard into a text heavy R markdown report with dynamic text blocks. The report in question is provided to the largest financial institutions in the U.S., but the audience for the data largely is composed of compliance experts and lawyers.
The interface between data products, and people who make decisions is often the most difficult piece in a project. Frequently, what your audience really needs is words! This talk will help you recognize when you need more narrative and will provide some helpful technical advice to get you there, including how to use existing word templates and how to use whisker:: and glue:: to help you dynamically generate text.
Talk materials are available at https://github.com/lmkirvan/presentation
Session: RMarkdown and Quarto
Lightning Talk | Andreas Hofheinz | leafdown: Interactive Multi-layer maps in Shiny apps
Interactive maps are indispensable tools for exploring spatial datasets because of their real-world context and intuitiveness. For a comprehensive understanding of the data, it is often necessary to switch between several map layers (such as states and counties) and to analyze multiple variables simultaneously - both of which are challenging. In this talk, I will show how we can overcome these challenges using the leafdown package, which allows us to create multi-layer maps embedded in Shiny apps.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/andreashofheinz/leafdown_presentation%20-%20Andreas%20H.pdf
Session: Lightning Talks
Liz Roten | Oddly Satisfying - Find delight in the mundane | RStudio (2022)
It happens to us all - a request to “just re-run the code” turns into a project nightmare. The materials left to you are poorly documented and scattered across Word, Excel, ArcGIS, and PDF reports. In this talk, I show you how to turn any project into a point of pride. Using a worked example, I provide guidance on how to complete a project intake, find your opportunity to shine, and how to work efficiently and reproducibly through thoughtful documentation. Finally, I cover how to set up the project for future success. Learn how to take the messy project you dread and make it inexplicably satisfying.
Talk materials are available at https://lizroten.com/oddly
Session: Take a sad process and make it better: project and process makeovers
Mark Rieke | Intro to Workboots: Make Prediction Intervals from Tidymodel Workflows | Posit (2022)
Sometimes, we want a model that generates a range of possible outcomes around each prediction. Other times, we just care about point predictions and may opt to use a fancy model like XGBoost. But what if we want the best of both worlds: getting a range of predictions while still using a fancy model? That’s where bootstrapping comes to the rescue! By using bootstrap resampling, we can create many models that produce a prediction distribution – regardless of the model type! In this talk, I’ll give an overview of bootstrap resampling for prediction, the pros/cons of this method, and how to implement it as a part of a tidymodel workflow with the workboots package.
Talk materials are available at https://github.com/markjrieke/rstudio-conf-2022
Session: Machine learning
Matthew Kay | Visualizing distributions and uncertainty using ggdist | RStudio (2022)
I propose a talk on visualizing distributions and uncertainty using {ggdist}. I will describe how to think systematically about distributional visualization as mappings of PDFs, CDFs, and quantile functions onto aesthetics, and how support for this enables creative and easy exploration of the space of possible uncertainty visualizations. I will highlight features like true gradient support in R 4.1, support for distribution vector datatypes, and the automatic binwidth- selecting geom_dots(). I expect to leave the audience with: (1) a systemic way to think about visualizing distributions and uncertainty in the grammar of graphics and (2) an understanding of how to actually do it using ggdist.
Talk materials are available at https://www.mjskay.com/presentations/rstudio-conf-2022-talk.pdf
Session: Lightning Talks
Meghan Hall |. Cultivating Your Own R Ecosystem as a Solo Contributor | RStudio (2022)
It can be daunting to start using R when no one else in your office is! Using a case study from an administrative higher education office, learn how you can begin to build your own R ecosystem, step by step, to increase the efficiency and impact of your work, even as a solo contributor. Start from scratch and get small wins by replacing common Excel tasks with reproducible code, and then continue to develop iteratively, incorporating more of R’s capabilities into your workflow until you’re humming along with internal packages and parameterized reporting. We’ll discuss how R can ease the burden of documentation and how to handle common challenges like when you can’t control how you get your data or in which tool it is ultimately presented.
Talk materials are available at https://meghan.rbind.io/talk/rstudioconf/
Session: Working with code is hard
Melissa Van Bussel | Achieving a seamless workflow between R, Python and SAS from within RStudio
Some of my best friends use Python…and all of my coworkers use SAS.
Statistics Canada is the official statistical agency of Canada and employs over 6,000 employees. Statistics Canada has a legal obligation to ensure that personal information collected for statistical purposes is kept strictly confidential. An internal system which prevents the release of confidential information is only implemented in SAS. As such, many Analysts and Data Scientists at Statistics Canada must use the SAS programming language as part of their workflow. It is therefore imperative to find ways to work with open source programming languages and SAS seamlessly. I will present a method for achieving a harmonious workflow between R, Python, and SAS, all within RStudio.
Talk materials are available at https://github.com/melissavanbussel/rstudio-conf-2022
Session: Some of my best friends use Python
Mike Cheng | The Polygons of Another World - realtime interactive rendering in R | RStudio (2022)
In this talk I want to explore R’s capabilities for fast, interactive graphical applications. This exploration is driven by my ongoing port of the 1991 action adventure game “Another World”, but these capabilities also open up possibilities for new visualisations and applications in R.
The porting of this game is a ‘moonshot’ project as I try to discover the techniques and tools needed for fast (greater than 20fps) interactive (keyboard + mouse) rendering to R graphics devices. A further constraint is that I want all this to be done in plain R - avoiding any C or javascript as much as possible.
I will discuss three of the key challenges faced: graphics device speed, fast double-buffered rendering and event-driven programming for interactivity.
I will showcase the capability of R to render 5000 moving sprites using the nara package, an interactive drum machine with the eventloop package, and my progress with the ‘Another World’ game engine with animation, keyboard control and synchronised sound.
Talk materials are available at https://github.com/coolbutuseless/RStudioConf-2022
Session: Eye candy: surprising and delightful uses of R
Nicholas Tierney | The Future of Missing Data | Posit (2022)
If you do data analysis, you encounter missing data. Missing data upsets data analysis workflow because you have to make decisions on how to deal with it - do you impute the values? Remove them? These each have consequences! The data we often encounter does not always arrive with a research question in mind, so how do you understand why you have missing values? When I first encountered missing data I was incredibly frustrated at how hard it was to understand and explore it. This frustration led me to create two R packages to explore missing data, {naniar} and {visdat}. In this talk I will showcase how to use these tools to explore missing data, as well as new features that have not been presented, and planned advances.
Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/nicholastierney/The%20Future%20of%20NA%20Data.pdf
Session: Lightning Talks
Nicola Rennie | Say Hello! to Multilingual Shiny Apps | Posit (2022)
Multilingual shiny apps are not straightforward to build. Translation affects almost every single aspect of an app. Although there are a few packages designed to automate the translation process, they tend to only work for the most widely spoken languages.
Using a bilingual English-Welsh shiny app we developed to present public health data as a case study, this talk will discuss:
• how we built a multilingual shiny app; • how translation affected design decisions; • how we overcame the main issues faced; • and most importantly, what we’d do differently next time.
By the end of this talk, you will have a better understanding of how to translate your Shiny app to help you to share your app with a much wider audience.
Talk materials are available at https://nrennie.rbind.io/talks/2022-july-rstudio-conf/
Session: Lightning Talks
Peter Gandenberger | Dashboard-Builder: Building Shiny Apps without writing any code | RStudio
I would like to create (more) Shiny Dashboards, but…
• I don’t know how • I can’t write R code • it’s too complex • I don’t have enough time (even though I know how to build them)
If this sounds familiar, this talk is for you. We present our latest project, the dashboard-builder that allows users to create full Shiny dashboards without writing a single line of code. You can find a demonstration video at https://youtu.be/oOKJLMAkEiw
This drag&drop dashboard-builder allows you to interactively create native Shiny dashboards. Lowering the entry barrier for new users starting their data-science journey. They can begin to visualize their datasets without prior knowledge of R. More experienced users can use the dashboard-builder to quickly sketch out their ideas and export them to act as a foundation for more complex dashboards.
Session: Pour some glitter on it: Polishing the design of your shiny apps
Rebecca Hadi | Exploring Query Optimization: How a few lines of code can save hours of time
If you find yourself waiting hours for your queries to run, this talk is for you. Learn from my query mistakes and avoid crashing your database. In this talk, you’ll learn about minor code changes that can dramatically improve query run time.
Talk materials are available at https://github.com/bhadi26/rstudio-conf-2022-slides/blob/main/rebecca-hadi-r-studio-presentation-2022.pptx
Session: Databases
Tanya Cashorali | Cross-Industry Anomaly Detection Solutions with R and Shiny | RStudio (2022)
This session highlights two anomaly detection use cases in production: identification of problematic life sciences manufacturing units and identification of significant newsworthy events. With both solutions, Shiny is integrated with live data to provide early detection for proactive intervention. Shiny’s intuitive user interface also allows for interaction with the data behind anomalies to uncover potential causes and paths to action or resolution.
The session also briefly highlights a rapid prototyping development approach with Shiny. This technique allows for collaborative refinement of the underlying anomaly detection model in R, quickly incorporating user feedback, where end users may not have in-depth machine learning knowledge.
Talk materials are available at https://docs.google.com/presentation/d/e/2PACX-1vTE7Ee2QIUGDUmfEKmF8l_WTQPVgnGaLJLGuuMquio57bXojeeb5YYSjuzO-xzYxMHxuX2cm_QNC2y-/pub?start=false&loop=false&delayms=60000&slide=id.gbb68c6dbe2_1_44
Session: Working with people is hard
Tiger Tang | Saving 1,000 hours with RStudio: selling R in your workplace | RStudio (2022)
There are many benefits to using R and no lack of packages that help you solve technical difficulties, but you may still get stuck at selling it to decision-makers or implementing it at work. Tiger’s recommendation is to start a project that focuses on automating work with R and gets everyone involved. Once the value of R has been established, selecting RStudio Workbench and RStudio Connect for streamlining tasks would not be a difficult choice.
Several years ago, Tiger’s organization moved away from SAS in favor of R for modeling projects, but there wasn’t much initiative taken company-wide to move everything to a new tool. To help change that, he started a work automation project using R that has saved 12K+ hours of manual work.
In this talk, he will share the key parts of the project, lessons learned, and a structure you can follow if you would like to do something similar in your organization.
Talk materials are available at https://tigertang.org/rst_conf_2022_talk/
Session: Take a sad process and make it better: project and process makeovers
Tom Mock | Quarto for the Curious | RStudio (2022)
Are you curious about Quarto? Maybe you saw it on Twitter or the RStudio::conf agenda. Perhaps this raised questions like: What exactly is Quarto? What about RMarkdown? (don’t worry it’s not going away!) What features does Quarto add? What should I do with my existing Rmd/ipynb files?
This talk will answer all of those questions and more! I’ll present Quarto as a next-gen version of RMarkdown, compare the similarities, and then discuss the new features in Quarto for publishing documents, presentations, blog posts, lab notebooks and more! Lastly, I’ll cover what this means for our customers using RStudio Team, and the exciting new world for Python users.
Talk materials are available at https://thomasmock.quarto.pub/quarto-curious/
Session: RMarkdown and Quarto
Weihuang Wong & Kiegan Rice | How to be a pollinatoR | RStudio (2022)
R users are part of data ecosystems comprising both statistical and non- statistical applications. We may work with SAS or Stata datafiles; non-R users may help run R scripts; or we may need to generate outputs in Word or Excel. Just as pollinators support biodiversity, we believe R users can be constructive members of diverse data ecosystems. Our talk will: (1) outline what it means to be constructive, (2) highlight tools that can help R users contribute to their ecosystems, and (3) describe practices that can improve workflows involving diverse groups of staff and software. We hope our talk will inspire R users to think creatively and empathetically about how R can be a force for good in diverse data ecosystems.
Talk materials are available at https://rsconnect.norc.org/rstudioconf-pollinator
Session: Working with people is hard
Zhian N. Kamvar | Building Accessible Lessons with R and Friends | RStudio (2022)
The Carpentries is a global community of volunteers who collaboratively develop and deliver lessons to build capacity in data and coding skills to researchers worldwide. In the recent redesign of our lesson infrastructure (serving more than100 lessons, used daily by more than 5K learners), we replaced embedded Jekyll templates with a workbench of modular and accessible packages using R and Pandoc. By leveraging renv and knitr for R-based lessons, we provide a seamless and collaborative lesson development experience that maximizes reproducibility and minimizes frustration so authors can focus on the contents, not the tooling. We demonstrate how anyone can use our infrastructure to build customised and accessible sites for their own lessons or tutorials.
Talk materials are available at https://github.com/zkamvar/rstudio-conf-2022
Session: It takes a village: building communities of practice
What’s New in {gt} 0.7.0?
gt 0.7.0 was just released. Rich Iannone, maintainer of gt, dives into the 7 new features added.
For more details, ⬢ Read the blog post on gt 0.7 https://www.rstudio.com/blog/all-new-things-in-gt-0-7-0/ . ⬢ Learn more about gt at https://gt.rstudio.com/ . ⬢ Follow the gt twitter account, https://twitter.com/gt_package .
00:07 The new Word table output format, .docx output. 00:34 A whole new family of vector formatting functions (vec_fmt_*()) has been added. 01:03 Table presets/themes styling with the new opt_stylize() function. 01:50 The new tab_stub_indent() for superfine control over row label indentation (in the stub) 02:26 The new fmt_duration() function for formatting of time duration values. 03:32 An upgraded gtsave() that uses {webshot2}, .png output looks better. 04:14 Accessibility enhancements for HTML table outputs

Open Source Environmental Monitoring with Shiny! | Wayne Jones, Shell
What are the critical factors for successful uptake of an application?
On Tuesday, October 18th at 12 ET, we were joined by Wayne Jones, Principal Data Scientist at Shell to learn about the Shiny application that has become a globally adopted industry standard tool.
GWSDAT (www.gwsdat.net ) is an open source, user-friendly, Shiny application for the visualisation and interpretation of groundwater monitoring data. In this meetup, Wayne will tell the story behind GWSDAT since its first release 8 years ago and will explain the critical factors for successful uptake in the environmental monitoring industry.
Resources shared: ⬢ gwsdat.net ⬢ Github: https://github.com/WayneGitShell/GWSDAT/tree/master/inst/extdata ⬢ GWSDAT LinkedIn Group: https://www.linkedin.com/groups/8715423/ ⬢ Shinyapps.io version of the app: https://stats-glasgow.shinyapps.io/GWSDAT/ ⬢ RStudio Connect: https://www.rstudio.com/products/connect/ ⬢ CRAN Task View for Reproducible Research: https://cran.r-project.org/web/views/ReproducibleResearch.html
RStudio Pro Product Lightning Series Meetup ⚡️
Recording from our October 11th meetup: a lightning series with our RStudio Product Managers to hear what’s new, ask questions, and provide feedback.
Lightning talks: 3:16 - Sharing Internal Packages with RStudio Package Manager 18:07 - Running RStudio workloads in the Cloud with Amazon SageMaker 40:02 - Content execution in Kubernetes with RStudio Connect
Resources and links shared during the meetup: A Package Manager demo tutorial on GitHub: https://github.com/rstudio/package-manager-demo Remote API Quickstart: https://docs.rstudio.com/rspm/admin/getting-started/configuration/#quickstart-remote-cli Differences between RStudio Workbench and RStudio Workbench on SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio.html#rstudio-differences RStudio Workbench release notes: https://www.rstudio.com/products/rstudio/release-notes/ Remote Content Execution with RStudio Connect and Kubernetes Conference talk: https://www.rstudio.com/conference/2022/talks/remote-content-execution-rstudio-connect/
Product Links: Package Manager, control and distribute packages throughout your organization: https://www.rstudio.com/products/package-manager/ Workbench, premiere development environment for data science professionals: https://www.rstudio.com/products/workbench/ Connect, easily share your insights: https://www.rstudio.com/products/connect/
Timestamps: 22:43 - Demo of RStudio Workbench on Amazon SageMaker
Sharing Internal Packages with RStudio Package Manager | Presented by Joe Roberts
Many know that RSPM can be used to mirror public packages from CRAN or PyPI, but it can also be used to share your private, internally developed packages. We’ll explore the latest features to make internal packages easier to deploy within your organization.
Running RStudio workloads in the Cloud with Amazon SageMaker | Presented by James Blair
RStudio Workbench on SageMaker enables users to “right-size” their environment for any given analysis. We’ll showcase how this flexibility enables users to effectively meet the workload demands of various analyses.
Content execution in Kubernetes with RStudio Connect | Presented by Kelly O’Briant
New and interesting ways to configure RStudio Connect: A quick introduction to off-host content execution in Kubernetes
We’re looking forward to learning from you and hearing your feedback as well!
Meetup recordings are always shared here: https://www.youtube.com/playlist?list=PL9HYL-VRX0oRKK9ByULWulAOO5jN70eXv
Data Science Hangout | Matthew Montero, Gen Re | Bringing a Vision to Life
We were joined at the Data Science Hangout by Matthew Montero, Head of Enterprise Data Services at Gen Re.
At 4:23, Matthew started us off with sharing the time that focusing on a vision and defining the strategy around that really clicked for him
“When I worked in pharmaceuticals, I had the pleasure of having a manager who was very focused on vision. So that’s when I really got a feel for once you have a vision and you can sell that vision, then you have the ability to build as big a team you need. You have to be able to back that up with either ways to improve the business or ways to even improve the organization as a whole.
With the knowledge I gained through the experience of dealing with getting consultants, onboarding new hires, and defining the strategy for that vision, I moved on to my current company, Gen Re. It fostered from there, let me take this vision and further define the way I feel it should be at this company and build a full strategy around that.
This was about figuring out, from day one when you get the data in ️ all the way to building applications that use that data ️ the self-service route of other people wanting to use that data for their analytical purposes. Let me build that strategy around there.
Being able to both build out strategy and communicate that to executives, to different business units, the heads of different business units, moved me up relatively fast in a sense. It’s being very confident in that vision and in the strategy and even the principles around that as well.
It’s really laying out the whole vision of how everything is going to be. You don’t go into the technicals yet of how you actually accomplish that, it’s just how you view what it all looks like. And that’s the first thing you got to sell because at least if you have a vision, you have something you can communicate to everyone.
And then that’s when you break down now what’s that strategy to accomplish that vision because you can’t do all that in one day. It’s going to take some time. So you have to have individual steps or a strategy to accomplish that.
When I was earlier in my career, my boss asked - “You know my vision. How are you going to make that vision happen?”
Once I was given that, I sat down and actually really thought through that. That’s the moment it clicked in my head. Like, this is my next step. I want to be able to both define that vision, define the high level strategies to it, and then work with teams to actually make that happen. That was a couple years back and the time it really clicked for me.”
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout (All are welcome! We’d love to see you!)
Model Monitors and Alerting at Scale with RStudio Connect | Adam Austin, Socure
Deploying a predictive model signals the end of one long journey and the start of another. Monitoring model performance is crucial for ensuring the long-term quality of your work.
A monitor should provide insights into model inputs and output scores, and should send alerts when something goes wrong. However, as the number of deployed models increases or your customer base grows, the maintenance of your monitor portfolio becomes daunting. In this talk we’ll explore a solution for orchestrating monitor deployment and maintenance using RStudio Connect. I will show how applications of R Markdown, Shiny, and Plumber can unburden data scientists of time-consuming report upkeep by empowering end-users to deploy, update, and own their monitors.
Timestamps: 2:01 - Start of presentation 3:43 - About Socure 6:12 - Model performance matters, deployment isn’t the end of the story 7:26 - What is a monitor? 8:55 - What is an alert? 11:00 - Monitor example 18:09 - Firing an alert from RStudio Connect 19:00 - Why monitor from RStudio Connect? 24:33 - How monitoring drives success at Socure 30:00 - Git-backed deployment in RStudio Connect 36:00 - Shiny app that their account managers see 46:00 - Architecture of a monitoring system 56:00 - Connect hot tip System-wide packages 57:00 - Why did we try Connect for monitoring 59:02 - Why do we keep using it for that :)
Resources shared: Blastula package: https://github.com/rstudio/blastula connectapi package: https://github.com/rstudio/connectapi rsconnect package: https://rstudio.github.io/rsconnect/ Intro to APIs blog post: https://www.rstudio.com/blog/creating-apis-for-data-science-with-plumber/
Speaker Bio: Adam Austin is a senior data scientist and RStudio administrator at Socure, a leading provider of identity verification and fraud prevention services. His work focuses on data science enablement through tools, automation, and reporting
Beautiful reports and presentations with Quarto | Led by Tom Mock, RStudio
Quarto is a powerful tool for authoring reproducible computational documents in R, Python or Julia. Quarto can also help with sharing your results to business stakeholders across your company. This talk will provide an overview of Quarto’s implementation of revealjs for interactive presentations and HTML/PDF documents for static reports.
Content website: rstd.io/quarto-reports
Timestamps: 2:55 - Start of session 4:10 - Visual editor in RStudio 6:20 - Parameters to create different variations of a report 15:30 - Unified syntax across different output formats 18:01 - Pandoc fenced divs 20:10 - Tabsets 22:22 - Pandoc bracketed spans 24:30 - Footnotes 26:30 - Layout image inline with paragraphs / image into “gutter” column margin 29:23 - Hide all code 29:50 - Code tools (Fold code, source code) 34:12 - Code highlighting 37:03 - HTML Appearance 38:00 - Bootswatch themes 38:43 - PDF Articles 42:05 - Presentations (revealjs (HTML), PowerPoint (MS Office), beamer (LaTeX, PDF)) 45:06 - Creating slides 47:53 - Multiple columns 48:28 - Secret Tip (Alt + Click to Zoom in to a section) 49:24 - Absolute Position 51:04 - Presentation themes 52:44 - Footer/Logo 54:01 - Slide Background 57:01 - Custom classes 58:35 - End slide with helpful links (all shared here: rstd.io/quarto-reports)
This meetup is Part 3 in our Quarto series: Part 1: Welcome to Quarto Workshop: https://www.youtube.com/watch?v=yvi5uXQMvu4 Part 2: Building a Blog with Quarto: https://youtu.be/CVcvXfRyfE0 For more about Quarto: quarto.org
Resources discussed: Visual editor: https://quarto.org/docs/visual-editor/ Parameters: https://quarto.org/docs/computations/parameters.html Tabsets: https://quarto.org/docs/interactive/layout.html#tabset-panel Fenced Divs and Bracketed spans: https://quarto.org/docs/authoring/markdown-basics.html#divs-and-spans Footnotes: https://quarto.org/docs/authoring/footnotes-and-citations.html Figures and figure layouts: https://quarto.org/docs/authoring/figures.html#complex-layouts Code execution options: https://quarto.org/docs/computations/execution-options.html Code chunk format options: https://quarto.org/docs/reference/formats/html.html#code Code appearance: https://quarto.org/docs/output-formats/html-code.html#appearance Code highlighting light/dark: https://quarto.org/docs/output-formats/html-code.html#appearance Function links in code chunks with downlit: https://quarto.org/docs/output-formats/html-code.html#code-linking HTML Themes: https://quarto.org/docs/output-formats/html-themes.html PDF formatting options: https://quarto.org/docs/reference/formats/pdf.html#title-author PDF journal templates: https://quarto.org/docs/journals/templates.html Presentations: https://quarto.org/docs/presentations/index.html Revealjs Options: https://quarto.org/docs/presentations/revealjs/ Advanced Revealjs (absolute positioning, layout helpers like r-stack): https://quarto.org/docs/presentations/revealjs/advanced.html Revealjs themes and customizing: https://quarto.org/docs/presentations/revealjs/themes.html Revealjs footer & logo: https://quarto.org/docs/presentations/revealjs/index.html#footer-logo Inline span text formatting: Emil Hvitfeldt’s Slidecraft 101: Colors and Fonts, https://www.emilhvitfeldt.com/post/slidecraft-colors-fonts/ Meghan Hall’s Quarto Slides, https://meghan.rbind.io/blog/quarto-slides/ Andrew Heiss’ Quarto slides on APIs and webscraping with R, https://github.com/andrewheiss/2022-seacen
Speaker bio: Thomas is the Customer Enablement Lead at RStudio, helping RStudio’s customers be as successful as possible. He is deeply involved in the global data science community, sharing tips on RStats Twitter (find him at @thomas_mock), as co-founder of TidyTuesday, a weekly Data Science learning challenge, and presenting on various Data Science topics on YouTube or at conferences.
For upcoming meetups: rstd.io/community-events

Data Science Hangout | Mythili Krishnaraj, AXA XL | Platform Governance With a Shared Vision
We were joined by Mythili Krishnaraj, Global Delivery Lead - Pricing and Analytics Platform at AXA XL.
Mythili provided a really helpful view into the IT/technology perspective and shared the steps to platform governance at AXA XL.
At (24:14) Mythili shares some guidance on how their team manages 200 users:
With onboarding of users, they are put into different AD Groups. Some users might only use Workbench and are not publishing content to Connect so they are put in different AD Groups.
Something that was initially lacking for us is that we didn’t know who was successfully onboarding or when someone left and we needed to revoke access to reallocate the license.
The governance was thought about as what can we do in terms of each steps?
First, with the license: 1️⃣ How do you get them onboard? 2️⃣ How do you keep an audit trail of what they are doing? 3️⃣ How can we tell when they are leaving?
One other consideration was - what if someone is not able to use the license for 5-6 months? We started putting a timeline around it. If they don’t use the platform for 6 months, we get in touch with them and find out whether they need the license. Otherwise, we take it back.
The team has also started their own R training to support new users. We have helped them with a sort of Wiki to put together all of our R resources so that they know who to go to for a request, how to connect to a data source, etc. That sort of knowledge sharing was a requirement.
In terms of governance, we have also segregated the environments now: we have separate development and production. In development, we also have staging and non-prod.
When they are ready to publish a model, they come through a ticket. We go through it to see what they are deploying, which are the data sources, have they used any hard coded passwords, etc.
There is also a validation framework which we are using. Are they using licensed libraries? Are they using proper packages?
A few tips also shared at (17:06) for IT and data science teams to successfully work together:
Help the IT team understand the value of what you’re doing. Maybe your eventual output takes a week, a month, or even six months. It’s very important for them to be able to understand that value of what you’re going to bring in.
Explain to IT the flexibility that you need. Analytics and data science is all about the speed that comes with any platform.
It’s always a two-way street. Let them know that you’re here to support the governance, security and stability as well. Say, “I’m going to be really supportive of your guardrails but at the same time I’m expecting XYZ from your team.”
Collaboration is required. Each team knows how we work now and there are sometimes tradeoffs for both sides, but we are able to deal with that because we want to do what’s best for the business. This objective is the same both for technologists and the data science world.
Where the conflict comes in is when there is no shared vision. To get that shared vision, both teams need to talk.
This conversation also reminded me of the meetup with Gordon Shotwell at Socure on Creating Secure Systems for Growth: https://www.youtube.com/watch?v=UnLpB4IDpZU
Follow-up blog post on that meetup above as well, “How Data Scientists and Security Teams Can Effectively Work Together” https://www.rstudio.com/blog/how-data-scientists-and-security-teams-can-work-together/
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the weekly Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Getting Started with Shiny for Python - in the browser! || Winston Chang || Posit
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack. You can try out Shiny for Python without installing a single thing… All in the browser.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Winston Chang (@winston_chang)

Multiple Inputs in Shiny for Python || Winston Chang || RStudio
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack. You can try out Shiny for Python without installing a single thing… All in the browser.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Winston Chang (@winston_chang)

Plot Outputs in Shiny for Python || Winston Chang || RStudio
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack. You can try out Shiny for Python without installing a single thing… All in the browser.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Winston Chang (@winston_chang)

Create & Publish a Quarto Blog on Quarto Pub in 100 Seconds | Quarto Pub
Thomas Mock, Quarto Product Manager, walks you through how to build a simple blog with Quarto and share it with the world on quarto.pub, all in less than two minutes.
Quarto is the multi-language publishing system. It also allows you to publish executable code blocks to include R, Python, Julia, or Observable JS output in your blog posts (and many other formats).
Quarto websites and blogs are particularly excellent ways to develop your technical skills and share your learnings with the world.
Resources, ⬡ Creating a Quarto Blog, https://quarto.org/docs/websites/website-blog.html ⬡ Publishing to Quarto Pub, https://quarto.org/docs/publishing/quarto-pub.html ⬡ Customize your Quarto blog or Website. This example creates and deploys a simple Quarto blog template, but there are ways to customize and style your content. Isabella Velásquez walks through this in detail at the Sept 2022 meetup, https://youtu.be/CVcvXfRyfE0 ⬡ Learn more about Quarto at quarto.org.
Requirements,
- To publish from the RStudio IDE, you’ll need to be working on a recent version of RStudio, v2022.07.1 or later.
- You may also work from Jupyter Labs, VS Code, or a notebook integrated with the Quarto CLI
MLOps with vetiver in Python and R | Led by Julia Silge & Isabel Zimmerman
Many data scientists understand what goes into training a machine learning model, but creating a strategy to deploy and maintain that model can be daunting. In this meetup, learn what MLOps is, what principles can be used to create a practical MLOps strategy, and what kinds of tasks and components are involved. See how to get started with vetiver, a framework for MLOps tasks in R and Python that provides fluent tooling to version, deploy, and monitor your models.
Blog Post with Q&A: https://www.rstudio.com/blog/vetiver-answering-your-questions/
For folks interested in seeing what data artifacts look like on Connect, we have these for R: ⬢ Versioned model object: https://colorado.rstudio.com/rsc/seattle-housing-pin/ ⬢ Deployed API: https://colorado.rstudio.com/rsc/seattle-housing/ ⬢ Monitoring dashboard: https://colorado.rstudio.com/rsc/seattle-housing-dashboard/ ⬢ Create a custom yardstick metric: https://juliasilge.com/blog/nyc-airbnb/ ⬢ End point used in the demo: https://colorado.rstudio.com/rsc/scooby
Our team’s reading list (mentioned in the meetup)
Books: ⬢ Designing Machine Learning Systems by Chip Huyen: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/
Articles: ⬢ “Machine Learning Operations (MLOps): Overview, Definition, and Architecture” by Kreuzberger et al: https://arxiv.org/abs/2205.02302 ⬢ “From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors” by Bayram et al: https://arxiv.org/abs/2203.11070 ⬢ “Towards Observability for Production Machine Learning Pipelines” by Shankar et al: https://arxiv.org/pdf/2108.13557.pdf ⬢ “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction” by Breck et al: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf
Web content: ⬢ How ML Breaks: A Decade of Outages for One Large ML Pipeline by Papasian and Underwood: https://www.youtube.com/watch?v=hBMHohkRgAA ⬢ MLOps Principles by INNOQ: https://ml-ops.org/content/mlops-principles ⬢ Google’s Practitioners Guide to MLOps by Salama et al: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf ⬢ Gently Down the Stream by Mitch Seymour: https://www.gentlydownthe.stream/
Speaker bios: Julia Silge is a software engineer at RStudio focusing on open source MLOps tools, as well as an author and international keynote speaker. Julia loves making beautiful charts, Jane Austen, and her two cats.
Isabel Zimmerman is also a software engineer on the open source team at RStudio, where she works on building MLOps frameworks. When she’s not geeking out over new data science techniques, she can be found hanging out with her dog or watching Marvel movies


Submitting Your Work to the Table Contest | 2022 Table Contest
Rich Iannone walks through how to submit a table to the 2022 Table Contest. He explains considerations for each field, and how to update & edit your entry afterwards.
Learn more about the 2022 Table Contest at https://www.rstudio.com/blog/rstudio-table-contest-2022/

Data Science Hangout | JD Long, RenaissanceRe | Empathy When Integrating with Other Tools
We were recently joined by JD Long, VP - Risk Management at RenaissanceRe. JD is a financial risk manager, analytical polyglot, and purveyor of colorful metaphors.
As people have recently asked more about building communities, I’ll share this snippet here from 45:29 in his hangout. 🤝
I started the Chicago useR Group 13-14 years ago and the design pattern for that group was, I think, a really good one:
1️⃣ We had about four or five people who were going to get together, drink beer, and talk about R anyway, and what we effectively said was, “let’s get in the types of presentations the four or five of us would like to hear, and then invite other people.”
And if they don’t show up, we don’t really care because we were going to drink beer and listen to this anyway because we’re interested in this.
And that community grew wildly with me running it like as a benevolent dictator just getting the presentations I wanted to hear.
2️⃣ And then one of the things I observed is we were getting lots of new users of R. So I decided once a quarter we would have a beginner’s meetup–
And at the beginner’s night, we would make sure we had two things:
-
Topics appropriate for beginners: like grouping and summing, and getting environments set up or updating packages. All the normal stuff people have friction with. Focus on that.
-
Then we would make sure we had–I would call it now like a learner salon or something. We had a handful of people who were more experienced who said, I will sit at a table and anybody that brings their computer with questions, I will answer.
We would have really senior people with lots of experience there doing Q&A, and people could bring their own problems, literally eat pizza, drink beer, and look at code. Those beginner nights were super helpful.
So anyway, when people ask about building communities, I recommend those two things.
Resources shared in the chat: ► Esquisse package: https://github.com/dreamRs/esquisse ► Shinyuieditor: https://rstudio.github.io/shinyuieditor/ ► Shiny Developer Series episode 7 (talking about esquisse): https://shinydevseries.com/interview/ep007/ ► Another “bridge” tool for business analytics is radiant: https://radiant-rstats.github.io/docs/ ► Norm Conf: https://normconf.com/
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout (All are welcome! We’d love to see you!)
Data Science Hangout | Tiger Tang, CARFAX | Quantifying the Hours Saved
We were joined by Tiger Tang, Manager, Data Science at CARFAX. Tiger (Chongtai) Tang is dedicated to building the Data Science team specializing in NLP and forecasting. He is a big fan of Shiny and he has a passion for the Data Science community.
We recommend checking out Tiger’s 2022 RStudio Conference Talk as well, “Saving 1000 hours with RStudio:” https://www.rstudio.com/conference/2022/talks/saving-1000-hours-rstudio/
How do you sell RStudio in your workplace? ️ Build a work automation process
How do you build an automation process? 1️⃣ Look at a typical report & identify all the manual steps 2️⃣ These manual steps can usually be tied into 3 portions: getting the data, wrangling/analysis/visualization, and communication 3️⃣ You can replace these portions with R code, with the help of various R packages
What are the three types of automation? 1️⃣ Attended automation - reports that still need human involvement: use R code in RStudio 2️⃣ Unattended automation - don’t need human input, but need to happen at the same time: use RStudio + RStudio Connect 3️⃣ Hybrid - combination of the previous 2, human input will come from the stakeholder & they can kick off processes to get answers: use Shiny + RStudio Connect
Ok, time to sell it to decision makers
What are the benefits? Reproducibility Less human error Cost benefit (hours saved)
Why weren’t they interested when you shared these?
The benefits listed above are great for selling to an R user who is concerned about the day to day workflow, but not decision makers who are more concerned about ROI.
Update your strategy in highlighting the benefits: Cost benefit (hours saved) - If we go through with this we might be able to save 1,000 hours per year Less human error Reproducibility (as a free add-on travel insurance)
It’s still the same benefits, just in a slightly different order. Start with one that does not require too much context to understand.
What do you need to do the actual automation?
Understand the current process and document it. This includes understanding: ⬢ the business reasons ⬢ the occurrence (daily, weekly, ad-hoc) ⬢ how you will communicate ⬢ how often to update the process so it will not become obsolete ⬢ you are not always the original report owner, so you need to know when to stop and call for additional help.
From this you will understand the complexity, impact, and stability & can help you decide which project to start with as well.
Top 3 recommendations for automation: 1️⃣ Always start with components: For ex, if you have a process that involves: SQL, Excel, and Outlook - code them individually because the same team will need this again and you can reuse the code.
2️⃣ Test, Test, Test: Capture all the scenarios possible.
3️⃣ Be practical and stay on target: Not everything needs to be fully automated. It’s not about building something cool with R but building something impactful with R.
Structure for you:
1️⃣ Identify tasks in your workplace 2️⃣ Build a proposal with the benefits that matter to decision makers in your workplace 3️⃣ Build a requirement doc that identifies the right task to start with 4️⃣ Code by component and do plenty of tasks, while staying on target 5️⃣ Share the progress from time to time
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
RStudio Sports Analytics Meetup: NFL Big Data Bowl 2022 Winners discuss the Math behind the Path
RStudio Sports Analytics Meetup: NFL Big Data Bowl 2022 Winners discuss the Math behind the Path
Led by Robyn Ritchie, Brendan Kumagai, Ryker Moreau, Elijah Cavan
Helpful Links: Kaggle: https://www.kaggle.com/code/robynritchie/punt-returns-using-the-math-to-find-the-path Github: https://github.com/ritchi12/punt_returns_using_the_math_to_find_the_path Optimal Path Generator Shiny app: https://rstd.io/meetup-shiny-app Related blog post, Tips for Getting Started With the NFL Big Data Bowl: https://www.rstudio.com/blog/tips-for-getting-started-with-the-nfl-big-data-bowl/
Timestamps: 3:02 - Start of presentation 4:07 - Meet the team 5:25 - Quick overview of 2022 NFL Big Data Bowl subject 7:00 - Finding the best path to the end zone 9:12 - Penalized Expected Arrival Time (PEAT) 13:10 - Take the PEAT to find the safest route (A-star algorithm) 15:11 - Comparing the optimal path to the observed path 18:30 - Key R functions for this project 23:37 - Optimal Path Generator Shiny app demo 33:38 - Comparing optimal paths (risk/reward) 35:44 - “Aiming for the side won’t provide” 37:54 - The precision of a good decision 38:38 - How do you have a good submission for the big data bowl 40:40 - Elements of a good big data bowl submission
Talk Abstract: Simon Fraser University has been a force in the NFL’s Big Data Bowl for years. Focusing on punt returns, this team used tracking data to develop an algorithm to find the optimal path for a successful return, quantify player formations on the field and predict yards remaining on any frame of the return. This year’s winning team will discuss their entry, the R code behind the optimal path and advice for this year’s entrants.
Speaker Bio: Robyn, Brendan, Ryker and Elijah are all graduate students in statistics at Simon Fraser University in Vancouver, Canada. They each have a passion for sports analytics and enjoy applying their stats knowledge to the game.
Q&A Timestamps: *rough estimate :) 15:00 - Do you think being in a general Statistics graduate program was beneficial to your work? As opposed to more specific like “Data Science” or “Sports Analytics”? 19:00 - Did you calculate PEAT at just a single point in time (at catch), or at multiple points during the return? 30:00 - Is it possible to share the code for future reference? 43:00 - What were the biggest challenges when going through the data? 45:00 - In terms of all your data inputs, do you see this extending to performance data like expected speed, fatigue etc., perhaps from practice etc.? 46:00 - At any point during the project, did you feel like you had hit a cul-de-sac or impasse? 48:20 - How do you get started working with player tracking data? 56:00 - Sounds like you were running the code locally? Were there any specific roadblocks to offloading that to the cloud, or was that just not really necessary?
Upcoming community events: rstd.io/community-events Feedback & Suggestions: rstd.io/meetup-feedback
Data Science Hangout | Jay Sewell, Harry Rosen | Prioritizing work with a centralized data team
We were be joined by Jay Sewell, Director of Analytics at Harry Rosen. Focused on plugging data into decisions and actions.
A snippet from the hangout last week:
We did an audit and we had 1,000 different reports that we had to migrate.
Rather than migrating all 1,000 reports where we couldn’t see who was using them - we developed this technique where we ask people:
Ok, you use this report…show me how you make a business decision from the data in this report.
A good question to ask is:
Imagine you pulled this up, and I said “Oh no, I just found out all this data is wrong and I gave you all new numbers that you’ve never seen before. How does your week change? How does your month change?”
Are you doing something different now that I’ve shown you new numbers?
If you can’t reasonably convince me that you week or month has changed because the report was wrong, it’s a useless report and we should not be looking at it.
We’re trying to get at plugging information into actions and decisions and stripping down reports that aren’t doing that. If you’re just looking at a report to pass on some information to someone else who is going to pass that on, it’s time to go.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
To join future data science hangouts, add to your calendar here: rstd.io/datasciencehangout
Communicating the value of data science | Led by Merav Yuravlivker
Communicating the value of data science Led by Merav Yuravlivker, CEO & Co-Founder of Data Society
Four years ago, Merav sent out a small survey to better understand the challenges of data professionals. It quickly spread and hundreds of responses poured in from Frustrated Data Scientists, who didn’t feel understood or have the support they needed. From that, Data Society released the Data Science Communicator Toolkit to help alleviate their pain points.
Now, they’ve relaunched the survey to see how they’re doing today - what’s better, what’s worse, and what’s next. Merav will release the preliminary results of the updated survey and talk through how data professionals today can bridge the communication gap to foster better insights and increase their impact.
Survey link: https://bit.ly/3pxrejT Data Science Communicator Toolkit: https://www.rstudio.com/champion/business-case (under the section that says build your case with a presentation)
Speaker bio: Merav Yuravlivker is the CEO and co-founder of Data Society. She and her team train up thousands of professionals every year across Fortune 500 companies, large organizations, and federal agencies to help them use data better and solve tough challenges. Merav is passionate about delivering the impactful combination of education and data to transform how industries run today.
What is a Champion Chat? The champion chat is an informal meetup where we can ask questions to the community and share best practices with each other for advocating for data science at your organization. During these calls, we may also feature a short presentation from a community member who has championed data science and has a few tips to share with us all.
*Please note, any presentation portion of this meetup would be recorded, but the open discussion/ Q&A will not.
Who? All are welcome - no matter your role/industry/experience No need to register for anything It’s always okay to join for part of a session You can just listen-in if you want You can ask anonymous questions too!
Helpful info on advocating for data science: rstudio.com/champion
Building a Blog with Quarto | Led by Isabella Velásquez, RStudio
Led by Isabella Velásquez
A few helpful links upfront: Quarto documentation https://quarto.org/ Meetup presentation: https://rstd.io/build-quarto-blog Blog exercise GitHub repo https://rstd.io/quarto-blog-exercise-repo Blog exercise Cloud project https://rstd.io/quarto-blog-exercise-cloud Upcoming events: rstd.io/community-events Welcome to Quarto Workshop: https://youtu.be/yvi5uXQMvu4
For the examples above, please ensure you are running Quarto Version: 1.0 or higher.
Timestamps: (to be updated) 42:03 - Add a blog post
Abstract: A blog is a fantastic opportunity to record your data stories, gain exposure for your expertise, and support others in their data science journey. In this talk, I will discuss building a blog with Quarto. Quarto is the multi-language, next-generation publishing system from RStudio, with many new features and capabilities. Quarto websites include integrated support for blogging. You can quickly get up and running with a blog and focus on customization and style. Quarto also allows you to publish executable code blocks to include R, Python, Julia, or Observable JS output in your blog posts.
Speaker Bio: I am a content strategist, data enthusiast, and author. My main goal is to drive engagement around all the awesome things happening at RStudio
Data Science Hangout | Ivonne Carrillo Domínguez, Bixal | Transitioning to data engineering
We were joined by Ivonne Carrillo Domínguez, Data Engineering Manager at Bixal. Ivonne is passionate about storytelling and empowering data professionals to jump to the cloud.
Here’s a snippet (55:00) including a few thoughts that both Ivonne and Brittany shared at the hangout with regards to some people being overlooked for data engineering roles based on their experience:
When I am hiring people for a data engineer position - I focus more on the programming skills. If you are a good engineer, you will learn Spark or MapReduce in time. You will end up using certain tools, but in my opinion I wouldn’t say no to someone because of a specific technology. If you find a good candidate, you can teach them or mentor them in specific frameworks.
When I started my first job as a data engineer, I didn’t know about big data or Spark. I had a lot of experience with Java and that’s why they hired me. I think knowing how to program is more important - and that’s the philosophy I use.
I think what they are looking for when they ask about certain tools is if you have experience with concurrency or parallel jobs. Sometimes you need to think a bit differently with distributed computing frameworks, but if you start playing around and reading about it, it will also help give you an opportunity to land a job.
Sometimes when you’re working through a third party like a hiring recruiter, they may not know all of the things that are needed for the role. They’re given the job description and they may not know “I really need someone who has this skill, but if they don’t have this particular one - I’m willing to train on that.” The recruiter might not know that information and this may even change depending on how long the role is open for.
What’s best for the candidate in that case is to get as close to the actual hiring manager as possible. Give them your resume and let them know your experience because sometimes they’ll look at your resume and would absolutely love to give you an interview, where the recruiter might say, “oh, you don’t meet X percent of the job description, therefore I’m not going to pass it on.” This is one of the disconnects between having a third party do that interfacing.
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Adam Bly, System | Decentralizing decision making
We were joined by Adam Bly, Founder & CEO at System. Adam asks, why haven’t we seen it all connected yet?
Primitives for leaders: (12:12)
Adam shared how he learned from Daniel Ek, the founder of Spotify - the importance of decentralizing as much of the decision making as possible to increase speed, autonomy, and ownership regardless of the size of a team.
This puts a great deal of emphasis on the primitives that you feel are important for every leader in the company and then for their teams to really embrace that, understand and make sure people really get it. This empowers teams to be agile, to be squads, and go off and do their thing.
We put a great deal of emphasis at System on a set of first principles that govern how we think about the product and technology, and then tensions that shape any major decision.
We put a great deal of emphasis on our values and talk a lot about those and use those as primitives when we’re making product decisions or technology decisions.
We also put a great deal of emphasis on the use of data broadly in the company to inform our decision making.
By really equipping everybody with those three things that we talk about from day one onward through professional development, 1:1s, and everything at System, it allows for a high degree of ownership, autonomy, and distributed decision making around the company.
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Posit Meetup | Afshin Mashadi-Hossein, Bristol Myers Squibb | Framework for Data Collaboration
Led by Afshin Mashadi-Hossein, Sr Principal Scientist at Bristol Myers Squibb
Github link: https://github.com/amashadihossein/daapr Other pharma use cases: rstudio.com/champion/life-science RStudio for Clinical Reporting: rstudio.com/solutions/pharma Chat with RStudio: rstd.io/chat-with-rstudio
Abstract: For data science teams, data preparation takes substantial investment of time, data science expertise and subject matter proficiency. However, as the name implies, data preparation is typically viewed merely as a means to an end, encouraging creation of expensive but often single-use and fragile elements in data analysis workflows.
Rather than seeing data preparation as an obstacle to be removed, we propose a framework that recognizes the time and expertise invested in data preparation and seeks to maximize the value that can be derived from it.
Viewing analysis-ready data as a multi-purpose, modularly built product that should lend itself to collaborative development and maintenance, the framework of Data-as-a-Product (DaaP) aims to remove barriers to version tracking and collaborative data development and maintenance. Specifically, the framework, which is entirely implemented in R, enables joint code and data versioning based on git, standardizes metadata capture, tracks R packages used, and encourages best practices such as adherence to functional programming and use of data testing.
Collectively, the patterns established by the DaaP framework can help data science teams transition from developing expensive, single-use “wrangled” datasets to building maintainable, version-controlled, and extendable data products that could serve as reliable components of their data analyses workflows.
Bio: Afshin is a data scientist who is passionate about putting engineering and computational tools to work to realize the potential of biomedical data in service to human health
BioC 2022 - Hello, Quarto!
Mine Çetinkaya-Rundel, PhD., Professor of the Practice at Duke University, Data Scientist and Professional Educator at RStudio, Inc., gives her keynote presentation at the Bioconductor Conference 2022. Dr. Cetinkaya presents Quarto, an open-source scientific and technical publishing system built on Pandoc.
Dr. Mine began her presentation by introducing Quarto and her personal experience using and teaching it to others. She then continued by giving an overview of the R Markdown ecosystem which includes packages like xaringa, Distill, Blogdown, Rmarkdown, and more. She explained the use of Quarto with these packages along with some other Quarto highlights. Afterward, Dr. Mine gave her first demo of Quarto which included setting up and sharing some handy features. Dr. Mine also gives another demonstration on publishing with Quarto and an overview of the features of collaboration. Dr. Mine then gives a second demonstration of collaborating and teaching with Quarto. To wrap up the presentation, Dr. Mine shared about reimagining open source and the work done by Openscapes and their mission for open practices accelerating data-driven solutions, and increasing diversity, equity, inclusion, and belonging in science. The presentation ended with a questions and answers session from the audience.
Main Sections
0:00 Introduction 4:29 Quarto! 6:12 Share 10:57 The R Markdown ecosystem 11:59 Quarto highlights 13:40 Demo of Quarto 21:40 Demo Quarto Publishing 22:58 Quarto rundown 25:23 Collaborate 28:43 Demo 2 34:07 Teaching with Quarto 40:39 Reimagine 44:03 Q&A and Resources
More Resources
Bioconductor Conference Site: https://bioc2022.bioconductor.org/ BioC2022 Github: https://github.com/Bioconductor/BioC2022
Main Site: https://www.r-consortium.org/ News: https://www.r-consortium.org/news Twitter: https://twitter.com/Rconsortium LinkedIn: https://www.linkedin.com/company/r-consortium

Intro to Functional Data Analysis - Part 2 | Matthew Malloure, Dow Chemical
RStudio Meetup: Functional Data Analysis (Part 2) Led by Matthew Malloure, Dow Chemical
Link to slides: https://github.com/MatthewMalloure/RStudioMeetup_FDA
Intro to Functional Data Analysis (Part 1) https://youtu.be/nA9fVOCD8yM
Timestamps: 3:00 - Start of presentation 7:50 - Recap, what is functional data analysis (FDA)? 12:00 - Why do we need FDA? 16:40 - Initial step in FDA applications - smoothing 22:21 - What made you first interested in FDA? 25:38 - What is your decision framework for which basis function to choose? 27:50 - Screening additives compared to control 41:27 - Specific FDA problems can pop up at work and I’m not sure which analysis is right. Is there a recommended resources for selecting approaches? 44:40 - Can you define “functional” as it is used in the FDA context? 49:30 - Functional regression 58:50 - Alternative modeling approaches
Abstract: The primary purpose of this presentation is to continue the introduction of functional data analysis methods that was kicked-off during the March RStudio Energy Meetup by Santiago Rodriguez. During that session’s Q&A, two specific methods were frequently mentioned: Functional Principal Components Analysis (FPCA) and Functional Regression. In this talk, both methods are introduced in more detail, applied to a simulated example from the chemical industry, and compared to their univariate/multivariate analogues. Though no code will be shown in the presentation, commented R code used to produce all data, analyses, and figures will be provided.
Speaker Bio: Matt is an associate research data scientist supporting new product development within the Packaging and Specialty Plastics business at The Dow Chemical Company. His specialty areas include functional data analysis, Bayesian hypothesis testing, computational statistics, and experimental design. Prior to joining Dow he earned a BS in Statistics and MS in Biostatistics at Grand Valley State University and a PhD in Statistics from Texas A&M University
Data Science Hangout | Lindsey Dietz, Federal Reserve Bank | Focus on the impact of the output
We were joined by Lindsey Dietz, Stress Testing Production Function Lead at the Federal Reserve Bank of Minneapolis. Lindsey leads a team of quants & data scientists implementing and analyzing stress testing models in the Federal Reserve System. She loves diving into the nitty gritty of data and using R code to better understand the statistical dynamics we observe in the banking sector.
Loved kicking off the session with tips for communicating technical results to the business.
⬢ It’s key to think of pretty much every audience as a non-technical audience, even when you have people you know are technical. It’s best not to assume that your audience ever knows the things you know because it’s unlikely that anyone will be as deeply immersed in a problem as you are.
⬢ Focus on the impact of the output first. I like to tell my team that what you might think is the smallest piece of work, someone else is going to think that is magic. I was actually just telling someone this earlier today. They were talking about the coding they had done and I said, “you know, the thing that is going to really impress people is that you generated this automated piece of documentation from your code basically for free.”
⬢ I remember someone at the RStudio Conference talking about “minimizing time to magic” and that is really the impact driver for most people.
⬢ Focus on the impact in your regular words and remove the jargon.
⬢ Trust yourself that you’ve done all the checking, the assumptions, and the technical details. Write that down for yourself and have a good appendix or technical paper to go with your work.
⬢ Non-technical audiences may ask you questions that you didn’t even think about that are really helpful in being proactive for future analysis.
⬢ Practice your translation skills, even if you know they’re not going to understand your world and you’re not necessarily going to understand theirs. For decision makers - bottom lines are important, right? How does this save us time, how does it save us money? These are key concepts that you should keep in mind for non-technical spaces.
Resources shared: Stress Testing Publications: https://lnkd.in/egXfajBF Info on Stress Testing data submitted by banks: https://lnkd.in/eqh5AckU Where to download some useful public bank data: https://lnkd.in/eRiceXaU Upcoming R Ladies meetups around the world! https://lnkd.in/dzcpRT-D Upcoming R user groups around the world as well! https://lnkd.in/esNku4Ei
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Welcome to Quarto Workshop! | Led by Tom Mock, RStudio
Welcome to Quarto 2-hour Workshop | Led by Tom Mock, RStudio
Content website: https://jthomasmock.github.io/quarto-2hr-webinar/ FULL Workshop Materials (this was from a 2-day workshop): rstd.io/get-started-quarto Other upcoming live events: rstd.io/community-events
Double-check: Are you on the latest version of RStudio i.e. v2022.07.1 or later?
Packages used: tidyverse, gt, gtExtras, reactable, ggiraph, here, quarto, rmarkdown, gtsummary, palmerpenguins, fs, skimr
️ Pre-built RStudio Cloud with workshop materials already installed: https://rstudio.cloud/content/4332583
For follow-up questions, please use: community.rstudio.com/tag/quarto
Timestamps: 7:16 - What is Quarto? 8:28 - How does R Markdown work? 9:40: Quarto, more than just knitr 13:56 - Quarto can support htmlwidgets in R and Jupyter widgets for Python/Julia 14:18 - Native support for Observable Javascript 19:28 - Quarto in your own workspace (Jupyter Lab, VSCode, RStudio) 20:26 - RStudio Visual Editor mode 23:30 - VS Code YAML 26:02 - Quarto for collaboration 26:55 - How do you publish Quarto? (Quarto Pub, GitHub Pages, RStudio Connect, Netlify) 28:44 - What about Data Science at Work? 29:59 - Formats baked into Quarto (basic formats, beamer, ppt, html slides, advanced layout, cross references, websites, blogs, books, interactivity) 32:13 - What to do with my existing .Rmd or .ipynb? 33:16 - Why Quarto, instead of R Markdown? 40:50 - Text Formatting 41:30 - Headings 41:51 - Code (also merging R and Python in one document) 43:29 - What about the CLI? 44:55 - Navigating in the terminal 57:56 - PART 2: Authoring Quarto 1:00:22 - Output options 1:04:46 - Quarto workflow 1:12:06 - Quarto YAML intelligence 1:13:20 - Divs and Spans 1:22:13 - Figure layout 1:34:40 - Code chunk options 1:41:00 - Quarto and R Markdown (converting R Markdown to Quarto)
This 2-hour virtual session is designed for those who have no or little prior experience with R Markdown and who want to learn Quarto.
Want to get started with Quarto?
- Install RStudio v2022.07.1 from https://www.rstudio.com/products/rstudio/download/#download - this will come with a working version of Quarto!
- Webinar materials/slides: https://jthomasmock.github.io/quarto-2hr-webinar/
- Workshop materials on RStudio Cloud: https://rstudio.cloud/content/4332583
What is Quarto?
Quarto is the next generation of R Markdown for publishing, including dynamic and static documents and multi-lingual programming language support. With Quarto you can create documents, books, presentations, blogs or other online resources.
Should I take this?
As with all the community meetups, everyone is welcome. This will be especially interesting to you if you have experience programming in R and want to learn how to take advantage of Quarto for literate data science programming in academia, science, and industry.
This workshop will be appropriate for attendees who answer yes to these questions:
Have you programmed in R and want to better encapsulate your code, documentation, and outputs in a cohesive “data product”? Do you want to learn about the next generation of R Markdown for data science? Do you want to have a better interactive experience when writing technical or scientific documents with literate programming?
For more info on Quarto: quarto.org
A Beginner’s Guide to Shiny for Python || Winston Chang || Posit
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Winston Chang (@winston_chang) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

An Interview with Winston Chang: Building a Wordle App with Shiny for Python || RStudio
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Winston Chang (@winston_chang) + Jesse Mostipak (@kierisi) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Data visualization and plotting with Shiny for Python || Carson Sievert || RStudio
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Carson Sievert (@cpsievert) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Getting Started with {shinytest2} Part 2 || Exporting values || RStudio
00:00 Introduction 00:29 Exporting reactives 03:28 Using exportTestValues()
Part 1 - Getting started: https://youtu.be/SS1Na3c8lhk Part 3 - Using shiny.testmode: https://youtu.be/xDxa_mDwN04
Manually testing Shiny applications is often laborious, inconsistent, and doesn’t scale well. Whether you are developing new features, fixing bug(s), or simply upgrading dependencies on a serious app where mistakes have real consequences, it is critical to know when regressions are introduced. shinytest2 provides a streamlined toolkit for unit testing Shiny applications and seamlessly integrates with the popular testthat framework for unit testing R code.
shinytest2 uses chromote to render applications in a headless Chrome browser. chromote allows for a live preview, better debugging tools, and/or simply using modern JavaScript/CSS.
By simply recording your actions as code and extending them to test the more particular aspects of your application, it will result in fewer bugs and more confidence in future Shiny application development.
Read up on shinytest2 here: https://rstudio.github.io/shinytest2/
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) Motion design and editing: Jesse Mostipak (@kierisi)
Theme song: Brad PKL by Blue Dot Sessions (https://app.sessions.blue/browse/track/113507 )

Getting Started with {shinytest2} Part 3 || Using shiny.testmode in {shinytest2} || RStudio
00:00 Introduction 00:15 Testing production apps
Part 1 - Getting started: https://youtu.be/SS1Na3c8lhk Part 2 - Exporting values: https://youtu.be/7KLv6HdIxvU
Manually testing Shiny applications is often laborious, inconsistent, and doesn’t scale well. Whether you are developing new features, fixing bug(s), or simply upgrading dependencies on a serious app where mistakes have real consequences, it is critical to know when regressions are introduced. shinytest2 provides a streamlined toolkit for unit testing Shiny applications and seamlessly integrates with the popular testthat framework for unit testing R code.
shinytest2 uses chromote to render applications in a headless Chrome browser. chromote allows for a live preview, better debugging tools, and/or simply using modern JavaScript/CSS.
By simply recording your actions as code and extending them to test the more particular aspects of your application, it will result in fewer bugs and more confidence in future Shiny application development.
Read up on shinytest2 here: https://rstudio.github.io/shinytest2/
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) Motion design and editing: Jesse Mostipak (@kierisi)
Theme song: Brad PKL by Blue Dot Sessions (https://app.sessions.blue/browse/track/113507 )

Getting Started with {shinytest2} Part I || Example + basics || RStudio
00:00 Introduction 00:48 Overview of the demo Shiny app 03:00 Running record_test() 04:44 Results from record_test() 07:18 A note on .png files created during testing 08:52 Debugging with shinytest2 09:32 Using app$view() to open a visual representation of a headless browser
Part 2 - Exporting values: https://youtu.be/7KLv6HdIxvU Part 3 - Using shiny.testmode: https://youtu.be/xDxa_mDwN04
Manually testing Shiny applications is often laborious, inconsistent, and doesn’t scale well. Whether you are developing new features, fixing bug(s), or simply upgrading dependencies on a serious app where mistakes have real consequences, it is critical to know when regressions are introduced. shinytest2 provides a streamlined toolkit for unit testing Shiny applications and seamlessly integrates with the popular testthat framework for unit testing R code.
shinytest2 uses chromote to render applications in a headless Chrome browser. chromote allows for a live preview, better debugging tools, and/or simply using modern JavaScript/CSS.
By simply recording your actions as code and extending them to test the more particular aspects of your application, it will result in fewer bugs and more confidence in future Shiny application development.
Read up on shinytest2 here: https://rstudio.github.io/shinytest2/
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) Motion design and editing: Jesse Mostipak (@kierisi)
Theme song: Brad PKL by Blue Dot Sessions (https://app.sessions.blue/browse/track/113507 )

Hello, World! A Quick Tour of Shiny for Python || Carson Sievert || Posit
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Carson Sievert (@cpsievert) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Hey Shiny Team, what are some of your biggest learnings from 2022? || Shiny Developers || RStudio
BIG THINGS happened on the Shiny team in 2022! Our team built out a new Shiny UI Editor, Shiny for Python, and Shiny for Python in the browser using WebAssembly. So we asked some of our Developers what their biggest learnings have been from building these products!
Learn more about Shiny for Python: https://shiny.rstudio.com/py/
Content: Winston Chang (@winston_chang), Carson Sievert (@cpsievert), Nick Strayer (), Michael Chow (@chowthedog) Producer: Jesse Mostipak (@kierisi) Video editing + motion design: Tony Pelleriti (@TonyPelleriti)





Shiny Programming Practices || Joe Cheng || Posit
Have you ever wanted to sit down and talk with Joe Cheng, the creator of Shiny and CTO of Posit (RStudio) and ask him how he approaches programming? Look no further - we’ve got that conversation for you right here!
Shiny makes it easy to build interactive web apps straight from R or Python. You can host standalone apps on a webpage or embed them in Markdown-style documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions.
Learn more about Shiny: https://shiny.rstudio.com/ Check out Shiny for Python: https://shiny.rstudio.com/py/ Explore our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Joe Cheng (@jcheng) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Shiny UI Editor Feature Tour || Nick Strayer || Posit (RStudio)
The Shiny UI Editor is a dynamic drag-and drop interface to help you design beautiful Shiny apps. The Shiny UI Editor is a visual tool for building the UI portion of a Shiny application that generates clean and human-readable code.
The goal of the Shiny UI Editor is to allow people to build the broad-level UI for their Shiny app without writing code. The editor is intended for those who may not be comfortable with the HTML-style code of Shiny’s UI functions or who simply don’t want to fiddle with sizes to get things laid out correctly.
Learn more about the Shiny UI Editor here: https://rstudio.github.io/shinyuieditor/ And read up on GridLayout here: https://rstudio.github.io/gridlayout
Content: Nick Strayer (@NicholasStrayer) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Shiny UI Editor Project Walkthrough || Nick Strayer || RStudio
The Shiny UI Editor is a dynamic drag-and drop interface to help you design beautiful Shiny apps. The Shiny UI Editor is a visual tool for building the UI portion of a Shiny application that generates clean and human-readable code.
The goal of the Shiny UI Editor is to allow people to build the broad-level UI for their Shiny app without writing code. The editor is intended for those who may not be comfortable with the HTML-style code of Shiny’s UI functions or who simply don’t want to fiddle with sizes to get things laid out correctly.
Learn more about the Shiny UI Editor here: https://rstudio.github.io/shinyuieditor/ And read up on GridLayout here: https://rstudio.github.io/gridlayout
Content: Nick Strayer (@NicholasStrayer) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Shinywidgets - An Overview || Carson Sievert || RStudio
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Shinywidgets lets you use ipywidgets in Shiny for Python applications. We called it ipyShiny during development, but we’re launching as Shinywidgets! Learn more about how to integrate them into your Shiny for Python apps. .
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Carson Sievert (@cpsievert) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Wrangling data for a Shiny app in Python || Michael Chow || Posit
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack.
Learn more about Shiny for Python: https://shiny.rstudio.com/py/ Check out our interactive Shiny for Python examples: https://shinylive.io/py/examples/
Content: Michael Chow (@chowthedog) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)

Data Science Hangout | Rebecca Hadi, Lyn Health | Transparent & Visible Work
We were joined by Rebecca Hadi, Head of Data Science at Lyn Health. Rebecca is passionate about improving healthcare using data science and machine learning.
At 19:11, Rebecca shared her approach to communication with transparent and visible work.
Here are 4 things they do on their team to create transparent & visible work:
- Starting off with an open Data Science channel (different from their team channel) that’s open to the company.
For example, if we do an analysis for the sales team - we will post it in that channel and tag the sales team. That way anyone in the company can see the work. We keep conversations in the channel, rather than DMs. A lot of times someone else may be interested in that work.
- We also have a Google sheet that we call our insight repository. We have that pinned to our channel so when we post analysis, we also put them in our insight repository with one row per analysis.
For example: in-patient mortality model, a headline, the link to the PDF, and the GitHub link. The insight repository has been very well received as a place that someone can start when they have a question.
- I also host a weekly data science prioritization meeting and we use Monday as our project management tool.
This has been really helpful because in the past I’ve had some challenges with balancing requests from our clinical team versus sales team and understanding the priority. It can be hard to be that person in the middle. It’s a lot more productive when we can all have that conversation live. The sales team might say, “I need this for XYZ meeting” and you might have to say, “Ok, you know that’s going to push off these additions to the clinical dashboard.” Then the clinical team can weigh in, “We’re okay with that because of XYZ reason”
- I’m also a really big advocate of a monthly all-employee call and I try to get air time there pretty frequently.
Initially it was level setting on “this is what data science is and how my team operates.” That has been successful for setting this idea of, “you are going to ask me something and I’m going to ask you the why behind it.” This helped set the precedent ahead of time where I’m not challenging an idea, but it will help my team produce better work if we understand why you care about this and how you are going to use it because that gives us context in decisions we need to make throughout the process.
A few resources shared too:
Javier shared: A resource that I just came across last week (def not something I do daily). if you’re wanting to collect / manipulate JSON data with R, Tom Mock has a super-easy-to-follow writeup with several different approaches: https://lnkd.in/gwGxJpcW
Luke shared: Data is Plural newsletter is a decent off-the-beaten-path dredger of data sources: https://lnkd.in/grRGqQRA
Rebecca shared this book on management, The Hands-Off Manager: https://lnkd.in/gg2TU2mB
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Alec Campanini, Walmart | Using Shiny to make business decisions
We were joined by Alec Campanini, Senior Manager, Merchandising Operations Omnichannel Business Analytics at Walmart. Alec loves to standardize and scale data in a targeted and meaningful way without sacrificing speed of development.
(5:12) - Alec shared an awesome example of the way their team is using Shiny at Walmart:
- Their team’s main scope is “rest of market” - trying to figure out every item that exists in the rest of the market, no matter if it’s in the store, online, through TikTok trends, social listening, whatever it may be.
- We have a lot of market share data, sales information, pricing over time, and bestseller trends that are popping up. We do a lot to find out where merchants would be lacking inside of a brand.
- Today I own a Shiny app (which has been alive for about a year and a half now) that is pushed to all merchandising. We have about 12,000 users on it today across all of the merchants and merch-ops.
What does it do?
-
Starting off it was insights on 28-30 different retailers in the e-commerce space. With so many items, there are a lot of different contracted data sets coming in. The biggest thing is normalizing all of those to the Walmart hierarchy. We need to have some way to tell a merchant that something from the rest of the market on a different website makes sense for them to look at.
-
Our Shiny app and workflow allows us to quickly identify what percentage we have of a certain brand and drill down into things like:
- What are the top items being sold?
- Do I want stable items or trending items? Do I want something that I’ve never seen before?
- Do I already have the item but need to increase what I’m giving the customer?
- Do we need to lower the price?
- Is it shipping too slow?
- Maybe Amazon has 10 images for the item and we only have 5?
An example of this Shiny app helping make business decisions:
- We released a line of new golf tech products for Father’s Day that came from the Shiny app
- We had a lot of the sports & fitness merchants find out that we sell a lot more golf technology rather than golf balls online. These are merchants that used to be in the store, but they got into the e-commerce space as a new merchant.
- They think the trends are going to be the same in-store as online, but they found a different trend inside of our app and were able to launch a whole slew of things this past week, which was really cool.
Other timestamps:
- When making the case for code-first data visualizations over BI tools (25:58)
Packages shared in the chat:
- bs4Dash: https://lnkd.in/gPUBDW72
- Rhandsontable: https://lnkd.in/gtYedxzY
- Bigrquery: https://lnkd.in/gtPM6Pn2"
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | David Meza, NASA | People analytics for getting to the moon
We were joined by David Meza, AIML R&D Lead, People Analytics at NASA.
We had a great chat about people analytics for getting to the moon and beyond
Here are a few snippets:
What areas do you focus on in people analytics? (5:19)
A lot of our focus is around natural language processing, graph algorithms, and graph databases. We create a knowledge graph & apply various algorithms to be able to deliver answers.
An example of this is around our skills and competency capabilities. Trying to get back to the moon and onto Mars, we need to identify the various skills and competencies that we have in our individuals. From a human perspective, that’s very labor intensive. We’re trying to use some of the concepts I’ve mentioned to be able to extract NASA-specific competencies from individuals or people applying for jobs to see how they align to our workforce.
We’re starting to develop pipelines and methods to say: • We’ve got gaps here • Or here’s your career path • Here’s your training path • Here are the number of individuals we have that meet 70% of our skills and we need to upscale them to the next 30%
People analytics can be broken into so many different areas: • Attrition • Time to hire • Recruitment capabilities • Learning/ development
With regards to skills for people analytics: • NLP is a good tool to have because a lot of the data we have is unstructured. We need to be able to pull from that. • Understanding various classification models, how you’re grouping individuals & understanding various metrics within the HR area.
How much of people analytics is data science and predictive modelings versus reports and business intelligence?
Right now everybody does a little bit of everything
When I first got here, they were primarily focused on reporting and creating some type of visualization
Over the last couple of years, we’ve been trying to break into a group that focus on reports in general and a group looking more at the data analytics & modeling, creating the various algorithms that we can utilize for extracting skills and competencies or looking at our attrition models, for example.
Ideally, I would break it down to 5-6 visualizers, 10 data scientists and 4-5 data engineers or architects.
Resources shared:
Joe shard an example of Shiny at NASA: https://lnkd.in/gRJjFX6B David shared Predictive HR Analytics: https://lnkd.in/gPPjjzPZ David shared Inclusalytics: https://lnkd.in/g48tdrMu Rachael shared the rstudio::conf workshop for admins: https://lnkd.in/eB9FZt-c Rachael shared the RStudio workshop for people analytics: https://lnkd.in/e4DdhFAV George shared an article on writing a resume for AI: https://lnkd.in/g3cF_vtD
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
RStudio Sports Analytics Meetup: SportsDataverse Initiative
Led by Saiem Gilani, Director of Data Science and Engineering for the Houston Rockets & founder of the SportsDataverse
Slides: https://slides.sportsdataverse.org Repo: https://github.com/sportsdataverse/sportsdataverse-rstudio/
Save the next sports meetup to your calendar: rstd.io/sports-meetup
If you’d like to ask questions anonymously during the meetup, you can also use: rstd.io/meetup-questions
Abstract: The SportsDataverse is a set of open-source sports data packages that work in harmony because they share common data representations and API design. Many of the packages are backed by associated data repositories updated nightly which allow for easy loading of pre-scraped datasets, particularly play-by-play, team and player box scores and season schedule datasets. In this presentation, we will discuss the updated wehoop package, which now serves as a full WNBA Stats API wrapper (100+ functions added) and if time permits, one of the 10+ other packages in the sportsdataverse.
Bio: Saiem Gilani is the Director of Data Science and Engineering for the Houston Rockets and founder of the SportsDataverse, a community of developers building easy-to-use sports analytics packages and open-source data repositories. He holds a MS in Analytics from Georgia Tech and a BS in Mathematics from Florida State
Data Science Hangout | Tanya Cashorali, TCB Analytics | Saving millions with a Shiny app
We were joined by Tanya Cashorali, CEO & Founder at TCB Analytics. Tanya is passionate about playing with data but using it for good causes whenever possible. Energized by working with ambitious people on difficult problems. I’m a huge advocate of rapid prototyping data products using Shiny and can talk about it for hours!
What was your biggest win for a client? (23:30)
We’ve had a lot of wins moving someone from Excel into a reproducible R pipeline and then displaying all the results in Shiny.
One example is a pharmaceutical company that was analyzing clinical trial data super manually, generating reports in Excel and putting together a PowerPoint weekly.
It probably took about 20 hours of multiple people’s time.
We made a Shiny app and that turned this into pretty much no time at all. They sent us the data and it was updated monthly.
They were able to then take that Shiny app to senior management and clinical trial managers to make decisions based on data very quickly, helping them understand which clinical trial sites were having problems.
Another similar one at another pharmaceutical company was focused on drug manufacturing, where there are a lot of things that can go wrong.
When there’s a contaminant in a batch, it can basically cause millions and millions of dollars in company loss because they have to shut down manufacturing completely until they identify the problem.
This involves going up and downstream of these different drug products to identify the issues. It was taking a team of 5-10 people sometimes 6 months to identify the problem. Meaning for 6 months, you’re not able to manufacture drugs.
We built a Shiny app that built out a D3 directed graph. This enabled one person to go and type in the drug compound and see everything up and downstream. It took that one person now maybe a week or several days to identify the problem now.
That’s an instance where you could say this is literally millions of savings from a Shiny app.
To me that just shows the power of R and the ability to really streamline manual processes.
Resources shared: Rachael shared Libby’s officerDemo: https://github.com/LibbyHeeren/OfficerDemo Daren shared the xaringan package for producing super impressive presentationshttps://github.com/yihui/xaringan Tanya shared Jesse’s Twitter as a great resource for learning Shiny: https://twitter.com/kierisi Javier shared xaringanthemer: https://pkg.garrickadenbuie.com/xaringanthemer/ and xaringanExtra: https://github.com/gadenbuie/xaringanExtra Tanya shared Hugo’s podcast: https://hugobowne.github.io/ Tanya’s Twitter: https://twitter.com/tanyacash21 Tanya’s presentation on “How to Hire and Test for Data Science Skills” https://tcbanalytics.com/2017/11/07/tanya-cashorali-presents-strata-2017-nyc-hire-test-data-skills/
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Posit Meetup | Jake Riley, Children’s Hospital of Philadelphia | Translating Facts to Insights
RStudio Healthcare Meetup:
Translating facts into insights at Children’s Hospital of Philadelphia Led by Jake Riley, data analyst at The Children’s Hospital of Philadelphia
Abstract: {headliner} is a new R package to add dynamic, insightful text to plots and reports. {headliner} generates useful talking points that users can string together using {glue} syntax. This makes it easy to write an informative sentences without adding a lot of technical debt to a project. Learn how to get started with {headliner} and ways we have used it at The Children’s Hospital of Philadelphia.
Speaker Bio: Jake Riley is a data analyst at The Children’s Hospital of Philadelphia. He is the author of several R packages related to data visualization and automated exploratory analysis. You can find his published work [simplecolors] and [shinyobjects] on CRAN with more packages on the way.
Timestamps: 0:49 - Start of talk 1:25 - Dashboards focused on facts vs. insights 2:56 - What’s a good title for a chart? 5:09 - Intro to headliner package 7:41- using glue() under the hood 14:04 - helpers for working with data frames: compare_conditions() 18:41 - using ggtext 21:27 - example using pixar_films 23:40 - how they’ve used it at CHOP 28:05 - Next steps for headliner package 29:32 - Start of Q&A session
Questions: 29:32 - Can you use any package you want in your organization? 31:13 - How do you load previous datasets to compare to current datasets? 32:48 - When you mentioned a front page on RStudio Connect (with the headlines), what is that? 33:25 - Is anyone using this for manuscripts at CHOP now? 36:24 - What has the adoption of R or Python been within the hospital analytics team? 37:28 - My manager is very leery of R because of technical depth. Any suggestions for convincing her of R’s value? 42:22 - How does CHOP use R for non-clinical analysis? 43:36 - How do you train new people to use R? 46:28 - How do you compare last week’s analysis to this week’s? 49:37 - Were there any major challenges in creating the hospital’s internal package?
Resources/links shared: Jake’s LinkedIn: https://www.linkedin.com/in/jake-riley-70736a3/ headliner package: https://github.com/rjake/headliner waldo package: https://www.tidyverse.org/blog/2020/10/waldo/ Examples of R in Life Science & Healthcare: https://www.rstudio.com/champion/life-science Chris Bumgardner’s talk on building an R-based analytic practice at Children’s Wisconsin: https://youtu.be/pHZ8dsc0PhY simplecolors package to generate hex codes using uniformly named colors: https://rjake.github.io/simplecolors/ R Packages book by Hadley Wickham & Jenny Bryan: https://r-pkgs.org/
Meetup Links: Future events: rstd.io/community-events-calendar If anyone’s interested in speaking at a future meetup, we’d love to hear from you too! rstd.io/meetup-speaker-form


Programming Games with Shiny || Roll the Dice: with Quosures! || RStudio
00:00 Introduction
03:44 The pain of copy + paste
07:28 Going on a helper function adventure!
18:09 Ready for rlang
28:17 !! + enquo()
37:57 Benefits of the rlang approach
38:46 Embracing the embrace operator
41:33 Visualizing what’s happening using reactlog
You’ve most likely used Shiny to build a web app that displays data, but you can also use Shiny to build games! In this video series, Jesse and Barret pair program simply games in Shiny as a way to uncover and explore new features.
Read up on the embrace operator here: https://rlang.r-lib.org/reference/embrace-operator.html
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) and Jesse Mostipak (@kierisi) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Theme song: Hakodate Line by Blue Dot Sessions (https://app.sessions.blue/browse/track/111291" )

Enabling Citizen Data Scientists at Dow Chemical with Posit Academy
Led by James Wade, Associate Research Scientist at Dow Chemical
Timestamps: 2:46 - Start of presentation 5:25 - Goal: “apply science and engineering technical expertise along with data science tooling to innovate in the materials science arena.” 6:36 - What does citizen data science mean? 8:05 - Data science as an interdisciplinary endeavor - looking to build a community of innovators 9:30 - Translating data to decisions 11:03 - Guidelines for success (data organizations, data access, data analysis, value preservation) 13:30 - Welcoming new users in an approachable, collaborative, and secure workspace with RStudio Team 14:25 - Making sure you can rapidly deploy your insights to others 16:25 - What is RStudio Academy? 20:55 - What do you need for academy? (Academy learners: 5-7 per cohort, cohort mentors from RStudio & your group, and a project - the closer to your work the better) 22:15 - Who is a good candidate? 23:55 - Who might not be the best candidate? 26:00 - What makes a good cohort? (similar work group, time zone, and skill level) 27:27 - Feedback (Are they still using the content they learned? 16 out of the 17 survey respondents were still writing code 6 months after) 31:42 - Community building (want to have a landing zone for people to continue to learn) 32:31 - RStudio Academy success story at Dow 35:30 - Start of Q&A portion
Questions: 36:00 - How do you help someone who knows coding would be useful but can’t motivate to take 5 steps back to take 10 steps forward? 37:55 - How can more advanced users participate in developing curriculums? 39:44 - Does Academy also teach good coding style and version control? 41:00 - If you’re trying to “sell” Academy to the individual who would fill the group mentor role, what level of commitment and bandwidth do they need to have? 42:13 - Is the type of data you work with relevant to the work you do at Dow? or random / set datasets regardless of which company you’re with? 43:00 - What other ways of teaching R have you tried (or considered) at Dow? How does Academy compare? 44:55 - What is the duration of RStudio Academy? 46:15 - Can you have multiple cohorts go through at the same time? What if we want to up-skill hundreds of people? 48:20 - How did you find out who might be interested and get the word out? 50:08 - Advertising that you help learners up-skill in coding seems like a good way to set your company apart from others, are you hiring? 51:25 - After the RStudio Academy 10 week training is the Academy team still available for questions, support or consult? 53:01 - Is Academy only for R? 53:38 - How can a data science student participate in RStudio Academy? (https://www.rstudio.com/conference/2022/workshops/intro-to-tidyverse/ ) 54:44 - How do you collaborate with others outside of Dow? 57:20 - How does RStudio Academy handle sensitive data? 1:00:20 - Do you have statistics on how many graduates are still using R?
Abstract: In chemistry and materials science research, data is messy, unstructured, and scattered. Solving this problem requires researchers to deeply embed within data generation and analysis workflows.
We are on a multi-year journey to equip scientists and engineers with guidance and tools to extract insights from their data. To this end, we have developed a set of 15 guidelines designed to move our organization toward a collaborative, reproducible work process in a dynamic data-diverse environment.
In this talk, I will share our lessons from this journey learned through teaching, community building, and collaboration with a particular focus on the integration of language agnostic RStudio tools, products, and programs. I will especially be focusing on our experience with RStudio Academy.
Speaker Bio: James is a research scientist working in the chemicals manufacturing industry as part of a research and development team. James applies materials characterization and data science with a special interest in sustainable materials design to develop new capabilities for research. His current focus is on augmenting materials characterization innovations with statistical analysis, machine learning, and data visualization.
For more information on RStudio Academy: rstudio.com/academy Link to speak with RStudio: rstd.io/chat-with-rstudio
Rich Iannone || New features in {gt} 0.6.0! || RStudio
00:00 Introduction 00:18 sub_missing() 03:51 Markdown formatting in sub_missing() 04:51 sub_zero() 07:34 sub_small_vals() 13:08 sub_large_vals() 16:25 final thoughts
A new version of the R package {gt} has been released! We are now at version 0.6.0 and there are now even more features that’ll make your display/summary tables look and work much, much better. Let’s run through some of the bigger changes and see the benefits they can bring!
New functions for substituting cell data
We now have four new functions that allow you to make precise substitutions of cell values with perhaps something more meaningful. They all begin with sub_ and that’s short for substitution!
sub_missing() (formerly known as fmt_missing()) Here’s something that’s both old and new. The sub_missing() function (for replacing NAs with… something) is new, but it’s essentially replacing a function that is old (fmt_missing()).
The missing_text replacement of “—” is actually an em dash (the longest of the dash family). This can be downgraded to an en dash with “–” or we can go further with “-”, giving us a hyphen replacement. Or, you can use another piece of text.
If you’re using and loving fmt_missing(), it’s okay! You’ll probably receive a warning about it when you upgrade to {gt} 0.6.0 though. Best to just substitute fmt_missing() with sub_missing() anyway!
sub_zero() The sub_zero() function allows for substituting zero values in the table body.
sub_small_vals() Next up is the sub_small_vals() function. Ever have really, really small values and really just want to say they are small?
With sub_small_vals() we can reformat smaller numbers using the default threshold of 0.01.
Small and negative values can also be handled but they are handled specially by the sign parameter. Setting that to “-” will format only the small, negative values.
You don’t have to settle with the default threshold value or the default replacement pattern (in small_pattern). This can be changed and the “x” in small_pattern (which uses the threshold value) can even be omitted.
sub_large_vals() Okay, there’s one more substitution function to cover, and this one’s for all the large values in your table: sub_large_vals(). With this you can substitute what you might consider as too large values in the table body.
Large negative values can also be handled but they are handled specially by the sign parameter. Setting that to “-” will format only the large values that are negative. You don’t have to settle with the default threshold value or the default replacement pattern (in large_pattern). This can be changed and the “x” in large_pattern (which uses the threshold value) can even be omitted.
Final thoughts We are always trying to improve the gt package with a mix of big features (some examples: improving rendering, adding new families of functions) and numerous tiny features (like improving existing functions, clarifying documentation, etc.). It’s hoped that the things delivered in gt 0.6.0 lead to improvements in how you create and present summary tables in R. If there are features you really want, always feel free to:
File an issue: https://github.com/rstudio/gt/issues ) Talk about your ideas on the Discussions page: https://github.com/rstudio/gt/discussions
Learn more about the gt package here: https://gt.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) Motion Design & editing: Jesse Mostipak Music: Nu Fornacis by Blue Dot Sessions https://app.sessions.blue/browse/track/98983

Data Science Hangout | Travis Gerke, PCCTC | Tips for Very Remote Work
We were joined by Travis Gerke, Director of Data Science at The Prostate Cancer Clinical Trials Consortium. Travis is an enthusiastic proponent of the use of R for data science in the clinical trials space and an advocate for productive remote work environments.
Tips for very remote work
- Ok, first things first - how do you ensure you have internet?
- I use a 5g Netgear hotspot and have redundant phone plans with T-Mobile, AT&T, and Verizon. It depends where we are and what kind of service is going to be best, but I’ve never had a challenge streaming videos or with Zoom.
- If you do get a hotspot, $20 boosters can be tremendously helpful too and help funnel the signal into your hotspot a bit better.
- If you want to geek out on this, the best resource is rvmobileinternet.com.
- Even in the remotest places we’ve been, I was still able to maintain Zoom meetings. On a week where I am furthest from civilization, I will plan ahead and focus more on learning and heads down work. It’s actually kind of good it gives you a little bit of space from the usual grind of meetings..
- How do you balance the time you can be available for conversations or when you’re traveling
- It’s something that one has to be very intentional about and communicate broadly. I do work odd hours, because I’m West Coast now and most of my colleagues are East Coast. I try to wake up early and stick to their schedule.
- It really comes down to communication. I will let people know if I’m traveling and block time on my calendar.
- I think async is the future of most work environments for data science. I’d recommend checking out content from Chris Herd on Twitter too: https://twitter.com/chris_herd
- If I send an email at a time like 2am, I make sure to communicate that if I’m sending an email, it doesn’t mean I expect anyone to see or respond at that time.
- A few people mentioned putting something in an email signature to let people know. Here’s an example: “I work on a flexible work schedule and across a number of time zones so I’m sending this message now because it works for me. Feel free to read, act on or respond at a time that works for you.”
- What tools are there for async work to work through something complicated without meetings?
A few ideas and tips shared from the group:
- We lean into github whenever we can. The process of writing down where you’re stuck and/or how you solved something is good.
- When working with non-data scientists, sometimes you just have to have the meeting and that’s fine. It can be a lot more efficient that way but the drawback is that you don’t end up documenting. Writing things down reduces those institutional knowledge silos.
- People mentioned using: Slack, video messages on Slack, Teams, Loom for sharing video screen shares, snagit, Discord, Fuze, Zoom whiteboard
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Tom Schenk & Bejan Sadeghian | Making Microservices Part of Your Data Team
Making microservices a part of your data science team Led by Tom Schenk & Bejan Sadeghian at KPMG
Timestamps: 1:09 - Start of presentation 4:00 - Challenges and trade-offs of a growing team (how I stopped worrying about hiring) 8:14 - What are microservices? (help separate out the different layers of an app) 9:36 - Hosting other web technologies on RStudio Connect (ex. React) 12:25 - Simple Hello World example of microservices 16:00 - Reason to separate out logging 17:15 - How to design & plan microservices (moving from a monolithic Shiny app) 17:51 - Challenges to getting started with microservices 21:02 - How do you address getting started? (domain driven design) 23:17 - Applying cloud design patterns 25:37 - Separation of development duties 27:22 - Addressing any risks that come with microservices 29:17 - Considering costs and benefits 30:11 - Microservices in action: demo of KODA app (making changes to the organization) 36:21 - PowerBI interacting with the same microservice from Connect 38:30 - Growing teams face a trade-off of complexity and simplicity (KPMG’s path)
Questions: 43:28 - Can you use a Shiny front end together with a microservice backend? 44:09 - Do you hire separately for back-end data science development and front-end Shiny UI development? 46:00 - Are all microservices managed by a centralized unit? 47:02 - Who can access RStudio Connect in your organization? 48:29 - When you decided to go the microservice route, what was your first step? 50:47 - What roles are you hiring for? 52:20 - Might you suggest some web service servers that host R-based or Python services? 53:33 - Are apps build with microservices as responsive as those that adopt a monolithic architecture or do microservices introduce a lag? 55:09 - Can you show the back-end response data through developer tools? 56:45 - Can you speak more about the logging microservice? Did you build it ground-up or did you adopt an off-the-shelf package or app?
Abstract: Whether or not you’ve heard of microservices architecture, you may want to know how microservices can help you scale R-based applications across an enterprise.
As data science teams—and their applications—grow larger, teams can experience growing pains that make applications complex, difficult to customize, or challenging to collaborate across large teams. This meetup will discuss what microservices are, how it compares to Shiny, how it can help a data science team, and how you can deploy microservices using your RStudio Connect environment.
This meetup will help you understand several key items: • The basic concept of microservices and benefits, such as making your code modular, domain-driven design, and reducing the complexity of application development, and facilitate larger development teams. • How to use the Plumber package to deploy APIs as part of a microservices architecture. • How you can work with front-end development teams using their preferred framework (e.g., React, Angular, Vue) using RStudio Connect.
We will show a widely-used application built using a microservices architecture and hosted in RStudio, including before-and-after comparisons to show the strengths of a microservices framework leads to a better-looking and better-functioning application. Our team will discuss the journey and growth to arrive at the new approach to make development easier within a quickly growing group.
Speaker Bios: Tom Schenk Jr. is a researcher and author on applying technology, data, and analytics to make better decisions. He’s currently a managing director at KPMG. He has previously served as Chief Data Officer for City of Chicago.
Bejan Sadeghian is a director of analytics at KPMG and leads data science development, which spans from advanced analytics to machine learning engineering.
For upcoming events: rstd.io/community-events-calendar Info on RStudio Connect: https://www.rstudio.com/products/connect/ To chat with RStudio: rstd.io/chat-with-rstudio
Data Science Hangout | Alice Walsh, Pathos | Improving an Interview Experience
We were joined by Alice Walsh, PhD, VP of Translational Research at Pathos. Alice works in drug development, where she is excited about the potential of computational research to yield breakthroughs for patients.
Loved that Alice also asked this question back to the audience:
How do I make an interview a good experience for a candidate? Or have you had any nightmares that’d be helpful to share?
A bunch of thoughts shared from the group: ⬢ I’ve had way more success not giving a technical interview, and having the technical interview be more of a discussion where I’m not even asking them to whiteboard anything or it’s just talking.
⬢ If I asked them, “how do you develop a shiny app”, I’d much rather someone tell me I’ve never developed a shiny app in my life but I use R Markdown every day. That tells me a lot about their ability to actually jump in and learn something new and their transparency.
⬢ I’m much more interested in the process. How do people approach a problem and solve challenges that they encounter versus the specific project they worked on because they’re not going to work on that project ever again with me. It’s going to be a new project so they will need to learn something anyway.
⬢ I’ve had success hiring from meetups or hackathons. Seeing people here and the way they problem solve gives you a lot of insight about these individuals.
⬢ My company actually does do a technical interview and we give candidates a data set while they’re on site or in a Teams meeting. We give them an hour to see what sorts of insights they find with a few very specifically directed questions. What we’re often looking for is not someone to have perfect answers to those questions - it’s really about understanding how they looked at the data set, what other information they want, and what do you wish you had more time to do. You get to see how people work through something and it’s okay if they don’t have a perfectly polished presentation.
⬢ I’ve had a nightmare interview that became a pop quiz on R stuff. What are all the packages in the tidyverse (and at the time I didn’t use tidyverse I was base R)
⬢ A nightmare one that sticks in my head was, please describe in detail the differences between Python 3 and Python 2.
⬢ I think, “this is something I would Google” is a valid answer sometimes because even if I don’t know this, I know where to find it and am really confident in my ability to Google this.
⬢ Honestly if I ask somebody a question and they said this is something that I know I could find the answer to, that would be a perfect answer to me. Not knowing but knowing where to find the information great.
⬢ I went on 3 interviews and they each had a technical part where for every single one of them the answer was: dynamic programming. They must have gone somewhere and decided that was the algorithm to ask about. I found that a bit ridiculous because it wasn’t relevant to what they were working on and it was off-putting. Now, when I interview people I try to make it more of a conversation around the data and what we might actually be doing.
⬢ I have an optional take home: “here’s a data set, take an hour and tell me something. Use whatever tools you want: Excel, R, Python, an abacus.” The key thing I want to see is a written output of what you did. I’m still on the fence, though, because I know a lot of people are anti-technical component. I’m still trying to figure out if it is really helping us make the best hiring decisions.
Also wanted to share a link to a job posting on Alice’s team! She has simplified to just one posting, but they have a couple of openings. They are hiring both for more experienced scientists and folks transitioning from academic research and no drug development experience:
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
{gt} Table Battles || Eurovision || RStudio
00:00 Introduction 00:07 Jesse’s gt table, with a focus on flag emoji and interactivity via a Shiny app 09:50 Rich’s gt table, with a focus on CSS and embedded animations
Code: https://github.com/kierisi/rstudio_videos/tree/main/gt/table-battles
Learn more about the gt package here: https://gt.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) & Jesse Mostipak (@kierisi) Motion Design & editing: Jesse Mostipak Music: Gemeni City by Blue Dot Sessions https://app.sessions.blue/browse/track/113567

Data Science Hangout | Lindsey Clark, Healthcare Bluebook | Measuring success of data science
We were joined by Lindsey Clark, Director of Data Science at Healthcare Bluebook. Evolving and Leading a Data Science/AI First Organization.
One of the topics of discussion (33:46) was measuring the success of a data science team and calculating ROI.
How do you measure success for a data science team?
Lindsey shared a few tips from her experience at two different organizations:
One measure of success was the conversion rate of projects. How many projects come to us and what percent are actually converted to an actual deployment? Our target was 5% which may sound low, but the idea was that we want to fail fast. It can be like the drug discovery mentality - if things aren’t going to work, we want to know early on.
We also have targets around the number of specific models we wanted - what we called our data science portfolio library. They didn’t need to be really fancy machine learning, but a lot of them were business logic sort of ranking models.
Another measure was product value and financial targets around that. Could we pinpoint that our model went into the data product and it resulted in more clients contracting with us or greater revenue because we were able to upsell a data product due to another feature we added.
In my position now, we have recently been acquired so we are working on the success measures and what that looks like. We do utilize the Outcomes and Key Results (OKR) Strategy from the Measure What Matters book.
We plan OKRs quarterly and try to plan projects that are really no longer than three month increments in order to keep things going and satisfy the business.
Data Science ROI can of course be a difficult thing to measure. Data science can be a capital expense at times and sometimes you have to look at this team that’s going to do high risk projects as research. Things are going to fail and that’s okay. We’re willing to invest the money for our team and absorb that cost for what we potentially get out of the team in terms of deployed solutions, specific insights, and research that are valuable to the business.
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Using Python with RStudio Team
Using Python with RStudio Team Led by David Aja, Solutions Engineer at RStudio
00:30 - Overview of RStudio Team - RStudio Workbench, RStudio Connect, RStudio Package Manager 2:38 - Use case of RStudio Package Manager with Python packages 4:29 - RStudio Workbench (ability to use RStudio, Jupyter Notebook, JupyterLab, VS Code) 5:48 - VS Code running on RStudio Workbench 11:50 - Deploying a Streamlit app to RStudio Connect 14:00 - Pause for questions 18:55 - Jupyter on RStudio Workbench 19:41 - Extension in Jupyter that gives you push-button publishing from RStudio Workbench 20:35 - Publishing with source code vs. publishing only the finished document 21:10 - Hiding the code that is generated when publishing Jupyter to RStudio Connect 23:30 - Creating a custom url in RStudio Connect for your content 24:30 - Adding viewers / collaborators to your content in RStudio Connect 25:18 - Scheduling your Notebooks to run repeatedly 25:54 - Stepping back to describe the differences between RStudio open-source IDE and RStudio Workbench
*please note that this meetup will cover our enterprise product, RStudio Team but all who are interested in joining are welcome!
Many Data Science teams today are bilingual, leveraging both R and Python in their work. While both languages have unique strengths, teams frequently struggle to use them together:
⬢ Data Scientists constantly need to switch contexts among multiple environments. ⬢ Data Science Leaders wrestle with how to share results consistently and deliver value to the larger organization, while providing tools for collaboration between R and Python users on their team. ⬢ DevOps engineers and IT Admins spend time and resources attempting to maintain, manage and scale separate environments for R and Python in a cost-effective way.
Join David Aja in this meetup, which will highlight how other data science teams are able to solve these problems with RStudio Team by:
⬢ Combining R and Python in a single data science project. ⬢ Launching and managing RStudio, Jupyter Notebooks, JupyterLab, and VS Code from the RStudio Workbench environment ⬢ Sharing Jupyter Notebooks, Python APIs via Flask, Dash, Streamlit, Bokeh, FastAPI, Shiny, R Markdown, etc. with the business through RStudio Connect. ⬢ Controlling and distributing Python and R packages with RStudio Package Manager.
A few helpful links shared and mentioned during the call:
⬢ Examples David used: https://lnkd.in/g8bdbj7Q ⬢ Example usage patterns of Using Python with RStudio: https://lnkd.in/gek3BhgW ⬢ Helpful place for questions about using Python in RStudio: https://lnkd.in/gWRV2rbG ⬢ Model Management with Python example: https://lnkd.in/gyX2YVvi
Product / Conference Questions: ⬢ More info on RStudio Team: https://lnkd.in/gv4YQj2G ⬢ If you’re starting to champion data science at your company: rstudio.com/champion ⬢ If you’d like to chat about RStudio Team: rstd.io/chat-with-rstudio ⬢ RStudio Conference: https://lnkd.in/dcrz79y ⬢ System Admin Workshop at RStudio Conference: https://lnkd.in/eB9FZt-c
Programming Games with Shiny || Roll the Dice || RStudio
00:00 Introduction 01:40 Rolling with eventReactive( ) 06:26 Reducing eventReactive( ) to reactive( ) + isolate( ) 16:23 Combining reactive( ) and bindEvent( ) 20:11 Reviewing our reactives 21:23 Writing a function to de-duplicate dice rolls
You’ve most likely used Shiny to build a web app that displays data, but you can also use Shiny to build games! In this video series, Jesse and Barret pair program simply games in Shiny as a way to uncover and explore new features.
Read up on tabset panels here: https://shiny.rstudio.com/reference/shiny/0.14/tabsetPanel.html
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) and Jesse Mostipak (@kierisi) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Theme song: Hakodate Line by Blue Dot Sessions (https://app.sessions.blue/browse/track/111291" )

Veerle van Leemput | Analytic Health | Optimizing Shiny for enterprise-grade apps
Can you use Shiny in production? A: Yes, you definitely can.
Link to slides: https://github.com/RStudioEnterpriseMeetup/Presentations/blob/main/VeerlevanLeemput-OptimizingShiny-20220525.pdf
Packages mentioned: ⬢ shiny: https://shiny.rstudio.com/ ⬢ pins: https://pins.rstudio.com/ ⬢ plumber: https://www.rplumber.io/ ⬢ blastula: https://github.com/rstudio/blastula ⬢ callR: https://github.com/r-lib/callr ⬢ shinyloadtest: https://rstudio.github.io/shinyloadtest/ ⬢ shinycannon: https://github.com/rstudio/shinycannon ⬢ shinytest2: https://rstudio.github.io/shinytest2/ ⬢ feather: https://github.com/wesm/feather ⬢ shinipsum: https://github.com/ThinkR-open/shinipsum ⬢ bs4Dash: https://rinterface.github.io/bs4Dash/index.html
Timestamps: 2:44 - Start of presentation 5:41 - What qualifies as an enterprise-grade app? 10:46 - UI first / user experience / prototyping 13:20 - Separating code into separate scripts and creating code that’s easy to test 17:15 - Golem 19:28 - Functionize your code 20:50 - Rhino package, framework for developing enterprise-grade apps at speed 22:33 - Infrastructure, how do you bring this to your users? (lots of ways to do this. They do this with R, pins, plumber, rmd, blastula, and Posit Connect on Azure) 31:17 - Optimizing Shiny (process configuration, cache, callR, API, feather) 47:35 - Testing your app (shinyloadtest and shinycannon) 50:23 - Testing for outcomes (shinytest2) 52:15 - Monitor app performance & usage (blastula, shinycannon, usage metrics with Shiny app)
Questions: 57:38 - What’s the benefit of using pins rather than pulling the data from your database? 59:30 - Are there package license considerations you had to think about when monetizing shiny applications? 1:00:45 - Do you use promises to scale the application? (they use CallR) 1:01:49 - For beginners, golem or rhino? 1:02:50 - The myth is that only Python can be used for production apps, what made you choose to use R? 1:05:12 - Is feather strictly better than using JSON? 1:06:38 - Where do you see the line between BI (business intelligence) and Shiny for your applications? 1:08:36 - Any tips for enterprise-grade UI development? Making beautiful apps (bs4Dash app) 1:10:25 - Have you found an upper limit for users? 1:12:19 - Any tips for more dynamic data? (optimizing database helps here) 1:13:50 - Where do you install shinycannon? (on our development Linux server) 1:15:00 - Can you share other resources or examples of code? (Slides here with resources: https://github.com/RStudioEnterpriseMeetup/Presentations/blob/main/VeerlevanLeemput-OptimizingShiny-20220525.pdf )
For upcoming events: rstd.io/community-events-calendar Info on Posit Connect: https://www.rstudio.com/products/connect/ To chat with Posit: rstd.io/chat-with-rstudio
{gt} Table Battles || Crosswords || RStudio
00:00 Introduction 00:34 Rich’s gt table, with a focus on creating audio within a table 07:28 Jesse’s gt table, with a focus on sentiment analysis
Learn more about the gt package here: https://gt.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) & Jesse Mostipak (@kierisi) Motion Design & editing: Jesse Mostipak Music: Nu Fornacis by Blue Dot Sessions https://app.sessions.blue/browse/track/98983

Data Science Hangout | Wayne Jones, Shell | Thinking Empathetically & Using Your Initiative
We were joined by Wayne Jones, Principal Data Scientist at Shell. Wayne has been using R since its first inception and is passionate about open source solutions.
During the hangout with Wayne, he brought up the point of thinking empathetically when communicating. Here are a few other tips shared among the group:
1️⃣ If you spot someone who you think is really good at communicating and you’re impressed by them - buddy up. Tell them you like the way they did that, can you give me a few tips and hints?
2️⃣ With data science, you can focus on a very small area sometimes and get obsessed by the details. Step back, step back, and step back some more so that when you’re presenting a problem you give the bigger picture and narrow into the data science details.
3️⃣ If you think more empathetically, this will also go a long way. Instead of thinking about it from a data science standpoint, put yourself in the customers or client’s shoes. With more empathy, you make yourself a better consultant.
4️⃣ Try using your team meetings (scrum meetings for example) to practice presenting in front of people, even something for 5 minutes where you need to learn to be succinct about it.
5️⃣ Think about when you were a child and bringing something in for show and tell. When they get to bring something they really care about, they are thinking more about the toy that they brought in rather than presenting.
6️⃣ There are lots of R meetup groups that are always looking for volunteers to speak at events! Lots of opportunities to practice there!
7️⃣ Learn about everyone else’s role so that you can talk their language - “BE RELEVANT” to your teams
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Posit Package Manager || Link Your Package Manager Repo to Your RStudio IDE || Posit
In this video Jeremey Allen walks through connecting Posit Package Manager to Workbench for fast and secure access to the organization’s R and Python libraries.
Posit Package Manager is a repository management server to organize and centralize packages across your team, department, or entire organization. Get offline access to CRAN, PyPI, and Bioconductor, share local packages, restrict package access, find packages across repositories, and more. Experience reliable and consistent package management, optimized for data science.
Learn more about Posit Package Manager here: https://www.rstudio.com/products/package-manager/
Got questions? Check out the Posit Package Manager Frequently Asked Questions page: https://docs.rstudio.com/rspm/admin/getting-started/faq/
Katie Masiello || Build a Codenames app using {pins} and Shiny! || RStudio
00:00 Introduction 00:05 Project outline 03:56 Create a codename generator (using RMarkdown) 09:35 Publish to RStudio Connect 10:38 Create a Shiny app 18:15 A little bit of troubleshooting 18:18 Ta-da!
Learn more about the pins package here: https://pins.rstudio.com/ Learn more about Shiny here: https://shiny.rstudio.com/ And learn more about RStudio Connect here: https://www.rstudio.com/products/connect/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Katie Masiello (@katieontheridge) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Theme song: Contrarian by Blue Dot Sessions (https://app.sessions.blue/browse/track/64281 )
Posit Pharma Meetup: R for Clinical Study Reports & Submission | Yilong Zhang
Presenter: Yilong Zhang, PhD
During the meetup you can ask questions here: rstd.io/pharma-meetup-questions (anonymously or with your name)
Abstract: The use of open-source R is evolving in drug discovery, research, and development for study design, data analysis, visualization, and report generation in the pharmaceutical industry. It is critical to enable the ability to produce tables, listings, and figures (TLFs) and submit results to the regulatory agency using R. We developed R packages (r2rtf, pkglite) and reference book (https://r4csr.org/ ) to simplify the workflow for an organization to complete those tasks. Based on the proposed workflow, the first pilot project has been successfully submitted to FDA by the R consortium R submission working group (https://rconsortium.github.io/submissions-wg/) .
Bio: Yilong Zhang is a biostatistician at Meta (and was previously at Merck). He has worked with a group of statisticians and programmers to demonstrate the capability of using R for regulatory work. Other research interest include statistical methods in study design, missing data, and survival analysis with more than 25 papers published in peer-reviewed journals. Before joining Merck, he earned Ph.D. degree in Biostatistics at New York University.
Helpful links: Use cases & insights from pharmaceutical industry leaders: https://www.rstudio.com/champion/life-science R/Pharma Conference: https://rinpharma.com/ R Validation Hub: https://www.pharmar.org/ Link to speak with RStudio: rstd.io/chat-with-rstudio Link to slides: rstd.io/pharma-meetup-slides
RStudio’s {pins} package: what it is, how it works, and what it can do for you! || RStudio
00:00 Introduction 00:09 What is the pins package? 01:49 pins - not just for RStudio Connect! 02:31 pins use cases 04:47 How to use pins instead of final_final_01_noreallyfinal.xls 06:37 How do pin boards work? 08:55 Getting started with pins 10:42 Versioning with pins at the board or pin level 11:47 pins and caching 12:13 Things you shouldn’t pin 14:00 Major functions in the pins package 17:21 Using pin_upload( ) and pin_download( ) 19:52 pins and Google Cloud 21:27 pins and modelops with the vetiver package
Learn more about the pins package here: https://pins.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Katie Masiello (@katieontheridge) and Jesse Mostipak (@kierisi) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Theme song: Contrarian by Blue Dot Sessions (https://app.sessions.blue/browse/track/64281 )
Data Science Hangout | Michael Chow, Posit | Exploring Team Structure w/ Data Scientists & Engineers
We were joined by Michael Chow, Data Scientist and Software Engineer at RStudio. Michael also previously led a team at the California Integrated Travel Project.
On this week’s hangout there were a lot of thoughts shared on structuring a data science team from both Michael and the broader group:
⬢ Jacqueline Nolis also shared thoughts on this on a data science hangout that there were virtues to different ones, but ended up sold on the decentralized model where data scientists are embedded in teams: https://youtu.be/CcPE29bYGVo?t=325
⬢ Michael agreed that data scientists and analysts should be sitting with the teams that they’re pushing out reports for. Otherwise, I would be trying to send people into those teams to figure out their priorities.
⬢ A data scientist should work with a Project Manager or whoever’s leading the team to push up metrics but also help change the roadmap.
⬢ It leaves a tricky question of where data engineers should be and how they should interact with the team. Today data engineers are often doing more tooling empowerment, so it can be okay to have them a bit more centralized and connect to the data scientists to enforce best practices or enable new pieces for them.
⬢ I think a nice model is for data scientists/analysts to live in the teams and data engineers to be like spokes of a wheel where then the data scientists connect with them and work closely to enforce better best practice and enable new important things.
⬢ Tatsu shared that in thinking of the structure, it’s also important to find your translators and to use the power of feedback. Reach out to those people to start to put that feedback into action.
⬢ George shared that insurance companies have come from a really traditional landscape where they have lots of actuaries working on lots of excel spreadsheets and there can be a lack of knowledge sharing and tool sharing. This is where the data science element comes in. To me, within the organization, you need to have this team which is a mini-spoke if you will, because they are central to the actuarial team. If they are too far removed and they’re back with the IT team, you end up with the old problems because they may not get the business concept communicated back. It’s all about getting enough skills, so they can get stuff done, especially proof of concepts. Maybe after that you can take a step back and then start to look at the centralized model again.
⬢ A central team can help converge to what they see as best practice, but if you’re pushing out something new, exploring a new line of work or area it can be important to set the data engineer there to actually do whatever they need to. Make sure that the converging doesn’t stifle creativity or prevent a team from doing the right thing.
⬢ Manny jumped in to share the perspective from data science being with IT as well, data science is a new field for their company (in real estate) and there’s an identity of where does data science fall. The IT team is fantastic and they’re very structured. Data science is so fluid and creative and non structured at the moment, so you kind of have to look at where it actually should fall.
- please note that some of the points above are summarized and not 100% actual quotes.
Resources shared:
⬢ Tatsu shared in the chat, a few projects that Michael is working on: vetiver: https://vetiver.tidymodels.org/articles/vetiver.html , siuba: https://github.com/machow/siuba ⬢ Libby shared a helpful tip on creating a 2 minutes YouTube video with a cover letter, to get the attention of a hiring manager ⬢ Javier shared an example Shiny app used in an interview: https://javierorraca.shinyapps.io/Bloomreach_Shiny_App/ ⬢ Michael mentioned David Robinson’s screencasts: https://www.youtube.com/channel/UCeiiqmVK07qhY-wvg3IZiZQ ⬢ Michael mentioned an article on “What data scientists really do according to 35 data scientists”: https://hbr.org/2018/08/what-data-scientists-really-do-according-to-35-data-scientists ⬢ Rachael shared a blog post link where Jacqueline Nolis talked about team structure as well: https://www.rstudio.com/blog/building-effective-data-science-team-answering-your-questions/#Structure
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout ► View the Data Science Hangout site here: rstudio.com/data-science-hangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio

Programming Games with Shiny || Guess the Number || RStudio
00:00 Introduction 00:35 Setting up our app UI 06:19 Using observeEvent() 10:02 Writing an if else statement as part of our feedback mechanism 13:40 Testing our app and deciding which bugs to fix first 14:20 Check yourself before you req() yourself 17:10 Using tabset panels to control what the user sees
You’ve most likely used Shiny to build a web app that displays data, but you can also use Shiny to build games! In this video series, Jesse and Barret pair program simply games in Shiny as a way to uncover and explore new features.
Read up on tabset panels here: https://shiny.rstudio.com/reference/shiny/1.5.0/tabsetPanel.html
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) and Jesse Mostipak (@kierisi) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Theme song: Hakodate Line by Blue Dot Sessions (https://app.sessions.blue/browse/track/91291 )

Kelly O’Briant | Building a business case for data science & advocating for analytic infrastructure
3:10 - Start of Kelly’s presentation 5:05 - Identifying where you are in your organizational culture when asking for change 9:30 - What is analytic infrastructure: the how, where, and with what that goes into your daily data science work 10:55 - Production looks different at different organizations 12:20 - Don’t get caught up in the hype - there is no perfect deployment pipeline 14:00 - Tactical metrics for communicating devops - code deployment lead time 15:55 - Your analytic infrastructure is what enables you or teams to deliver value through decreasing that code deployment lead time 20:49 - What is an analytic administrator / R Admin? (A data scientist who onboards new tools, deploys solutions, support existing standards. They work closely with IT to maintain and scale analytic environments. They influence others in the organization to be more effective, and are passionate about making data science a legitimate analytic standard) 24:04 - Building a business case (introducing rstudio.com/champion)
Join us for future champion chats the 4th Monday of every month (not the date I said in the recording) rstd.io/champion-chats
Let’s admit it. Getting to use the tools you want sometimes needs a little convincing.
We’ve had the opportunity to meet so many wonderful people from the community. People that: organize meetups in their own communities, spend time after-hours teaching their co-workers, solve business problems at work with Shiny, write blog posts to help others, and so much more..
People are doing amazing things with data science, yet so many are still in a position where they are not able to make the case for it at their own organizations.
Hear from Kelly O’Briant on how you can start to navigate the internal process and advocate for great analytic infrastructure as a data scientist.
Whether you’re getting pushback about using open-source, being told to use a BI tool instead, or just unable to find the other data scientists at your company - we want to help make this process less frustrating. Come hang out and chat with us!
*Please note, only the presentation portion of this meetup was recorded, the open discussion and Q&A was not.
Resources shared this week: ⬢ rstudio.com/champion ⬢ John shared a typology of organisational cultures: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1765804/pdf/v013p0ii22.pdf ⬢ Kelly shared The Phoenix Project book: https://www.amazon.com/Phoenix-Project-DevOps-Helping-Business-ebook/dp/B078Y98RG8 and The Unicorn Project book: https://www.amazon.com/gp/product/B07QT9QR41 ⬢ Kelly shared Benjy’s talk on conveying data science findings to non-experts: https://www.youtube.com/watch?v=UUIda4-CSXQ ⬢ Nick shared: The pkgdown package can be used to build informational web pages about R packages, which can themselves be deployed to RStudio Connect: https://pkgdown.r-lib.org/ ⬢ Kelly shared Nathan Stephens RViews blog on Analytics Administration for R: https://rviews.rstudio.com/2017/06/21/analytics-administration-for-r/
{gt} Table Battles || Digital Publications || RStudio
00:00 Introduction 00:32 Jesse’s gt table, with a focus on changing background cell color 07:11 Rich’s gt table, which uses three different tables to create a fixed-size scrollable gt table
You can find the code for each table here: https://github.com/kierisi/rstudio_videos/tree/main/gt/table-battles/01_round-01_digital-publications
Learn more about the gt package here: https://gt.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) & Jesse Mostipak (@kierisi) Motion Design & editing: Jesse Mostipak Music: Nu Fornacis by Blue Dot Sessions https://app.sessions.blue/browse/track/98983

Data Science Hangout | Daren Eiri, Arrowhead General Insurance | Building a DS Playbook
We were joined by Daren Eiri, Director of Data Science at Arrowhead General Insurance. Daren is passionate about building creative solutions within the insurance industry using data, statistical modeling, and cloud technology.
What do you do to ensure your ideas or solutions are well received by internal customers?
Daren shared that they have a data science playbook for building out models. This generally follows the path of: initial kickoff, understanding what their business problem is, and if we do build out a solution - what is it going to look like?
Frequently meet with the business stakeholders.
Whenever you’re making major decisions or have questions about something, make sure that you reach out to them about that.
Make sure that it’s part of their business workflow. If they’re adding this new feature, how is the business going to use it and continue using it so it doesn’t get left in the dark.
In terms of making sure that they’re onboard, we talk about their data and what it looks like.
This is a service to the team because they may not have looked at the data at that grand of a scale. They may be more used to looking at it quarter by quarter or year over year. We look at the data and provide the perspective that we’re seeing.
We always ask for their hypothesis and what they think will lead to what we’re trying to predict. This helps us understand what their problem is and include their own ideas into the project as well. That results in some ownership on their side too because they are partnered with us.
When we have the model built out, we evaluate it and see the accuracy. We provide them with several examples and the variation of that data to show the limitations of what that final product will look like.
When you’re spending the time walking through the process and they see it working, this helps getting them onboard.
Resources shared: ⬢ Javier shared this article on packages to help work with business stakeholders more comfortable with Excel: https://lnkd.in/eqXTP82c ⬢ Daren shared, The Making of a Manager by Julie Zhuo: https://lnkd.in/epjdHGgc
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Tegan Bunsu Ashby, Brooklyn Nets | Showing the Difference You’re Making
We were joined by Tegan Bunsu Ashby, Senior Software Developer, Brooklyn Nets. Tegan works at the intersection of data and design, building applications and visualizations to empower front office decision making and help win basketball games.
Towards the end of the hangout, there was a great question from Trevor: How do you show that you are making a difference?
One of the joys and terrors of being in a data intensive role for a team is that you can’t point to oh, we did X or we recommended Y and now we have an NBA championship.
It has to be - does analytics and investment in data at a league level indicate that you’re going to have sustained success? The investment in analytics departments and the success of teams that have been able to do so is indicative of how teams are evaluating the value of that data.
It’s really nice to say, but we also need to hold ourselves accountable. When you’re using all this time and resources - you have to sit back and think, does this contribute to winning? Does it help us make better decisions?
At certain pressure points, did we create an environment where it was easier to negotiate a trade or pull the trigger on drafting a certain player over another? Were we able to support that in a way that was actionable, returnable and implemented? You have to communicate that.
Resources shared:
⬢ Tegan mentioned The MVP Machine book: https://lnkd.in/e5FAkxsm ⬢ Women in Sports Data Symposium + Data Hackathon: https://lnkd.in/eG8M54gH
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Jennifer Listman, Statespace | Culture that Helps Avoid Burnout
We were joined by Jennifer Listman, Director of Research at Statespace. Jenny likes mining multidimensional datasets to find valuable unknown unknowns.
One of the questions that had a lot of people sharing in the chat as well was:
““How do you ensure a work-life balance to avoid burnout?”” ️
A few thoughts:
The culture of the team matters. Make it okay to take personal time in the middle of the week to go to doctors, take children somewhere, etc. If it’s a family member’s birthday, let people do what they need to help them celebrate that person. Having that freedom is important.
As a manager, take time off yourself to set a good example.
Managers have to be really careful and be on top of that with your employees. Understand who has to take responsibility for certain things and if they are at risk of experiencing burnout.
A good manager recognizes when the team needs a break is key. Know when they’ve had a crazy week & let them allocate mental health days accordingly
Especially for companies that have nebulous working hours and lack of specific vacation time, ensure that people are taking vacation.
HR may keep tabs on if people are taking enough time off, and will make people take off. Try quarterly check-ins to make sure that people are on-track to take enough time off.
Some people make the choice to not have any work stuff on my phone at all or delete those apps from your phone on the weekend.
Links shared about burnout ⬢ https://lnkd.in/g5GssAck ⬢ https://lnkd.in/g5DUVj8r
Speaking of culture, Statespace is hiring! ⬢ Statespace Data Scientist (Remote) - Data & Analytics team: https://lnkd.in/ddxpp3Ep ⬢ Statespace Manager of Marketing Analytics and Data Science (Remote) - Data & Analytics team: https://lnkd.in/da-9J9aw ⬢ Statespace Biostatistician / Data Analyst, Digital Health (Remote) - Digital Health team: https://lnkd.in/d2ni-7vu
Other resources shared: ⬢ Jenny shared a paper Statespace recently published (all of my analyses were done in R!): https://lnkd.in/dWFE-Cqx ⬢ Rachael shared Resume Review Club form: https://lnkd.in/gjtQDS9D ⬢ Antti shared a thread on R-related podcasts: https://lnkd.in/gTNtZhdg ⬢ Book on importance of sleep: https://lnkd.in/gUtWb8YF ⬢ Bala shared a link on keeping your mind young: https://lnkd.in/g9XiGmmb ⬢ Antti shared this post on LinkedInHardMode (content challenge): https://lnkd.in/giKWzeYh ⬢ Upcoming Shiny conference (April 27-29): https://shinyconf.com/ ⬢ Bala shared podcasts he enjoys: Datacast (https://lnkd.in/gJtSURsh) , AI Podcast (https://lnkd.in/gVddYDT2 ) and Data Science Imposters (https://lnkd.in/gMEUMUWN )
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Joseph Korszun, ProCogia | Encouraging People to Learn to Code
We were recently joined by Joseph Korszun, Manager of Data Science at ProCogia.
There was a great discussion on mentorship and helping teach people to code.
🤔 How do you encourage people to learn to code? (15:27)
⬢ At every point in your career, you’ve written a formula in Excel so just lending that to translate into you know more usable scripts is helpful.
⬢ It starts with identifying what their determined career path is from their own perspective. Some people want to be more statistics oriented, or more ML engineering oriented. You have to understand what their focal point is.
⬢ Help identify someone’s pain point that they have right now and with that - what’s the quickest way to up-skill them in that?
⬢ Have people do some hands-on labs. This can be more beneficial than watching videos or reading textbooks and allows you to get into the code. You’re going to have errors along the way but that’s ok!
⬢ Let new coders know that it’s okay to google and copy/paste code. This can unblock new coders, so that you don’t have to code everything from scratch. Adapt existing code to your problem.
⬢ You can pair someone with somebody else who has that skillset to help mentor them.
⬢ Remember that junior colleagues often have a lot to teach to senior team members as well. Things are changing so quickly, and there are so many ways to do things. Interns may teach you something new technically and senior colleagues may then mentor them in the soft-skills.
Resources shared: ⬢ Working with IT: https://lnkd.in/dzU-uzzW ⬢ Security courses here give practical tips for how to follow good security practices as an R programmer: https://lnkd.in/dmb2nmGV ⬢ Manager tools podcast: https://lnkd.in/dXY6hK7V ⬢ Breaking Math podcast: https://lnkd.in/dEJNu64F
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Alan Carlson | Robust, modular dashboards that minimize tech debt | RStudio
Robust, modular dashboards that minimize tech debt Presented by Alan Carlson, Snap Finance
Abstract Dashboards can be complex but building them shouldn’t be! We’ve built a wrapper for developing production level dashboards that streamlines onboarding new developers and standardizes the initial infrastructure to mitigate tech debt. Now you and your team can spend more time developing insights and less time trying to spin up shiny code with {graveler}.
Speaker Bio As the Tech Lead for the BI (Business Intelligence) team, Alan’s primary focus at Snap is researching, creating, and maintaining methods that help the rest of Snap’s BI Team in their work. From dashboards to visualizations to R code in general, he has built multiple packages and bookdowns that make BI easier to train and to use within the RStudio environment.
Helpful Links: Blog Post: https://www.rstudio.com/blog/make-robust-modular-dashboards-with-golem-and-graveler/ Graveler package: https://github.com/ghcarlalan/graveler Environment variables: https://docs.rstudio.com/connect/user/content-settings/#content-vars Git-backed publishing: https://docs.rstudio.com/connect/user/git-backed/ If you’d like to join events live: colorado.rstudio.com/rsc/community-events
Question about style guides: Tidyverse Style Guide: https://style.tidyverse.org/ Efficient R Programming book that Colin Gillespie wrote: https://csgillespie.github.io/efficientR/
Questions about RStudio Team: ⬢ RStudio Connect: https://www.rstudio.com/products/connect/ ⬢ Chat with RStudio about RStudio Team: rstd.io/chat-with-rstudio
Data Science Hangout | Mike Smith, Pfizer | Building an R Center of Excellence
We were joined by Mike Smith, Senior Director, Pfizer R&D UK Ltd at the Data Science Hangout - a weekly, free-to-join open conversation for the data science community. If you’d like to join us live, you can add it to your calendar here: rstd.io/datasciencehangout
Mike shared with us all that they are building up a Center of Excellence at Pfizer to help teams across the business build reproducible workflows and use analytics tools effectively & efficiently.
What led to the creation of the CoE within Pfizer and how could we do something similar?
Mike: ⬢ Last year before R/Pharma, we did a poll & found that 1,500+ colleagues had downloaded R. I wanted to service & build up that community to find out what other people are doing and share that. (2:45)
⬢ We’re a very decentralized disparate team, so there are subject matter experts (SMEs) throughout the organization. The Center of Excellence is focused on building connections between SMEs and helping the teams where there isn’t an SME available.
⬢ What we saw was that it’s hard to sometimes get an effective strategy across people in such a big company. We also saw that there were other places within the organization that wanted data science work but they didn’t have an R subject matter expert there. We want to be able to help them solve their problems and set them up with a proof of concept that they can tweak.
33:52 -
Ok so how to do this?
⬢ Find out how many people are using the tools and who you could help.
⬢ Be that translator role between the business people who need solutions with the technical side - folks who are building things.
Communicate the value:
⬢ We may have a bunch of people trying to write the same function or access the same data. We could solve this problem once and then make that into a package and serve that out to everybody and streamline their workflow for the future.
⬢ There’s a benefit in being able to solve problems strategically. We’re trying to build the lego pieces so that the next time we see a problem like this, we can use that. We can also offer this as a package or via something that allows other people to solve that problem for themselves.
Talk to someone who has experience in this, other community builders
⬢ Doug Robinson helped start this at Pfizer because he had set-up something like this at Novartis before as well. Talking with someone who has done this before is really helpful because they have the experience of : who do we need to tell, what do we need to tell them, what’s our purpose for being, who do you have to speak to and convince. That has to be ready to go.
Find a champion in leadership:
⬢ We went to the head of Statistical programming and said we’d like to do something like this. Fortunately, she was 110% supportive here.
How did they phrase this CoE at Pfizer?
⬢ Check out this description from the job post: https://lnkd.in/g776nYVF
Resources shared: Ethan shared: I saw on RStudio blog the other day the {sassy} system for SAS programmer transitioning to R: https://sassy.r-sassy.org/index.html Tatsu shared: For folks that have RStudio Connect and Tableau, there’s now a supported integration https://www.rstudio.com/blog/dynamic-r-and-python-models-in-tableau-using-plumbertableau/ Tatsu shared the Working with IT section of the champion site: https://www.rstudio.com/champion/working-with-it Mike’s Bandcamp: https://mikeksmith.bandcamp.com/ R Consortium Pharma Working Groups: https://www.r-consortium.org/projects/isc-working-groups R in Pharma Conference: https://rinpharma.com/ Upcoming Pharma meetup with Merck: https://youtu.be/RBVqKi3FV30
Question about style guides: Jesus shared: Tidyverse Style Guide: https://style.tidyverse.org/ Jesus shared: One guide overall guide on better clean R code is the contributing.md of the ggplot2 package: https://github.com/tidyverse/ggplot2/blob/main/CONTRIBUTING.md Sam shared: Efficient R Programming book that Colin wrote: https://csgillespie.github.io/efficientR/
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Erin Pierson, Charles Schwab | Advocating for the Role you Want
We were joined by Erin Pierson, Sr Manager of Trading Operations at Charles Schwab.
There was a great conversation about networking within your organization on the hangout with Erin and also tips for the question: How do you have the conversation with your manager that…this isn’t the job that I was hired to do? 🤔
11:57 -
⬢ Talk to people and let them know what you’re capable of doing and what you’re interested in doing. Hopefully you have someone who’s in your corner that will help you get to where you want to be.
⬢ Start talking about what your skills are and what you had discussed your position being. ““You have technically hired me for this job and I’m not doing that job.””
⬢ You have to have a little finesse about it and can’t be super negative but you can say, “You know, I’m really interested in doing X, Y, Z. I appreciate this job. I’m doing this and I’m learning this but I could also be doing more. I’m interested in doing more.”
⬢ Try phrasing it as what can I do for your leadership, even if it’s really more for you.
⬢ Make your manager aware of where you are and what you want to do. You need to have a good manager to have this happen but in some cases they may even be thinking about hiring someone for that role you actually want to do.
⬢ The bottom line is, if you’re not happy in your job you’re not going to stay in your job. They don’t want you to be unhappy because they don’t want to lose you. If you’re not happy, you’re not going to be doing good work. Be honest and let them know how you’re feeling about it.
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Joe Gibson, de Beaumont Foundation | Collaboration Across a Team
We were joined by Joe Gibson, Senior Project Director at de Beaumont Foundation.
At the hangout, there was a great conversation on building a code archive database/using snippets ️
38:16 - Sharing a bit of the conversation below:
Stephanie asked: How do you build a code archive database? I’ve worked places with Word documents and need something more user-friendly.
Joe: Each time that we create a work product, we log into our database - it creates a new ID for that project and we add some basic information then create a folder for it. We then store it in our structure we have in our network to make it easier for people to find things. In addition to having the library of all our code, we have some folders that have handy code snippets.
Steve: We’re developing an internal package for getting data and doing common tasks is a more standard way, which is nice. Nothing too fancy but it streamlines things.
Ethan: If you’re using GitHub, Gist is also a great way to share snippets of code. I have a snippet that set up the header/documentation structure for a script. On of the first bits is library(tidyverse)
Mike asked: Has anyone developed R snippets and distributed them across a team? I don’t know if people are familiar with the snippets within RStudio but they are cool because you can use template frameworks and it jumps you to the next thing you need to tailor for your own situation. It’s essentially a function.
Tatsu shared: https://lnkd.in/gwWCB3T2
Javier: These are super helpful, Mike. I recently learned about them and was shocked. I have all kinds of snippets for myself and my team now.
Jordan: I’ve saved snippets inside a core package and have a function that updates your RStudio snippets. Saved a snippet update gist here: https://lnkd.in/gEcmNViN requires a snippets folder in your package/inst. You can have this .onLoad() if you’re feeling lucky
Resources shared: ◘ Rachael shared the new champion site: https://lnkd.in/gaHt_8Br ◘ Jen & Michelle shared the National Syndromic Surveillance Program Community of Practice: https://lnkd.in/gHE3C94s ◘ Joe shared Harold’s GitHub for NSSP projects: https://lnkd.in/gFRsezTM ◘ Joe mentioned, Mozilla Foundation to ensure the internet remains a public resource that is open and accessible to us all. https://lnkd.in/gbvdQXwH ◘ Rachael shared AI Inclusive https://lnkd.in/gt8cQUUX ◘ Cris shared Fairlearn to improve fairness of AI systems:: https://fairlearn.org/ ◘ Angela shared openscapes, helping teams develop collaborative practices that are more reproducible, transparent, inclusive, and kind: https://lnkd.in/gs-6_-ZA
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Rich Iannone || {gt} Intendo Game Data Project Walkthrough || RStudio
00:00 Introduction 00:11 Setting up our environment 01:21 Importing data 01:56 Data preparation using the tidyverse 14:12 Basic gt table 16:25 Specifying row order with row_group_order() 17:20 Formatting currency with fmt_currency() 18:10 Formatting missing values with fmt_missing() 18:55 Creating row groups with tab_options() 19:50 Relabel column names with cols_label() 20:41 Creating tab spanners with tab_spanner() 23:00 Creating a table title and subtitle with tab_header() 24:40 Aligning table title and subtitle with opt_align_table_header() 25:16 Creating a stubhead label with tab_stubhead() 26:00 Format all table cell text using tab_style() 27:25 Automatically format data color based on value using data_color() 30:45 Creating Markdown-friendly source notes using tab_source_note() 32:45 Creating Markdown-friendly footnotes using tab_footnote() 39:28 Adjust table column width using cols_width() 40:55 Adjust cell padding using opt_horizontal_padding() and opt_vertical_padding() 42:22 Change row group headers using tab_style() 43:40 Convert all table text to small caps using opt_all_caps() 43:58 Change all table text font using opt_table_font() 44:28 Changing table, table heading, footnotes, and source notes background color using tab_options() 46:41 Add a table “cap” at the top and bottom using table.border.top.width() and table.border.bottom.width() 47:23 Use multiline formatting with footnotes using footnotes.multiline() 47:34 Change line style using table_body.hlines.style() 47:55 Change table title and subtitle font sizes using heading.title.font.size() and heading.subtitle.font.size() 48:11 Checking out our final table!
Code to recreate the table from the video: https://github.com/kierisi/rstudio_videos/blob/main/gt/rich-intendo-project-walkthrough/intendo-30032022.R
Learn more about the gt package here: https://gt.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) Motion design and editing: Jesse Mostipak (@kierisi) Music: Nu Fornacis by Blue Dot Sessions https://app.sessions.blue/browse/track/98983

Santiago Rodriguez | Intro to functional data analysis | RStudio Meetup
Energy Meetup - Intro to functional data analysis Presented by Santiago Rodriguez
2:49 - Start of talk 12:15 - First Q&A point 23:23 - Second Q&A point 31:56 - Third Q&A point 40:44 - Final Q&A point
Abstract The focus of this talk is to introduce functional data analysis (FDA) and to showcase some of its applications in the utility space. A primary source of data in a utility are meter reads. This data is periodic and appears discrete, but energy consumption is continuous, which makes meter reads a perfect use case to apply FDA. The talk will highlight two applications: load profiles and segmentation. The talk will be non-technical - no math or code - because the goal of the talk is to persuade you to investigate FDA on your own.
Speaker Bio Santiago is a data scientist in the marketing department at Consumers Energy, a Michigan-based, public utility. He focuses on data engineering, data science, and MLOps. Santiago has about a decade of experience working in the analytics space across energy, aviation, automotive, and contact centers. He has a bachelor’s in finance from Florida State University and a master’s in statistics from Texas A&M.
Resources shared in the chat: FDA descriptive statistics blog post: https://lnkd.in/gZjauTdt Intro to FDA Blog Post: https://lnkd.in/gM5NCbHe Refund package is introduced in ‘Introduction to Functional Data’ by Kokoszka and Reimherr: https://lnkd.in/gd5Bt3tE Gaussian Process Regression Analysis for Functional Data by Shi & Choi: https://lnkd.in/gAig-AZa GAMs in R, free course - Noam Ross: https://lnkd.in/gput2QbK Jiguo Cao - Intro to FDA on YouTube: https://lnkd.in/ga4qfGDr Feedback: rstd.io/meetup-feedback Talk submission: https://lnkd.in/gJ7EUSCk If you’d like to find out about upcoming events you can also add this calendar: rstd.io/community-events
Packages shared in the chat: Task view https://lnkd.in/gBqtuV2b FDA: https://lnkd.in/ge85i6UE Refund package: https://lnkd.in/gXGcT79t FDA.USC: https://lnkd.in/ggHdUaEe
Brad Lindblad | Professional Financial Reports with {rmarkdown} | Posit
GitHub: https://github.com/bradlindblad/pro_reports_talk
Abstract: With finance there will always be a need for reports, and as long as there’s a need for reports, there will be R users who want to create them as lazily as possible.
R Markdown lets us create incredibly customized and branded reports that can run automatically each month or day or whatever, and it all starts with the wonderful parameterizing features of R Markdown.
In this lightning talk, we will work through a practical example of creating an income statement for a group of theoretical office branches. You will learn how to make a parameterized R Markdown report, organize your R Markdown files and even create a custom cover letter, all in R.
Bio: Brad Lindblad is a data scientist located in Fargo, North Dakota. He is author of the tidyUSDA and schrute R packages, and specializes in geospatial data science and risk modeling. Brad is a frequent contributor to data science publications and loves creating new R users.
This is a meetup recording from December 2020. For more information on how to join meetups live: rstd.io/community-events
Links shared in the chat: Brad’s material/slides: https://github.com/bradlindblad/pro_reports_talk For anyone who’s new to R Markdown, this is a great reference guide and overview: https://bookdown.org/yihui/rmarkdown/ Pagedown package: https://github.com/rstudio/pagedown ETL example: https://solutions.rstudio.com/r/apps/twitter-etl/ More information on RStudio Connect: https://www.rstudio.com/products/connect/ To chat with RStudio about Connect: rstd.io/chat-with-rstudio
Programming Games with Shiny || Dragon Realm || RStudio
00:00 Introduction 00:05 Fun dragon facts 00:35 Describing the Dragon Realm game 01:20 Outlining our approach 04:38 Coding the basics of our app 10:15 Programming our action buttons 14:10 A note on coding objects “outside” of Shiny 15:27 Programming cave choice logic 20:29 Connecting action buttons to our consequences function 29:40 Creating separate pages using tabsetPanel() 39:35 Conclusion
You’ve most likely used Shiny to build a web app that displays data, but you can also use Shiny to build games! In this video series, Jesse and Barret pair program simply games in Shiny as a way to uncover and explore new features.
And because we know you’ll ask, Jesse is using the Woodland theme from the base16 palette. You can get it - and other themes - from the {rsthemes} package: https://github.com/gadenbuie/rsthemes
Read up on tabset panels here: https://shiny.rstudio.com/reference/shiny/1.5.0/tabsetPanel.html
Learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) and Jesse Mostipak (@kierisi) Animation, motion design, and editing: Jesse Mostipak (@kierisi)
Intro music: RGift by Blue Dot Sessions (https://app.sessions.blue/browse/track/91282 ) Theme song: Hakodate Line by Blue Dot Sessions (https://app.sessions.blue/browse/track/91291 )

Data Science Hangout | Kristi Angel, Stitch Fix | How to Break into Data Science
On March 10th, we were joined by Kristi Angel, Data Scientist - Experimentation @ Stitch Fix.
This week featured a lively discussion on breaking into data science:
Try to get around the applicant tracking systems by networking - we can call it something different though. Utilize LinkedIn as a professional social media. Find people who are generating content that you find interesting. Engage and people will start to know who you are by your comments and feedback. This will help you build a reputation for yourself.
Try to connect directly with the hiring manager. There are a lot of hiring managers advertising for roles on LinkedIn right now.
Take a look at your resume and see if you can get some reviews from peers you trust and see if there’s a way that might be more effective. Your resume needs to tell your story in less than a 5 minute glance. A lot of us mentioned we would be willing to help review here - let’s think of ways to connect people through this group too :)
For some people hitting HR roadblocks, where large businesses have decided that there are minimum requirements for education and experience and there is absolutely no budging from it - looking at smaller companies without those rules has helped lots of people I know.
On the topic of interviewing and hiring people with degrees vs no degrees or not a relevant degree
️ A lot of us shared in the chat that we have different degrees than the job we’re doing today :)
Looking past the degree is a really good idea in many cases because you need diversity of thought and background. If you want the company as a whole to have empathy for customers, it’s important to have a really varied set of life experience backgrounds.
We still acknowledge, it can be hard because the degree and experience gives a little bit of a certification, especially for junior people. In startups especially, the time spent hiring can be a lot of drain on the organization so degrees are often used to quickly filter.
If you’re on the side of the company interviewing, something that can also help is to make sure that all interview loops have a diverse representation.
Links shared:
Frank shared: Kahnaman’s book Noise has some great frameworks for interviewing and how we make decisions on candidates. One example, stack rank your candidates instead assessing each individually because (people) are typically bad at that: https://readnoise.com/
Melody’s team is hiring several Data Scientist positions and a Data Architect position. Here are a few links: https://lnkd.in/gf9nqqWV https://lnkd.in/gKwEcqC2 Data Architect position: https://lnkd.in/g3dRxKMG
Liz shared: Manager Tools has a great podcast about resumes https://lnkd.in/gsRDKrqF"
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Data Science Hangout | Stephen Bailey, Whatnot | From Academia to Industry
We were joined by Stephen Bailey, Data Engineer at Whatnot and former Director, Data & Analytics at Immuta at the Data Science Hangout.
A few topics from the discussion: being the first data scientist working with real-time data bringing data science skills to data engineering getting people to feel ownership of data
49:43 - Towards to end of the hangout, there were a lot of thoughts on moving from academia to industry.
Tatsu asked: As a fellow recovering academic, I’d love to hear your views of where we are in terms of the process for PhDs to go into industry in data related role? In psychology, there was minimal visibility into how to apply my skill set.
A few thoughts from the discussion:
‣ With a PhD program, it’s almost like you have a relationship with the field and you’re very invested in the research. You’re part of the scientific community. One of the things that’s very jarring about leaving, is that it feels like a divorce - severing that relationship and that community. It is almost two different worlds.
‣ If anyone’s thinking about transitioning or doing something similar, talking to people and building relationships before that transition can really help.
‣ Intentionally think about branding yourself.
‣ The biggest thing in industry is that you can’t fall in love with a problem. It makes finding a job much harder. You have to fall in love with your position in some way, how you are solving problems. Apply the mindset of a scientist into this business context.
‣ The skillsets that everyone’s learning in graduate school - no matter what the subject area is, are very helpful. My biggest advice would be don’t be afraid to reach out to people (whether on LinkedIn or in community events.) Most people are very willing to talk to you about what their journey was no matter where they’re at in their career. Once you realize that, you will see that your room is much bigger.
‣Finding a mentor outside academia is really helpful.
‣ Certain departments within universities do a really nice job connecting with industry with the community and have great collaboration. How can that carry over to other institutions? How can companies better connect with colleges to help them understand how they can help more students?
Other resources shared: Santiago shared: I watched this recently. It explains zero knowledge proofs: https://lnkd.in/g6iS-97i (source = Wired) Ian shared: PAWS is a very useful package for interacting with AWS: https://lnkd.in/g_Fh9t6X
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Mike Miller, Engine | Adjusting for Stakeholder Tendencies
We were recently joined by Mike Miller, Vice President and Data Science Team Leader at Engine.
We dove deep into a conversation towards the end of the discussion regarding surveys (whether internal or to clients):
Responder fatigue is a huge concern and something we deal with on a daily basis.
️ Keep it short, keep it focused. Make sure you’re asking people what you want to get out of it. It’s easy to go down a rabbit hole with data you’d like to see. Every time you add a question make sure it’s focused on your main objective.
️ Five questions, maybe 10 max. More than that you should be paying your respondents.
️ Double-barreled questions are bad - check for these. (questions composed of more than two separate issues or topics, which however can only have one answer.)
️ Write your question well. Think about the people answering this. Make sure the question is asking what you want people to respond to and that there’s no way for other people to interpret it differently.
️ Resources mentioned as well:
Bryan’s meetup talk on Survey Design: https://lnkd.in/dCK_Ga8x Mentorship program Maisie shared: https://lnkd.in/gwPyH-Mr"
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
George Mount | R for Excel Users - First Steps | RStudio Meetup
Abstract: Excel’s built-in programming language has served as an entry point to coding for many. If you’re a data analyst steeped in Excel, chances are you could also benefit from learning R for projects of increased scope and complexity.
This presentation serves as a hands-on introduction to R for Excel users:
How R differs from Excel as an open source software tool How to translate common Excel concepts such as cells, ranges, and tables to R equivalents Example use cases that you can take and apply to your own work How to enhance Excel and Power BI with R By the end of this presentation, you will have a clear path forward for building repeatable processes, compelling visualizations, and robust data analyses in R.
Speaker Bio: George Mount is the founder of Stringfest Analytics, a consulting firm specializing in analytics education and upskilling. He has worked with leading bootcamps, learning platforms and practice organizations to help individuals excel at analytics. George regularly blogs and speaks on data analysis, data education and workforce development. He is the author of Advancing into Analytics: From Excel to Python and R (O’Reilly).
Link to George’s white paper “Five things Excel users should know about R”https://stringfestanalytics.com/five-things-r-excel/
Working group sign-up for those interested!
Within many organizations Microsoft Excel is a preferred tool for working with data for non data analytics users. In order to build a data driven organization, source data and analytical models must be accessible to all data users (technical and non-technical) within their preferred tool. Let’s rally the R community to welcome Excel users into our data driven culture by building an Excel add-on to access data and models available within RStudio. If you’re interested in continuing this conversation and joining a working group, let us know! rstd.io/excel-r-community
Links shared at the meetup! George’s GitHub/ Presentation Resources: https://github.com/stringfestdata/rstudio-mar-2022
Packages? Where to find them & recommendations:
CRAN Task Views: https://cran.r-project.org/web/views/
Mark shared: for folks who primarily use excel to present formatted tables, the gt package is a great way to start doing this programmatically in R: https://gt.rstudio.com/
Ivan shared: In addition to regular Google, I’d recommend https://rseek.org/
, given that the character ‘R’ is sometimes not search friendly :)
Jeff shared: Fpp2 is great for forecasting and time series analysis - https://otexts.com/fpp2/
Floris shared: https://otexts.com/fpp3/
Ivan shared: If you’re into tidyverse, there’s an equivalent for time-series: https://tidyverts.org/
George shared: https://dplyr.tidyverse.org/
Ryan shared: This can be a helpful package for dynamically editing tables, like in excel https://github.com/DillonHammill/DataEditR
Ryan shared: This is a great package for making and learning ggplot visualizations: https://cran.r-project.org/web/packages/esquisse/vignettes/get-started.html
Other resources: Monaly shared: There is a R help group: r-help@r-project.org George shared: Helpful book/site on statistics: https://moderndive.com/ Ryan shared:Harvard has a good online source (free options) that has a number of classes, the following for stats: https://www.edx.org/professional-certificate/harvardx-data-science George shared: R for Data Science free book: https://r4ds.had.co.nz/ Fernando shared: big book of R https://www.bigbookofr.com/index.html Floris shared: Advanced R Book: https://adv-r.hadley.nz/ Pedro shared: The R for Data Science Slack channel is a great learning resource! r4ds.io/join (we just made a channel there called #chat-excel_to_r Ivan shared: For teams who are deeply entrenched in Excel (like my old team), this tool may be useful - https://bert-toolkit.com/ . It allows running R code in .xls, so you can learn R while doing .xls :)
Re: Glossary of terms: Ivan shared: inner_join() is like VLOOKUP in .xls. Dan shared: Here’s one cheat sheet (glossary of Excel to R) that I just found; https://paulvanderlaken.com/2018/07/31/transitioning-from-excel-to-r-dictionary-of-common-functions/
Extra Meetup Links Feedback: rstd.io/meetup-feedback Talk submission: rstd.io/meetup-speaker-form If you’d like to find out about upcoming events you can also add this calendar: rstd.io/community-events RStudio conference/submit a talk: https://www.rstudio.com/conference/ Recordings of all meetups: https://www.youtube.com/playlist?list=PL9HYL-VRX0oRKK9ByULWulAOO5jN70eXv
Creating Features for Machine Learning from Text – Julia Silge, March 2022
Julia Silge is a software engineer at RStudio PBC where she works on open source modeling tools. She holds a PhD in astrophysics and has worked as a data scientist in tech and the nonprofit sector, as well as a technical advisory committee member for the US Bureau of Labor Statistics. She is an author, an international keynote speaker, and a real-world practitioner focusing on data analysis and machine learning. Julia loves text analysis, making beautiful charts, and communicating about technical topics with diverse audiences.
Natural language that we as speakers and writers use must be dramatically transformed to new representations for analysis, whether we are just starting off with exploratory data analysis or are ready to train machine learning algorithms such as predictive models. We can explore typical text preprocessing steps from the ground up, from tokenization to building word embeddings, and consider the effects of these steps. When are these preprocessing steps helpful, and when are they not? In this talk, learn about the process of text preprocessing for ML models in the real world, how and when practitioners use different preprocessing choices, and considerations for text ML tooling.
#rstats #nlp #juliasilge #coding #machinelearning https://rug-at-hdsi.org/ https://twitter.com/RUGatHDSI

Mara Averick & Maya Gans | Data Visualization Accessibility | RStudio Meetup
2:55 - A11Y in R: Adapting Sarah L. Fossheim’s 10 dos and don’ts to keep in mind when designing accessible data visualizations | Maya Gans 30:11 - Adventures with {highcharter} and the Highcharts accessibility module | Mara Averick 58:07 - Q&A
A11Y in R: Adapting Sarah L. Fossheim’s 10 dos and don’ts to keep in mind when designing accessible data visualizations | Presented by Maya Gans
Abstract: This talk will use R based visualizations and walk through examples of accessibility considerations when making plots and applications.
Speaker Bio: Maya Gans is a Data Visualization Engineer at Atorus Research where she develops custom applications using R and JavaScript. As an RStudio intern she designed TidyBlocks, a visual block based programming language. Maya also co-wrote JavaScript for Data Science. Maya uses ggplot2 and d3.js to create music related infographics for JamBase.com. When Maya’s not coding, she’s climbing mountains.
Adventures with {highcharter} and the Highcharts accessibility module | Presented by Mara Averick
Abstract: Lessons learned about accessibility in data visualization through using the {highcharter} R package and the Highcharts visualization library’s accessibility module.
Speaker bio: Mara is a developer advocate at RStudio. She is the author of neither the highcharter package nor the Highcharts charting library, but enjoys using both to make interactive, accessible data visualizations.
So many amazing resources shared yesterday on data visualization accessibility:
Sarah L Fossheim’s blog - intro to designing accessible data viz: https://lnkd.in/dAAXfE35 Coblis - Color Blindness Simulator: https://lnkd.in/dJT-hJE4 Web Content Accessibility Guidelines (WCAG): https://lnkd.in/dJKmvFTq Color Contrast Accessibility Validator: https://color.a11y.com/ Google lighthouse (automated tool for improving quality of web pages): https://lnkd.in/d8xjSN5i A11y Project Checklist: https://lnkd.in/diFM_TBd Chartability (questions) for ensuring data visualizations, systems, and interfaces are accessible: https://lnkd.in/d_7wk3zx Accessible {highcharter} GitHub repo (Mara’s charts, and source .Rmds): https://lnkd.in/dhvBwQ-f Mara’s blog post series: https://lnkd.in/d9xz6VZ6 10 Guidelines for DataViz Accessibility by Øystein Moseng: https://lnkd.in/dS-XsKxw Accessible visualization via natural language descriptions by Alan Lundgard and Arvind Satyanarayan: https://lnkd.in/dpdN4skK DataViz Accessibility Advocacy and Advisory Group: https://lnkd.in/d336ACn3 Alt-texts: The Ultimate Guide by Daniel Göransson: https://lnkd.in/dsHcvPs2 JooYoung Seo’s Talk on non visual interactions with R packages: https://lnkd.in/dcid56BT Accessible Data Science for the Blind Using R: https://lnkd.in/dTWZbau8 Maya’s blog on skip links: https://lnkd.in/dFTYFxTk Twitter alt text: https://lnkd.in/d9bKqiPU Twitter account to follow: @alttextreminder Silvia Canelón’s blog posts: https://lnkd.in/drwbE2Rf
Packages shared: ggpattern: https://lnkd.in/dyTBvvz4 gglabeler: https://lnkd.in/dumA8Um8 gghighlight: https://lnkd.in/d_m25j7x sonfiy: https://lnkd.in/dyPwHimP tuneR: https://lnkd.in/dWi2WZH8 brailleR package - https://lnkd.in/d_75cdnQ
Caleb brought up a great point that visualizing data isn’t new, so it cam be helpful to look at adjacent disciplines to see ways people have solved these before as well. For example, cartographers have had really creative ways to make things visible.
Sarah Belle, cartographer who makes fonts / typography really legible on maps: sarahbellmaps.com/belltopo-sans-font-by-sarah-bell/ Cynthia Brewer’s work on color palettes. https://colorbrewer2.org/# cartographers who use 3d printing for tactile maps: https://touch-mapper.org/en/
Shiny Usage Tracking in Posit Connect
🤔 Who actually used my Shiny app and for how long? ↗️ Is viewership increasing? Did my CEO use it?!
Did you know? Posit Connect can record event-style usage information which is intended to answer questions like this and is accessed via the Posit Connect Server API.
Please note: during the meetup you can ask questions here anonymously as well: rstd.io/connect-meetup-questions
We will walk through several examples for getting started with the Posit Connect usage data designed to help you answer questions like:
Who is visiting my content? What reports are most common? Has viewership increased over time?
During/after the meetup, you can use the GitHub repo to create the dashboard in your own Connect environment as well and schedule the report to be distributed through email (with inline graphics). The goal for this code is that it is generic enough so that you can copy/paste it into your own R session and run it successfully.
In order for the code to work in your environment, you need two pieces of information unique to your enterprise: Posit Connect’s server path A Posit Connect API Key
We’d love to hear your feedback and learn how you have taken this API and created dashboards for your own organizations as well.
Helpful Links: Follow-up thread on community.rstudio.com: https://community.rstudio.com/t/rstudio-connect-usage-data-thread-to-discuss-ideas-improvements/130581 Contribute to open-source examples: https://github.com/sol-eng/connect-usage Posit Connect Server API: https://docs.rstudio.com/connect/api/ Cookbook: https://docs.rstudio.com/connect/cookbook/ R Client: https://pkgs.rstudio.com/connectapi/ Python Client (mostly deployment): https://github.com/rstudio/rsconnect-python/ Cole’s slides: https://github.com/RStudioEnterpriseMeetup/Presentations
If you ever have questions about Posit Connect or any of our professional products, you can use this link to chat with our team: rstd.io/chat-with-rstudio
Data Science Hangout | Matthias Mueller, Campaign Monitor | Understanding Customer Actions
We were recently joined by Matthias Mueller, Senior Director of Analytics at Campaign Monitor.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.posit.co LinkedIn:https://www.linkedin.com/company/posit Twitter: https://twitter.com/posit
Carson Sievert || Using tagQuery() from {htmltools} to modify HTML snippets in R || RStudio
00:00 Introduction 00:45 Motivating example - enabling front-facing camera as an input for fileInput() 01:55 Breaking down the return value of fileInput() 04:16 Design philosophy of fileInput() 07:27 tagAppendAttributes() overview 11:05 tagQuery() basics 12:00 Quick overview of the htmltools package 13:18 How tagQuery() is used to append attributes 20:54 How tagQuery() is used to append children 23:45 Using tagQuery() on an actionButton()
Learn more about tagQuery here: https://rstudio.github.io/htmltools/articles/tagQuery.html
Read up on tagAppendAttributes() here: https://shiny.rstudio.com/reference/shiny/latest/tagAppendAttributes.html
And learn more about the htmltools package here: https://rstudio.github.io/htmltools/index.html
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Developer (@winston_chang) Animation, design, and editing: Jesse Mostipak (@kierisi)

James Blair || Getting Started with {plumbertableau} || RStudio
00:00 Introduction 01:19 Setting up the problem - capitalizing text with a custom function 02:18 Using Plumber to create an API for our function 04:08 Using Run API + Swagger from the RStudio IDE 05:44 Giving Tableau access to the function with PlumberTableau 09:16 Reviewing what we’ve done so far 09:47 Comparing results between Plumber and PlumberTableau 10:12 Overview of what PlumberTableau does 14:27 Centralized hosting with RStudio Connect 15:17 Looking at our API in RStudio Connect 18:14 How to access the deployed API from Tableau 21:03 Overview of RStudio Connect, Tableau, and PlumberTableau process 21:52 More in-depth example using sample sales data 22:36 Example with the Python equivalent of PlumberTableau, FastAPITableau 25:15 Overview of how these Tableau extension packages work 27:21 Setting up a connection between Tableau and RStudio Connect
Read more about the plumbertableau package here: https://rstudio.github.io/plumbertableau/
And learn about the fastapitableau package here: https://rstudio.github.io/fastapitableau/
If you’re unfamiliar with Plumber, this Quickstart guide gives a good overview of the package: https://www.rplumber.io/articles/quickstart.html And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: James Blair (@Blair09M) Design and editing: Jesse Mostipak (@kierisi)
Music: Borough by Blue Dot Sessions https://app.sessions.blue/browse/track/89821
Carson Sievert || Customizing Navigation Items in Shiny using {bslib} || RStudio
00:00 Introduction 00:15 Linking inside navbarPage 01:19 Replacing tabPanel with navbarPage, and navbarMenu 02:32 nav_spacer() 03:41 Adding header and//or footer content 04:07 Replacing tabsetPanel with navs_tab and navs_pill 04:32 navs_tab_card() and navs_pill_card() variants 04:40 Demo of all of the nav_*() functions
The bslib R package provides tools for customizing Bootstrap themes directly from R, making it much easier to customize the appearance of Shiny apps & R Markdown documents. bslib’s primary goals are:
- To make custom theming as easy as possible.
- Custom themes may even be created interactively in real-time.
- Also provide easy access to pre-packaged Bootswatch themes.
- Make upgrading from Bootstrap 3 to 4 (and beyond) as seamless as possible. (Shiny and R Markdown default to Bootstrap 3 and will continue to do so to avoid breaking legacy code.)
- Serve as a general foundation for Shiny and R Markdown extension packages. (Extensions such as flexdashboard, pkgdown, and bookdown already fully support bslib’s custom theming capabilities.)
You can read more about bslib here: https://rstudio.github.io/bslib/articles/bslib.html And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Carson Sievert (@cpsievert) Design & editing: Jesse Mostipak (@kierisi)

Alok Pattani - Google | Sports Analytics Meetup | RStudio
Abstract: The increasing volume, variety, and velocity of sports data provides both great opportunities and challenges for data scientists working in sports. Using R with Google Cloud data science tools like BigQuery can help practitioners scale their analysis and impact in this “new era” of sports analytics. This presentation will include a demonstration of using R and Google Cloud together with an NCAA basketball data example, as well as a discussion of the application of such metrics and tools in the sports media and technology industries.
Speaker Bio: Alok is a Data Science Developer Advocate at Google, where he shows how to use Google Cloud tools for data science, in sports and otherwise. He is a sports analytics expert and a long-time user of R and RStudio. Before joining Google, Alok spent 8 years at ESPN, where he was a founding member of their Sports Analytics team and contributed significantly to the use of analytical content across all media platforms. Alok is originally from Cheshire, CT and earned a BA/MA in statistics from Boston University.
Alok’s Slides: https://lnkd.in/gxQUf8nV
Packages shared: sportsdataverse: https://lnkd.in/g8UKAJgc wehoop: https://lnkd.in/grBgmc33 hoopr: https://lnkd.in/gRHNWV4j glmnet: https://lnkd.in/g4cuxYzs googleAuthRverse: https://lnkd.in/gf5fRgcC All of Mark Edmondson’s Google packages: https://lnkd.in/gmTpvMsY
Other resources shared: Sports data: https://www.spotrac.com/ Analyzing NCAA Basketball with GCP: https://lnkd.in/gT-4vWwa 2020 Google Cloud March Madness Insights: https://lnkd.in/gEWd9xtu Alok’s presentation on Innovating the MLB Fan Experience through Data: https://lnkd.in/g9XmFKsy NFL Player Tracking Data Meetup recording: Analyzing Soccer Data with BigQuery: https://lnkd.in/gbbCGKaK Sports channel on the R for Data Science Online Learning Community Slack: r4ds.io/join # chat-sports_analytics
Carson Sievert || Developing Shiny Custom Themes in Real Time Using {bslib}| RStudio
00:00 Introduction 00:09 The magic of bs_theme_preview() 01:43 The interactive widget provided by bs_theme_preview() 02:12 Using Bootswatch themes 02:57 Using the interactive widget to adjust your theme in real time 03:38 Integration with Google Fonts 04:22 Thematic is enabled in bs_theme_preview() 04:45 DT tables is enabled in bs_theme_preview() 05:30 Going from the interactive widget to your R code 07:03 Using interactive theming on your own Shiny app 09:01 Interactive theming with R Markdown documents
The bs_theme_preview() function launches an example shiny app via run_with_themer() and bs_theme_dependencies(). This is useful for getting a quick preview of the current theme setting as well as an interactive GUI for tweaking some of the main theme settings. Link to docs: https://rstudio.github.io/bslib/reference/bs_theme_preview.html
You can read more about the bslib package here: https://rstudio.github.io/bslib/ And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Carson Sievert (@cpsievert) Design and editing: Jesse Mostipak (@kierisi)

Carson Sievert || Custom Theming with {bslib} in Shiny and R Markdown using bs_theme() || RStudio
00:00 Introduction 01:15 Jumping right in with the theme argument in Shiny 01:31 Shiny’s classic navbarPage using bs_theme() 01:46 Specifying your Bootstrap version 02:31 Using the theme argument in R Markdown 03:17 Custom theming in R Markdown using bs_theme() 04:10 bslib templates provided by RStudio 05:33 Review of common arguments for the theme parameter 08:47 Tips for working with dark themes 10:34 Introduction to the thematic package, which styles plots 12:04 How thematic handles fonts 13:09 Using fonts with bslib in R Markdown 14:36 Moving a theme from an R Markdown document into a Shiny app 16:51 Setting warnings for contrast ratios 18:42 A quick tour of Bootstrap 4 and Bootstrap 5 Sass variables 20:35 A quick overview of writing custom HTML in Shiny 22:15 How bslib automatically handles color contrast ratios
bs_theme() allows you to creates a Bootstrap theme object, where you can choose a (major) Bootstrap version, choose a Bootswatch theme (optional), customize main colors and fonts via explicitly named arguments (e.g., bg, fg, primary, etc), and customize other, lower-level, Bootstrap Sass variable defaults via ….
You can read more about the bslib package here: https://rstudio.github.io/bslib/ And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Carson Sievert (@cpsievert) Design and editing: Jesse Mostipak (@kierisi)

Barret Schloerke || Maximize computing resources using future_promise() || RStudio
00:00 Introduction 01:45 Setting up a multisession using the future package 02:05 Simulation using two workers 04:14 Simulation using 10 workers 05:20 What happens when we run out of workers? 05:35 How Shiny handles future processes like promises 07:16 Introduction to future_promise() 07:45 Demo of the promises package 09:21 Setting the number of workers 10:40 Demo of processing without future_promise() 14:11 Wrapping a slow calculation in a future() 14:53 Demo of processing using Plumber 16:25 Considerations on the number of cores to use 17:21 What happens if we run out of workers? 19:44 Decrease in execution times using future_promise()
In an ideal situation, the number of available future workers (future::nbrOfFreeWorkers()) is always more than the number of future::future() jobs. However, if a future job is attempted when the number of free workers is 0, then future will block the current R session until one becomes available.
The advantage of using future_promise() over future::future() is that even if there aren’t future workers available, the future is scheduled to be done when workers become available via promises. In other words, future_promise() ensures the main R thread isn’t blocked when a future job is requested and can’t immediately perform the work (i.e., the number of jobs exceeds the number of workers).
You can read more about the promises package here: https://rstudio.github.io/promises/articles/shiny.html And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) Design and editing: Jesse Mostipak (@kierisi)

Nick Strayer || Part IV: Styling a Shiny Wordle App with CSS || RStudio
00:00 Introduction 00:44 Switching from verbatimTextOutput to uiOutput 01:42 Switching from renderText to HTML DOM elements 03:17 In-line styling with divs 07:30 Converting individual letters from block elements to adjacent grids with CSS grid 08:56 Adding CSS at the head of the UI variable in Shiny with tags$head (and wrapping with HTML!) 10:36 CSS targeting of the background color 12:24 Link: Complete Guide to CSS Grid 14:05 Moving text position within each individual div using CSS classes 16:48 Creating a gap between grid elements 17:13 Rounding border edges for letter grids 19:00 Formatting letter grid background color to indicate result “correctness” 21:30 Increasing font size 23:37 Updating the legend to use color, not text indicators 26:40 Adjusting padding to improve app aesthetic 28:08 Formatting the app UI with justified centering 31:56 Adjusting the text input and Go button 34:07 Why Flexbox is the right tool for this task 35:09 Exploring Flexbox Dev Tools in Chrome 39:14 Adjusting the colors of letter grids using Inspect Element 40:40 Making text bold with font-weight 41:04 Hint on how to approach formatting the keyboard
In final installment of this four-part series, RStudio’s Nick Strayer walks through using CSS to stylize our Shiny Wordle app.
Code + word list: https://github.com/wch/shiny-wordle Check out the full Shiny app here: https://winston.shinyapps.io/wordle/ You can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Nick Strayer (@NicholasStrayer) Animation, design, and editing: Jesse Mostipak (@kierisi) Music: Lakal by Blue Dot Sessions
Wordle: https://www.powerlanguage.co.uk/wordle/

Barret Schloerke || {reactlog} Rundown || RStudio
00:00 Introduction to Reactlog 00:44 Viewing Reactlog using an Old Faithful Shiny app 02:07 The Reactlog interface 04:31 Walking through a reactive graph with Reactlog 05:14 Downstream dependency invalidation in Shiny 06:43 How Shiny “grabs” data 09:41 How the Reactlog timeline works 10:46 Switching between idle states in Reactlog 11:58 Reactlog interactivity - clicking a single item 13:21 Reactlog with the Pythagoras Theorem app 15:45 Adding a UI and server value to add Reactlog to your Shiny app 18:05 Walking through the reactive graph using the Pythagorean Theorem app 21:07 Append-only behavior of Reactlog 21:18 Marking a time point in Reactlog 23:17 Using Reactlog to debug reactivity 26:55 Resetting our app and testing logic changes 28:01 Reactlog with a large Shiny app, CRANwhales 34:10 Freezing reactive values 36:19 Calculating click count in a Shiny app 37:10 Click the button, render the plot is bad - see why
Shiny is an R package from RStudio that makes it incredibly easy to build interactive web applications with R. Behind the scenes, Shiny builds a reactive graph that can quickly become intertwined and difficult to debug. reactlog provides a visual insight into that black box of Shiny reactivity.
After logging the reactive interactions of a Shiny application, reactlog constructs a directed dependency graph of the Shiny’s reactive state at any time point in the record. The reactlog dependency graph provides users with the ability to visually see if reactive elements are:
- Not utilized (never retrieved)
- Over utilized (called independently many times)
- Interacting with unexpected elements
- Invalidating all expected dependencies
- Freezing (and thawing), preventing triggering of future reactivity
There are many subtle features hidden throughout reactlog. Here is a short list quickly describing what is possible within reactlog:
- Display the reactivity dependency graph of your Shiny applications
- Navigate throughout your reactive history to replay element interactions
- Highlight reactive family trees
- Filter on reactive family trees
- Search for reactive elements
You can read more about reactlog here: https://rstudio.github.io/reactlog/articles/reactlog.html And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Barret Schloerke (@schloerke) Design & editing: Jesse Mostipak (@kierisi)

Sjoerd Wierenga & Job Spijker | Public Health | Shiny in Production | Posit
R in Public Sector: Organizational & Technical Aspects of Shiny in Production with the Dutch National Institute for Public Health and the Environment.
00:00 - Introductions 2:47 - Organizational aspects of Shiny in production 32:52 - Technical aspects of Shiny in production 52:33 - Ask us everything / Open Discussion
Questions: 29:00 - When you first introduced Shiny, what other tools were you comparing it to? How did you explain the difference to your leaders? 30:00 - What were the most important aspects of your prototype app to create buy-in? 52:33 - As Clusterbuster began to be used by more people, did you face any performance issues? How did you adjust your app to deal with more concurrent users? 56:10 - Can you say anything about the update frequency of the data? 57:15 - Which model was used to define the clusters? 58:23 - Did you ever consider not using a database? 1:01:50 - What’s the communication with the data engineering team? 1:03:51 - How often do you collect feedback from users and update your app? 1:05:10 - Was your data loaded into Docker in a form of some aggregates? How did you create them? 1:06:26 - What is the main advantage of keeping it all in R with Shiny? Did you feel at any point you were sacrificing simplicity? 1:08:14 - Did you use any specific methods to increase the performance of your app? Did you scope your data, or load it all in the global file? 1:12:03 - How did you make sure regions and users felt comfortable using your app? 1:13:25 - What types of businesses are hotbeds for covid clusters? Has this info informed policy changes? 1:14:50 - How did the data quality issues improve over the rollout? 1:16:47 - Did you use CI/CD? 1:17:38 - Did you have any functionality within your apps to send individual-level data to municipalities? 1:19:47 - For huge amounts of data, have you tested out different file types to store your data set within your containers? 1:20:54 - For people just starting to use Shiny, what is one piece of advice you would give them?
Proof on Concept with fictitious data: https://rivm.shinyapps.io/clusterbuster/ Blog post from the team as well! https://www.rstudio.com/blog/how-the-clusterbuster-shiny-app-helps-battle-covid-19-in-the-netherlands/ Code-first blog post mentioned: https://www.rstudio.com/blog/code-first-data-science-for-the-enterprise2/
How the “Clusterbuster” app provides actionable information to 300 health professionals Presented by: Sjoerd Wierenga
In this talk we want to give an overview of what it took to create the Clusterbuster from an organizational perspective. We will go into detail on how we got from an abstract question to an application that is user-friendly, safe, and valuable. Furthermore, we will offer a glimpse of what is yet to come, and where we see possibilities to turbocharge a more data-driven public policy approach.
How to build a production shiny app within the context of public health governance. Presented by: Job Spijker
This presentation goes into the more technical details about the production environment of the Clusterbuster application. We will show how we deployed the application, how we ensured security and mitigated the risks in case of a security breach, and how we organized our code for maintainability and refactoring.
Presenter Biographies:
Sjoerd Wieringa: As the son of two healthcare professionals, with a background in Public Administration, and a passion for technology, it is no surprise that Sjoerd Wierenga now works at the National Institute for Public Health and the Environment leading a team of highly skilled Data Scientists that created an application to support the battle against COVID-19. After having worked as a healthcare manager for several years, he decided he wanted to learn how to program. Which he has been doing now since 2016 in different capacities.
Job Spijker: Job Spijker is a senior research and data scientist at the Dutch National Institute of Public Health and the Environment. He has a PhD in Earth Sciences with a focus on computational and statistical methods of spatial data. He is currently involved in projects about how the institute’s environmental and health data can be leveraged to create insightful actionable information to assist policy makers at local, regional, and national level
Data Science Hangout | Ian Anderson, Philadelphia Flyers | Moving into Leadership & Managing a Team
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
posit.co/data-science-hangout
We were recently joined by Ian Anderson, Director of Hockey Analytics at the Philadelphia Flyers, to discuss the most important things going in data science leadership.
One of the topics that there were a lot of thoughts on during this hangout was balancing that shift into leadership and maintaining technical skills.
Question: 𝐀𝐬 𝐲𝐨𝐮 𝐦𝐨𝐯𝐞 𝐮𝐩 𝐭𝐡𝐞 𝐥𝐚𝐝𝐝𝐞𝐫 𝐢𝐧𝐭𝐨 𝐥𝐞𝐚𝐝𝐞𝐫𝐬𝐡𝐢𝐩 𝐚𝐧𝐝 𝐦𝐚𝐧𝐚𝐠𝐢𝐧𝐠 𝐚 𝐭𝐞𝐚𝐦, 𝐚𝐫𝐞 𝐲𝐨𝐮 𝐰𝐨𝐫𝐫𝐢𝐞𝐝 𝐚𝐛𝐨𝐮𝐭 𝐥𝐨𝐬𝐢𝐧𝐠 𝐲𝐨𝐮𝐫 𝐭𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐜𝐨𝐝𝐢𝐧𝐠 𝐬𝐤𝐢𝐥𝐥𝐬?
Here are a few of the thoughts shared live from Ian and the community:
-
You make time for things that are important to you. If exercise and health is important to you, you make time for it. If coding is something that you want to continue to maintain, you can find the time for that.
-
Think about blocking designated focus time to do your individual work
-
Be curious. Office hours as a manager to answer questions from your group can also be an opportunity to learn new things/packages/methods together
-
Organize opportunities for the team to share any topics that they would like (coding, techniques, etc.) In sharing topics, you can also try limiting the time to ~4min to share. Two benefits of this fast moving, high energy meeting: 1. People will inspire each other with passion and creativity 2. The team gets to practice communicating their ideas with precision.
-
Becoming more comfortable with what it means to lead people, it’s about enabling them to be really successful. If you measure your success by other people’s success, it can also help rationalize that. If you can chop down barriers for them, then they will be more successful so that means the technical time I sacrifice enables them further.
-
It’s still very relevant to keep your skills at some type of level just for street cred in your team. Bringing in new people, if you can talk that language it just really helps your team and you can have those conversations."
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstu … Twitter: https://twitter.com/rstudio
Data Science Hangout | Prabha Thanikasalam, Flex | Calculating ROI with the Business
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
rstudio.com/data-science-hangout
We were recently joined by Prabha Thanikasalam, Senior Director, Analytics and Supply Chain Solutions at Flex, to discuss the most important things going in data science leadership.
There was a great conversation live and in the chat that followed this question.
𝐇𝐨𝐰 𝐝𝐨 𝐲𝐨𝐮 𝐬𝐭𝐚𝐲-𝐟𝐨𝐜𝐮𝐬𝐞𝐝 𝐰𝐡𝐞𝐧 𝐰𝐨𝐫𝐤𝐢𝐧𝐠 𝐨𝐧 𝐝𝐚𝐭𝐚 𝐩𝐫𝐨𝐣𝐞𝐜𝐭𝐬? (From two sides: when learning data science and also when working on projects for the business)
Think about it as a marathon, not a sprint, you won’t be an expert in 6 months or a year. Build a good foundation over time by spending 45 minutes a day learning something.
Don’t keep your email inbox open all the time. This won’t work all the time, but it helps!
Try to separate the time you devote to learning new skills rather than “learning while doing” all the time. This can allow you to work faster with imperfect solutions and set aside learning a better way of doing it in dedicated time. Although it’s important to keep delivering quality analysis and product, it helps me move away from perfectionist tendencies.
Getting “side-tracked” isn’t necessarily bad - sometimes we’re asked to solve the wrong question and find that out half way through
Give yourself a deadline, or if you’re working with the business meet with them to discuss the success criteria and deadlines
Encourage teams to explore rabbit holes and not be mindless “task completers.” Some of the best insights come out of curveballs we explore.
Curiosity is also key to being a great data person
Try using a pomodoro timer to keep focused time
Check out Jacqueline Nolis & Emily Robinson’s book/companion podcast, “Build a Career in Data Science”, which talks about using projects as a motivator for learning new skills. Sometimes, it’s better to start with a problem you want to solve rather than start with a skill you want to learn!
When starting out, try to use code as frequently as possible, and also find a project you can get really excited about. Passion can help with the late nights and tying in some of the side track thoughts to one topic or outcome.
It’s helpful to realize there are still plenty of skills that require dedicated time to do more focused learning to exclusively understand the math, methodology, and technology. For example, with Natural Language Processing, starting with the technology and then moving onto the documentation can be an order that keeps you motivated.
𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞 𝐬𝐡𝐚𝐫𝐞𝐝 𝐚𝐬 𝐰𝐞𝐥𝐥: Prediction Machines: https://lnkd.in/gSft3XRK & Measure what Matters: https://lnkd.in/g6xdSNci Building a Career in Data Science: https://jnolis.com/book/ Lessons learned from the smartest Software Engineer I’ve met: https://lnkd.in/gDScE7gH Inside Intel: https://lnkd.in/gKyTWZj4
Daniel Petzold || RStudio Team: Building and Sharing Jupyter Notebooks || RStudio
Learn more about RStudio Team here. https://www.rstudio.com/products/team/
Find the code for this example here. https://github.com/danielpetzold/space-tracker Read our blog post here. https://www.rstudio.com/blog/build-and-share-jupyter-notebooks-on-rstudio-team/
Timecodes 0:00 - Intro 0:07 - Build Jupyter Notebooks to analyze and visualize data 2:47 - Publish directly from RStudio Workbench to your content hub 5:13 - Share With Your Stakeholders on RStudio Connect
Jupyter Notebooks are interactive documents for code, outputs, and text. However, they’re often stuck in data scientists’ local computing environments. Collaborating can be difficult and sharing can be tedious. To live up to their fullest potential, data science teams need a way to scale their development securely and efficiently — while providing stakeholders easy access to their output and visualizations.
RStudio Team, made up of RStudio Workbench, RStudio Connect, and RStudio Package Manager, brings everything together to help data scientists create, reproduce, and share insights from their Jupyter Notebooks.
Let’s dive into a real-life example by exploring data from NASA’s Center for Near-Earth Objects (NEOs). Daniel Petzold walks us through his data analysis and reporting. Want to explore the report yourself? Check out the published report on RStudio Connect here. https://colorado.rstudio.com/rsc/space-tracker/space_tracker.html
On RStudio Workbench, you have a choice of editors: the RStudio IDE, JupyterLab, Jupyter Notebook, or VS Code. Choose your preference. From here, you can explore your dataset, embed HTML directly in your document, create visualizations, and more.
Once you’ve run your analyses and created insightful visualizations, you want to be able to share them with your team. RStudio Workbench allows you to publish to RStudio Connect, the content platform from RStudio.
You have multiple options: push-button deployment from Jupyter Notebook or using terminal commands from JupyterLab.
It’s not enough to publish your work. Once on RStudio Connect, you can share with end-users. Make your analysis accessible to specific users or more generally with different authentication measures. In addition, you can schedule the document to run at a certain time and send out an email with refreshed data.
Click the links below to learn more about these offerings.
RStudio Workbench: https://www.rstudio.com/products/workbench/
RStudio Connect: https://www.rstudio.com/products/connect/
Winston Chang || Part III: Adding a Keyboard to a Wordle Shiny App || RStudio
00:00 Introduction 00:25 Setting up a keyboard 00:54 Using an HTML p tag to print out letter indicators 01:56 Back to our keyboard! 03:44 Setting up a search and replace 06:32 Removing letters using regular expressions 08:43 Making guesses a reactiveVal() 11:00 Avoiding an infinite loop with reactiveVal()
In Part III of this four-part series, Winston walks through how to build a keyboard in a Shiny Wordle app.
Code + word list: https://github.com/wch/shiny-wordle Check out the full Shiny app here: https://winston.shinyapps.io/wordle/ You can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Developer (@winston_chang) Animation, design, and editing: Jesse Mostipak (@kierisi)
Wordle: https://www.powerlanguage.co.uk/wordle/

Winston Chang || Part II: Handling Duplicate Letters in a Shiny Wordle App || RStudio
00:00 Introduction 00:52 Setting up the problem with duplicate letters 02:08 Coding the first pass for exact matches in the correct position 06:29 Re-evaluating how to approach the problem 12:28 Removing only one instance of a letter 13:56 Testing our code 14:54 Setting up the second pass 19:08 Scoping with a double arrow 19:52 Debugging with a browser() statement 21:28 Checking our code
In Part II of this four-part series, Winston walks through how to handle duplicate letters when building your Shiny Wordle app.
Code + word list: https://github.com/wch/shiny-wordle Check out the full Shiny app here: https://winston.shinyapps.io/wordle/ You can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Developer (@winston_chang) Animation, design, and editing: Jesse Mostipak (@kierisi)
Wordle: https://www.powerlanguage.co.uk/wordle/

Capacity Planning for Microsoft Azure Data Centers | Using R & RStudio Connect
Capacity Planning for Microsoft Azure Data Centers | An Explainable Data Science Workflow using R & RStudio Connect | Presented by Paul Chang
2:12 - Start of presentation 47:43 - Start of Q&A session
Thank you for watching! Here are a few helpful links:
- Link to Paul’s slides: https://lnkd.in/gh-hGScE
- More information on RStudio Connect: https://www.rstudio.com/products/connect/
- How to open an Azure account: https://azure.microsoft.com/en-us/
- Getting started with SAML authentication on RStudio Connect: https://support.rstudio.com/hc/en-us/articles/360022321494-Getting-Started-with-SAML-in-RStudio-Connect
- pins package: https://pins.rstudio.com/
- plumber package: https://www.rplumber.io/
- Upcoming events: rstd.io/community-events
- Chat with our team to start an RStudio Connect evaluation: rstd.io/chat-with-rstudio
Abstract: The Long Range Capacity Planning team at Microsoft is responsible for producing plans for expanding Microsoft Azure Data Centers around the world. These are multi-billion dollar plans that enable the full suite of IaaS and PaaS cloud offerings for our customers, over a 5+ year time horizon. In this talk, we will present the data science software stack that we have built using RStudio Connect and Azure, for producing these data center capacity plans. We will discuss how RStudio Connect has empowered our data scientists to connect more directly with internal stakeholders and decision makers, and how RStudio Connect has enabled us to streamline our data science and business processes.
Speaker Bio: Paul Chang, Senior Data & Applied Scientist, Microsoft
Paul Chang is the Systems Architect of the Long Range Capacity Planning team for Microsoft Azure Data Centers. He received his Applied Math PhD from Simon Fraser University and has worked in a variety of fields including Applied Functional Analysis, Hydrogen Fuel Cell modeling, and A.I. Applications in Vehicular Traffic Engineering. He was also a software engineer in SQL Azure for a couple of years.
Thank you for joining us!
- If you ever have suggestions or general feedback, please let us know! Here’s an anonymous google form: rstd.io/meetup-feedback
- We’d love to hear from you too! Here’s a talk submission form as well: rstd.io/meetup-speaker-form
- If you’d like to learn more about RStudio Connect: https://www.rstudio.com/products/connect/
- If you’re just starting to advocate for data science in general or RStudio tools: rstudio.com/champion
Winston Chang || Part I: Build a Basic Wordle App with Shiny || RStudio
00:00 Introduction 00:12 What is Wordle? 00:36 The Wordle app we’ll build by the end of this four-part series 01:08 How to approach the problem 01:38 Word list (link to file below) 01:52 UI function with fluidPage() 02:24 Print out what player guesses using verbatimTextOutput() 03:36 Run app in Viewer Panel 04:04 Adding an action button with actionButton() 04:29 Using bindEvent() with actionButton() 06:02 Limiting guesses to words with five characters 07:40 Using req() and cancelOutput() 08:54 Incorporating the word list 10:13 Matching player guess to word list 11:06 Matching player guess to target word 13:50 Writing a function to match guess to target word with feedback 18:15 Checking word length between guess and target 23:02 Why we’re using intermediary functions 28:51 Printing formatted letter information
In Part I of this four-part series, Winston walks through how to build a basic Wordle app using Shiny!
Code + word list: https://github.com/wch/shiny-wordle Check out the full Shiny app here: https://winston.shinyapps.io/wordle/ You can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Developer (@winston_chang) Animation, design, and editing: Jesse Mostipak (@kierisi)
Wordle: https://www.powerlanguage.co.uk/wordle/

Rich Iannone || Making Beautiful Tables with {gt} || RStudio
00:00 Introduction 00:37 Adding a title with tab_header() (using Markdown!) 01:47 Adding a subtitle 02:48 Aligning table headers with opt_align_table_header() 03:48 Using {dplyr} with {gt} 06:03 Create a table stub with rowname_col() 07:35 Customizing column labels with col_label() 09:45 Formatting table numbers with fmt_number() 12:10 Adjusting column width with cols_width() 15:39 Adding source notes with tab_source_note() 16:55 Adding footnotes with tab_footnote() 18:55 Customizing footnote marks with opt_footnote_marks() 19:10 Demo of how easy managing multiple footnotes is with {gt} 23:41 Customizing cell styles with tab_style() 27:07 Adding label text to the stubhead with tab_stubhead() 28:15 Changing table font with opt_table_font() 29:25 Automatically scaling cell color based on value using data_color()
With the gt package, anyone can make wonderful-looking tables using the R programming language. The gt philosophy: we can construct a wide variety of useful tables with a cohesive set of table parts. These include the table header, the stub, the column labels and spanner column labels, the table body, and the table footer.
It all begins with table data (be it a tibble or a data frame). You then decide how to compose your gt table with the elements and formatting you need for the task at hand. Finally, the table is rendered by printing it at the console, including it in an R Markdown document, or exporting to a file using gtsave(). Currently, gt supports the HTML, LaTeX, and RTF output formats.
The gt package is designed to be both straightforward yet powerful. The emphasis is on simple functions for the everyday display table needs.
You can read more about gt here: https://gt.rstudio.com/articles/intro-creating-gt-tables.html And you can learn more about Shiny here: https://shiny.rstudio.com/
Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/
Content: Rich Iannone (@riannone) Design & editing: Jesse Mostipak (@kierisi)

Isabella Velásquez | Building a Blog with R | RStudio
Building a Blog With R Presented by: Isabella Velásquez
Here are a bunch of resources Isabella shared ⤵️
Slides from the presentation: https://lnkd.in/gqGFmHMf Internal Blog Example: https://lnkd.in/gaFPxN5F Other resources from the talk: https://lnkd.in/gjXxeMaa
Distill Resources: 1️⃣ Distill for R Markdown: https://lnkd.in/gWsEBXfN 2️⃣ Building a blog with distill by Tom Mock: https://lnkd.in/gQiE8PC2 3️⃣ (Re-)introducing Distill for R Markdown: https://lnkd.in/gzidDpV2 4️⃣ The distillery: https://lnkd.in/gwDAg_7G 5️⃣ Postcards package: https://lnkd.in/geT6uB9t
Blogdown Resources:
1️⃣ blogdown: Creating Websites with R Markdown: https://lnkd.in/gGQ-fCWw 2️⃣ Hugo Themes: https://themes.gohugo.io/ 3️⃣Hugo Apéro: https://lnkd.in/g8U9tfvq 4️⃣ A Blogdown New Post Workflow with Github and Netlify: https://lnkd.in/gYNwsKTm
The R programming language is known for its applications to data science, but one of its best assets is the inviting community. Folks from around the world share their lessons learned, best practices, and code to support and inspire others. One tool that helps contribute to the thriving community is the blog.
A blog is a wonderful opportunity to record your data stories, gain exposure for your expertise, and support others in their R journey. Thanks to the advancement of tools like R Markdown, you can quickly get up and running with a blog and focus on customization and style.
In this talk, we will discuss possible reasons for creating a blog, the pros and cons of a blog, and how to decide on topics. We will then explore tools for creating your blog that make it easy to showcase your R skills, such as blogdown and distill.
At RStudio, we are always looking for stories of how you are using R for your work, community, or for fun. If this talk inspires you to start writing, we would love for you to contribute to the RStudio blog: https://www.rstudio.com/blog/
Speaker Bio: Isabella Velásquez: Isabella is a content strategist, author, and active member of the R community. Currently, she works at RStudio as a Sr. Product Marketing Manager with the goal of driving engagement around all the awesome things happening at RStudio. In her previous role, she conducted data analysis and research, developed infrastructure to support use of data, and created resources to engage technical and non-technical audiences. She channels these experiences to illuminate what is possible with great products
Ralph Asher & Laura Darby Rose | R in Supply Chain Management | RStudio
Two talks! 3:24 - Start of meetup 3:24 - Intro to Supply Chain Design - Ralph Asher 33:38 - Forecasting Demand with R - Laura Darby Rose
Intro to Supply Chain Design - Ralph Asher
Abstract: COVID-19 has moved supply chain management from the back office to front-page news. And along with it, the discipline of supply chain design – the strategic evaluation of deciding where to locate manufacturing sites, warehouses, and other supply chain facilities – has gone from a little-known niche to a C-suite priority.
In this talk, I will introduce the field of supply chain design to the R community. Drawing upon my decade of experience in supply chain design and R, I will give a short example of how to design a warehouse network to support future customer need. This example will be drawn directly from my experience in the corporate world and my consulting business.
Speaker Bio: Ralph Asher is the founder of Data Driven Supply Chain LLC, a Minnesota-based consulting firm that uses data science and AI to evaluate, design, and optimize supply chains. (www.datadrivensupplychain.com ) Prior to founding Data Driven Supply Chain, Ralph worked as an Operations Research Scientist in corporate supply chain functions for 8 years at Target, designing e-commerce supply chain networks, and at General Mills, designing warehousing networks. Ralph has used R for supply chain analytics for a decade and can be reached at ralph@datadrivensupplychain.com or via LinkedIn.
Forecasting Demand with R - Laura Darby Rose
Abstract: Over the course of about a year, Mallinckrodt Pharmaceuticals’ Specialty Generics division replaced an expensive SaaS (Software as a Service) with an R script and alternate forecasting process. Using an RStudio Enterprise solution, they have found a more flexible and cost-effective way to forecast demand and analyze data with R. This discussion will detail the process of replacing a SaaS with R, as well as challenges and next steps in the project.
Speaker Bio: Laura Darby Rose is Manager of Demand at Mallinckrodt Pharmaceuticals, where she is responsible for statistical forecasting, forecast visualization, and forecast accuracy measurement for the Specialty Generics division. She has a M.A. in Economics from the University of Missouri-St. Louis, and enjoys using R and SQL for time-series analysis, creating Shiny apps, and data wrangling/cleaning.
Are you a data scientist or data analyst working in supply chain management? Are you interested in joining a group of fellow practitioners and taking a leadership role in developing and promoting open source solutions in your field? Join us!
At this meetup, we also proposed the formation of a community working group focused on developing and popularizing open source solutions for data scientists and analysts working in supply chain management. We’d love to start by creating a home – initially a website – which hosts resources for supply chain data scientists. More to come. Form to be part of the working group: rstd.io/supply-chain-community-org
RStudio Cloud Demo with Dr. Mine Çetinkaya-Rundel
Much has been written in the statistics and data science education literature about pedagogical tools and approaches to provide a practical computational foundation for students. However a common friction point for getting students (and faculty) started with computing is installation and setup. Circumventing the installation and setup steps early in the course by having students access R and RStudio in the cloud can minimize frustration and improve buy in. RStudio Cloud is a lightweight and easy to set up / use solution to this problem. In this talk we will discuss pedagogical reasons for teaching computing with R on the cloud as well as share best practices and tips for setting up your learners for success on RStudio Cloud. We will also provide an opportunity for the audience to experience computing in RStudio Cloud first hand, demo its newest features, and highlight a suite of ready to use resources for teaching R to new learners.
Read more in the follow-up blog post: https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/

An inclusive solution for teaching and learning R during the COVID pandemic
The COVID pandemic has shaken our teaching and learning approaches in many different ways all over the world.
Nonetheless, it has also provided opportunities for bringing creativity into the classroom.
In this talk, I will discuss how I have used RStudio Cloud in my teaching during the pandemic and how I capitalized on the opportunities that RStudio Cloud offers to deal with the crucial issues of software installation.
Introducing RStudio Cloud in the units has allowed me to work effectively in an online environment to engage, motivate and empower students through their learning process while removing the troubles and hurdles of software installation which is generally particularly challenging in first-year cohorts without prior coding experience.
I used RStudio Cloud in a data science introductory unit at Monash University and as a tool to present the usage of R and RStudio for reproducible reporting in another unit on Reproducible and Collaborative Practises.
In the latter, I introduced RStudio Cloud during the first few weeks to get the students up to speed before transitioning to using R and RStudio Cloud in their own local machines while using the command line interface, Git, and GitHub as a version control tool for reproducible reporting.
I will also discuss how I organized and managed the unit’s RStudio Cloud account so that my research associates were also an integral part of the unit delivery to ensure the success of the units.
Read Dr. Menéndez’s guest post on the RStudio Blog: https://www.rstudio.com/blog/rstudio-cloud-an-inclusive-solution-for-learning-r/
Read more in the follow-up blog post: https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/
Data Science Hangout | Ryan Garnett, Green Shield Canada | Getting People Excited about Open Roles
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Ryan Garnett, Manager Data Management Insights & Analytics at Green Shield Canada.
Here are a few snippets from our conversation: 1:31 - Start of session 12:00 - Tackling Challenges with 5 Questions 14:00 - Benefits of being vulnerable 16:00 - Making the case for hardware 17:45 - Breaking Down Problems to Prioritize Work 18:20 - When to make something a function 38:15 - Collaboration around code-review (what happens if one person gets tasked with this?) 48:13 - How do get people excited about potential opportunities? 1:04:12 - Recruiting in the public sector
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
To Our Community: Thank You | RStudio Open Source (2021)
This year, among many challenges, we have been so grateful for our community who have continued to show up in so many different ways. As the year comes to a close, we wanted to say thank you. We know folks have adjusted where they spend their time and energy, and we just want to say thank you for continuing to be here and learn together, wherever you are right now.
Thank you to our maintainers. Our package maintainers not only contribute code, but also steward projects, welcome new contributors, and answer many questions. Thank you for continuing to set a positive tone and fostering a positive community for our contributors and users!
Thank you to our contributors. Package contributors contribute code, ideas, conversation, documentation, and tests. You are the ones trying things out, figuring out what’s wrong, and sharing ideas and fixes. Thank you for pushing our code to grow and evolve and to help ever more people!
Thank you to our educators. Educators work in classrooms, with friends and colleagues, and respond to email lists from someone around the world. Thank you for sharing your enthusiasm for data science with others and expanding the group of people who make sense of data with code!
Thank you to everyone who’s used R to solve a problem, create aRt, write a blog post, share insights or any of the thousands of ways we can create with R. Thank you to all of you who have contributed in some way, no matter how big or small. Without you, there would be no point to writing open source code. Thank you for all the work you do in the world, using data to improve our lives and our knowledge of the world in such incredibly diverse ways.
Music: Basketliner by Bitters, published on Blue Dot Sessions - https://app.sessions.blue/browse/track/81258
Editing & motion design: Jesse Mostipak
Data Science Hangout | Aliyah Wakil, Texas DSHS | Increasing Data Literacy & Moving to Leadership
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Aliyah Wakil, Epidemiology Team Lead at TX Department of State Health Services.
A few key snippets from our conversation: 01:34 - Start of session 3:51 - Strategies for bringing data science to everyone 6:55 - Creating communities for knowledge sharing 16:18 - Dealing with patient data across multiple data systems 22:20 - Increasing data literacy 26:54 - Tips for moving into leadership 29:28 - Transitioning into leadership from an individual contributor role 34:00 - People management vs. technical leadership discussion 57:22 - How to govern such a wide variety of tools and manage the code
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Jarus Singh, Pandora | Human in the Loop
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Jarus Singh, Director, Quantitative Analytics at Pandora.
A few key snippets from our conversation: 01:28 - Start of session 6:33 - Human in the loop 14:14 - Working with stakeholders and teaching communication 25:47 - What does your tech environment look like? 28:46 - Presenting work; pretty vs. impactful 33:18 - Skills for data science leadership 43:01 - Having data science rolling up to the CFO 51:41 - Getting motivated by personal projects 49:00 - Applying our hobbies to work: cooking 1:01:25 - Going with your passion
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Nate Kratzer, Brown-Forman | Focusing Tools on Adoption, BI Tools & Shiny
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Nate Kratzer, Data Science Manager at Brown-Forman.
A few key snippets from our conversation: 01:31 - Start of session 11:11 - The very first Shiny app we deployed 11:57 - How do you calculate the ROI of data science? 13:52 - What a data science tech stack in the liquor industry looks like 21:20 - Marketing data science to your colleagues vs them coming to you with projects 23:43 - How we think about Shiny and BI Tools 27:26 - Example that would need to be a data science visualization instead of BI dashboard 44:50 - Supporting both Python and R 39:58 - Specialized skills vs general skills
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Chase Carpenter, Chicago Cubs | Advice for Getting your First Job in Sports
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Chase Carpenter, Director of Strategy & Analytics at the Chicago Cubs.
The Cubs are also hiring for a Database Marketing Analyst: https://my1060wd.wd5.myworkdayjobs.com/en-US/Chicago_Cubs_FO/job/Chicago-Illinois/Analyst--Database-Marketing_R000445
A few key snippets from our conversation: 01:33 - Start of session 13:52 - Defining areas of analysis (scoping projects) 1. What’s the size of the prize? What’s the size of the problem? 2. Do we actually have data that can help understand or improve the problem? 3. When does that work need to be done? What’s the timing? 22:32 - How are models evaluated and over what time period 28:32 - What does the last mile look like? Delivering results back to the business 31:40 - Techniques for approaching ambiguity 36:28 - Finding quick wins to build relationships 41:49 - Perspective on getting your first job in sports analytics 50:15 - Adjusting sports models in a world after the pandemic 59:06 - Advice for aspiring data science leaders
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Daniela Garcia & Julieta Nieva | R en la Administración Pública & R Markdown | RStudio
R en la Administración Pública & informes de técnicas psicométricas con R Markdown | RStudio
Thank you to Sergio Garcia Mora for hosting this meetup.
R en la Administración Pública: Cuando la Big Data te empuja a salir de la zona de confort Daniela Garcia
Abstract: Muchas veces no tenemos motivación para programar debido a que el Excel lo resuelve todo fácilmente, sin embargo, cuando en mi trabajo nos solicitaron información detallada y actualizada sobre puestos y costos laborales de cada escuela de nuestra provincia (más de 2.000) era aprender o morir en el intento.
En esta presentación veremos como a través de R automatizamos un reporte para toma de decisiones gubernamentales con gran precisión y escaso esfuerzo.
Bio: Daniela García, Lic. en Relaciones Laborales, Analista de Datos de la Dirección General de Personal - Ministerio de Hacienda y Finanzas Corrientes, Argentina. Scrum Master en Starter - Consultora en Tecnología e Innovación. Colaboradora activa del Club de R para Recursos Humanos. https://www.linkedin.com/in/claudiadanielagarcia/
R Markdown aplicado a informes de técnicas psicométricas Julieta Nieva
Abstract: Los psicólogos que realizamos investigación podemos tener algunas limitaciones a la hora de dar devoluciones personalizadas a los participantes voluntarios de nuestros proyectos. Poder analizar gran cantidad de datos, procesarlos rápidamente y poder ahorrar horas de trabajo manual, hicieron que R Markdown se convierta en nuestro aliado. En la presentación se explicará cómo utilizamos estas herramientas en el International Cognitive Research Consortium y cuáles fueron nuestros resultados.
Bio: Julieta Nieva, Licenciada en Psicología. UX Researcher en Banco Galicia. Investigadora en ICRC. https://www.linkedin.com/in/julieta-nieva/
Leveraging the Cloud for Analytics Instruction at Scale: Challenges and Opportunities
Data science and programming languages like R and Python are some of the most in demand skills in the world.
Students interested in analytics and professors facilitating curriculums deserve to use industry-leading tools to acquire these skills.
However, it’s challenging to enable this experience in an educational setting, especially at scale.
The traditional tools to facilitate learning analytics simply aren’t great. Students and professors often spend way too much time troubleshooting systems and software, things that are a complete waste of time and detract from the learning experience. Additionally, there are seemingly endless IT hurdles and requirements.
That’s part of the reason we created RStudio Cloud, a brilliantly simple but powerful solution for teaching and learning analytics, especially at scale. RStudio Cloud solves many of the technical and financial challenges associated with teaching analytics. It’s also a joy to use for professors, students, and IT administrators.
In this presentation, Dr. Brian Anderson will discuss the challenges and opportunities associated with leveraging the cloud to deliver analytics instructions at the undergraduate and graduate levels at scale.
Our hope is that you walk away inspired to think about ways you can leverage RStudio and the Cloud to enhance your students’ experiences with learning analytics.
Read more in the follow-up blog post: https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/
Data Science Hangout | Merav Yuravlivker, Data Society | Getting People Invested in Data Science
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Merav Yuravlivker, Co-Founder and CEO of Data Society.
Data Society is a leading provider of customized data science training programs and AI/ML solutions for enterprise and government agencies (datasociety.com)
A few key snippets from our conversation: 01:13 - Start of session 7:33 - Ways to get people invested in data science internally 13:00 - Measuring and defining success with data science projects 15:50 - Growing engagement at internal community events / lunch & learns 32:38 - Building to serving predictive models 47:47 - Calming fears around the company “being behind” in using data 58:50 - Doing both the A & B of A/B Testing
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Priyanka Gagneja | Exploratory Data Analysis | RStudio
Priyanka Gagneja | Exploratory Data Analysis | RStudio
Exploratory Data Analysis with: {DataExplorer}: https://github.com/boxuancui/DataExplorer {Skimr}: https://github.com/ropensci/skimr {rpivotTable}: https://github.com/smartinsightsfromdata/rpivotTable {esquisse}: https://github.com/dreamRs/esquisse {chronicle}: https://github.com/pheymanss/chronicle and more.
With a plethora of packages that can help us with exploratory data analysis (EDA), it can be difficult to know where to start! Some packages are specific for a given data type, while others are more general.
In this talk, Priyanka Gagneja (github.com/priyankagagneja) will walk us through her EDA workflow and share a few of her favorite packages in doing so.
Speaker Bio: Priyanka Gagneja is a Data Scientist/Freelance Analytics Professional. Priyanka graduated from Boston College with a Masters in Applied Economics and holds an MBA and Bachelor’s in Computer Science from India. She has experience providing supply chain-related solutions to retail clients.
Priyanka’s slides: https://github.com/priyankagagneja/Talks/blob/main/EDA_Workflow/EDA_workflow_Boston_useR_talk.pdf
James Blair | Using RStudio on Amazon SageMaker | RStudio
Using RStudio on Amazon SageMaker Presented by James Blair
*please note that this meetup will share a new update about our professional product, RStudio Workbench. All are welcome, but just wanted to make sure to share that upfront :) https://www.rstudio.com/products/workbench/
Agenda: Presentation & Demo - 40 minutes Q&A with both RStudio & AWS SageMaker team - 20 minutes
Amazon SageMaker helps data scientists and developers to build, train, and deploy machine learning models quickly by bringing together a broad set of capabilities purpose-built for machine learning. RStudio recently announced the release of RStudio on Amazon SageMaker, developed in collaboration with the SageMaker team. This brings a fully-managed RStudio Workbench IDE into the powerful SageMaker environment.
With this functionality, Data Scientists can quickly get to work, spinning up their favorite development environment on SageMaker, choosing from a wide array of instance types as needed. They can get access to their organization’s data stored on AWS, as well as all of SageMaker’s deep learning capabilities.
As a fully managed offering on Amazon SageMaker, this release makes it easy for DevOps teams and IT Admins to administer, secure, and scale their organization’s centralized data science infrastructure, using familiar AWS tools and frameworks.
Here are James’ slides as well: https://github.com/blairj09-talks/rstudio-sagemaker-webinar Answers to the Q&A are shared in this blog post: https://www.rstudio.com/blog/using-rstudio-on-amazon-sagemaker-faq/
RStudio Team Deep Dive | In A Hosted Environment
You probably know that RStudio makes a free, open-source development environment for data scientists. It’s made with love and used by millions of people around the world.
What you might not know is that we also make a professional platform, called RStudio Team.
In this Live Session, Tom will walk you through our Rstudio Team Trial, where you can learn how to best test drive….
- Scaling your data science work
- Seamlessly managing open-source data science environments
- Automate repetitive tasks
- Rapidly share key insights and data science products securely to your entire organization.
- And, optionally integrate some of your favorite open-source packages into the trial experience
Leading organizations like NASA, Janssen Pharmaceuticals, The World Health Organization, financial institutions, government agencies and insurance organizations around the globe use RStudio’s professional products to tackle world-changing problems and we’re inviting you to learn how. You’ll learn how RStudio Team gives professional data science teams superpowers, with all of the bells and whistles that enterprises need.
If you don’t have your own trial instance of Rstudio Team to follow along (not required), feel free to request yours here: https://www.rstudio.com/products/team/evaluation3/
Additional resources here: https://docs.google.com/document/d/1HGt7LSohhyxpCvETvVEFHugrdaSnTcZaXbI0jV5g9ok/edit?usp=sharing
Leveraging R & Python in Tableau with RStudio Connect | James Blair | RStudio
Leveraging R & Python in Tableau with RStudio Connect Overview Demo / Q&A with James Blair
Tableau combines the ease of drag-and-drop visual analytics with an open, extensible platform. RStudio develops free and open tools for data science, including the world’s most popular IDE for R. RStudio also develops an enterprise-ready, modular data science platform to help data science teams using R and Python scale and share their work.
Now, with new functionality in RStudio Connect, users can have the best of both worlds. Tableau users can call R and Python APIs from Tableau calculated fields, getting access to all the power and analytic depth of these open-source data science ecosystems in real-time.
For Tableau users, this makes it easy to add dynamic, advanced analytic features from R and Python to a Tableau dashboard, such as scoring predictive models on Tableau data. They can leverage all the great work done by their organization’s data science team and even call both R and Python APIs from a single dashboard.
Data science teams can continue to use the code-first development and deployment tools from RStudio that they know and love. Using these tools, they can build and share R APIs (using the plumber package) and Python APIs (using the FastAPI framework).
Speaker Bio: James is a Solutions Engineer at RStudio, where he focuses on helping RStudio commercial customers successfully manage RStudio products. He is passionate about connecting R to other toolchains through tools like ODBC and APIs. He has a background in statistics and data science and finds any excuse he can to write R code.
A few other helpful links: Tableau Integration Documentation: https://docs.rstudio.com/rsc/integration/tableau/
Tableau / RStudio Connect Blog Post: https://blog.rstudio.com/2021/10/12/rstudio-connect-2021-09-0-tableau-analytics-extensions/
Embedding Shiny Apps in Tableau using shinytableau blog: https://blog.rstudio.com/2021/10/21/embedding-shiny-apps-in-tableau-dashboards-using-shinytableau/
James’ slides: https://github.com/blairj09-talks/rstudio-tableau-webinar
Building R packages with devtools and usethis | RStudio
Package building doesn’t have to be scary! The tidyverse team has made it easy to get started with RStudio and the devtools/usethis packages. This hour long presentation will walk you through the basics of R package building, and hopefully leave you prepared to go out and build your own package!
Slides: https://colorado.rstudio.com/rsc/pkg-building/ Source Code: https://github.com/jthomasmock/pkg-building
devtools: https://devtools.r-lib.org/ usethis: https://usethis.r-lib.org/ R Packages book: https://r-pkgs.org/index.html
Data Science Hangout | Bryan Butler, Eastern Bank | Using the Best Tool for the Job
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Bryan Butler, VP Business Insights & Analytics at Eastern Bank.
Here are a few snippets from our conversation 1:17 - Start of session 9:18 - Using the best tool for the job - don’t box yourself in with one tool 10:38 - NLP Use Case - Solving somebody’s problem, not just using the newest tools 14:44 - Implement incremental change, not a revolution 16:32 - What is the timeline for a data science team to implement change? 30:00 - The leadership traits not talked about 32:51 - How to improve leadership skills 1:12:00 - Get leadership to pay attention with good storytelling
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Jacqueline Nolis, Saturn Cloud | Structuring Teams to Empower the Business
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Jacqueline Nolis, Head of Data Science at Saturn Cloud.
Here are a few snippets from our conversation: 1:17 - Start of session 5:22 - Centralized vs. decentralized Data Science team 12:21 - Using one language or many 25:04 - How to structure internal documentation or communication channels to empower the business 32:36 - Talking to executives 41:45 - Difference between data scientist and data analyst 43:33 - What do you recommend as far as the types of roles in a data science team? 49:06 - How do you answer, why should companies use data science?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Data Science Hangout | Joel Pepera, GEICO | Fundamentals of Data Strategy & Data Science Maturity
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Joel Pepera, Director of Data Science, GEICO.
Here are a few snippets from our conversation:
1:20 - Start of session 18:00 - How the pandemic impacted data science 24:05 - Considering how our models are going to be used 30:45 - What should we focus on first for building out a data science team? (Fundamentals) 37:20 - How do you know your machine learning model is working? 51:30 - Capturing when predictions are used to make decisions 58:32 - Advice to aspiring data science leaders (pay attention to the business) 1:01:25 - Good data science leaders understand how to make good decisions
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Nathan Stephens | Scaling Spreadsheets with R | RStudio
Scaling Spreadsheets with R Presented by Nathan Stephens
Excel (and spreadsheets in general) is a powerful tool for analyzing and visualizing data, especially when your data is small and the complexity of your work is manageable.
What happens when your workbook grows larger and more complex?
What do we mean by “too large”? • Too many rows - 1,048,576 • Too many columns - 16,384 • Too many workbooks - 255 • Some combination thereof • Workbooks are difficult to share because the file size is so large
What do we mean by “more complex”? • The workbook comes from many data files or data sources • The workbook requires functions that are not built-in • The workbook contains a lot of VBA code • The workbook contains interactive elements • The workbook requires many user inputs • Many collaborators are working on the same workbook
In this webinar, we will discuss the benefits of using the R programming language when Excel gets overloaded. We will demonstrate how to use R in two situations where traditional Excel workbooks often get overloaded:
(1) Reading and analyzing data files that contain millions of records (2) Automating interactive dashboards that are built on multiple data files
Nathan’s slides and examples here: https://github.com/sol-eng/babynamesByState/tree/master/presentation
If you’d like to join future meetups, join this group here: https://www.meetup.com/RStudio-Enterprise-Community-Meetup/
Michelle Brandão | R in Sports Analytics - Intro to GitHub Actions | RStudio
Introduction to GitHub Actions Presentation by Michelle Brandao, Data Analyst at FanDuel
GitHub Actions is a popular tool to automate software tasks, helping you integrate your analytics and ML workflows with CI/CD principles
Michelle is a Data Analyst at FanDuel. Prior to that, she played basketball in college at Old Dominion University, then professionally in Europe and represented Portugal in international competitions. Following her professional career, she pursued a master’s degree in Sports Management with a concentration in Business Analytics and joined the basketball coaching staff as a graduate assistant.
For future meetups, check out this group: https://www.meetup.com/RStudio-Enterprise-Community-Meetup
Data Science Hangout | Sandy Steiger, J.M. Smucker Co. | Setting Expectations for Your Team
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Sandy Steiger, Director, Integrated Analytics, Pet at The J.M. Smucker Co.
Here are a few snippets from our conversation: 1:06 - Start of session 5:29 - Transitioning to the mindset, “Data science leader for the people” 10:48 - Setting expectations effectively on a data science team (Turning point as a people leader) 20:11 - Developing data science talent profiles 28:21 - Getting data scientists more exposure across the company 30:00 - How to encourage data scientists more comfortable sharing their work 35:05 - Taking up your space, being confident 57:39 - Create better team environments to challenge each other
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Community Conversation: Hiring Great Data Science Teams
At RStudio, we believe open source, code-first data science frameworks based on R and Python, centralized on-prem or in the cloud for ease of collaboration, sharing and administration, help data science teams be successful. However, finding and retaining the right people is just as key for building a productive, effective, and collaborative data science team.
In this panel webinar, you will hear from leaders at RStudio, Boeing, Pandora, Beam Dental, & Amwell, about what it takes to hire a successful data science team:
- Their experiences and perspective on hiring to build great data science teams
- How their teams prioritize skills and experience in new positions
- How they think about skill assessment
- Their advice to data scientists just getting started
As a company that champions data science, we like to make data-driven decisions, specifically around community events like this. If you could take just a few minutes to provide your feedback and what you’d like to see more of, we would really appreciate it: https://forms.gle/zzYMQy4CjUiEedVaA
If you enjoyed this, you might want to attend our Data Science Hangouts. Every Thursday at 12:00 noon ET, current and aspiring data science leaders come together on an open zoom call to casually discuss data science, no registration required. Add the Data Science Hangout to my weekly calendar: https://www.addevent.com/event/Qv9211919
Data Science Hangout | Óli Páll Geirsson, City of Reykjavik | Data Science is More About People
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Óli Páll Geirsson, Chief Data Officer at the City of Reykjavik.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Add the Data Science Hangout to your calendar: https://www.addevent.com/event/Qv9211919
9:44 - Providing value to stakeholders in data science 12:35 - Why data science is more about the people 15:06 - Communication with stakeholders = Crucial 17:21 - The value of building up your data science team 18:08 - Why you need a diverse data science team 20:15 - More efficient data science teams by breaking down goals 37:46 - Active listening to better identify needs for your data science projects 43:55 - Prioritizing the projects your data science team works on 49:37 - The best way to approach key stakeholders 1:07:06 - The importance of visualizing data products
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
RStudio Team Demo | Build & Share Data Products Like The World’s Leading Companies
You probably know that RStudio makes a free, open-source development environment for data scientists. It’s made with love and used by millions of people around the world.
What you might not know is that we also make a professional platform, called RStudio Team.
Learn How RStudio Team Can…
- Help you scale your data science work
- Seamlessly manage open-source data science environments
- Automate repetitive tasks
- And, rapidly share key insights and data science products securely to your entire organization.
Timecodes 0:00 - Intro 4:18 - Hard truth of data science 10:22 - Serious Data Science 16:46 - Model management with R and Python 18:48 - Live Demo / RStudio Workbench 23:09 - RStudio support for Jupyter Notebooks 24:40 - Live Demo / RStudio Connect 28:01 - RStudio support for VS Code 30:05 - R and Python within RStudio 32:33 - Scale and share data science results 36:55 - Sharing previous versions of presentations 38:16 - Data Science team knowledge sharing 40:36 - Scheduling snd emailing data science content 43:55 - Live demo / RStudio Package Manager 48:09 - Data Science stories 49:37 - RStudio Team 52:59 - What makes RStudio different? 55:12 - Q/A - Learn More
Leading organizations like NASA, Janssen Pharmaceuticals, The World Health Organization, financial institutions, government agencies and insurance organizations around the globe use RStudio’s professional products to tackle world-changing problems and we’re inviting you to learn how. You’ll learn how RStudio Team gives professional data science teams superpowers, with all of the bells and whistles that enterprises need.
You can try RStudio Team free here: https://www.rstudio.com/products/team/evaluation2/
If you’d like to access presentation slides, sign up for future events, provide feedback and/or ask additional questions we’ve bundled everything together for you here: https://docs.google.com/document/d/1HGt7LSohhyxpCvETvVEFHugrdaSnTcZaXbI0jV5g9ok/edit?usp=sharing
Christophe Dervieux | Business Reports with R Markdown | RStudio
Business Reports with R Markdown Presentation by Christophe Dervieux
As you share your analysis with business stakeholders across the company, how do you create the look and feel that they expect? This talk will cover theming of reports to fit with graphical guidelines: what can be done already, how that works with Powerpoint and Word, pagedown as another way to generate well-designed PDFs, and also how we plan to improve.
Christophe is a software engineer at RStudio. With an engineering background in economics and energy, he discovered R as an analyst then became a passionate R-admin supporting all R users and usage in his previous company. Longtime contributor for open-source R packages, he now works with Yihui to make the publication and reproducibility ecosystem with R Markdown better.
Christophe’s repo is here: https://github.com/cderv/meetup-2021-rmd-business-report Slides are here: https://meetup-rmd-style-business-report.meetup-rmd-style-business-report.netlify.app/

Data Science Hangout | Sep Dadsetan at ConcertAI | Infrastructure that Encourages Reproducibility
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Sep Dadsetan, Executive Director, RWE Analytics at ConcertAI.
01:35 - Start of session 08:20 - Gaps in the data science space 12:13 - How to show key stakeholders the value of data science 23:23 - Setting up the infrastructure to encourage reproducibility 29:38 - The importance of architecture and infrastructure to data quality 35:34 - Adoption to formalized processes over individual processes
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Add the Data Science Hangout to my weekly calendar: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstu… Twitter: https://twitter.com/rstudio
Gordon Shotwell | Socure | Creating Secure Systems for Growth | Posit
Setting Up Secure Systems for Growth Presentation by Gordon Shotwell, Lead Data Scientist at Socure
Abstract: One of the main challenges of doing data science is getting access to the data in the first place. Data scientists need to be able to look at data in detail to do their work effectively, but that necessarily creates data security and data governance problems for the organization and its clients. In this presentation, we go through the social and technical processes that can create a secure data analytics environment and set your team up for success.
Speaker Bio: Gordon Shotwell is a Lead Data Scientist at Socure where he helps develop tools for data scientists to securely and efficiently work on sensitive data.
They’re hiring too! https://www.socure.com/about/careers
Timestamps: 09:45 - Make friends with the security team in your org 19:40 - Set up child-proof [data science environments] 22:11 - Developer experience is a security problem - people will do the easy thing, make the right thing easy 35:34 - Buy [tools] don’t build them - economic lesson in understanding cost centers
Read the related blog post here: https://blog.rstudio.com/2021/10/26/how-data-scientists-and-security-teams-can-work-together/
GA Tech || “Communicating with 8 Million People through Shiny” || Posit
Georgia Institute of Technology faculty, scientists, GIS specialists, and graduate students launched a tool that provided real-time, localized information on the estimated risk of COVID-19 exposure by attending an event.
In talking with their team it was clear that their empathetic perspective of the audience and communication-focus helped successfully share their insights with event planners, policy makers, various news outlets and individuals - adding up to ultimately over 8 million unique users around the world.
Lessons learned from the GA Tech team about their experience building the COVID-19 Event Risk Assessment Planning Tool can apply to visualizations across many different industries and use cases - whether you are communicating to a handful of executives at your company or out to the world:
-
As a Shiny developer, make sure that you have a specific question in mind. What is the problem that you have that your app is helping them solve? Think about who your audience is going to be and what they would use this for. For the COVID-19 Event Risk Assessment Planning Tool, this question was “what is the risk level of attending an event, given the event size and location?”
-
View your audience through a lens of empathy. Think about metrics that people can really get a grip on and visualize. For example, the risk of attending a local event with 100 people in your own town vs. communicating this as cases per 100,000 people. If you want to communicate something that’s critical to the public, put it in the right terms.
-
Balance the straightforwardness of your visualization. You don’t have to anticipate every single question. With every feature or piece of information included, ask yourself if this supports your overall point? Importance of continued communication.
-
Keep the lines of communication open with your users. If you share a visualization, make sure that people have a clear way to contact you (email, Twitter, LinkedIn) with questions or feedback. Their team made an intentional effort to be available for local news particularly. They were responsive to the kinds of decisions people were making and adjusted the app to match their needs with event sizes for example.
Check out their Shiny application here: https://covid19risk.biosci.gatech.edu/ Read the related blog post here: https://blog.rstudio.com/2021/09/14/how-do-you-use-shiny-to-communicate-to-8-million-people/
R Markdown Advanced Tips to Become a Better Data Scientist & RStudio Connect | With Tom Mock
R Markdown is an incredible tool for being a more effective data scientist. It lets you share insights in ways that delight end users.
In this presentation, Tom Mock will teach you some advanced tips that will let you get the most out of R Markdown. Additionally, RStudio Connect will be highlighted, specifically how it works wonderfully with tools like R Markdown.
Please provide feedback: https://docs.google.com/forms/d/e/1FAIpQLSdOwz3yJluPR2fEqE0hBt92NtKZzzNACR8KJhHUt9rhFj3HqA/viewform?usp=sf_link
More resources if you’re interested: https://docs.google.com/document/d/1VKGs1G9GcQcv4pCYFbK68_LDh72ODiZsIxXLN0z-zD4/edit
04:15 Literate Programming 09:00 - Rstudio Visual Editor Demo 15:44 - R and python in same document via {reticulate} 18:10 - Q&A: Options for collaborative editing (version control, shared drive etc.) 19:30 - Q&A: Multi-pane support in Rstudio 20:46 Data Product (reports, presentations, dashboards, websites etc.) 24:15 - Distill article 26:27 - Xaringan presentation (add three dashes — for new slide) 28:58 - Flexdashboard (with shiny) 30:30 - Crosstalk (talk between different html widgets instead of {shiny} server) 35:03 - Q&A: Jobs panel – parallelise render jobs in background 36:50 - Q&A: various data product packages, formats 39:35 Control Document (modularise data science tasks, control code flow) 39:58 - Knit with Parameters (YAML params: option) 41:20 - Reference named chunks from .R files (knitr::read_chunk()) 43:00 - Child Documents (reuse content, conditional inclusion, {blastula} email) 47:07 Templating (don’t repeat yourself) 47:38 - rmarkdown::render() with params, looping through different param combinations 49:30 - Loop templates within a single document 50:40 - 04-templating/ live code demo 54:37 - {whisker} vs {glue} – {{logic-less}} vs {logic templating} 55:30 - {whisker} for generating markdown files that you can continue editing 57:49 RMarkdown + Rstudio Connect 1:00:41 Follow-up Reading and resources 1:04:49 Q&A - {shiny} apps, {webshot2} for screenshots of html, reading in multiple .R files, best practice for producing MSoffice files, {blastula}
Data Science Hangout | Moody Hadi at S&P Global | Unlocking Business Value with Data Science
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Moody Hadi, Manager of New Product Development and Financial Engineering at S&P Intelligence
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstu … Twitter: https://twitter.com/rstudio
How to Prioritize Projects | Data Science Hangout Highlights
RStudio is joined by Moody Hadi, Manager of New Product Development and Financial Engineering at S&P Intelligence, to discuss how data scientists can become leaders within their organizations.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Dr. Julia Silge | RStudio Voices | RStudio
Julia Silge recently sat down with Michael Demsko Jr for an interview, the first in a new Voices of RStudio PBC series.
In this excerpt, Julia discusses where she sees the most value created in the data science lifecycle–and it’s not advanced machine learning models.
Read the full interview at https://blog.rstudio.com/tags/rstudio-voices/

The Importance of Understanding Your Business Users | Data Science Hangout Highlights
RStudio is joined by Tori Oblad, Data Officer at WaFd bank, to discuss how data scientists can become leaders within their organizations.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Measuring the Impact of Data Science | Data Science Hangout Highlights
RStudio is joined by Frank Corrigan, Director of Decision Intelligence, to discuss how data scientists can become leaders within their organizations.
Watch the full recording: https://www.youtube.com/watch?v=KBs4b3Q2n8Y
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Kelly O’Briant | Build Your Ideal Showcase of Data Products | RStudio Connect 1.9.0
Posit Connect 1.9.0 - Content Curation Tools Overview Demo / Q&A with Kelly O’Briant
As publishers add more content to Posit Connect, content organization, distribution, and discovery can become a challenge. Distributing individual links to all your most important content is tiresome, and the default Connect dashboard contains more information than end-users often want or need.
The new 1.9.0 release of Posit Connect introduces tools for addressing these common content curation concerns:
How do you make sure your audience finds what they need on Posit Connect without paging through the dashboard, remembering the right search terms, or bookmarking every content item you share?
After deploying many pieces of related content, how do you share them as a cohesive project?
You might be interested in these content curation tools if you’ve ever wanted to create:
- A summary/reference page for a complex project.
- A content hub or knowledge repository for work belonging to a team or objective.
- A customized entry point into Posit Connect for stakeholders.
- A presentation layer for any curated list of notable content items.
For more information on Posit Connect: https://www.rstudio.com/products/connect/ If you’d like to do an evaluation: https://www.rstudio.com/products/connect/evaluation/ Blog Post: https://blog.rstudio.com/2021/07/29/rstudio-connect-1-9-0/
Make Sure You Communicate Value | Data Science Hangout Highlights
RStudio is joined by Tori Oblad, Data Officer at WaFd bank, to discuss how data scientists can become leaders within their organizations.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Teaching and learning with RStudio Cloud | RStudio
Learn about RStudio Cloud and most recent developments, particularly with respect to teaching with it.
Slides are posted at https://rstd.io/tl-rscloud .
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
Data Science Hangout | Frank Corrigan, Target | Understanding the Impact of Data Science
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Frank Corrigan, Director of Decision Intelligence at Target.
6:38 - Problem formation - trying to find the unknown unknowns and bringing them to the business
8:20 - Integrating your data science team into your company’s business objectives (ex. newsletter)
10:50 - What is the divide between a business analyst and a data scientist, in your eyes? How business analysts and data scientists differ
15:32 - What is the biggest mistake you’ve made in your role and what did you learn from this mistake?
19:15 - Onboarding new team members effectively
25:00 - The importance of motivating non-data scientists
26:42 - Resources for data scientists
32:30 - Challenges when using different tools across a data science team
49:28 - Analytical thinking vs critical thinking skills
57:30 - Embracing the 80/20 Rule & the importance of Focus Time
1:02:48 - Two frameworks to be more effective with stakeholders
1:06:47 - Rebranding to “decision intelligence”
1:08:35 - Quantitatively measuring impact from data science insights
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Join the Data Science Hangout Live every Thursday from 12-1 ET: https://www.addevent.com/event/Qv9211919
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Data Scientists vs. Business Analysts | Data Science Hangout Highlights
RStudio is joined by Frank Corrigan, Director of Decision Intelligence, to discuss how data scientists can become leaders within their organizations.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Posit Cloud | Creating a Shared Space | Instructor View
A shared space is an area where a group of people can collaborate together - only the members of a shared space can access the space and its contents.
To create a shared space, go to the navigation sidebar (click the menu icon at the upper left if needed) and choose New Space, then follow the on-screen instructions.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | {rscloud} Package | Instructor View
You can access RStudio Cloud’s API to manage space members programatically using the rscloud package.
You will need to create client credentials to use the package. To do so, click on your icon/name in the header to reveal the User panel, then click on Credentials. This will take you to the Credentials page of RStudio User Settings, where you can create and manage your client credentials.
{rscloud} package repo: https://github.com/rstudio/rscloud
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Creating an Assignment | Instructor View
A good way to create assignments for your students is to make a project for each assignment, and populate it with the files and packages you would like each student to have when they begin the assignment. When you are ready to reveal an assignment to your students, open the project and do the following:
- Click on the Project Settings button (the gear in the upper right)
- Select the Access panel
- Set the project access so it can be viewed by everyone in the space
- Check “Make this project an assignment”
When the student opens an assignment project (either via the projects listing, or via a direct link to the project), RStudio Cloud will automatically make a copy for them. To convey this special behavior, assignments are displayed a bit differently, both for you and for your students.
You can see the students’ copies of the assignment by clicking on the “View n derived projects” link that will appear with your original project.
Note that changes you make to the original assignment will not be applied to any student copies already created. The changes will only apply to future copies of the assignment.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Exporting Projects | Instructor View
The contents of a project can be downloaded without opening the project.
Press the Export action to the right of the project that you wish to download.
A dialog box will appear showing the progress of your export. The process can take anywhere from a few seconds to a couple of minutes, depending on the size of the project, and how recently it was opened.
Once the export is complete, press the Download button in the dialog box to download your project.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Invite Learners | Instructor View
As you did to invite your co-instructors, go to the space’s Members area to invite your students to your course space. The easiest approach is to enable access via a sharing link.
- Click on the Sharing link option in the Access section
- Set the Initial Role to Contributor
- Click on the Copy Sharing Link action
You can then distribute the sharing link to all your students via a course web page or email. Once all your students have joined the space, you can either reset the sharing link (which will disable access via the previous sharing link), or set Access to “Invitation Required” to ensure that nobody joins the class later without your permission. See the Members section in Shared Spaces above for more details and alternate methods for adding students to your space.
Note that each student must have their own RStudio Cloud account. When they attempt to access your course space for the first time, they will be prompted to log in, or to sign up for an account if they don’t already have one. If your organization is using SSO, account creation will happen automatically when they log in the first time.
At this point, you and your students should be all set.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Inviting Co-instructors and Teaching Assistants | Instructor View
After creating a space, go to the Members area to manage its membership. You can add members to your space in three ways:
To add members one by one, choose the Invitation Required option, and send invitations to each person you’d like to add to the space via the Add Member button. Note that for security reasons, the invitation is good for 7 days from the time that it is sent. To allow many people to join the space, choose the Sharing Link option, copy the sharing link and then share that link with all the people you’d like to join the space, either via an email or by posting the link on a web page. Keep in mind that anyone with that link will be able to join your space. If you would like to disable a link you previously shared, choose Reset Sharing Link; this will prevent any additional people from joining the space using that link. Space membership can also be managed programatically. See API Access from R in the Advanced Topics section below for details.
Note that you can switch the access option at any time - a common approach is to initially set the access option to Sharing Link, post the sharing link, then after all the people you want to join the space initially have done so, switch access to Invitation Only. The original sharing link will no longer allow additional people to join the space, but you can add any new members individually.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Overview | Instructor View
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Project Types | Instructor View
A project is the fundamental unit of work on RStudio Cloud. It encapsulates your R code, packages and data files and provides isolation from other analyses. If you are familiar with projects in the desktop RStudio IDE, an RStudio Cloud project is the same thing, plus some additional metadata for access and sharing.
To create a new project from scratch, simply press the New Project button from the Projects area. Your new project will open in the RStudio IDE.
To create a new project from an existing git repository, press the down arrow on the right side of the New Project button, and choose ‘New Project from Git Repo’ from the menu that appears. Note that your git credentials need to be entered each time you create a new project and are only cached for 15 minutes by default.
To create a Jupyter Project, press the down arrow on the right side of the New Project button, and choose ‘New Jupyter Project’ from the menu that appears.
A new Jupyter project will be created and deployed. Once deployed, you will see the Jupyter hub tree view with a welcome.ipynb notebook that contains information about getting started with Jupyter.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Reusing a Workspace | Instructor View
One approach to reusing a space is to remove the current cohort of students from the space, and reuse the same space with the next cohort. If you are an Admin of a space, you can remove any member from that space.
Go to the Members area of the space. For the member you would like to remove, press the Delete icon. You will be prompted to confirm that you would like to remove them from the space, and to choose whether to leave their projects behind in the space, or move them to their personal space.
You can also remove members programatically via the Cloud API using the rscloud package.
Another option is to create a new space from your current course space using the Copy Space action.
If you’d like to re-use materials from a space, e.g. to teach a new cohort of students, use the Copy Space command. All projects shared with everyone in the original space will be copied over to the new space. Members of the original space are not copied over - you will be the only initial member of the new space.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Roles | Instructor View
Each member of a space is assigned a role to determine what they are able to do within the space. The available roles are:
Admin: can manage users, view, edit and manage all projects and view space usage data.
Moderator: can view, edit and manage all projects, and view space usage data.
Contributor: can create, edit and manage their own projects. This is the default.
Viewer: can view projects shared with everyone in the space. A viewer cannot create or save copies of projects.
Members are assigned an initial role when they are invited to or join a space, but roles can be changed by an Admin at any time. When you invite an individual member to a Invitation Required space, you set the initial role in the Add Member dialog box. When someone joins via a sharing link, their initial role is set to the current Initial Role setting in the members options panel. Admins can update a user’s role via the role selector in the members list.
Changing permissions lets you fine-tune the Contributor and Viewer roles. The permissions are:
Contributors can see the members list: enables Contributors to see who can access the space.
Contributors can make their projects visible to all members: enables Contributors to share their projects with everyone in the space.
Contributors can change project resources: enables Contributors to change RAM and CPU settings on their projects.
Viewers can see the members list: enables Viewers to see who can acess the space.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Setting Up a Base Project | Instructor View
You can make all projects in a space begin with a default set of files and packages. You do this by defining a Base Project for the space.
Create a new project and add any packages or files you want all projects created in the space to start with. Set the project’s access so that everyone in the space can view the project.
Go to the Space Settings page and select the project as the Base Project. Once you select a project as your Base Project, it will no longer be included in the projects listing for the space. To access it, choose the Edit command from the Space Settings page.
Changes to the Base Project are not retroactive. Changes will not be applied to any projects already created - the changes will only apply to future projects created via the New Project action.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Troubleshooting | Instructor View
Relaunch Project If you are working on a project, and it fails to load or becomes completely unresponsive, you can relaunch the project. Depending on the underlying problem, relaunching the project may fix it and let you continue.
Allocate Additional Memory memory-gaugeRunning out of memory while working on a project may be the cause of a variety of errors.
The project memory gauge will give you an idea of what percentage of available memory your project is currently using. The gauge updates roughly every ten seconds, so it may not show the exact usage at the current moment. Note that reclaiming unused memory is controlled by R and the operating system - you may see uncanny fluctuations in the gauge as the system manages memory.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Tuning Resources | Instructor View
By default, each project is allocated 1 GB of RAM and 1 CPU, and can execute in the background for up to 1 hour. If your plan allows it, you can increase the memory, CPU or background execution time allocated to a project. To do so, open Project settings, go to the Resources panel and adjust the allocation.
Note that copies of a project will inherit its resource settings. However, the effective allocation for the copy will be limited by the maxiumum allowed by the account that “owns” the new copy.
The usage hours consumed by a project in a given amount of time depend on the resources allocated to the project, according to the following formula: (RAM + CPUs allocated) / 2 x hours.
On disk, each project is allocated up to 20GB for its files, data, and packages.
The maximum size of a file that can be uploaded to a project is 500MB.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
RStudio Cloud | Viewing Learning Work | Instructor View
As an Admin (or Moderator) of your course space, you and your fellow instructors have access to all projects in your space. You can open student projects from any projects listing, or to see all the projects of a given student, go to their profile page by clicking on their name from the Members page.
Note that if you do open a student’s project while they also have it open, they will be temporarily disconnected from the project.
You can also view how much time your students have spent using Cloud. Simply visit the Usage area of your space, where you can see aggregate usage data for the entire class, or for an individual student and their projects.
ABOUT RSTUDIO CLOUD: RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.
Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world. Teach data science with R to your students or colleagues. Learn data science in an instructor-led environment or with interactive tutorials.
There is nothing to configure and no dedicated hardware, installation or annual purchase contract required. Individual users, instructors and students only need a browser to do, share, teach and learn data science.
We will always offer a free plan for casual, individual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations.
RSTUDIO CLOUD RESOURCES: RStudio Cloud https://rstudio.cloud RStudio Cloud Pricing plans https://rstudio.cloud/plans/instructor RStudio Cloud guide https://rstudio.cloud/learn/guide {rscloud} https://github.com/rstudio/rscloud
VIDEO CREDITS: Monitor icon made by xnimrodx from flaticon.com Cloud icon made by Freepik from flaticon.com Tiny Putty Music from Blue Dot Sessions: https://app.sessions.blue/browse/track/52046
#
ABOUT RSTUDIO: RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
Data Science Hangout | Tori Oblad, WaFd Bank | Getting Executives to Support Data Science
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Tori Oblad, Enterprise Data & Analytics Officer at WaFd Bank.
Here are a few snippets from our conversation: 1:14 - Start of session 3:00 - How to build an internal data science community 11:40 - Showing the art of the possible 14:00 - How do you get others to lead topics and foster engagement? 26:17 - Writing starter scripts for new users 35:55 - When to use R or Python versus BI 36:38 - Building toy models in Excel to explain it to people / to build relationships with business 38:33 - Avoiding vendor lock-in, being technology agnostic 43:35 - How to build confidence with IT and compliance 49:15 - Working with business users and creating business value 53:21 - Getting business and executive support 1:22:30 - What data scientists should focus on when communicating with stakeholders: value
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
How to Improve Your Communication Skills | Data Science Hangout Highlights
RStudio is joined by Elaine McVey, VP of Data Science at The Looma Project, to discuss how data scientists can become leaders within their organizations.
Watch the full recording here: https://www.youtube.com/watch?v=IkqItgPSPro&feature=youtu.be
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
How to Communicate Value | Data Science Hangout Highlights
RStudio is joined by Jonathan Regenstein, Head of Data and Quantamental Research at Truist Securities, to discuss how data scientists can become leaders within their organizations.
Watch the full recording here: https://www.youtube.com/watch?v=pNTENrov020
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Ralph Asher | Intro to Monte Carlo Simulation | RStudio
Introduction to Monte Carlo simulation, using Shiny Presentation by Ralph Asher
Monte Carlo Simulation is a powerful methodology to model and understand the impact of uncertainty upon real life. In this talk, I will introduce Monte Carlo simulation through a simple example: I’m meeting my neighbor after work for dinner in our neighborhood. Given the uncertain length of our commutes, will we make it in time for our reservation? I’ll talk through the scenario, then walk through a simple Shiny app that explores the power of Monte Carlo Simulation to recommend decisions under uncertainty.
Bio: I am Ralph Asher, and I am the founder of Data Driven Supply Chain LLC, a Minnesota-based consultancy that helps organizations apply data science and AI methods, including simulation, to design and improve their supply chain. Prior to founding Data Driven Supply Chain, I worked as an Operations Research Scientist at Target, designing e-commerce supply chain networks, and at General Mills, designing warehousing networks. I have used R for supply chain analytics for over eight years at these companies. I live in the Minneapolis, MN area and love running in the (usually cool) Minnesota air. I can be reached at ralph@datadrivensupplychain.com
Data Science Hangout | Jonathan Regenstein, Truist | Relationships with IT and Non-Data Scientists
We want to help data science leaders become better.
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Jonathan Regenstein, Head of Data and Quantamental Research at Truist Securities.
Working with IT and building relationships was a focus in our conversation with Jonathan and he included a few tips for building relationships with non-data scientist colleagues.
Find a partner within the IT organization and talk to that person at least once a week. IT can help you communicate value proposition along the way as well.
“It sounds crazy to say this in the world of data science, but relationship building was critical to what we did, especially at a bank. Thousands of request for new technology. There’s no way to avoid going through all the security scans and check marks that we have to go through. We want to make sure we have a good partner who is going to help us do that”
0:48 - Start of session 10:52 - How should data science leaders work with IT? 46:20 - How far out Data Science Leaders should be planning projects with IT 48:20 - How do you become a champion of data science within your organization? 1:02:11 - Your responsibility as a data science leader is to work cross functionally 1:04:17 - Data Science Leaders: Your business cares about the value, not how you got there.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
How to Speak to Executives | Data Science Hangout Highlights
RStudio is joined by Elaine McVey, VP of Data Science at The Looma Project, to discuss how data scientists can become leaders within their organizations.
Watch the full recording here: https://www.youtube.com/watch?v=IkqItgPSPro
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio
Matthias Mueller - Campaign Monitor | Marketing Meetup | RStudio
R in Marketing: Serving Bespoke Insights Through Automation Presentation by Matthias Mueller
Blog Post: https://www.rstudio.com/blog/r-in-marketing-meetup/
Abstract: As a global MarTech company, Campaign Monitor and its family of brands generate a myriad of data points. We have countless different platforms, reports, and dashboards that track business and individual performance, however with this overabundance of data come challenges that could hamper growth and productivity. Functioning as a de facto analytics agency to Campaign Monitor’s marketing organization, this talk will walk through how Matthias’ team used R, Slack, and RStudio Connect to build a system that serves custom, individualized insights directly to stakeholders, at the right time, where work already happens.
Bio: Matthias is the Director of Marketing Analytics at Campaign Monitor, a global MarTech company in the email marketing and personalization space. At CM, he oversees a team of data scientists, engineers, and analysts tasked with optimizing the marketing mix, building customer lifetime value models, and enhancing the prospect’s buying journey. Prior to this assignment, Matthias held positions managing the global digital analytics strategy for SVP Worldwide and led a data-driven UX team at a full-service digital marketing agency.
For future meetups: https://www.meetup.com/RStudio-Enterprise-Community-Meetup
Data Science Hangout | Elaine McVey at the Looma Project | Communicating the Value of Data Science
The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders.
An accomplished leader in the space will join us each week and answer whatever questions the audience may have.
We were recently joined by Elaine McVey, VP of Data Science at the Looma Project.
21:30 - How to approach experimentation and running tests 43:30 - How do you communicate the value of data science to executives 52:40 - Ways to improve your communication skills as a data scientist 57:15 - How to package insights to executives 1:01:03 - What data scientists get wrong when communicating insights
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstu … Twitter: https://twitter.com/rstudio
To join future data science hangouts, more info here: rstd.io/datasciencehangout
Chris Bumgardner, Children’s Wisconsin || Healthcare Meetup || Posit
Cultivating an R-based Analytic Practice in Healthcare
Supporting the advanced analytic needs of an active academic healthcare organization requires tools and practices that enhance the application of statistical and algorithmic approaches. To positively impact care, system operations, or even well-being at the community level, these tools need to support solutions that can be rapidly deployed and communicated as well as reproduced when studying longitudinal trends.
At Children’s Wisconsin, we use R and Posit’s suite of tools to enable forecasting, modeling, and data mining among other data science activities. We communicate the results of our efforts using interactive applications built with Shiny as well as reports and push analytics created using RMarkdown. This talk will discuss how we have developed this capability and provide a few examples of the applications that have been created to support our vision that the kids of Wisconsin will be the healthiest in the nation.
Agenda
- Children’s Wisconsin Introduction
- Data Science Tools and Supporting Infrastructure
- Example R-based Projects [Community: Missing Youth, System-wide: COVID-19 Response, Operational: Patient Placement Planning and Optimization]
- Challenges and Future Plans
Speaker Bio: Chris Bumgardner leads the data science efforts at Children’s Wisconsin and works with teams across the health system to improve decision-making. He is focused on applying statistical methods to data sets large and small to discover and visualize insights that will help ensure Wisconsin’s kids are healthy, happy, and safe. Chris can often be found awake far too early thanks to an insubordinate rescue dog named Dutch.
R in Healthcare Slack Group: https://join.slack.com/t/rinhealthcare/shared_invite/zt-sc7lc4k6-K9zb~kX826dOXMcaj~Wt~w
RStudio Enterprise Community Meetup for future events: https://www.meetup.com/RStudio-Enterprise-Community-Meetup
Art Steinmetz | Open Source Data Science in the Enterprise | RStudio
Art Steinmetz is the former Chairman, CEO and President of Oppenheimer Funds. In this interview, Art gives his unique perspective on the value and suitability of open source, code-first data science for the enterprise.
Timecodes 0:00 Intro 0:10 What are some of the advantages to adopting open source software within an organization? 2:23 Is open source software appropriate for enterprise-level data science? 4:29 How do you build support for open source software within an organization? 5:56 Do executives need to know how to code in order to manage data science teams? 7:56 How and why did you get started with R, and why? 9:55 Any other advice for an aspiring Data Science Leader?
For a more in-depth view of Art’s perspective on Open Source Data Science in Investment Management, see this blog post https://blog.rstudio.com/2020/10/13/open-source-data-science-in-investment-management/
To learn more about RStudio’s professional products, and how they can help you scale, secure and operationalize your open source data science, see www.rstudio.com
Nicolas Nguyen - ZEISS | Supply Chain Management Meetup | RStudio
0 - 50:20 Presentation 50:20 - 56:52 Q&A Presentation by Nicolas Nguyen, Digital Supply Chain and Global S&OP Leader for Carl Zeiss Meditec
Abstract: In demand & supply planning, we often need to calculate projected inventories and replenishment plans - sometimes for hundreds or thousands of SKUs, and through different levels of the distribution network.
In the sales & operations planning (S&OP) process, we might need to run some scenarios to balance demand and supply to support sales: changing inventories plans, sales plans, delivery lead times, production plans, product mix, etc.
Using Shiny, we can design simple, powerful, scalable, and reproducible apps for demand & supply planning as well as the S&OP process. In this talk, you will learn about real-life examples of web applications, which can be deployed in a few minutes to perform some popular demand & supply planning operations.
__
Thank you to Nicolas for an awesome presentation on how he is using Shiny today. With all the interest from the community in this topic, we’d love to continue the discussion and connect us all through:
Future meetups: https://www.meetup.com/RStudio-Enterprise-Community-Meetup/ R for Data Science Online Learning Community (“chat-supply_chain”): r4ds.io/join RStudio Community: community.rstudio.com
Katherine Kopp | COVID vaccine distribution Shiny app walkthrough (mock data) | RStudio
Learn more:
Data Driven West Virginia: https://business.wvu.edu/research-outreach/data-driven-wv
DDWV PPE forecasting: https://wvutoday.wvu.edu/stories/2020/04/27/wvu-business-experts-partner-with-the-national-guard-to-forecast-ppe-needs
DDWV inventory management system: https://wvutoday.wvu.edu/stories/2021/03/22/a-different-kind-of-science-wvu-chambers-college-data-scientists-propel-west-virginia-s-acclaimed-vaccine-strategy-with-digital-inventory-management-system
West Virginia National Guard: https://www.wv.ng.mil/
Shiny: https://shiny.rstudio.com/
West Virginia leading nation at start of vaccine rollout: https://www.vox.com/first-person/2021/3/4/22313540/covid-19-vaccine-west-virginia
To understand just how hard it is to get vaccines to the population, it helps to understand where it can go wrong. This starts with how vaccines are packed into containers.
To fill up a container, Pfizer places 195 vials into a tray, and up to 5 trays into a single container. Moderna puts 10 vials into a small box, and then combines a minimum of 10 small boxes into a single container. In most states Pfizer and Moderna ship directly to the organization that will be administering the vaccine to the population. This could be a hospital, a pharmacy, or any place where trained professionals will be putting shots into arms.
But what happens when a pharmacy receives a full container from Pfizer, 975 vials, but only needs 600?
West Virginia has removed this complication by shipping directly to five hubs strategically located throughout the state. Within each of these hubs, containers of vaccine vials are broken down into smaller components and then either picked up or shipped directly to the hospital, pharmacy, or organization that will be administering the vaccine.
These hubs are managed by the Joint Interagency Task Force (JIATF), a team of teams composed of public, private, and governmental organizations as well as the National Guard. The Joint Interagency Task Force is responsible for drawing up a weekly distribution plan for each hub, in alignment with CDC allocations, and matching vaccine supply with demand.
By using a statewide system managed by a central organization, there’s a level of agility and fluidity that allows each hub to adjust to a variety of changes in order to maximize the number of vaccines that are being administered to the population each week.
RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com .
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
Managing COVID vaccine distribution in West Virginia | RStudio
With a little help from open source software
Learn more: Data Driven West Virginia: https://business.wvu.edu/research-outreach/data-driven-wv
DDWV PPE forecasting: https://wvutoday.wvu.edu/stories/2020/04/27/wvu-business-experts-partner-with-the-national-guard-to-forecast-ppe-needs
DDWV inventory management system: https://wvutoday.wvu.edu/stories/2021/03/22/a-different-kind-of-science-wvu-chambers-college-data-scientists-propel-west-virginia-s-acclaimed-vaccine-strategy-with-digital-inventory-management-system
West Virginia National Guard: https://www.wv.ng.mil/
Shiny: https://shiny.rstudio.com/
West Virginia leading nation at start of vaccine rollout: https://www.vox.com/first-person/2021/3/4/22313540/covid-19-vaccine-west-virginia
In the United States, approximately 2.5 million doses of COVID vaccines are being delivered each day, and how these doses go from the manufacturer to a shot in someone’s arm varies by state, often with mixed results.
But early on in the vaccine distribution process, one state led the pack in terms of using the majority of vaccine doses it had been allotted. That state? West Virginia.
Part of what has made West Virginia successful is the creation of an inventory management system using Shiny, an open source framework for building interactive web applications. The system was built by Data Driven West Virginia, part of the John Chambers College of Business and Economics at West Virginia University, in collaboration with the West Virginia Army National Guard.
Using Shiny has provided visibility into each component of the vaccine supply chain, leading to the creation of distribution plans that are able to quickly and efficiently match supply with demand, getting vaccines to the right people in the right location at the right time.
RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work across industries.
RStudio also produces RStudio Team, a modular platform of commercial software products that give organizations the confidence to adopt R, Python and other open-source data science software at scale, along with online services to make it easier to learn and use them over the web.
Together, RStudio’s open-source software and commercial software form a virtuous cycle: the adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone. Check out www.rstudio.com
Follow us on Twitter: https://twitter.com/rstudio
Facebook: https://www.facebook.com/rstudiopbc/
And LinkedIn: https://www.linkedin.com/company/rstudio-pbc/
Julia Silge | Monitoring Model Performance | RStudio
0:00 Project introduction 1:50 Overview of the setup code chunk 3:05 Getting new data 4:05 Getting model from RStudio Connect using httr and jsonlite 6:20 Bringing in metrics 9:45 Using the pins package 10:50 Using boards on RStudio Connect 13:30 Benefits of using pins 14:00 Visualizations using ggplot and plotly 17:00 Knitting the flexdashboard 18:10 Project takeaways
You can read Julia’s blogpost, Model Monitoring with R Markdown, pins, and RStudio Connect, here: https://blog.rstudio.com/2021/04/08/model-monitoring-with-r-markdown/
Modelops playground GitHub repo: https://github.com/juliasilge/modelops-playground
pins package documentation: https://pins.rstudio.com/
flexdashboard documentation: https://rmarkdown.rstudio.com/flexdashboard/
tidymodels documentation: https://www.tidymodels.org/

Tom Mock | RStudio Connect in Production
https://rstudio.com/resources/webinars/rstudio-connect-in-production/
In part 2 of this 3 part series, Tom covers: Communicating results can be the most challenging part of Data Science: many insights never leave the laptops where they are discovered. In this webinar, we will show you how to use RStudio Connect to deploy your results in a production environment. You’ll learn how to automate publishing, schedule updates, and provide consumers with self-service access to your work. RStudio Connect is a revolutionary new way to host executable Data Science content.
About Tom: Thomas is involved in the local and global data science community, serving as Outreach Coordinator for the Dallas R User Group, as a mentor for the R for Data Science Online Learning Community, as co-founder of #TidyTuesday, attending various Data Science and R-related conferences/meetups, and participated in Startup Weekend Fort Worth as a data scientist/entrepreneur
Kelly O’Briant | Interactivity in Production | RStudio (2019)
https://rstudio.com/resources/webinars/interactivity-in-production/
In part 3 of this 3 part series, Kelly covers: Interactive products take your data science to a new level, but they require new coding decisions. This webinar will give you clear guidelines on when and how to add interactivity to your work. Here you’ll learn: when to use off-the-shelf interactive products like parameterized R Markdown and htmlwidgets, when to create bespoke interactivity with Shiny, how to make your Shiny apps as fast as possible, how to support interactivity in production, and much more.
About Kelly: Kelly is Solutions Engineer for RStudio and also an organizer of the Washington DC chapter of R-Ladies Global. It’s an R users group for lady-folk and friends
Garrett Grolemund | Reproducibility in Production | RStudio (2019)
https://rstudio.com/resources/webinars/reproducibility-in-production/
In part 1 of this 3 part series, Garrett covers the following:
Computational documents offer limitless opportunities for your business. With them, your consumers can rerun your report with new parameters, apply your analysis to new data, or schedule future, automatic updates to your work—all with the click of a button. This is the first in a three part webinar series that will describe this new form of reproducibility. Here, we begin by showing you how to write executable R Markdown documents for a production environment.
About Garrett: Garrett is the author of Hands-On Programming with R and co-author of R for Data Science and R Markdown: The Definitive Guide. He is a Data Scientist at RStudio and holds a Ph.D. in Statistics, but specializes in teaching. He’s taught people how to use R at over 50 government agencies, small businesses, and multi-billion dollar global companies; and he’s designed RStudio’s training materials for R, Shiny, R Markdown and more. Garrett wrote the popular lubridate package for dates and times in R and creates the RStudio cheat sheets
Webinar Summary | Avoid Dashboard Fatigue | RStudio (2020)
0:00 Introduction 0:07 The Problem 1:05 The Solution 3:20 Real Life Success Stories 5:27 Demo (with code)
Don’t have an hour to watch a webinar? We’ve made a summary video that covers the main points of our “Avoid Dashboard Fatigue” webinar from Sean Lopp and Rich Iannone.
The full webinar covered: Data science teams face a challenging task. Not only do they have to gain insight from data, they also have to persuade others to make decisions based on those insights. To close this gap, teams rely on tools like dashboards, apps, and APIs. But unfortunately data organizations can suffer from their own success - how many of those dashboards are viewed once and forgotten? Is a dashboard of dashboards really the right solution? And what about that pesky, precisely formatted Excel spreadsheet finance still wants every week?
In this webinar, we’ll show you an easy way teams can solve these problems using proactive email notifications through the blastula and gt packages, and how RStudio pro products can be used to scale out those solutions for enterprise applications. Dynamic emails are a powerful way to meet decision makers where they live - their inbox - while displaying exactly the results needed to influence decision-making. Best of all, these notifications are crafted with code, ensuring your work is still reproducible, durable, and credible.
We’ll demonstrate how this approach provides solutions for data quality monitoring, detecting and alerting on anomalies, and can even automate routine (but precisely formatted) KPI reporting.
Webinar materials: https://rstudio.com/resources/webinars/avoid-dashboard-fatigue/
About Sean: Sean has a degree in mathematics and statistics and worked as an analyst at the National Renewable Energy Lab before making the switch to customer success at RStudio. In his spare time he skis and mountain bikes and is a proud Colorado native.
About Rich: My background is in programming, data analysis, and data visualization. Much of my current work involves a combination of data acquisition, statistical programming, tools development, and visualizing the results. I love creating software that helps people accomplish things. I regularly update several R package projects (all available on GitHub). One such package is called DiagrammeR and it’s great for creating network graphs and performing analyses on the graphs. One of the big draws for open-source development is the collaboration that comes with the process. I encourage anyone interested to ask questions, make recommendations, or even help out if so inclined!

Tom Mock & Shannon Haggerty | Theming Shiny and RMarkdown with {thematic} & {bslib} | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: this presentation covers the basics of how the thematic and bslib packages can be used to consistently style all the components of a shiny app at once.
About Tom Mock: Thomas is involved in the local and global data science community, serving as Outreach Coordinator for the Dallas R User Group, as a mentor for the R for Data Science Online Learning Community, as co-founder of #TidyTuesday, attending various Data Science and R-related conferences/meetups, and participated in Startup Weekend Fort Worth as a data scientist/entrepreneur.
About Shannon Haggerty: Shannon is on RStudio’s Customer Success team working with teams across the Life Sciences and Healthcare. In her free time, she likes to bake, hang out with her dogs, and explore new hobbies.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Pedro Silva | Styling Shiny with CSS & SASS and Speeding Up Shiny Apps | Posit
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: in the first part of this talk I will discuss how to use CSS to give your application a fresh and unique look, while keeping your codebase clean and organized with SASS. During the second half I will discuss how to leverage Shiny update functions, proxy objects and JavaScript messages to speed up your dashboards.
About Pedro Silva: Pedro has nearly a decade of experience combining frontend and backend technologies, and is an expert on augmenting R Shiny dashboards with CSS and JavaScript.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Jenny Bryan | Help me help you: creating reproducible examples | RStudio (2018)
What is a reprex? It’s a reproducible example. Making a great reprex is both an art and a science and this webinar will cover both aspects. A reprex makes a conversation about code more efficient and pleasant for all. This comes up whenever you ask someone for help, report a bug in software, or propose a new feature. The reprex package (https://reprex.Tidyverse.org ) makes it especially easy to prepare R code as a reprex, in order to share on sites such as https://community.rstudio.com , https://github.com , or https://stackoverflow.com . The habit of making little, rigorous, self-contained examples also has the great side effect of making you think more clearly about your programming problems.
Webinar materials: https://rstudio.com/resources/webinars/help-me-help-you-creating-reproducible-examples/
About Jenny: Jenny is a software engineer on the tidyverse team. She is a recovering biostatistician who takes special delight in eliminating the small agonies of data analysis. Jenny is known for smoothing the interfaces between R and spreadsheets, web APIs, and Git/GitHub. She’s been working in R/S for over 20 years and is a member of the R Foundation. She also serves in the leadership of rOpenSci and Forwards and is an adjunct professor at the University of British Columbia

Damian Rodziewicz | Scaling Shiny to Thousands of Users | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: in this talk I will discuss how to scale Shiny dashboards to thousands of users.
About Damian Rodziewicz: Damian is one of the four co-founders of Appsilon. Before founding Appsilon he worked at Accenture, UBS, Microsoft and Domino Data Lab.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Dominik Krzemiński | Appsilon’s Guide to Working With Open Source Shiny | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: There is no need to praise Shiny for its influence on interactive data visualisation. As with many other technology stacks, Shiny could benefit from community contributions for the further development of the package itself and the growth of independent packages that add new features. In this talk, I present some of the most popular Shiny extensions and explain what are the ways to help with developing Shiny-related tools.
About Dominik Krzemiński: Dominik is the Open Source Tech Lead at Appsilon where he enjoys contributing to open source tools, mainly in R and Python. He created shiny.i18n, shiny.semantic, and the TODOr package for R. He also participated in the Google Summer of Code, where he developed tools supporting neuroscience analyses. He’s also a fan of all kinds of board sports and capoeira.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Olga Mierzwa-Sulima | Best Practices for Developing Shiny Apps | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: Best practices for developing Shiny apps presentation covers organizing app’s code with modules and R6 classes, setting up development environment, and testing.
About Olga Mierzwa-Sulima: Olga is experienced in production applications of analytical solutions, especially for FMCG companies. Recently she developed a price elasticity model for Unilever.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Filip Stachura & Marek Rogala | Empowering Data Scientists to Build Spectacular Shiny Apps | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: in this talk, Appsilon’s CEO and CTO show their vision of challenges facing Shiny app authors and what is crucial to achieving success. They announce 3 key initiatives that Appsilon undertakes to empower data scientists to build spectacular Shiny Apps, including the {shiny.fluent} package.
About Filip Stachura: Filip is a CEO and a Co-founder of Appsilon. He holds a double degree in Applied Mathematics and Computer Science from the University of Warsaw. He started his professional career at Microsoft in California. Passionate about data analysis, elegant visualisations and tackling hard algorithmic and analytical problems.
About Marek Rogala: Marek Rogala is the CTO at Appsilon, where he drives innovation in R and Shiny as well as Machine Learning. He previously did software engineering at Google and at Domino Data Lab, where he worked on enabling data scientists to experiment and collaborate effectively.:
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Nathan Stephens | Best Practices for Administering RStudio in Production | RStudio (2019)
Most organizations are unfamiliar with the R programming language. As a result they often struggle to onboard and manage R in production. In this webinar we introduce the RStudio Quickstart which makes it easy to try RStudio professional products on your desktop for free. We also outline best practices for using R and RStudio in production.
Webinar materials: https://rstudio.com/resources/webinars/best-practices-for-administering-rstudio-in-production/
About Nathan: Nathan has a background in analytic solutions and consulting. He has experience building data science teams, architecting analytic infrastructure, and delivering innovative data products. He is a long time user of R
Tom Mock | A Gentle Introduction to Tidy Statistics in R | RStudio (2019)
R is a fantastic language for statistical programming, but making the jump from point and click interfaces to code can be intimidating for individuals new to R. In this webinar I will gently cover how to get started quickly with the basics of research statistics in R, providing an emphasis on reading data into R, exploratory data analysis with the Tidyverse, statistical testing with ANOVAs, and finally producing a publication-ready plot in ggplot2.
Use the code presented instantly on RStudio Cloud!
RStudio Cloud: rstudio.cloud Webinar materials: https://rstudio.com/resources/webinars/a-gentle-introduction-to-tidy-statistics-in-r/
About Thomas: Thomas is involved in the local and global data science community, serving as Outreach Coordinator for the Dallas R User Group, as a mentor for the R for Data Science Online Learning Community, as co-founder of #TidyTuesday, attending various Data Science and R-related conferences/meetups, and participated in Startup Weekend Fort Worth as a data scientist/entrepreneur
Sean Lopp | Posit Investments in Pharma | Posit
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: R/Pharma is an organization of R enthusiasts who work in the pharma and biotech industries. This presentation summarizes the group and presents some goals for 2021.
More about Sean Lopp: Sean has a degree in mathematics and statistics and worked as an analyst at the National Renewable Energy Lab before making the switch to customer success at RStudio. In his spare time he skis and mountain bikes and is a proud Colorado native.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Marly Gotti | Risk Assessment Tools: R Validation Hub Initiatives | Posit
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: we will present some of the resources and tools the R Validation Hub has been working on to aid the biopharmaceutical industry in the process of using R in a regulatory setting. In the talk, you will learn about the {riskmetric} R package, which measures the risk of using R packages, and you will also see a demo of the Risk Assessment Shiny application, which is an advanced user interface for {riskmetric}.
About Marly Gotti: Marly Gotti is a Senior Data Scientist at Biogen and a former RStudio intern. She is also an executive committee member of the R Validation Hub, where she advocates for the use of R within a biopharmaceutical regulatory setting.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Harvey Lieberman | R/Pharma | Posit
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: R/Pharma is an organization of R enthusiasts who work in the pharma and biotech industries. This presentation summarizes the group and presents some goals for 2021.
About Harvey Lieberman: Harvey Lieberman works at Novartis and has been a member of R/Pharma since 2017.
For more on R/Pharma: https://www.pharmar.org/
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Ellis Hughes | R Package Validation Framework | Posit
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: in this talk I discuss the process developed for validating internally generated R packages at SCHARP (Statistical Center for HIV/AIDS Research and Prevention) - the R Package Validation Framework. I cover the elements of the framework and basics of applying it with some examples. By using tools native to the R package building infrastructure, validation can become an integrated part of your package development, improving the quality of both the package and validation.
About Ellis Hughes: I am a statistical Programmer at Fred Hutch Cancer Research Center where I work on a team that evaluates potential HIV vaccine candidates. Having graduated from Washington State University with a degree in Bioengineering, I found a passion for programming in R. I now organize the Seattle UseR group, and enjoy building packages to automate my workflows.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Edgar Ruiz | Programación con R | RStudio (2019)
Hay ocasiones que, cuando trabajamos en un análisis en R, necesitamos dividir nuestros datos en grupos, y después tenemos que correr la misma operación sobre cada grupo. Por ejemplo, puede ser que los datos que tenemos contienen varios países, y necesitamos crear un modelo por cada país. Otro caso sería el de correr múltiples operaciones sobre los mismos datos. Estos casos requieren que sepamos cómo usar iteraciones con R. Este webinar se concentrará en cómo utilizar el paquete llamado purrr para ayudarnos a resolver este tipo de problema.
Descargar materiales: https://rstudio.com/resources/webinars/programacio-n-con-r/
About Edgar: Edgar Ruiz es un Ingeniero de Soluciones en RStudio. Es el administrador de los sitios oficiales de sparklyr y de R para bases de datos. También es autor de los paquetes de R: dbplot, tidypredict y modeldb, y co-autor de el paquete dbplyr

Mike Garcia | R in Pharma: Intro to Shiny | Posit
Slides: https://garciamikep.github.io/rstudioglobal-2021-shiny-slides/slides.html#1
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: in this introduction to Shiny app development, we begin with a quick review of visualization with {ggplot2} and then cover core concepts in app structure and reactive programming. After building several Shiny apps of increasing complexity, we wrap up with a demonstration of how to include your Shiny app in a dashboard using the {flexdashboard} package.
About Mike Garcia: Mike is a Data Science Consultant with ProCogia, with a background in Biostatistics and experience in clinical trial design and public health research. If not geeking out on data with a cup of coffee and spreading his passion for R, he’s probably out enjoying the outdoors.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
To hear more about how other major pharmaceutical companies are transitioning to open source data science you can watch talks from this year’s R in Pharma conference: https://www.youtube.com/@RinPharma/playlists
At Posit, we have a dedicated Pharma team to help organizations migrate and utilize open source for drug development. To learn more about our support for life sciences, please see our dedicated Pharma page where you can book a call with our team. (https://posit.co/solutions/pharma )
Volha Tryputsen | R in Janssen Drug Discovery Statistics | Posit
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: this talk discusses how R is utilized in the Janssen drug discovery statistics workflow.
About Volha: Volha is the Principal Statistician in the Translational Medicine and Early Development Statistics (TMEDS) group in the Quantitative Sciences Department of Janssen R&D.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
Lou Bajuk & Kevin Bolger | Why Data Science in the Cloud? | RStudio (2020)
As business and organizational needs expand, a centralized ecosystem such as the cloud is needed to securely store and access data, conduct analyses, and share results.
We’ll share some examples of what it means to do data science in the cloud and discuss some problems that users may face along the way, and the solutions that RStudio products can provide. We’ll also discuss best practices for migrating to a cloud environment.
What you’ll learn:
- What are the benefits of working in a cloud environment?
- What are the different cloud environments available?
- How do I learn which is the best fit for my organization?
- What should I consider when migrating my data science infrastructure to the cloud?
Webinar materials: https://rstudio.com/resources/webinars/why-data-science-in-the-cloud/
About Lou: Lou is a passionate advocate for data science software, and has had many years of experience in a variety of leadership roles in large and small software companies, including product marketing, product management, engineering and customer success. In his spare time, his interests includes books, cycling, science advocacy, great food and theater.
About Kevin: After finishing his education in the University of Limerick, Ireland – Kevin’s passion for data science was cemented. Focusing primarily on data analytics and modelling, he went on to spend the first years of his career working at a biopharmaceutical company, where he led the data team on multiple products. Since moving to Seattle with his Washington native wife, Kevin has spent his spare time enjoying the beautiful PNW and playing ‘hurling’, an ancient gaelic field sport with the Seattle Gaels. He now leads the Data Science team at ProCogia as the Director of Data Solutions – where he works with clients from Biotech to Telecom
Hadley Wickham | testthat 3.0.0 | RStudio (2020)
In this webinar, I’ll introduce some of the major changes coming in testthat 3.0.0. The biggest new idea in testthat 3.0.0 is the idea of an edition. You must deliberately choose to use the 3rd edition, which allows us to make breaking changes without breaking old packages. testthat 3e deprecates a number of older functions that we no longer believe are a good idea, and tweaks the behaviour of expect_equal() and expect_identical() to give considerably more informative output (using the new waldo package).
testthat 3e also introduces the idea of snapshot tests which record expected value in external files, rather than in code. This makes them particularly well suited to testing user output and complex objects. I’ll show off the main advantages of snapshot testing, and why it’s better than our previous approaches of verify_output() and expect_known_output().
Finally, I’ll go over a bunch of smaller quality-of-life improvements, including tweaks to test reporting and improvements to expect_error(), expect_warning() and expect_message().
Webinar materials: https://rstudio.com/resources/webinars/testthat-3/
About Hadley: Hadley Wickham is the Chief Scientist at RStudio, a member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. You may be familiar with his packages for data science (the tidyverse: including ggplot2, dplyr, tidyr, purrr, and readr) and principled software development (roxygen2, testthat, devtools, pkgdown). Much of the material for the course is drawn from two of his existing books, Advanced R and R Packages, but the course also includes a lot of new material that will eventually become a book called “Tidy tools”

A quick tour of RStudio 1.4 | RStudio
HD version here: https://youtu.be/oCR_LB3H73M
0:00 Introduction 0:20 R Markdown Visual Editor 0:46 Insert citations in R Markdown 1:09 Python support in Environment pane 2:05 Python environment selection 2:25 Rainbow parentheses 2:43 Monospace font support 2:54 Support for multiple source columns 3:10 Command palette 3:27 Customize data and configuration storage (users and servers) 3:55 RStudio Pro edition features 4:08 Authenticate RStudio Server Pro using SAML 4:25 Project sharing with Launcher 4:48 Request a GPU with SLURM 5:00 Run Visual Studio Code sessions (beta)
What’s new with RStudio 1.4:
A visual markdown editor that provides improved productivity for composing longer-form articles and analyses with R Markdown.
New Python capabilities, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments.
The ability to add source columns to the IDE workspace for side-by-side text editing.
A new command palette (accessible via Ctrl+Shift+P) that provides easy keyboard access to all RStudio commands, add-ins, and options.
Support for rainbow parentheses in the source editor (enabled via Options, then Code, then Display).
New citation support that allows you to include document citations from your document bibliography, personal or group libraries, and several other sources.
Integration with a host of new RStudio Server Pro features including project sharing when using Launcher, Microsoft Visual Studio Code support (currently in beta), SAML authentication, and local launcher load-balancing.
Read more on our blog: https://blog.rstudio.com/2021/01/19/announcing-rstudio-1-4/
Mine Çetinkaya-Rundel | Teaching R online with RStudio Cloud | RStudio (2020)
RStudio Cloud is a lightweight and easy to set up / use solution to teaching R online, in the browser. In this webinar we will walk you through the steps of setting up your course on RStudio Cloud, highlighting the various functionalities for teachers and students. We will also discuss best practices and provide an opportunity for the audience to experience the setup first hand. Additionally, we highlight a suite of ready to use resources for teaching an introduction to data science and statistical thinking course using R.
Webinar materials: https://rstudio.com/resources/webinars/teaching-r-online-with-rstudio-cloud/
About Mine: Mine Çetinkaya-Rundel is Professional Educator and Data Scientist at RStudio as well as Senior Lecturer in the School of Mathematics at University of Edinburgh (on leave from Department of Statistical Science at Duke University). Mine’s work focuses on innovation in statistics and data science pedagogy, with an emphasis on computing, reproducible research, student-centered learning, and open-source education as well as pedagogical approaches for enhancing retention of women and under-represented minorities in STEM. Mine works on integrating computation into the undergraduate statistics curriculum, using reproducible research methodologies and analysis of real and complex datasets. She also organizes ASA DataFest and works on the OpenIntro project. She is also the creator and maintainer of datasciencebox.org and she teaches the popular Statistics with R MOOC on Coursera

Greg Wilson | Teaching Online at Short Notice | RStudio (2020)
So here you are: you planned to teach your class or deliver your workshop in person, and now you have to do it online or not at all.
Nobody is giving you time or money to make the change, and a hundred other things also need your attention. Where should you start, and what can you realistically hope to achieve? This one-hour webinar will present answers from people who have found themselves in this situation before, and will recommend a handful of techniques that you can apply right away.
Webinar materials: https://rstudio.com/resources/webinars/teaching-online-at-short-notice/
About Greg: Dr. Greg Wilson has worked for 35 years in both industry and academia, and is the author or editor of several books on computing and two for children. He is best known as the co-founder of Software Carpentry, a non-profit organization that teaches basic computing skills to researchers, and is now part of the education team at RStudio
James Blaire & Barret Schloerke | Integrating R with Plumber APIs | RStudio (2020)
Full title: Expanding R Horizons: Integrating R with Plumber APIs
In this webinar we will focus on using the Plumber package as a tool for integrating R with other frameworks and technologies. Plumber is a package that converts your existing R code to a web API using unique one-line comments. Example use cases will be used to demonstrate the power of APIs in data science and to highlight new features of the Plumber package. Finally, we will look at methods for deploying Plumber APIs to make them widely accessible.
Webinar materials: https://rstudio.com/resources/webinars/expanding-r-horizons-integrating-r-with-plumber-apis/
About James: James is a Solutions Engineer at RStudio, where he focusses on helping RStudio commercial customers successfully manage RStudio products. He is passionate about connecting R to other toolchains through tools like ODBC and APIs. He has a background in statistics and data science and finds any excuse he can to write R code.
About Barret: I specialize in Large Data Visualization where I utilize the interactivity of a web browser, the fast iterations of the R programming language, and large data storage capacity of Hadoop

Sean Lopp & Lou Bajuk | R & Python: A Data Science Love Story | RStudio (2020)
Many Data Science teams today leverage both R and Python in their work, but struggle to use them together. Data Science leaders and their business partners find it difficult to make key data science content easily discoverable and available for decision-making, while IT Admins and DevOps engineers grapple with how to efficiently support these teams without duplicating infrastructure. Even experienced data scientists familiar with both languages often struggle to combine them without painful context switching and manual translations.
In this webinar, you will learn how RStudio helps organizations tackle these challenges, with a focus on some of the recent additions to our products that have helped deepen the happy relationship between R and Python:
- Easily combine R and Python in a single Data Science project using a single IDE.
- Leverage a single infrastructure to launch and manage Jupyter Notebooks, JupyterLab, VSCode and the RStudio IDE, while giving your team easy access to Kubernetes and other resources.
- Share and manage access to R- and Python-based interactive applications, dashboards, and APIs, all in a single place.
Webinar materials: https://rstudio.com/resources/webinars/r-python-a-data-science-love-story/
About Lou: Lou is a passionate advocate for data science software, and has had many years of experience in a variety of leadership roles in large and small software companies, including product marketing, product management, engineering and customer success. In his spare time, his interests includes books, cycling, science advocacy, great food and theater.
About Sean: Sean has a degree in mathematics and statistics and worked as an analyst at the National Renewable Energy Lab before making the switch to customer success at RStudio. In his spare time he skis and mountain bikes and is a proud Colorado native
Sean Lopp & Rich Iannone | Avoid Dashboard Fatigue | RStudio (2020)
Data science teams face a challenging task. Not only do they have to gain insight from data, they also have to persuade others to make decisions based on those insights. To close this gap, teams rely on tools like dashboards, apps, and APIs. But unfortunately data organizations can suffer from their own success - how many of those dashboards are viewed once and forgotten? Is a dashboard of dashboards really the right solution? And what about that pesky, precisely formatted Excel spreadsheet finance still wants every week?
In this webinar, we’ll show you an easy way teams can solve these problems using proactive email notifications through the blastula and gt packages, and how RStudio pro products can be used to scale out those solutions for enterprise applications. Dynamic emails are a powerful way to meet decision makers where they live - their inbox - while displaying exactly the results needed to influence decision-making. Best of all, these notifications are crafted with code, ensuring your work is still reproducible, durable, and credible.
We’ll demonstrate how this approach provides solutions for data quality monitoring, detecting and alerting on anomalies, and can even automate routine (but precisely formatted) KPI reporting.
Webinar materials: https://rstudio.com/resources/webinars/avoid-dashboard-fatigue/
About Sean: Sean has a degree in mathematics and statistics and worked as an analyst at the National Renewable Energy Lab before making the switch to customer success at RStudio. In his spare time he skis and mountain bikes and is a proud Colorado native.
About Rich: My background is in programming, data analysis, and data visualization. Much of my current work involves a combination of data acquisition, statistical programming, tools development, and visualizing the results. I love creating software that helps people accomplish things. I regularly update several R package projects (all available on GitHub). One such package is called DiagrammeR and it’s great for creating network graphs and performing analyses on the graphs. One of the big draws for open-source development is the collaboration that comes with the process. I encourage anyone interested to ask questions, make recommendations, or even help out if so inclined!

Marie Vendettuoli | Lessons learned developing a library of validated packages | RStudio
Full title: Towards an integrated {verse}: lessons learned developing a library of validated packages
Developing R packages as a unified {verse} – a set of packages that work well together but with each focusing on individual tasks – is an efficient strategy to structure support for complex workflows. The ongoing challenge becomes managing the growth of related packages in a holistic manner. This is especially problematic in industries with a heavy emphasis on stability, for example if packages need to be validated prior to use in production. In this talk, I will discuss a paradigm for developing and maintaining validated R packages, emphasizing the following areas:
- Strategies for organizing packages to prevent excessive re-work
- Facilitating responsive, iterative development and
- Empathy for developer and user experiences
About Marie: Marie Vendettuoli is a Senior Statistical Programmer at Statistical Center for HIV/AIDS Research and Prevention (SCHARP - https://www.fredhutch.org/en/research/divisions/vaccine-infectious-disease-division/research/biostatistics-bioinformatics-and-epidemiology/statistical-center-for-hiv-aids-research-and-prevention.html ) @ FredHutch. She holds a PhD from Iowa State University in Human Computer Interaction and started developing R packages for use within regulatory frameworks while working as a Data Scientist at USDA Center for Veterinary Biologics (https://www.aphis.usda.gov/aphis/ourfocus/animalhealth/veterinary-biologics/sa_about_vb/ct_vb_about) . Before discovering R, Marie worked in a CBER (https://www.fda.gov/about-fda/fda-organization/center-biologics-evaluation-and-research-cber)-regulated laboratory. Her main interest is developing analytical infrastructure to facilitate scientific analysis for fellow data scientists working in a regulatory environment
Matt Thomas & Mike Page | How the Tidyverse helped the British Red Cross respond to COVID | RStudio
Full title: Cognitive speed: How the Tidyverse helped the British Red Cross respond quickly to COVID-19
We will discuss the importance of cognitive speed, defined here as the rate in which an idea can be translated into code, and why the Tidyverse excels in this domain. We will demonstrate this idea in relation to a suite of tools we were required to rapidly develop at the British Red Cross in order to respond effectively to the COVID-19 pandemic. To do this, we will exhibit how elements of the unifying design principles outlined in the ‘tidyverse design guide - Tidyverse team’ relate to the notion of cognitive speed, giving specific examples for various design considerations. We believe this talk will encourage reflection on better design practices for future R developers, using the design principles of the tidyverse as the guiding beacon.
About Matt: Dr. Matt Thomas is Head of Strategic Insight and Foresight at the British Red Cross. Matt’s team aims to help the Red Cross become more anticipatory and proactive by producing insights and tools including the Vulnerability Index (https://britishredcrosssociety.github.io/covid-19-vulnerability/ ) and Resilience Index (https://britishredcross.shinyapps.io/resilience-index/) . He holds a PhD in Evolutionary Anthropology and, prior to joining the British Red Cross, was researching topics including reindeer herders in the Arctic, hunter-gatherers in the Philippines, and witches in China. Outside of work, Matt writes a column for an anthropology magazine (https://www.sapiens.org/column/machinations/ ) as well as fiction.
About Mike: Mike Page is a data scientist on the Strategic Insight and Foresight team at the British Red Cross. Here, he helps to develop a suite of open source tools and dashboards including the Vulnerability Index (https://britishredcrosssociety.github.io/covid-19-vulnerability/ ) and Resilience Index (https://britishredcross.shinyapps.io/resilience-index/) . Mike is also the author of several R packages including mortyr and newsrivr. In his spare time you can find him rock climbing around the Alps
Megan Beckett | Aesthetically automated figure production | RStudio
Automation, reproducibility, data driven. These are not normally concepts one would associate with the traditional publishing industry, where designers normally manually produce every artefact in proprietary software. And, when you have 1000s of figures to produce and update for a single textbook, this becomes an insurmountable task, meaning our textbooks quickly become outdated, especially in our rapidly advancing world.
With R and the tidyverse in our back pocket, we rose to the challenge to revolutionize this workflow. I will explain how we collaborated with a publishing group to develop a system to aesthetically automate the production of figures for a textbook including translations into several languages.
I think you’ll find this talk interesting as it shows how we applied tools that are familiar to us, but in an unconventional way to fundamentally transform a conventional process.
About Megan: Megan Beckett is a Data Scientist at Exegetic Analytics, where she consults, develops and leads several analytical projects across a wide range of fields and industries. “Scientifically creative; creatively scientific.” This aptly describes her philosophy and approach in her work and life. Megan helped co-found and organises the Cape Town R-Ladies chapter and is a co-organiser of the satRday events in South Africa. She loves to paint, with her most recent work exploring the biodiversity of southern Africa , and running is her passion, whether on the road or the trail
Nicole Kramer | A New Paradigm for Multifigure Coordinate-Based Plotting in R | RStudio
R is unparalleled in its ability to transform raw data into a wide array of beautiful graphics, all within the same environment. However, when it comes to complex, multi-paneled plots, users rely on 3rd party graphic design software to arrange plots. Here I present the new world of programmatic, coordinate-based multi-figure plotting in R. Employing grid Graphics and drawing from the paradigms of base plotting and ggplot2, I am developing a package that will revolutionize the way plots are laid out in R. Not only will individual plots be aesthetically customizable and tailored for speed, users will also be offered exquisite control over all aspects of page layout, plot placement, and arrangements. Come join me in changing how we plot in R!
About Nicole: Nicole Kramer is a third year Bioinformatics and Computational Biology graduate student at the University of North Carolina at Chapel Hill. She works in the lab of Dr. Doug Phanstiel , where her and her colleagues use experimental and computational techniques to study human genomics. Prior to grad school, Nicole received her B.S. in Biological Engineering from MIT in 2018. When not doing science, you can find Nicole petting dogs, admiring giraffes, or knitting tiny animals!
Shelmith Kariuki | rKenyaCensus Package | RStudio
The rKenyaCensus package contains the results of the 2019 Kenya Population Census. The census exercise was carried out in August 2019, and the results were released in February 2020. Kenya leveraged on technology to capture data during cartographic mapping, enumeration and data transmission, making the 2019 Census the first paperless census to be conducted in Kenya. The data was published in four different pdf files (Volume 1 - Volume 4) which can be found in the Kenya National Bureau of statistics website. The data in its current form was open and accessible, but not usable and so there was need to convert it into a machine readable format. This data can be used by the government, non-governmental organizations and any other entities for data driven policy making and development. During the talk, I will explain the reasons behind development of the package, take you through the steps I took during the process and finally showcase analysis of certain aspects of the data.
About Shelmith: Shelmith Kariuki is a Senior Data Analyst based in Nairobi, Kenya. She is an RStudio Certified Tidyverse trainer (https://education.rstudio.com/trainers/) , currently working as a Data Analytics consultant with UN DESA. She previously worked as a Research Manager at Geopoll, and as a Data Analyst at Busara Center for Behavioral Economics. She also worked as an assistant lecturer in various Kenyan universities, teaching units in Statistics and Actuarial Science. She has extensive experience in data analysis using R. She co-organizes a community of R users in Nairobi (https://www.linkedin.com/feed/hashtag/nairobir/ ) and in Africa (https://twitter.com/AfricaRUsers) . One of the missions of her community work is to make sure that there is an increased number of R adopters, in Africa. She is very passionate about training and using data analytics to drive development projects in Africa
Vicki Boykis | Your public garden | RStudio
Vicki will discuss how that as people who can write code and analyze data, we have a lot of input and power over what our digital and work worlds looks like, and therefore can act as agents of change and repair.
About Vicki: Vicki Boykis is a machine learning engineer at Automattic, the company behind Wordpress.com. She works mostly in Python, R, Spark, and SQL, and really enjoys building end-to-end data products. Outside of work she publishes the Normcore Tech newsletter (https://vicki.substack.com ) and blogs at https://veekaybee.github.io/ . In her “spare time”, she blogs, reads, and writes terrible joke tweets about data
Winston Chang | Making Shiny apps faster with caching | RStudio
Shiny’s 1.6 has a new function, bindCache(), which makes it easy to dramatically speed up reactive expressions and output rendering functions. This allows many applications to scale up to serve several times more users without an increase in server resources.
Note: Shiny 1.6.0 isn’t yet on CRAN, but will be in the next few days. In the meantime, you can install it with:
remotes::install_github(““rstudio/shiny@rc-v1.6.0 "”)
About Winston: Winston is a software engineer at RStudio. He holds a Ph.D. in psychology from Northwestern University and is the author of R Graphics Cookbook, published by O’Reilly Media

Ahmadou Dicko | Humanitarian Data Science with R | RStudio
Humanitarian actors are increasingly using data to drive their decisions. Since the Haiti 2010 earthquake, the volume of data collected and used by humanitarians has been growing exponentially and organizations are now relying on data specialists to turn all this data into life-saving data products.
These data products are created by teams using proprietary point and click software. The process from the raw data to the final data product involves a lot of clicking, copying and pasting and is usually not reproducible.
Another approach to humanitarian data science is possible using R. In this talk, I will show how to seamlessly develop reproducible, reusable humanitarian data products using the tidyverse, rmarkdown and some domain-focused R packages.
About Ahmadou: Ahmadou Dicko is a statistics and data analysis officer at the United Nations High Commissioner for Refugees (UNHCR) where he uses statistics and data science to help safeguard the rights and well-being of refugees in West and Central Africa. He has an extensive experience in the use of statistics and data science in development and humanitarian projects. Ahmadou was the lead of the OCHA Center for Humanitarian Data team for West and Central Africa and has worked with several humanitarian and development organizations such as IFRC, FAO, IAEA, OCHA. Ahmadou is a RStudio trainer (https://education.rstudio.com/trainers/ ) and he is passionate about the R community. He is currently co-organizing the Dakar R User Group (https://www.meetup.com/DakaR-R-User-Group/ ) and co-leading the AfricaR initiative (https://africa-r.org/ )
Sean Lopp | Announcing Posit Package Manager | RStudio (2019)
Posit Package Manager is the newest professional product that helps teams, departments, and entire enterprises organize and centralize package management. If you’ve ever struggled with IT to get access to a new (any?) R package, reproduce an old result, or share your code with others, Posit Package Manager can help! We’ll introduce the new product, discuss how R repositories can be used to solve problems and take a sneak peek at what is coming in 2019.
VIEW MATERIALS https://github.com/slopp/rspm-rstudioconf
About the Author Sean Lopp Sean has a degree in mathematics and statistics and worked as an analyst at the National Renewable Energy Lab before making the switch to customer success at RStudio. In his spare time he skis and mountain bikes and is a proud Colorado native.
*Posit Package Manager, formerly known as RStudio Package Manager
Tracy Teal | Teaching R using inclusive pedagogy: Carpentries workshops lessons learned | RStudio
Talk from rstudio::conf(2019)
The Carpentries is an open, global community teaching researchers the skills to turn data into knowledge. Since 2012 we have taught 700+ R workshops & trained 1600+ volunteer instructors. Our workshops use evidence-based teaching, focus on foundational and relevant skills and create an inclusive environment. Teaching the Tidyverse allows learners to start working with data quickly, and keeps them motivated to begin and sustain their learning. Our assessment show that these approaches have been successful in attracting diverse learners, building confidence & increasing coding usage. Through our train-the-trainer model and open, collaborative lessons, this approach scales globally to reach more learners and further democratize data.
About Tracy Teal: Executive Director of The Carpentries (https://carpentries.org ) and co-founder of Data Carpentry (http://www.datacarpentry.org ), a non-profit organization that develops and runs workshops training researchers in effective data analysis and visualization to enable data-driven discovery. Manages projects, operations and finances. Leads lesson development and volunteer coordination and is responsible for strategic and business planning
Jake Thompson | Branding and Packaging Reports with R Markdown | RStudio (2020)
The creation of research reports and manuscripts is a critical aspect of the work conducted by organizations and individual researchers. Most often, this process involves copying and pasting output from many different analyses into a separate document. Especially in organizations that produce annual reports for repeated analyses, this process can also involve applying incremental updates to annual reports. It is important to ensure that all relevant tables, figures, and numbers within the text are updated appropriately. Done manually, these processes are often error prone and inefficient. R Markdown is ideally suited to support these tasks. With R Markdown, users are able to conduct analyses directly in the document or read in output from a separate analyses pipeline. Tables, figures, and in-line results can then be dynamically populated and automatically numbered to ensure that everything is correctly updated when new data is provided. Additionally, the appearance of documents rendered with R Markdown can be customized to meet specific branding and formatting requirements of organizations and journals. In this presentation, we will present one implementation of customized R Markdown reports used for Accessible Teaching, Learning, and Assessment Systems (ATLAS) at the University of Kansas. A publicly available R package, ratlas, provides both Microsoft Word and LaTeX templates for different types of projects at ATLAS with their own unique formatting requirements. We will discuss how to create brand-specific templates, as well as how to incorporate the templates into an R package that can be used to unify report creation across an organization. We will also describe other components of branding reports beyond R Markdown templates, including customized ggplot2 themes, which can also be wrapped into the R package. Finally, we will share lessons learned from incorporating the R package workflow into an existing reporting pipeline. https://rstudio.com/resources/rstudioconf-2020/branding-and-packaging-reports-with-r-markdown/
Mine Çetinkaya-Rundel | Making the Shiny Contest | RStudio (2020)
In January 2019 RStudio launched the first-ever Shiny contest to recognize outstanding Shiny applications and to share them with the community. We received 136 submissions for the contest and reviewing them was incredibly inspiring and humbling. In this talk, we shine a spotlight on the backstage: the inspiration behind the contest, the process of evaluation, what we learned about Shiny developers and how we can better support them, and what we learned about running contests and how we hope to improve the Shiny Contest experience. We also highlight some of the winning apps as well as the newly revamped Shiny Gallery, which features many noteworthy contest submissions. Finally, we introduce the new process for submitting your apps to the Shiny Gallery and, of course, to Shiny Contest 2020! https://rstudio.com/resources/rstudioconf-2020/making-the-shiny-contest/

Javier Luraschi | Updates on Spark, MLflow, and the broader ML ecosystem | RStudio (2020)
Originally posted at https://rstudio.com/resources/rstudioconf-2020/updates-on-spark-mlflow-and-the-broader-ml-ecosystem/
Paige Bailey | Deep Learning with R | RStudio (2020)
Originally posted to https://rstudio.com/resources/rstudioconf-2020/deep-learning-with-r/
Paige Bailey is the product manager for TensorFlow core as well as Swift for TensorFlow. Prior to her role as a PM in Google’s Research and Machine Intelligence org, Paige was developer advocate for TensorFlow core; a senior software engineer and machine learning engineer in the office of the Microsoft Azure CTO; and a data scientist at Chevron. Her academic research was focused on lunar ultraviolet, at the Laboratory for Atmospheric and Space Physics (LASP) in Boulder, CO, as well as Southwest Research Institute (SwRI) in San Antonio, TX
Yihui Xie | One R Markdown Document, Fourteen Demos | RStudio (2020)
R Markdown is a document format based on the R language and Markdown to intermingle computing with narratives in the same document. With this simple format, you can actually do a lot of things. For example, you can generate reports dynamically (no need to cut-and-paste any results because all results can be dynamically generated from R), write papers and books, create websites, and make presentations. In this talk, I’ll use a single R Markdown document to give demos of the R packages rmarkdown,
- bookdown for authoring books (https://bookdown.org ),
- blogdown for creating websites (https://github.com/rstudio/blogdown) ,
- rticles for writing journal papers (https://github.com/rstudio/rticles) ,
- xaringan for making slides (https://github.com/yihui/xaringan) ,
- flexdashboard for generating dashboards (https://github.com/rstudio/flexdashboard) ,
- learnr for tutorials (https://github.com/rstudio/learnr) ,
- rolldown for storytelling (https://github.com/yihui/rolldown) ,
And the integration between Shiny and R Markdown. To make the best use of your time during the presentation, I recommend you to take a look at the rmarkdown website in advance: https://rmarkdown.rstudio.com
Miriah Meyer | Effective Visualizations | RStudio (2020)
Originally posted to https://rstudio.com/resources/rstudioconf-2020/effective-visualizations/
Riva Quiroga | The development of “datos” package for the R4DS Spanish translation| RStudio (2020)
Originally posted at https://rstudio.com/resources/rstudioconf-2020/the-development-of-datos-package-for-the-r4ds-spanish-translation/
Jeff Leek | Data science education as a public health intervention in E. Baltimore | RStudio (2020)
Originally posted at https://rstudio.com/resources/rstudioconf-2020/data-science-education-as-an-economic-and-public-health-intervention-in-east-baltimore/
Rebecca Barter | Becoming an R blogger | RStudio (2020)
Blogging is an excellent way to learn, improve your communication skills, and gain exposure in the R and data science communities. In this talk, I will discuss how and why I started blogging, and why you should too. I will guide you through choosing topics, writing your blog using RStudio and blogdown, hosting it on netlify, and sharing your blog with the world. This talk is for you if you’ve wanted to start a blog on R, data science, or to showcase your data analyses, but don’t know where to start.
Materials: github https://github.com/rlbarter/rstudio-conf-2020-blogger-slides slides (pdf) https://github.com/rlbarter/rstudio-conf-2020-blogger-slides/blob/master/Becoming%20an%20R%20blogger
Kara Woo | Boxplots: a case study in debugging and perseverance | RStudio (2019)
Come on a journey through pull request #2196. What started as a seemingly simple fix for a bug in ggplot2’s box plots developed into an entirely new placement algorithm for ggplot2 geoms. This talk will cover tips and techniques for debugging, testing, and not smashing your computer when dealing with tricky bugs.
VIEW MATERIALS https://github.com/karawoo/2019-01-17-rstudioconf
About the Author Kara Woo Kara is a research scientist in data curation at Sage Bionetworks, where she helps other researchers document and share their data. She has previously worked as an information manager at Washington State University and at the National Center for Ecological Analysis and Synthesis (NCEAS), where she combined data management with fieldwork at a remote Siberian lake. Kara is an enthusiastic R programmer, and collects data visualizations gone beautifully wrong on a blog called accidental aRt
Edgar Ruiz | Databases using R The latest | RStudio (2019)
Learn about the latest packages and techniques that can help you access and analyze data found inside databases using R. Many of the techniques we will cover are based on our personal and the community’s experiences of implementing concepts introduced last year, such as offloading most of the data wrangling to the database using dplyr, and using the RStudio IDE to preview the database’s layout and data. Also, learn more about the most recent improvements to the RStudio products that are geared to aid developers in using R with databases effectively.
VIEW MATERIALS https://github.com/edgararuiz/databases-w-r
About the Author Edgar Ruiz Edgar is the author and administrator of the https://db.rstudio.com web site, and current administrator of the [sparklyr] web site: https://spark.rstudio.com . Author of the Data Science in Spark with sparklyr cheatsheet. Co-author of the dbplyr package and creator of the dbplot package

Jonathan McPherson | New language features in RStudio | RStudio (2019)
RStudio 1.2 dramatically improves support for many languages frequently used alongside R in data science projects, including SQL, D3, Stan, and Python. In this talk, you’ll learn how to use RStudio 1.2’s new language features to work more efficiently and fluidly in multi-lingual projects.
VIEW MATERIALS https://github.com/rstudio/rstudio-conf/tree/master/2019/RStudio_1.2_Language_Features--Jonathan_McPherson
About the Author Jonathan McPherson Jonathan is a software engineer at RStudio working on the IDE. In the past, he’s written Web applications at a nuclear site in the desert, exploratory information visualization systems at UC Davis, and features for flagship Office products and modern web applications at Microsoft
Wes McKinney | Ursa Labs and Apache Arrow in 2019 | RStudio (2019)
Learn more about what’s happening at URSA labs at https://wesmckinney.com/archives.html
Amelia McNamara | Working with categorical data in R without losing your mind | RStudio (2019)
Categorical data, called “factor” data in R, presents unique challenges in data wrangling. R users often look down at tools like Excel for automatically coercing variables to incorrect datatypes, but factor data in R can produce very similar issues. The stringsAsFactors=HELLNO movement and standard tidyverse defaults have moved us away from the use of factors, but they are sometimes still necessary for analysis. This talk will outline common problems arising from categorical variable transformations in R, and show strategies to avoid them, using both base R and the tidyverse (particularly, dplyr and forcats functions).
VIEW MATERIALS http://www.amelia.mn/WranglingCats.pdf
(related paper from the DSS collection) http://bitly.com/WranglingCats https://peerj.com/collections/50-practicaldatascistats/
About the Author Amelia McNamara My work is focused on creating better tools for novices to use for data analysis. I have a theory about what the future of statistical programming should look like, and am working on next steps toward those tools. For more on that, see my dissertation. My research interests include statistics education, statistical computing, data visualization, and spatial statistics. At the moment, I am very interested in the effects of parameter choices on data analysis, particularly data visualizations. My collaborator Aran Lunzer and I have produced an interactive essay on histograms, and an initial foray into the effects of spatial aggregation. I talked more about spatial aggregation in my 2017 OpenVisConf talk, How Spatial Polygons Shape Our World
Tyler Morgan-Wall | 3D mapping, plotting, and printing with rayshader | RStudio (2019)
Long form discussion: https://www.tylermw.com/3d-printing-rayshader/
Jeroen Ooms | A preview of Rtools 4.0 | RStudio (2019)
Rtools is getting a major upgrade. In addition to the latest gcc, it now includes a full build system and package manager to build, install, and distribute external c/c++/fortran libraries needed by R packages. Thereby it bridges the long-standing gap between Windows and MacOS/Linux with respect to the availability of high quality, up-to-date system libraries. In this talk, we will show how to build and install system libraries with Rtools, and manage your Rtools build environment. It should be interesting both for Windows users as well as non-Windows package authors that are interested in reducing the pain of making things work on Windows.
VIEW MATERIALS https://resources.rstudio.com/rstudio-conf-2019/a-preview-of-rtools-4-0
About the Author Jeroen Ooms Postdoc hacker for @ropensci at UC Berkeley

Emily Robinson | Building an AB testing analytics system with R and Shiny | RStudio (2019)
Online experimentation, or A/B Testing, is the gold standard for measuring the effectiveness of changes to a website. While A/B testing is used at tens of thousands of companies, it can seem difficult to parse without resorting to expensive end-to-end commercial options. Using DataCamp’s system as an example, I’ll illustrate how R is actually a great language for building powerful analytical and visualization A/B testing tools. We’ll first dive into our open-source funneljoin package, which allows you to quickly analyze sequential actions using different types of behavioral funnels. We’ll then cover the importance of setting up health checks for every experiment. Finally, we’ll see how Shiny dashboards can help people monitor and quickly analyze multiple A/B tests each week.
VIEW MATERIALS http://bit.ly/rstudio19
About the Author Emily Robinson I work at DataCamp as a Data Scientist on the growth team. Previously, I was a Data Analyst at Etsy working with their search team to design, implement, and analyze experiments on the ranking algorithm, UI changes, and new features. In summer 2016, I completed Metis’s three-month, full-time Data Science Bootcamp, where I did several data science projects, ranging from using random forests to predict successful projects on DonorsChoose.org to building an application in R Shiny that helps data science freelancers find their best-fit jobs. Before Metis, I graduated from INSEAD with a Master’s degree in Management (specialization in Organizational Behavior). I also earned my bachelor’s degree from Rice University in Decision Sciences, an interdisciplinary major I designed that focused on understanding how people behave and make decisions
Karthik Ram | A guide to modern reproducible data science with R | RStudio (2019)
Resources: https://github.com/karthik/rstudio2019
Hilary Parker | Cultivating creativity in data work | RStudio (2019)
Traditionally, statistical training has focused primarily on mathematical derivations, proofs of statistical tests, and the general correctness of what methods to use for certain applications. However, this is only one dimension of the practice of doing analysis. Other dimensions include the technical mastery of a language and tooling system, and most importantly the construction of a convincing narrative tailored to a specific audience, with the ultimate goal of them accepting the analysis. These “softer” aspects of analysis are difficult to teach, perhaps more so when the field is framed as mathematics and often housed in mathematics departments. In this talk, I discuss an alternative framework for viewing the field, borrowing upon the past work in other fields such as design. Looking forward, we as a field can borrow from these fields to cultivate and hone the creative lens so necessary to the success of applied work.
VIEW MATERIALS https://www.slideshare.net/mobile/hilaryparker/rstudioconf2019l
About the Author Hilary Parker I’m a Senior Data Analyst at Etsy, where I help product teams with data-driven development, via experimentation, opportunity sizing and impact analysis. I got my Ph.D. from the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health, working with Jeff Leek. I studied genomics and built tools to help researchers use genomic technologies in personalized medicine applications. I graduated from Pomona College in 2008, where I double-majored in Mathematics and Molecular Biology. True to my liberal arts upbringing, I’m a passionate teacher. Most notably, I taught an introductory Biostatistics class at the American University of Armenia (and kept a pretty cool travel blog along the way)
Angela Bassa | Data science as a team sport | RStudio (2019)
How do you data science as a team sport? Oftentimes a data scientific initiative starts with just a single, lonesome data scientist. But when that germ of a team is successful and starts expanding, should the team be embedded in other disciplines or should it be centralized into its own function? Where should it live in the organizational structure? Should you focus on recruiting senior data scientists or is there a benefit to attracting junior talent as well? And in terms of capabilities, should you hold out for unicorns or hire several specialists to get all jobs done? Data scientists need to work on almost every aspect of a business, so how should a team composition set the data science discipline up for success? Great data scientists have career options and won’t abide bad managers for very long: if you want to retain them, you’ll need to care about their work, connect it to the business, and design a diverse, resilient, high-performing team.
Materials: https://github.com/angelabassa/rstudioconf-2019
James Blair | Democratizing R with Plumber APIs | RStudio (2019)
The Plumber package provides an approachable framework for exposing R functions as HTTP API endpoints. This allows R developers to create code that can be consumed by downstream frameworks, which may be R agnostic. In this talk, we’ll take an existing Shiny application that uses an R model and turn that model into an API endpoint so it can be used in applications that don’t speak R.
VIEW MATERIALS https://bit.ly/2TXfFR5
About the Author James Blair James holds a master’s degree in data science from the University of the Pacific and works as a solutions engineer. He works to integrate RStudio products in enterprise environments and support the continued adoption of R in the enterprise. His past consulting work centered around helping businesses derive insight from data assets by leveraging R. Outside of R and data science, James’s interests include spending time with his wife and daughters, cooking, camping, cycling, racquetball, and exquisite food. Also, he never turns down a funnel cake
Eric Nantz | Effective use of Shiny modules in application development | RStudio (2019)
As a Shiny application grows in scale, organizing code into reusable and streamlined components becomes vital to manage future enhancements and avoid unnecessary duplication. Shiny modules are customized R functions that are easily reused multiple times within an application by avoiding namespace collisions and assist with organizing the code base. Like R functions, modules can be simple utilities or elaborate pieces with multiple inputs and outputs. While the process of creating a module is uncomplicated, application developers can quickly encounter challenges including communication among modules, defining logical compositions, and avoiding hidden state modifications. In this talk, we will introduce practical principles and techniques developers can leverage to address these issues head-on such as documenting modules, passing parameters and return values effectively between modules, and how nesting modules enables dynamic user interfaces with minimal overhead.
VIEW MATERIALS https://rpodcast.github.io/rsconf-2019
About the Author Eric Nantz I have a broad background in statistics, computer science, and system administration which gives me a unique set of skills for using state-of-the-art technology and techniques to accomplish important and innovative data analyses. In my professional role as a statistician, I support the design and analyses of clinical trials evaluating treatments for auto-immune disorders. I also perform statistical analyses of specialized biomarkers utilizing cutting-edge statistical software such as R and high-performance computing infrastructures. I am also the creator, producer, and host of the R-Podcast. The R-Podcast is dedicated to helping those who are new to statistical computing develop their skills and confidence in using the free and open-source statistical computing package called R to get their data analyses done
Hao Zhu | Empowering a data team with RStudio addins | RStudio (2019)
RStudio addins provide a mechanism to extend RStudio in various ways. Addins can interact with the RStudio IDE through RStudio API. It can also provide users a graphical interface with the power of Shiny. In practice, we found it very useful for enhancing or streamlining interaction with data and computing infrastructure. In this talk, we will demonstrate how our team develops and uses RStudio addins to empower our work. You will see some internal tools created to help us manage database connections, and an addin which helps us access external cloud computing resources. We will also show an example of using the addins in rcrossref and citr to download and manage citation and literature databases during rmarkdown document development.
VIEW MATERIALS https://github.com/hebrewseniorlife/addin_demo
About the Author Hao Zhu Hao is a data analyst and software developer working at the Hinda and Arthur Marcus Institute for Aging Research. He completed his training at Boston University School of Medicine in the program on Clinical Investigation. His interests include research reproducibility, data visualization and machine learning. At the Marcus Institute, he works with different teams on various topics, ranging from smartphone motion sensors to MRI images, and helps researchers understand their data by creating analytical reports and web applications. At the same time, Hao leads the development of R packages in the Biostatistics Core. He has contributed multiple R packages to the open source R community, such as kableExtra and memor. He also has a passion for teaching and has mentored several students at the Marcus Institute
Amanda Gadrow | Getting it right: Writing reliable and maintainable R code | RStudio (2019)
How can you tell that your scripts, applications, and package functions are working as expected? Are you sure that when you make changes in one part of the code, it won’t break something in another part? Have you thought deeply about how the consumers of your code (including Future You) will use it, maintain it, fix it, and improve it? Code quality is essential not only for reliable results but also for your script’s maintainability and your users’ satisfaction. Quality can be measured in part with targeted testing, and fortunately, there are several effective and easy-to-use code testing tools available in R. This talk will discuss some of the most useful testing packages, covering both concepts and examples.
VIEW MATERIALS https://github.com/rstudio/rstudio-conf/tree/master/2019/Testing_R_Code--Amanda_Gadrow
About the Author Amanda Gadrow Amanda is a software engineer with many years’ experience writing automated test frameworks for enterprise software. She started learning R when she joined RStudio in 2016, and has been basking in its glory ever since. Amanda leads the QA and Support teams, and spends a significant amount of time analyzing customer data to improve the products and optimize support. She is a co-organizer of R-Ladies Columbus, and an avid musician on the side
Thomas Lin Pedersen | gganimate live cookbook | RStudio (2019)
Animation of data visualisation is becoming increasingly popular both as an attention grabber on social media and as a way to tell small data stories. gganimate is a package that extends ggplot2 for making animations and provides a grammar of animation on top of the grammar of graphics. This talk will quickly introduce gganimate, and then dive into a series of different animation and show how they were made and how they could be changed or expanded.
Slides: https://data-imaginist.com/slides/rstudioconf2019 4 Resources: https://resources.rstudio.com/rstudio-conf-2019/gganimate-live-cookbook 4 Discussion https://community.rstudio.com/t/gganimate-live-cookbook-thomas-lin-pedersen-rstudio-conf-2019l-video/24852

Rich Iannone | Introducing the gt package | RStudio (2019)
With the gt package, anyone can make great-looking display tables. Though the package is still early in development, you can do some really great things with it right now! I’ll walk through a few examples that touch upon the more common table-making use cases. These will include features like adding table parts, integrating footnotes, styling/transforming table cells, using tables in R Markdown documents, and even including gt tables in email messages.
VIEW MATERIALS https://github.com/rich-iannone/presentations/tree/master/2019_01-19-rstudio_conf_gt
About the Author Rich Iannone My background is in programming, data analysis, and data visualization. Much of my current work involves a combination of data acquisition, statistical programming, tools development, and visualizing the results. I love creating software that helps people accomplish things. I regularly update several R package projects (all available on GitHub). One such package is called DiagrammeR and it’s great for creating network graphs and performing analyses on the graphs. One of the big draws for open-source development is the collaboration that comes with the process. I encourage anyone interested to ask questions, make recommendations, or even help out if so inclined!

Jim Hester | It depends: A dialog about dependencies | RStudio (2019)
Software dependencies can often be a double-edged sword. On one hand, they let you take advantage of others’ work, giving your software marvelous new features and reducing bugs. On the other hand, they can change, causing your software to break unexpectedly and increasing your maintenance burden. These problems occur everywhere, in R scripts, R packages, Shiny applications and deployed ML pipelines. So when should you take a dependency and when should you avoid them? Well, it depends! This talk will show ways to weigh the pros and cons of a given dependency and provide tools for calculating the weights for your project. It will also provide strategies for dealing with dependency changes, and if needed, removing them. We will demonstrate these techniques with some real-life cases from packages in the tidyverse and r-lib.
VIEW MATERIALS https://speakerdeck.com/jimhester/it-depends
About the Author Jim Hester Jim is a software engineer at RStudio working with Hadley to build better tools for data science. He is the author of a number of R packages including lintr and covr, tools to provide code linting and test coverage for R
Jenny Bryan | Lazy evaluation | RStudio (2019)
The “tidy eval” framework is implemented in the rlang package and is rolling out in packages across the tidyverse and beyond. There is a lively conversation these days, as people come to terms with tidy eval and share their struggles and successes with the community. Why is this such a big deal? For starters, never before have so many people engaged with R’s lazy evaluation model and been encouraged and/or required to manipulate it. I’ll cover some background fundamentals that provide the rationale for tidy eval and that equip you to get the most from other talks.
VIEW MATERIALS https://github.com/jennybc/tidy-eval-context#readme
About the Author Jenny Bryan Jenny is a recovering biostatistician who takes special delight in eliminating the small agonies of data analysis. She’s part of Hadley’s team, working on R packages and integrating them into fluid workflows. She’s been working in R/S for over 20 years, serves in the leadership of rOpenSci and Forwards, and is an Ordinary Member of the R Foundation. Jenny is an Associate Professor of Statistics (on leave) at the University of British Columbia, where she created the course STAT 545

Jesse Sadler | Learning and using the tidyverse for historical research | RStudio (2019)
My talk will discuss how R, the tidyverse, and the community around R helped me to learn to code and create my first R package. My positive experiences with the resources for learning R and the community itself led me to create a blog detailing my experiences with R as a way to pass along the knowledge that I gained. The next step was to develop my first package. The debkeepr package integrates non-decimal monetary systems of pounds, shillings, and pence into R, making it possible to accurately analyze and visualize historical account books. It is my hope that debkeepr can help bring to light crucial and interesting social interactions that are buried in economic manuscripts, making these stories accessible to a wider audience.
VIEW MATERIALS https://github.com/jessesadler/rstudioconf-2019-slides
About the Author Jesse Sadler I am an early modern historian interested in the social and familial basis of politics, religion, and trade. I received a Ph.D. in European History from UCLA in 2015 and have taught courses on cultural and intellectual history of early modern Europe and the Atlantic. My research investigates the familial basis of the early modern capitalism through archival research on two mercantile families from Antwerp at the end of the sixteenth and beginning of the seventeenth century. I am currently working on a manuscript that argues for the significance of sibling relationships and inheritance in the development of early modern trade. My manuscript places concepts such as patriarchy, emotion, exile, and friendship at the heart of the efficacy of long-distance trade networks and the growth of capitalism
Miles McBain | Our colour of magic The open sourcery of fantastic R packages | RStudio (2019)
What does it mean to say software is, to quote one Twitter user, ‘so f***ing magical!’? In the context of our popular community hobby of rating and sharing R packages, the term ‘magic’ seems reserved for our most powerful expressions of visceral approval. Why is this? And what does it say about how we value software? Can this magical quality be quantified? We will consider these questions in examination of magical specimens, and in the process reveal the surprising depths at which notions of magic are embedded in the R zeitgiest.
VIEW MATERIALS https://github.com/MilesMcBain/rstudioconf_talk
About the Author Miles McBain As an Applied Statistician Miles combines a theoretical statistical knowledge and computing expertise to help organizations understand their core business and their customers. Miles is a hacker at heart, which he channels into regular contributions to the open source and open science communities. In addition to commercial projects Miles is always interested in small data/statistics consulting jobs for start-ups and non-for-profits that enable him to expand his applied experience in areas such as as A/B testing, experimental design, and statistical power analysis. He does this mainly for the thrill of learning new domains and the opportunity to meet fascinating people
Max Kuhn | parsnip A tidy model interface | RStudio (2019)
parsnip is a new tidymodels package that generalizes model interfaces across packages. The idea is to have a single function interface for types of specific models (e.g. logistic regression) that lets the user choose the computational engine for training. For example, logistic regression could be fit with several R packages, Spark, Stan, and Tensorflow. parsnip also standardizes the return objects and sets up some new features for some upcoming packages.
VIEW MATERIALS https://github.com/rstudio/rstudio-conf/tree/master/2019/Parsnip--Max_Kuhn
About the Author Max Kuhn Dr. Max Kuhn is a Software Engineer at RStudio. He is the author or maintainer of several R packages for predictive modeling including caret, Cubist, C50 and others. He routinely teaches classes in predictive modeling at rstudio::conf, Predictive Analytics World, and UseR! and his publications include work on neuroscience biomarkers, drug discovery, molecular diagnostics and response surface methodology. He and Kjell Johnson wrote the award-winning book Applied Predictive Modeling in 2013

Mark Sellors | R in production | RStudio (2019)
With the increase in people using R for data science comes an associated increase in the number of people and organisations wanting to put models or other analytic code into “production”. We often hear it said that R isn’t suitable for production workloads, but is that true? In this talk, Mark will look at some of the misinformation around the idea of what “putting something into production” actually means, as well as provide tips on overcoming the obstacles put in your path.
VIEW MATERIALS https://rinprod.com/
About the Author Mark Sellors Mark is the Head of Data Engineering at Mango Solutions as well as the author of the ‘Field Guide to the R Ecosystem’. He has more than a decade’s experience working with analytical computing environments, DevOps and Unix/Linux. He uses his experience to help Mango’s customers transform their analytic capabilities to ensure they can make the most of their data. Mark and his team are at the forefront of the data engineering field, deploying high performance analytical environments using a wide range of tools, such as R, Python, Spark, and cloud computing. He is experienced in the complete product life-cycle from initial ideas and proofs of concept through to development, test, release and production
Garrett Grolemund | R Markdown The bigger picture | RStudio (2019)
Statistics has made science resemble math, so much so that we’ve begun to conflate p-values with mathematical proofs. We need to return to evaluating a scientific discovery by its reproducibility, which will require a change in how we report scientific results. This change will be a windfall to commercial data scientists because reproducible means repeatable, automatable, parameterizable, and schedulable.
VIEW MATERIALS https://github.com/garrettgman/rmarkdown-the-bigger-picture
About the Author Garrett Grolemund Garrett is a data scientist and master instructor for RStudio. He excels at teaching, statistics, and teaching statistics. He wrote the popular lubridate package and is the author of Hands On Programming with R and the upcoming book, Data Science with R, from O’Reilly Media. He holds a PhD in Statistics and specializes in Data Visualization
Karl Broman | R qtl2 Rewrite of a very old R package | RStudio (2019)
For nearly 20 years, I’ve been developing, maintaining, and supporting an R package, R/qtl, for mapping quantitative trait loci (genetic loci that contribute to variation in quantitative traits, such as blood pressure) in experimental crosses (such as in mice). It’s a rather large package, with 39k lines of R code, 24k lines of C code, and nearly 300 user-accessible functions. In the past several years, I’ve been working on rewriting the package, to better handle high-dimensional data and more complex experimental crosses. This has been a good opportunity to take advantage of many new tools, including Rcpp, Roxygen2, and testthat. I’ll describe my efforts to avoid repeating the mistakes I made the first time around.
VIEW MATERIALS https://bit.ly/rstudio2019
About the Author Karl Broman Karl Broman is Professor in the Department of Biostatistics & Medical Informatics at the University of Wisconsin–Madison; research in statistical genetics; developer of R/qtl (for R). Karl received a BS in mathematics in 1991, from the University of Wisconsin–Milwaukee, and a PhD in statistics in 1997, from the University of California, Berkeley; his PhD advisor was Terry Speed. He was a postdoctoral fellow with James Weber at the Marshfield Clinic Research Foundation, 1997–1999. He was a faculty member in the Department of Biostatistics at Johns Hopkins University, 1999–2007. In 2007, he moved to the University of Wisconsin–Madison, where he is now Professor. Karl is a Senior Editor for Genetics, Academic Editor for PeerJ, and a member of the BMC Biology Editorial Board. Karl is an applied statistician focusing on problems in genetics and genomics – particularly the analysis of meiotic recombination and the genetic dissection of complex traits in experimental organisms. The latter is often called “QTL mapping.” A QTL is a quantitative trait locus – a genetic locus that influences a quantitative trait. Recently he has been focusing on the development of interactive data visualizations for high-dimensional genetic data; see his R/qtlcharts package and his D3 examples
Barret Schloerke | Reactlog 2.0 Debugging the state of Shiny | RStudio (2019)
The revamped reactlog provides an updated visual display to traverse through the reactive behavior within your shiny application. Using live shiny applications, we will use reactlog’s directed dependency graph to find missing reactive dependencies in “working” applications and address suboptimal reactive coding patterns. Correcting these coding patterns will reduce the amount of calculations done by shiny and keep reactive objects from being created unnecessarily.
VIEW MATERIALS http://github.com/schloerke/presentation-2019-01-18-reactlog
About the Author Barret Schloerke I specialize in Large Data Visualization where I utilize the interactivity of a web browser, the fast iterations of the R programming language, and large data storage capacity of Hadoop

Javier Luraschi | Scaling R with Spark | RStudio (2019)
This talk introduces new features in sparklyr that enable real-time data processing, brand new modeling extensions and significant performance improvements. The sparklyr package provides an interface to Apache Spark to enable data analysis and modeling in large datsets through familiar packages like dplyr and broom.
VIEW MATERIALS https://github.com/rstudio/rstudio-conf/tree/master/2019/Scaling%20R%20with%20Spark%20-%20Javier%20Luraschi
About the Author Javier Luraschi Javier is a Software Engineer with experience in technologies ranging from desktop, web, mobile and backend; to augmented reality and deep learning applications. He previously worked for Microsoft Research and SAP and holds a double degree in Mathematics and Software Engineering
Alex Hayes | Solving the model representation problem with broom | RStudio (2019)
The R objects used to represent model fits are notoriously inconsistent, making data analysis inconvenient and frustrating. The broom package resolves this issue by defining a consistent way to represent model fits. By summarizing essential information about fits in tidy tibbles, broom makes it easy to programmatically work with model objects. Combining broom with list-columns results in an especially powerful way to work with many model fits at once. This talk will feature several case studies demonstrating how broom resolves common problems in data analysis
VIEW MATERIALS https://buff.ly/2FGKFkj
About the Author Alex Hayes Alex is interested in how statistics can help people make better decisions. He’s active in the R and data science communities, particularly interested in improving interfaces to modeling sofware. In his free time, he tries to get outside to climb and bike
Edzer Pebesma | Spatial data science in the Tidyverse | RStudio (2019)
Package sf (simple feature) and ggplot2::geom_sf have caused a fast uptake of tidy spatial data analysis by data scientists. Important spatial data science challenges are not handled by them, including raster and vector data cubes (e.g. socio-economic time series, satellite imagery, weather forecast or climate predictions data), and out-of-memory datasets. Powerful methods to analyse such datasets have been developed in packages stars (spatiotemporal tidy arrays) and tidync (tidy analysis of NetCDF files). This talk discusses how the simple feature and tidy data frameworks are extended to handle these challenging data types, and shows how R can be used for out-of-memory spatial and spatiotemporal datasets using tidy concepts.
VIEW MATERIALS https://edzer.github.io/rstudio_conf/2019/index.html
About the Author Edzer Pebesma I lead the spatio-temporal modelling laboratory at the institute for geoinformatics. I hold a PhD in geosciences, and am interested in spatial statistics, environmental modelling, geoinformatics and GI Science, semantic technology for spatial analysis, optimizing environmental monitoring, but also in e-Science and reproducible research. I am an ordinary member of the R foundation. I am one of the authors of Applied Spatial Data Analysis with R (second edition), am Co-Editor-in-Chief for the Journal of Statistical Software, and associate editor for Spatial Statistics. I believe that research is useful in particular when it helps solving real-world problems
Irene Steves | Teaching data science with puzzles | RStudio (2019)
Of the many coding puzzles on the web, few focus on the programming skills needed for handling untidy data. During my summer internship at RStudio, I worked with Jenny Bryan to develop a series of data science puzzles known as the “Tidies of March.” These puzzles isolate data wrangling tasks into bite-sized pieces to nurture core data science skills such as importing, reshaping, and summarizing data. We also provide access to puzzles and puzzle data directly in R through an accompanying Tidies of March package. I will show how this package models best practices for both data wrangling and project management.
VIEW MATERIALS https://github.com/isteves/ds-puzzles
About the Author Irene Steves This summer I was an intern at RStudio, where I worked with Jenny Bryan to develop a series of coding challenges to cultivate and reward the mastery of R and the tidyverse. I was previously a Data Science Fellow at the National Center for Ecological Analysis and Synthesis (NCEAS), where I reviewed data submissions to a national repository for completion, clarity, and data management best practices. As a fellow, I also collaborated on a number of open science projects to improve access to Ecological Metadata Language (EML) and datasets in the DataONE network (see metajam, dataspice)

Nic Crane | The future’s Shiny: Pioneering genomic medicine in R | Posit (2019)
Shiny’s expanding capabilities are rapidly transforming how it is used in an enterprise. This talk details the creation of a large-scale application, supporting hundreds of concurrent users, making use of the future and promises packages. The 100,000 genomes project is an ambitious exercise that follows on from the Human Genome Project - aiming to put the UK at the forefront of genomic medicine, with the NHS as the first health service in the world to offer precision medicine to patients with rare diseases and cancer. Data is at the heart of this project; not only the outputs of the genomic sequencing, but vast amounts of metadata used to track progress against the 100,000 genome target and the status and path of each case through the sample tracking pipeline. In order to make this data readily available to stakeholders, Shiny was used to create an application containing multiple interactive dashboards. A scaled-up version of the app is being rolled out in early 2019 to a much larger audience to support the National Genomics Informatics Service, with the challenge of creating a complex app capable of supporting so many users without grinding to a halt. In this talk, I will explain why Shiny was the obvious technology choice for this task, and discuss the design decisions which enabled this project’s success.
VIEW MATERIALS https://github.com/thisisnic/rstudio-conf-2019
About the Author Nic Crane Nic Crane is a Data Scientist at Elucidata, and has formerly worked for Mango Solutions and IBM. She is passionate about learning and teaching all things data science
Carl Howe | The next million R users | RStudio (2019)
Many students believe that R is obscure, complex, and difficult to write. However, data from a new large-scale survey of R users conducted by RStudio shows that new R users are taking dramatically different learning paths from those who learned R as recently as 2 years ago, and these new learning paths are changing its perception. In this talk, we’ll present this new survey data, describe how new tools and techniques for teaching R can satisfy the demands of today’s R learners, and outline a vision for adding millions of new R users to our community.
VIEW MATERIALS https://github.com/rstudio/learning-r-survey/blob/master/slides/Next-Million-R-Users.pdf
David Robinson | The unreasonable effectiveness of public work | RStudio (2019)
In this talk, I’ll lay out the reasons that blogging, open source contribution, and other forms of public work are a critical part of a data science career. For beginners, a blog is a great accompaniment to data science coursework and tutorials, since it gives you experience applying practical data science skills to real problems. For data scientists at any stage of their careers, open source development offers practice in collaboration, documentation, and interface design that complement other kinds of software development. And for data scientists more advanced in their careers, writing a book is a great way to crystallize your expertise and ensure others can build on it. All of these practices build skills in communication and collaboration that form an essential component of data science work. Each also lets you build a public portfolio of your skills, get feedback from your peers, and network with the larger data science community.
VIEW MATERIALS https://bit.ly/drob-rstudio-2019
About the Author David Robinson David is the Chief Data Scientist at DataCamp, an education company for teaching data science through interactive online courses. His interests include statistics, data analysis, education, and programming in R. David is co-author with Julia Silge of the tidytext package and the O’Reilly book Text Mining with R. He also the author of the broom, gganimate, and fuzzyjoin packages, and of the e-book Introduction to Empirical Bayes. David previously worked as a data scientist at Stack Overflow, and received a PhD in Quantitative and Computational Biology from Princeton University

Earo Wang | Melt the clock Tidy time series analysis | RStudio (2019)
Time series can be frustrating to work with, particularly when processing raw data into model-ready data. This work presents two new packages that address a gap in existing methodology for time series analysis (raised in rstudio::conf 2018). The tsibble package supports organizing and manipulating modern time series, leveraging tidy data principles along with contextual semantics: index and key. The tsibble data structure seamlessly flows into forecasting routines. The fable package is a tidy renovation of the forecast package. It promotes transparent forecasting practices and concise model representations, to empower analysts tackling a broad domain of forecasting problems. This collection of packages form the tidyverts, which facilitates a fluent and fluid workflow for analyzing time series.
VIEW MATERIALS https://slides.earo.me/rstudioconf19
About the Author Earo Wang I’m currently doing my Ph.D. on statistical visualisation of temporal-context data at Monash University, supervised by Professor Di Cook and Professor Rob J Hyndman. I enjoy developing open-source tools with R, and is the (co)author of some widely-used R packages including anomalous, hts, sugrrants, rwalkr and tsibble. My research areas invovle data visualisation, time series analysis, and computational statistics
Yihui Xie | pagedown Creating beautiful PDFs with R Markdown and CSS | RStudio (2019)
The traditional way to beautiful PDFs is often through LaTeX or Word, but have you ever thought of printing a web page to PDF? Web technologies (HTML/CSS/JavaScript) are becoming more and more amazing. It is entirely possible to create high-quality PDFs through Google Chrome or Chromium now. Web pages are usually single-page documents, but they can be paginated thanks to the JavaScript library Paged.js, so that you can have elements like headers, footers, and page margins for the printing purpose. In this talk, we introduce a new R package, pagedown (https://github.com/rstudio/pagedown) , to create PDF documents based on R Markdown and Paged.js. Applications of pagedown includes, but not limited to, books, articles, posters, resumes, letters, and business cards. With the power of CSS and JavaScript, you can typeset your documents with amazing elegance (e.g., a single line of CSS, “tr:nth-child(even) { background: #eee; }”, will give you a striped table, and “border-radius: 50%;” gives you a circular element) and power (e.g., HTML Widgets).
VIEW MATERIALS https://bit.ly/pagedown
Claus Wilke | Visualizing uncertainty with hypothetical outcomes plots | RStudio (2019)
Uncertainty is a key component of statistical inference. However, uncertainty is not easy to convey effectively in data visualizations. For example, viewers have a tendency to interpret visualizations of the most likely outcome as the only possible one. Viewers may also misjudge the likelihood of different possible outcomes or the extent to which moderately rare outcomes may deviate from the expectation. One way in which we can help the viewer grasp the amount of uncertainty present in a dataset is by showing a variety of different possible modeling outcomes at once. For example, in a linear regression, we could plot a number of different regression lines with slopes and intercepts drawn from the range of likely values, as determined by the variation in the data. Such visualizations are called Hypothetical Outcomes Plots (HOPs). HOPs can be made in static form, showing the various hypothetical outcomes all at once, or preferably in an animated form, where the display cycles between the different hypothetical outcomes. With recent progress in ggplot2-based animation, via gganimate, as well as packages such as tidybayes that make it easy to generate hypothetical outcomes, we can easily produce animated HOPs in a few lines of R code. This presentation will cover the key concepts, packages, and techniques to generate such visualizations.
VIEW MATERIALS: https://docs.google.com/presentation/d/1zMuBSADaxdFnosOPWJNA10DaxGEheW6gDxqEPYAuado/edit?usp=sharing
Sigrid Keydana | Why TensorFlow eager execution matters | RStudio (2019)
In current deep learning with Keras and TensorFlow, when you’ve mastered the basics and are ready to dive into more involved applications (such as generative networks, sequence-to-sequence or attention mechanisms), you may find that surprisingly, the learning curve doesn’t get much flatter. This is largely due to restrictions imposed by TensorFlow’s traditional static graph paradigm. With TensorFlow Eager Execution, available since summer and announced to be the default mode in the upcoming major release, model architectures become more flexible, readable, composable, and last not least, debuggable. In this session, we’ll see how with Eager, we can code sophisticated architectures like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) in a straightforward way.
VIEW MATERIALS https://github.com/skeydan/rstudio_conf_2019_eager_execution
Lionel Henry | Working with names and expressions in your tidy eval code | RStudio (2019)
n practice there are two main flavors of tidy eval functions: functions that select columns, such as dplyr::select(), and functions that operate on columns, such as dplyr::mutate(). While sharing a common tidy eval foundation, these functions have distinct properties, good practices, and available tooling. In this talk, you’ll learn your way around selecting and doing tidy eval style.
Materials: https://speakerdeck.com/lionelhenry/selecting-and-doing-with-tidy-eval

Joe Cheng | Shiny in production: Principles, practices, and tools | RStudio (2019)
Shiny is a web framework for R, a language not traditionally known for web frameworks, to say the least. As such, Shiny has always faced questions about whether it can or should be used “in production”. In this talk we’ll explore what “production” even means, review some of the historical obstacles and objections to using Shiny for production purposes, and discuss practices and tools that can help your Shiny apps flourish.
About the Author: Joe Cheng is the Chief Technology Officer at RStudio. Joe was the original creator of Shiny, and leads the team responsible for Shiny and Shiny Server. GitHub: https://github.com/jcheng5
Materials: https://speakerdeck.com/jcheng5/shiny-in-production

Jeff Allen | RStudio Connect Past, present, and future | RStudio (2019)
RStudio Connect is a publishing platform that helps to operationalize the data science work you’re doing. We’ll review the current state of RStudio including its ability to host Shiny applications and Plumber APIs, schedule and render R Markdown documents, and manage access. Then we’ll unveil some exciting new features that we’ve been working on, and give you a sneak peek at what’s coming up next.
Materials: http://rstd.io/rsc170
RStudio Connect Deployments with GitHub Webhooks and Jenkins
This video complements the article at https://medium.com/@kelly.obriant/rstudio-connect-deployments-with-github-webhooks-and-jenkins-c0dd8a82b986 TL;DR: New content management Connect server APIs are easy to integrate with programmatic deployment workflows.
Content Management API Resources:
- API Reference: https://docs.rstudio.com/connect/api/
- User Guide: https://docs.rstudio.com/connect/user/cookbook.html#recipes (Deploying Content)
- Example deployment scripts: https://github.com/rstudio/connect-api-deploy-shiny
Intermediate Shiny 2-Day-Workshop - rstudio::conf(2019L)
What is the 2-day Intermediate Shiny Workshop? That’s a great question, I’m glad you asked….
Register at https://rstd.io/conf Learn more at https://rstd.io/conf-agenda
Intermediate Shiny Workshop - 2 Days
This two-day workshop is designed by Shiny author Joe Cheng for the experienced Shiny developer. By taking this workshop, you’ll improve your understanding of shiny’s foundations and learn how to make the most of reactive programming, techniques for extending and improving UI, techniques for debugging and tools for modularizing applications. By the end of the two days, you’ll be able to push the envelope of what you and your organizations can do with Shiny.
You should take this workshop if you are already familiar with the basics of shiny and you have built your own simple applications.
This course is led by friend of RStudio and Education Practice Lead at RStudio Certified Partner Mango Solutions, Aimee Gott. Winston Chang, RStudio Data Scientist, Developer, and author of the R Graphics Cookbook will join Aimee to provide hands-on advice and answers to the Shiny application development questions that stump you.
Speakers: Aimee Gott (Mango Solutions), with Winston Chang (RStudio)


Reproducible Examples with the reprex package
Reproducible Examples and the reprex package.
https://speakerdeck.com/jennybc/reprex-reproducible-examples-with-r
Jump to: 0:08 Intro 0:40 Basic usage of reprex 3:35 Motivation, why use reprex? “Help me help you”
4:08 Define reprex?
Three commons ways to use the term.
- noun, a reproducible example
- the reprex package. a tool to build R
reprexs - reprex::reprex(), a function in
reprexto make a reprex.
5:26 When should you use a reprex?
6:14 reprex installation and setup. How do you actually get repex on your machine? 7:59 Advanced setup and discussion. 9:45 Please use advanced features responsibly.
11:02 Why does the reprex package exist? Anyone who has helped teach R or dealt with github issues, twitter, stack overflow & RStudio community questions knows that helping people diagnose their coding problems can be hard. This tool comes from hard-won experience. It’s aim to is help people ask well formed questions and increase the chances of getting well formed answers quickly.
12:52 philosophy behind reprex
- code that I can run
- code that I don’t need to run
- code that I can easily run
13:52 code that I can run.
17:25 Tips on writing good reprexs. Dos and don’ts.
18:52 How do I get my data into my reprex?
Getting small data and CSV type data into your reprex is easy.
“I have a big hairy data object and I can only show their problem by using it”, but that’s not always the case.
21:02 code that I don’t need to run reprex gives your reader the code and reveals the output being produced by that code. For experienced coders, that might be enough to help you.
22:44 code that I can easily run Don’t copy and paste from the R console. This is usually annoying for your reader. Worse than console copy-pasta is the screenshot. (Many people think screenshots of code are downright offensive.)
25:03 reprex_clean If you copy someone else’s reprex into your consolve, it may include their output, making your new reprex a untidy. Here are tips for taking someone else’s reprex code and output, and create a clean reprex reply.
25:54 shock and awe More interesting features of the reprex package.
- 26:29 What about figures and plots in your reprex? So happy you asked about that. reprex will automatically upload your images to imgur.com.
- 28:23 Create a reprex by explicitly providing your code in the reprex call.
- 29:00 when you need your reprex to work in the current working directory.
- 30:45 Differently flavored markdown. Optimize your reprex markdown output for github, stack overflow, or the RStudio community.
- 30:31 Make your reprex create an R script, with your reprex outputs as comments. This is handy for pasting into an email or slack-type-app.
- 32:25 Rich text format, rtf output. (currently experimental feature as of this video)
- 33:06 supress the reprex add at the bottom of your reprex
- 33:19 Include session info.
- 33:54 Auto styling of your code. Good if you’re dealing with poorly formatting code.
- 34:25 Change your comments string.
- 34:32 Silence Tidyverse startup messages.
- 35:00 Capture a reprex that sends messages to standard output and standard input (e.g. package installation compilation messages).
36:13 Set up personal defaults for your reprex usage.
36:54 reprex RStudio addins; render reprex and reprex selection. These accelerate your use of reprex.
39:01 The human side of reproducible examples. How to ask questions in ways that are most likely to get answered. Sorry for the tough love, but this is important. Why are you always asked to give a reprex?
- Experts try to use reproducible examples to ensure their advice works.
- Making a good reprex is hard. But, you are asking them to solve a problem for you, so meet them halfway.
- Creating reprexes is good coding practice.
- Making a good reprex is often a good way to debug your issue in the embarrassment-free privacy of your own home.
- reprexes lead to discussions more likely to help people in the future.
44:34 Behind the scenes of reprex
44:44 Thanks for those that helped make reprex possible.
Questions and Answers
- 46:05 can reprex capture variables and objects in the current environment? (not yet, maybe in development)
- 47:25 does reprex actually check that the code is self contained? (self contained)
- 48:08 does readr::read_csv support the text argument? (yep, just read the help manual for readr)
Shiny Train-the-Trainer Workshop - rstudio::conf(2019L)
What is the 2-day Shiny Train-the-Trainer Workshop? That’s a great question, I’m glad you asked.
Register at https://rstd.io/conf Learn more at https://rstd.io/conf-agenda
Shiny Train-the-Trainer Certification Workshop - 2 Day
- Day 1 of the course will be co-taught by Mine Cetinkaya-Rundel and Garrett Grolemund, RStudio Data Scientists and Professional Educators.
- On Day 2, Mine will teach the Shiny track and Garrett will teach the Tidyverse track.
This two-day workshop will equip you to teach R effectively. We will draw on RStudio’s experience teaching R to recommend tips for designing, teaching, and supporting short R courses.
On Day 1 of the course, you will learn practical activities that you can use immediately to improve your presentation style, learning outcomes, and student engagement. You will leave the class with a cognitive model of learning that you can use to develop your own effective workshops or courses within your organization. The course will also cover how to use RStudio Cloud and its curriculum of tutorials to jump-start your own lessons.
On Day 2 of the course, participants will have the option to choose one of two tracks: Teaching the Tidyverse or Teaching Shiny.
- Teaching Shiny: Classroom examples will focus on teaching Shiny at the beginner and intermediate levels. The course materials will build on RStudio’s Mastering Shiny workshop as well as the upcoming book from the author of the Shiny package, Joe Cheng, and they will cover the entire lifecycle of a Shiny app: build ️ improving ️ share. Participants will receive the course materials for teaching Mastering Shiny. You should take this workshop if you work as a training partner and want to qualify as an RStudio Certified Shiny Instructor or if you are an advocate for R in your organization. You should be proficient in Shiny already and be prepared to submit examples of your work. Prior teaching experience is helpful, but not required. Please bring a laptop and a device that has video recording capabilities (such as a laptop or cell phone).
Instructors: Garrett Grolemund, Mine Çetinkaya-Rundel


Tidyverse Train-the-Trainer Certification Workshop - rstudio::conf(2019L)
What is the 2-day Tidyverse Train-the-Trainer Workshop? That’s a great question, I’m glad you asked.
Register at https://rstd.io/conf Learn more at https://rstd.io/conf-agenda
Tidyverse Train-the-Trainer Certification Workshop - 2 Days
- Day 1 of the course will be co-taught by Mine Cetinkaya-Rundel and Garrett Grolemund, RStudio Data Scientists and Professional Educators.
- On Day 2, Mine will teach the Shiny track and Garrett will teach the Tidyverse track.
This two-day workshop will equip you to teach R effectively. We will draw on RStudio’s experience teaching R to recommend tips for designing, teaching, and supporting short R courses.
On Day 1 of the course, you will learn practical activities that you can use immediately to improve your presentation style, learning outcomes, and student engagement. You will leave the class with a cognitive model of learning that you can use to develop your own effective workshops or courses within your organization. The course will also cover how to use RStudio Cloud and its curriculum of tutorials to jump-start your own lessons.
On Day 2 of the course, participants will have the option to choose one of two tracks: Teaching the Tidyverse or Teaching Shiny.
- Teaching the Tidyverse: Classroom examples will focus on how to teach students to do data analysis with the Tidyverse. We will use Master the Tidyverse, which is an award-winning two-day workshop developed by RStudio, as an example. Participants will receive the course materials for teaching Master the Tidyverse. You should take this workshop if you work for a training partner and want to qualify as an RStudio Certified Tidyverse Instructor or if you are an advocate for R in your organization. You should be proficient in the Tidyverse already and be prepared to submit examples of your work. Prior teaching experience is helpful, but not required. Please bring a laptop and a device that has video recording capabilities (such as a laptop or cell phone).
Instructors: Garrett Grolemund, Mine Çetinkaya-Rundel

Shiny and R to Build Dynamic Dashboards
In a static report, you answer known questions. With a dynamic report, you give the reader the tools to answer their own questions. Unleash the full flexibility of analytic app development with Shiny.
In this talk, Winston Chang will discuss how to use R and Shiny to quickly create data dashboards. You’ll also get a glimpse of some new features in Shiny for presenting and interacting with data. He will also demonstrate how you can easily deploy apps to the web via RStudio’s hosting service shinyapps.io.
Related blog post: https://blog.rstudio.com/2017/05/18/shinydashboard-0-6-0/

Data Manipulation Tools: dplyr – Pt 3 Intro to the Grammar of Data Manipulation with R
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
dplyr docs: dplyr.tidyverse.org/reference/
- http://dplyr.tidyverse.org/reference/union.html
- http://dplyr.tidyverse.org/reference/intersect.html
- http://dplyr.tidyverse.org/reference/set_diff.htm
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:10
tidyr::gather - /12:30
tidyr::spread - /15:23
tidyr::unite - /15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- 02:00
dplyr::select - 03:40
dplyr::filter - 05:05
dplyr::mutate - 07:05
dplyr::summarise - 08:30
dplyr::arrange - 09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- 11:45
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
Tidy Data and tidyr – Pt 2 Intro to Data Wrangling with R and the Tidyverse
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
http://tidyr.tidyverse.org/reference/
- http://tidyr.tidyverse.org/reference/gather
- http://tidyr.tidyverse.org/reference/spread
- http://tidyr.tidyverse.org/reference/unite
- http://tidyr.tidyverse.org/reference/separate
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- 00:48 Goal 1 Making your data suitable for R
- 01:40
tidyr“Tidy” Data introduced and motivated - 08:10
tidyr::gather - 12:30
tidyr::spread - 15:23
tidyr::unite - 15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by - /15:00
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
What is data wrangling? Intro, Motivation, Outline, Setup – Pt. 1 Data Wrangling Introduction
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. These videos introduce you to these tools. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- 01:44 Intro and what’s covered Ground Rules
- 02:40 What’s a tibble
- 04:50 Use View
- 05:25 The Pipe operator:
- 07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:15
tidyr::gather - /12:38
tidyr::spread - /15:30
tidyr::unite - /15:30
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by - /15:00
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
New York Times “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”, By STEVE LOHRAUG. 17, 2014 https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html
Working with Two Datasets: Binds, Set Operations, and Joins – Pt 4 Intro to Data Manipulation
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
dplyr docs: dplyr.tidyverse.org/reference/
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules:
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:10
tidyr::gather - /12:30
tidyr::spread - /15:23
tidyr::unite - /15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- /00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- 00.42
dplyr::bind_cols - 01:27
dplyr::bind_rows - 01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - 02:15 joining data -
dplyr::left_join,dplyr::inner_join, -dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
Machine Learning with R and TensorFlow
J.J. Allaire’s keynote at rstudio::conf 2018 on the R interface to TensorFlow (https://tensorflow.rstudio.com ), a suite of packages that provide high-level interfaces to deep learning models (Keras) and standard regression and classification models (Estimators), as well as tools for cloud training, experiment management, and production deployment. The talk also discusses deep learning more broadly (what it is, how it works, and where it might be relevant to users of R in the years ahead).
Slides: https://beta.rstudioconnect.com/ml-with-tensorflow-and-r/ JJ Allaire: - https://github.com/jjallaire Twitter: @fly_upside_down https://twitter.com/fly_upside_down Related blog post: https://blog.rstudio.com/2018/02/06/tensorflow-for-r/

Admin SSP | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Advanced Features of Sparkyr | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Beyond Static Reports With R Markdown | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Connect Basics | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Connect Production | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Covr Test Coverage | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Data Wrangling R | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Extending Spark Using Sparklyr | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Extracting Data From the Web: Part 1 | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Extracting Data From the Web: Part 2 | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Importing Data Into R | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Interactive Graphics with Shiny | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Introducing an R interface for Apache Spark | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Introducing Bookmarks | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Introducing Flex Dashboards | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Introducing Notebooks | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Introduction to Blogdown (R Package) | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message.
Read more on our blog: https://blog.rstudio.com/2017/09/11/announcing-blogdown/
Introduction to Bookdown (R Package) | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message.
Read more on our blog: https://blog.rstudio.com/2016/12/02/announcing-bookdown/
Introduction to Recipes (R Package) | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Profvis - Profiling tools for Faster R code | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
RcppParallel Overview | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Reproducible Reporting with R & RStudio | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
RStudio - Shiny Server Pro Architecture | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Shiny Gadgets with R | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Sparklyr: Using Spark with RMarkdown | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
The Tidyverse and RStudio Connect | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Understand Modules with R & RStudio | RStudio Webinar - 2016
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web … . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Understanding Sparklyr Deployment Modes | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
Web API Updates for R | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
What’s New in Dplyr (0.7.0) | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message
What’s New With Readxl | RStudio Webinar - 2017
This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community’s understanding of R and RStudio’s capabilities.
We are always interested in receiving feedback, so please don’t hesitate to comment or reach out with a personal message



