ggplot2
An implementation of the Grammar of Graphics in R
ggplot2 is an R package for creating graphics using a declarative system based on The Grammar of Graphics. You provide data and specify how variables map to visual properties, then add layers like points or histograms to build complete visualizations.
The package handles low-level plotting details automatically, letting you focus on the structure of your visualization. It supports layered graphics construction through composable components (geometries, scales, facets, coordinate systems). The package is mature and stable, with a large ecosystem of extensions for specialized plot types and customizations.
Contributors#
Resources featuring ggplot2#
Who are ’the ggplot2 extenders’ and how to become one (Gina Reynolds) | posit::conf(2025)
Who are ’the ggplot2 extenders’ and how to become one
Speaker(s): Evangeline ‘Gina’ Reynolds
Abstract:
The ggplot2 extension ecosystem is large and robust. Still, even for the most competent ggplot2 users, jumping into writing extensions may not feel straightforward. But the extension-interested should know, it is a great time to get into extension. New efforts exist to support and connect extenders! This talk will discuss some new getting-started resources for extenders and will highlight the ggplot2 extenders meetup and discussions.
YouTube Playlist - https://www.youtube.com/playlist?list=PLpUeWjs9wDGc4C_Db_u4T7g_rDmw9ZkXE Join Form - https://docs.google.com/forms/d/e/1FAIpQLSe3M1KwUPrmTfEGuuQp0fZ0J7dZkk_82gb310JCvdouMTa_7Q/viewform Website - https://ggplot2-extenders.github.io/ggplot-extension-club/ posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
Building Multilingual Data Science Teams (Michael Thomas, Ketchbrook Analytics) | posit::conf(2025)
Building Multilingual Data Science Teams
Speaker(s): Michael Thomas
Abstract:
For much of my career, I have seen data science teams make the critical decision of deciding whether they are going to be an “R shop” or a “Python shop”. Doing both seemed impossible. I argue that this has changed drastically, as we have built out an effective multilingual data science team at Ketchbrook, thanks to polars/dplyr, gt/great-tables, ggplot2/plotnine, arrow, duckdb, Quarto, etc. I would like to provide a walk through of our journey to developing a multilingual data science team, lessons learned, and best practices. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
Same Data, Different Tools: Visualizing with R and Python (Olivia Hebner, Summit)
Same Data, Different Tools: Visualizing with R and Python
Speaker(s): Olivia Hebner
Abstract:
In 2024, our team participated in a data challenge to recreate a visualization from W.E.B. Du Bois’s 1900 Paris Exposition using modern tools. We split into two groups—one using R and the other Python—to compare their strengths and limitations. Both teams used census and geographic data to map county-level populations for 1870 and 1880. Team R used ggplot2 and grid for precise layout control, while Team Python used matplotlib’s subplot system for structuring. This challenge pushed us beyond more traditional data science visualizations, requiring creative approaches to mimic Du Bois’s design. Attendees will gain insights into data wrangling, visualization techniques, and layout design to guide their own projects.
Materials - https://github.com/summitllc/Du-Bois-Challenge-2024 posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
Exploratory Data Analysis with R in Positron
Learn exploratory data analysis (EDA) in R with this tutorial by Mine Çetinkaya-Rundel. Using Positron, Mine guides you through a real-world project, ’exploring deadlines,’ to analyze the impact of homework deadlines on student performance and stress levels. Discover how to effectively clean, filter, and visualize data using ggplot2 for insightful comparisons. This tutorial emphasizes best practices for data organization and clear data presentation while highlighting Positron’s features that streamline your data analysis workflow. Perfect for anyone looking to master data visualization in R and enhance their data science skills in this new IDE.
0:00 Introduction 0:25 Opening a new Positron project 1:48 Loading and exploring data 3:44 Creating a new R file 4:05 Running exploratory data analysis 16:37 Formatting code with Air 19:22 Copying a plot
Positron documentation: https://positron.posit.co/ Download Positron: https://positron.posit.co/download.html Read the blog post: https://posit.co/blog/eda-in-positron
Air documentation: https://posit-dev.github.io/air/

Open Source in Pharma | Harvey Lieberman | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you!
We were recently joined by Harvey Lieberman, Associate Director of Data Science at Novartis, to chat about R/Pharma, automating processes, career advice, and data science in drug discovery vs. development.
In this Hangout, Harvey talks about a lot of things, like the power of automating processes. He shares examples of how automating mundane tasks can save significant time and identify errors that humans might miss (we all know human error is a thing!). For instance, he automated the analysis of data from 48 Excel sheets that had previously taken a colleague about three months to process by hand; Harvey completed the automated analysis in one hour over lunch and found copying and pasting errors in the original manual process! Automating processes not only increases efficiency but can also help move people into more data-focused roles. Harvey suggests demonstrating that automation speeds things up and, most importantly, removes errors, which is when people start to pay attention and get interested.
Resources mentioned in the video and zoom chat: R/Pharma website → https://rinpharma.com/ Cecilia Baldoni’s scrollytelling project (on shrews!) → https://cecibaldoni.github.io/projects.html Advent of Code → https://adventofcode.com/ Pharmaverse.org (pharmaceutical R packages) → https://pharmaverse.org GSK’s Journey to R → https://www.youtube.com/watch?v=xDrt6txplek Roche’s Journey to R → https://www.youtube.com/watch?v=BlJNILSoZlM R/Pharma March 2025 newsletter (LinkedIn) → https://www.linkedin.com/pulse/rpharma-march-2025-newsletter-open-source-in-pharma-wmf5c/ ggplot2 extenders club → https://ggplot2-extenders.github.io/ggplot-extension-club/ Coursera: Making Data Science Work for Clinical Reporting Course → https://www.coursera.org/learn/making-data-science-work-for-clinical-reporting hiring.cafe (for finding R jobs) → https://hiring.cafe/ Posit’s PydyTuesday GitHub → https://github.com/posit-dev/python-tidytuesday Joy’s Law (management concept) Wikipedia → https://en.wikipedia.org/wiki/Joy%27s_law_(management)
If you didn’t join live, one great discussion you missed from the zoom chat was about the diverse backgrounds of attendees. Many participants shared that they came to data science “sideways,” holding degrees in fields such as sociology, psychology, mathematics, atmospheric science, education, history, chemistry, and various engineering disciplines, rather than traditional statistics or computational degrees. so many data scientists have non-traditional paths into the field! But we’re all better together.
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Janssens, Chow & Nieuwdorp - Turning DataFrames into Pretty Pictures with Plotnine | PyData NYC 2024
Learn how Plotnine, a Python package inspired by R’s ggplot2, enables the creation of sophisticated and effective data visualizations with minimal effort. This tutorial will explain how Plotnine’s grammar of graphics approach provides a flexible, intuitive way to visualize data, either as ad-hoc plots or fine-tuned graphs suited for communication.
Quick links Presentation: https://bit.ly/plotnine-tutorial GitHub repository: https://bit.ly/plotnine-repo Slideshow about what to expect: https://bit.ly/expect-plotnine
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Quarto Dashboards 1: Hello, Dashboards! | Mine Çetinkaya-Rundel | Posit
You already analyze and summarize your data in computational notebooks with R and/or Python. What’s next? You can share your insights or allow others to make their own conclusions in eye-catching dashboards and straight-forward to author, design, and deploy Quarto Dashboards, regardless of the language of your data processing, visualization, analysis, etc. With Quarto Dashboards, you can create elegant and production-ready dashboards using a variety of components, including static graphics (ggplot2, Matplotlib, Seaborn, etc.), interactive widgets (Plotly, Leaflet, Jupyter Widgets, htmlwidgets, etc.), tabular data, value boxes, text annotations, and more. Additionally, with intelligent resizing of components, your Quarto Dashboards look great on devices of all sizes. And importantly, you can author Quarto Dashboards without leaving the comfort of your “home” – in plain text markdown with any text editor (VS Code, RStudio, Neovim, etc.) or any notebook editor (JupyterLab, etc.).
This video takes you through
0:00 - Overview of building dashboards with Quarto 0:15 - Dashboard basics 7:40 - First dashboard in R 10:30 - First dashboard in Python 11:43 - Live coding demo
Slides can be found at https://mine.quarto.pub/quarto-dashboards/1-hello-dashboards/#/title-slide and the starter documents for the accompanying exercises at https://github.com/mine-cetinkaya-rundel/olympicdash .
Materials for all parts of the videos can be accessed at https://mine.quarto.pub/quarto-dashboards

Quarto Dashboards 3: Theming and Styling | Mine Çetinkaya-Rundel | Posit
Theming and styling Quarto dashboards built with R and/or Python.
Before watching this video, you might want to watch Parts 1 & 2.
This video takes you through
0:00 - Theming (including Bootswatch themes, light/dark mode, customizing themes with SCSS) 3:55 - Styling 4:55 - Live coding demo
Slides can be found at https://mine.quarto.pub/quarto-dashboards/3-theming-styling and the starter documents for the accompanying exercises at https://github.com/mine-cetinkaya-rundel/olympicdash .
Materials for all parts of the videos can be accessed at https://mine.quarto.pub/quarto-dashboards .
You already analyze and summarize your data in computational notebooks with R and/or Python. What’s next? You can share your insights or allow others to make their own conclusions in eye-catching dashboards and straight-forward to author, design, and deploy Quarto Dashboards, regardless of the language of your data processing, visualization, analysis, etc. With Quarto Dashboards, you can create elegant and production-ready dashboards using a variety of components, including static graphics (ggplot2, Matplotlib, Seaborn, etc.), interactive widgets (Plotly, Leaflet, Jupyter Widgets, htmlwidgets, etc.), tabular data, value boxes, text annotations, and more. Additionally, with intelligent resizing of components, your Quarto Dashboards look great on devices of all sizes. And importantly, you can author Quarto Dashboards without leaving the comfort of your “home” – in plain text markdown with any text editor (VS Code, RStudio, Neovim, etc.) or any notebook editor (JupyterLab, etc.).
This workshop will walk you through building an increasingly complex dashboard using various layout options and deploy them as static web pages (with no special server required) as well as with a Shiny Server on the backend for enhanced interactivity.
This course is for you if you:
- do data analysis in computational notebooks
- share your results with your audience in static or interactive dashboards
- want to improve the design, user interface, and experience of your dashboards

Georgios Karamanis - From idea to code to image: Creative data visualizations in R
In this talk, we will walk through the process of converting an idea into a creative visualization in R and ggplot2, from finding inspiration to writing the code. We’ll look at handy tips to make the creative and coding process smoother, how to create more personal plots, as well as the importance (and fun!) of sharing your work with a great community.
Talk by Georgios Karamanis
Slides: https://github.com/gkaramanis/posit_conf_2024/blob/main/From%20idea%20to%20code%20to%20image%20-%20creative%20data%20visualizations%20in%20R%20-%20Georgios%20Karamanis.pdf GitHub Repo: https://github.com/gkaramanis/posit_conf_2024
Ask Hadley Anything
A unique opportunity to gain insights directly from a leading expert in open source data science and a driving force behind many popular R packages like ggplot2 and dplyr.
Links from the Q&A: gh-action webscraping demo: https://github.com/hadley/cran-deadlines tidyverse devday 2024: https://www.tidyverse.org/blog/2024/04/tdd-2024/
For the 3 questions on moving from SAS to R in Pharma: Posit and Atorus have partnered on a Posit Academy training: https://posit.co/blog/upskill-to-r-programming-with-posit-and-atorus-research/ And at least 3 pharma companies have shared resources to help people on the transition from statistical programming in SAS, to data science in R: Pfizer exercises: https://github.com/pfizer-opensource/pharma-hands-on-exercises Bayer SAS to R: https://bayer-group.github.io/sas2r/ Roche Coursera course: https://www.coursera.org/learn/making-data-science-work-for-clinical-reporting
Hadley Wickham - R in Production
R in Production by Hadley Wickham
Visit https://rstats.ai for information on upcoming conferences.
Abstract: In this talk, we delve into the strategic deployment of R in production environments, guided by three core principles to elevate your work from individual exploration to scalable, collaborative data science. The essence of putting R into production lies not just in executing code but in crafting solutions that are robust, repeatable, and collaborative, guided by three key principles:
-
Not just once: Successful data science projects are not a one-off, but will be run repeatedly for months or years. I’ll discuss some of the challenges for creating R scripts and applications that run repeatedly, handle new data seamlessly, and adapt to evolving analytical requirements without constant manual intervention. This principle ensures your analyses are enduring assets not throw away toys.
-
Not just my computer: the transition from development on your laptop (usually windows or mac) to a production environment (usually linux) introduces a number of challenges. Here, I’ll discuss some strategies for making R code portable, how you can minimise pain when something inevitably goes wrong, and few unresolved auth challenges that we’re currently working on.
-
Not just me: R is not just a tool for individual analysts but a platform for collaboration. I’ll cover some of the best practices for writing readable, understandable code, and how you might go about sharing that code with your colleagues. This principle underscores the importance of building R projects that are accessible, editable, and usable by others, fostering a culture of collaboration and knowledge sharing.
By adhering to these principles, we pave the way for R to be a powerful tool not just for individual analyses but as a cornerstone of enterprise-level data science solutions. Join me to explore how to harness the full potential of R in production, creating workflows that are robust, portable, and collaborative.
Bio: Hadley is Chief Scientist at Posit PBC, winner of the 2019 COPSS award, and a member of the R Foundation. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. His work includes packages for data science (like the tidyverse, which includes ggplot2, dplyr, and tidyr)and principled software development (e.g. roxygen2, testthat, and pkgdown). He is also a writer, educator, and speaker promoting the use of R for data science. Learn more on his website, http://hadley.nz .
Mastodon: https://fosstodon.org/@hadleywickham
Presented at the 2024 New York R Conference (May 17, 2024) Hosted by Lander Analytics (https://landeranalytics.com )

Hadley Wickham on R vs Python
Learn about tidyverse, ggplot2, and the secret to a tech company’s longevity as Hadley Wickham joins @JonKrohnLearns in this episode. He talks about Posit’s rebrand, why tidyverse needs to be in every data scientist’s toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career.
Watch the full interview “779: The Tidyverse of Essential R Libraries and their Python Analogues — with Dr. Hadley Wickham” here: https://www.superdatascience.com/779

779: The Tidyverse of Essential R Libraries and their Python Analogues — with Dr. Hadley Wickham
#Tidyverse #RProgramming #RLibraries
Tidyverse, ggplot2, and the secret to a tech company’s longevity: Hadley Wickham talks to @JonKrohnLearns about Posit’s rebrand, Tidyverse and why it needs to be in every data scientist’s toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career.
This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel) . Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn: • [00:00:00] Introduction • [00:02:55] All about the Tidyverse • [00:15:19] Hadley’s favorite R libraries • [00:28:39] The goal of Posit • [00:34:12] On bringing multiple programming languages together • [00:50:19] The principles for a long-lasting tech company • [00:53:34] How Hadley developed ggplot2 • [01:03:52] How to contribute to the open-source community
Additional materials: https://www.superdatascience.com/779

Adding a Touch of glitr: Developing a Package of Themes on Top of ggplot - posit::conf(2023)
Presented by Aaron Chafetz and Karishma Srikanth Please note, a power issue cut off the first five minutes of the talk.
Explore how our team at the US Agency for International Development (USAID) created our own data viz branding R package on top of ggplot2 and how you can too.
How do you create brand cohesion across your large team when it comes to data viz? Inspired by the BBC’s bbplot, our team at the US Agency for International Development (USAID) developed a package on top of ggplot2 to create a common look and feel for our team’s products. This effort improved not just the cohesiveness of our work, but also trustworthiness. By creating this package, we reduced the reliance on using defaults and the time spent on each project customizing numerous graphic elements. More importantly, this package provided an easier on-ramp for new teammates to adopt R. We share our journey within a federal agency developing a style guide and aim to guide and inspire other organizations who could benefit from developing their own branding package and guidance.
Materials:
- https://speakerdeck.com/achafetz/adding-a-touch-of-glitr
- https://usaid-oha-si.github.io/glitr/
- https://issuu.com/achafetz/docs/oha_styleguide
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Compelling design for apps and reports. Session Code: TALK-1103
Grammar of Graphics in Python with Plotnine - posit::conf(2023)
Presented by Hassan Kibirige
{plotnine} brings the elegance of {ggplot2} to the Python programming language. Learn about The Grammar of Graphics and get a feel of why it is an effective way to create Statistical Graphics.
ggplot2 is one of the most loved visualisation libraries. It implements a Grammar of Graphics system, which requires one to think about data in terms of columns of variables and how to transform them into geometric objects. It is elegant and powerful. This is a talk about plotnine, which brings the elegance of ggplot2 to the Python programming language. It is an invitation to learn about the Grammar of Graphics system and to appreciate it. It will include some tips on how to avoid common frustrations as you learn the system.
Materials:
- Website: https://plotnine.org
- Source Code: https://github.com/has2k1/plotnine
- Slides for this talk: https://github.com/has2k1/my-talks
Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference.#
Talk Track: Data science with Python. Session Code: TALK-1137

Tree maps are easy to make in R with ggplot2
See the code here! https://colorado.posit.co/rsc/tay-swift-tour/r.html#in-the-limelight #positshorts
Hadley Wickham @ Posit | Giving benefit to people using what you build | Data Science Hangout
We were recently joined by Hadley Wickham, Chief Scientist at Posit PBC. Listen in to hear our chat about building tools (like the tidyverse) to make data science easier, faster, and more fun.
36:57 - While I’m bought into developing open source packages to help deliver better processes, any advice to those of us doing that development in getting their company bought in?
You have to give some benefit to the people using (what you’re building)
You’ve got to either remove pain or add pleasure in some way because if you can’t do that and you’re not someone’s direct supervisor, it’s hard to get people to change.
The way I think about the tidyverse is, how do we give people some sort of quick wins so they can be motivated to do the things that are slower where they’re gonna have to learn some new ideas or some new tools. You kind of build up some equity with that person.
They build trust that you’ve helped them in the past and now they’re willing to invest a little bit more time before they see the payoff. But in the early days, it’s all about delivering payoffs as quickly as possible.
And I think if you’re doing, like, you know “my company’s first R package” - the easy pain points are: make themes for your company corporate style guide, make a ggplot2 theme, make an R Markdown, a Quarto theme. Make a Shiny theme that people can just use to get, you know, something that’s reasonably close to whatever your corporate style guide dictates.
That just feels like an easy win for people because it makes them look good inside the corporation and because you’ve put in all the hard work, it’s like three seconds for them to type the right function name to get the right theme.
I think the other bit is making it easier to get access to data. Set up some wrappers around DBI connections to the most important data sources. Provide some conventions around authentication so that stuff just works so that they’re not struggling with “What packages do I need to install? What’s the password? Where’s the path I need?” Just give them some, like, a list of the top ten most common data sources and people will love you by and large.
Follow-up question: Once you identify the things that you think would be useful for people - do you have a philosophy or a way in which you approach putting things together?
When you’re in an environment of scarcity when you’ve only got so much time that you can take out of your everyday job to invest in writing a package, it’s really tough to balance. Like, how do I add new stuff versus making sure the old stuff continues to work?
I think, again, some of it’s about building up trust. So, give people some wins so that when you inevitably break stuff, you’ve got some kind of cushion so people aren’t going to be really angry with you right away. They’re gonna be like, ok, well there’s a little bit of suffering now, but this person saved me so much time.
But yeah, it’s really hard. And particularly as you’re starting out, like, you’re going to make mistakes. That’s inevitable.
You’re going to do things that when you look back a year later, you’re like, why on earth did I do it that way? You’ll want to rip out the whole thing and ride it from scratch. And I think that if it feels horrible, you have to remember, that’s great. It means you’ve grown immensely as a programmer.
Certainly if you have my kind of mindset, you have to resist the temptation to rip things out and redo them as much as possible and just focus on making the next generation better rather than breaking what stuff people already have.
So I don’t have any great answers here, but I think you just have to think about those tensions of “how do I keep my forward velocity up while getting better as a programmer and evolving over time, but also thinking about how do you make the things you did a long time ago better?”
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)
Come hangout with us!

posit::conf(2023) Workshop: Engaging and Beautiful Data Visualizations with ggplot2
Register now: http://pos.it/conf Instructor: Cédric Scherer Workshop Duration: 1-Day Workshop
This course will be appropriate for you if you: • already know how to create basic graphics with the ggplot2 package • aim to improve the design of your ggplot outputs • want to learn how to create more complex charts which feature multiple layers, annotations, text styling, custom themes, and more
Creating effective and easily accessible data visualizations of high quality in an efficient and preferably reproducible way is an essential skill for everyone working in a data-related field. Luckily, by leveraging the functionality of ggplot2, the most famous package for data visualization with R, and related extension packages one can create highly customized data visualization without the need for post-processing.
This workshop provides everything one needs to know to create and customize numerous chart types with ggplot2. Participants will learn the most important steps and helpful tips to create visually appealing and informative graphics with a code-only approach. The power of ggplot2 and related extension packages will be illustrated with advanced real–life examples that help to understand useful coding tricks and the process of creating engaging and effective visualizations. The workshop will particularly focus on more advanced tasks with ggplot2 such as styling labels and titles, customizing themes and visual aesthetics, and using less-common chart types
posit::conf(2023) Workshop: Introduction to Data Science with R and Tidyverse
Register now: http://pos.it/conf Instructors: Posit Academy Instructors Workshop Duration: 2-Day Workshop
This course is ideal for: • those new to R or the Tidyverse • anyone who has dabbled in R, but now wants a rigorous foundation in up-to-date data science best practices • SAS and Excel users looking to switch their workflows to R
This is not a standard workshop, but a six-week online apprenticeship that culminates in two in-person days at posit::conf(2023). Begins August 7th, 2023. No knowledge of R required. Visit posit.co/academy to learn more about this uniquely effective learning format.
Here, you will learn the foundations of R and the Tidyverse under the guidance of a Posit Academy mentor and in the company of a close group of fellow learners. You will be expected to complete a weekly curriculum of interactive tutorials, and to attend a weekly presentation meeting with your mentor and fellow students. Topics will include the basics of R, importing data, visualizing data with ggplot2, wrangling data with dplyr and tidyr, working with strings, factors, and date-times, modelling data with base R, and reporting reproducibly with quarto
posit::conf(2023) Workshop: Steal like an Rtist: Creative Coding in R
Register now: http://pos.it/conf Instructors: Ijeamaka Anyene Fumagalli & Sharla Gelfand Workshop Duration: 1-Day Workshop
This workshop is for you if you: • are comfortable with R and RStudio, experience with tidyverse and ggplot2 • are interested in applying data visualization skills more creatively, but may not know where to start or how to develop style/inspiration • are an artist interested in exploring code as another medium for creating their work
R is a tool for data analysis but also can be used for self-expression. This workshop will be an introduction to creative coding in R in order to make visual art. We will take an inspiration-first approach, using compelling pieces to discuss and learn the techniques that shape the work. This workshop takes guidance from its namesake, the book “Steal Like An Artist” by Austin Kleon - once we have identified and learned to recreate existing works, we will cover how to take this inspiration and transform, remix, or reinterpret it in the pursuit of developing our own work and artistic styles.
This workshop is hands-on and will cover color theory and manipulation, a reintroduction of the data frame as the foundation for creating art (instead of just for analyzing data!), using ggplot2 as an artistic canvas, creating basic and specialized shapes, tiling and pattern making, developing your own functions and using iteration. We will also discuss how to use controlled randomness to convert a standalone piece into a generative art system that can produce many distinct outputs. Creative coding may seem a world apart from data analysis, but we see a large overlap and intersection of the skills used in both, not to mention the creative muscles that are already used in data visualization
posit::conf(2023) Workshop: Tidy time series and forecasting in R
Register now: http://pos.it/conf Instructor: Rob J Hyndman Workshop Duration: 2-Day Workshop
This course is for you if you: • already use the tidyverse packages in R such as dplyr, tidyr, tibble and ggplot2 • need to analyze large collections of related time series • would like to learn how to use some tidy tools for time series analysis including visualization, decomposition and forecasting
It is common for organizations to collect huge amounts of data over time, and existing time series analysis tools are not always suitable to handle the scale, frequency and structure of the data collected. In this workshop, we will look at some packages and methods that have been developed to handle the analysis of large collections of time series.
On day 1, we will look at the tsibble data structure for flexibly managing collections of related time series. We will look at how to do data wrangling, data visualizations and exploratory data analysis. We will explore feature-based methods to explore time series data in high dimensions. A similar feature-based approach can be used to identify anomalous time series within a collection of time series, or to cluster or classify time series. Primary packages for day 1 will be tsibble, lubridate and feasts (along with the tidyverse of course).
Day 2 will be about forecasting. We will look at some classical time series models and how they are automated in the fable package, and we will explore the creation of ensemble forecasts and hybrid forecasts. Best practices for evaluating forecast accuracy will also be covered. Finally, we will look at forecast reconciliation, allowing millions of time series to be forecast in a relatively short time while accounting for constraints on how the series are related
R-Ladies Rome (English) - What’s new in the tidyverse - Isabella Velasquez
Welcome to R-Ladies Rome Chapter!
What’s new in the tidyverse - Speaker: Isabella Velasquez
In this video, Isabella will tell you about What’s new in the tidyverse, a suite of packages that’s revolutionized data wrangling, visualization, and analysis. Recently, Tidyverse has undergone some changes and updates to make it even more user-friendly and powerful. The changes to Tidyverse include new packages, updates to existing ones, and improvements in performance and functionality. Some of the most notable updates include enhancements to package dependencies, performance improvements for specific functions such as group_by(), and the addition of new packages such as ggplot2, readr and dplyr.
You can find the latest news here: https://bit.ly/3z9BcMR To follow Isabella Velásquez: Twitter: twitter.com/ivelasq3 LinkedIn: linkedin.com/in/ivelasq/
Materials: GitHub repo: https://bit.ly/3LHVSmS Website: https://bit.ly/3M5gE03 The tidyverse blog: https://www.tidyverse.org/blog/
Open Source Chat - {gt} with Rich Iannone
Join Rich Iannone, maintainer of the {gt} package, as he takes questions from the community about the latest in {gt} v0.7.0, and building great looking data display tables with R.
Key Resources: ⬡ Get started with {gt} - https://gt.rstudio.com
Reach out: 38:48 - How do I ask Rich about {gt}, feature requests, bug reports, how to solve a problem via {gt}? Rich and the {gt} team would love to hear from you. ⬡ Feature requests & bug reports with GitHub Issues, https://github.com/rstudio/gt/issues ⬡ GitHub Discussions, https://github.com/rstudio/gt/discussions ⬡ Ask the community a question, https://community.rstudio.com/tag/gt ⬡ Follow {gt} on Twitter, feel free to reach out and ask questions, https://twitter.com/gt_package
Timestamps
Rich Iannone Introduction.
03:52 - Why {gt}? - What does {gt} bring to the table? Why so much effort into static, data display tables?
05:50 - Why open source? Why is {gt} open source and why have you dedicated your career to develop open source software?
08:30 - {gt} v0.7.0, Tell us about those new vector formatting functions in {gt}. Why did you include them? Could you show us some examples?
{gt}’s vector formatting functions help you customize the styling, look and feel of your values. Converting the output values R gives you, and making them look exactly the way you want them to can be tricky. A lot of work was put into {gt} to give nice value formatting options. You can now access all these outside of a gt table; e.g. in text, in a plot, etc.
22:35 - Could you provide an example or two with the new styling function called opt_stylize()? What kinds of tables can you make with that? Can you extend that with your own tweaks?
28:15 - Can you make your own themes and share them? “How do I create my own custom theme for my table? A theme I can share with the rest of my organization?”
31:58 - What is the distinction between tab_options and the opt_* functions? Why would a function be in opt_* and not tab_options?
34:00 - sub_values() function, to find and replace certain values in your table.
36:50 - What is the current support for latex in {gt} at the moment? “Personally, I much prefer HTML, but for scientific publications, we are asked to provide a LaTeX file.”
42:50 - “In my work, I often produce A4 output in PDF, mainly with ggplot2 content. It would be nice to be able to combine ggplot + gt tables in a similar way {patchwork} works. Having the plot and the table next to it is very useful sometimes.”
44:30 - Interactive Tables with {gt}?
47:45 - “Any plans to make applying of same style to several columns easier? Unless I’m mistaken, the locations argument of tab_style requires one to specify an individual column. See here: https://gt.rstudio.com/reference/tab_style.html#examples."
Yes, supply a vector of columns or use tidyselect functions.
49:15 - “Excel output with {gt}? Would be a huge improvement. I often have to produce tabular output that can be easily reused. Usually it means Excel tables. So far I have mainly done this with Python and openpyxl or PyWin32 (through COM). A simple solution in R would be great.”
50:20 - Support for additional output formats with {gt}? Excel, PowerPoint, etc.?
50:25 - {pointplank}, a package to methodically validate your data whether in the form of data frames or as database tables., https://rich-iannone.github.io/pointblank/
. Check out the workshop materials at https://github.com/rich-iannone/pointblank-workshop
55:50 - “Are there ways to have grouped rows? I mean when repeated rows have same characters can we merge them to one?”
58:00 - “Is there an ability to add ‘battleship coordinates’ (e.g. column letters & row numbers) to a gt object? This is a standard for table across my org and I’ve been trying to figure out how to implement it.”
59:59 “Do you have suggestions or examples of building out & applying corporate formatting to gt tables (e.g. adding a company logo, company colors, etc.)?”
01:04:30 - “With PDF/LaTeX output for wide tables, it does not shrink the table.”

Jacqueline Nolis | I made an entire e-commerce platform on Shiny | RStudio (2022)
E-commerce requires passing data between many components like managing a shopping cart, taking payment, fulfilling orders, and sending emails. I’ve successfully created a full e-commerce platform entirely in R for a quirky side project. The R package ggirl lets users order ggplot2 plots as postcards and more via R functions. Those R functions pass data to a separate Shiny app, which then passes data other services like Stripe payment APIs and printing APIs. In this talk I will walk through how to use packages like httr, callr, and brochure to have your Shiny apps call external services and do many tasks in parallel. You’ll leave the talk with more ways to use Shiny than dashboards plus the knowledge to monetize your existing dashboards!
Talk materials are available at https://link.jnolis.com/rstudio22-slides
Session: Unexpected uses of R
June Choe | Cracking open ggplot internals with {ggtrace} | RStudio (2022)
The inner workings of {ggplot2} are difficult to grasp even for experienced users because its internal object-oriented (ggproto) system is hidden from user- facing functions, by design. This is exacerbated by the foreignness of ggproto itself, which remains the largest hurdle in the user-to-developer transition. However this needs not to be the case: ggplot internals have clear parallels to data wrangling, where data is passed between methods that take inputs and return outputs. Capitalizing on this connection, package {ggtrace} exposes the familiar functional programming logic of ggplot with functions that inspect, capture, or modify steps in a ggplot object’s execution pipeline, enabling users to learn the internals through trial-and-error.
Talk materials are available at https://github.com/yjunechoe/ggtrace-rstudioconf2022
Session: Just typing R code: advanced R programming
Data Science Hangout | Mike Smith, Pfizer | Building an R Center of Excellence
We were joined by Mike Smith, Senior Director, Pfizer R&D UK Ltd at the Data Science Hangout - a weekly, free-to-join open conversation for the data science community. If you’d like to join us live, you can add it to your calendar here: rstd.io/datasciencehangout
Mike shared with us all that they are building up a Center of Excellence at Pfizer to help teams across the business build reproducible workflows and use analytics tools effectively & efficiently.
What led to the creation of the CoE within Pfizer and how could we do something similar?
Mike: ⬢ Last year before R/Pharma, we did a poll & found that 1,500+ colleagues had downloaded R. I wanted to service & build up that community to find out what other people are doing and share that. (2:45)
⬢ We’re a very decentralized disparate team, so there are subject matter experts (SMEs) throughout the organization. The Center of Excellence is focused on building connections between SMEs and helping the teams where there isn’t an SME available.
⬢ What we saw was that it’s hard to sometimes get an effective strategy across people in such a big company. We also saw that there were other places within the organization that wanted data science work but they didn’t have an R subject matter expert there. We want to be able to help them solve their problems and set them up with a proof of concept that they can tweak.
33:52 -
Ok so how to do this?
⬢ Find out how many people are using the tools and who you could help.
⬢ Be that translator role between the business people who need solutions with the technical side - folks who are building things.
Communicate the value:
⬢ We may have a bunch of people trying to write the same function or access the same data. We could solve this problem once and then make that into a package and serve that out to everybody and streamline their workflow for the future.
⬢ There’s a benefit in being able to solve problems strategically. We’re trying to build the lego pieces so that the next time we see a problem like this, we can use that. We can also offer this as a package or via something that allows other people to solve that problem for themselves.
Talk to someone who has experience in this, other community builders
⬢ Doug Robinson helped start this at Pfizer because he had set-up something like this at Novartis before as well. Talking with someone who has done this before is really helpful because they have the experience of : who do we need to tell, what do we need to tell them, what’s our purpose for being, who do you have to speak to and convince. That has to be ready to go.
Find a champion in leadership:
⬢ We went to the head of Statistical programming and said we’d like to do something like this. Fortunately, she was 110% supportive here.
How did they phrase this CoE at Pfizer?
⬢ Check out this description from the job post: https://lnkd.in/g776nYVF
Resources shared: Ethan shared: I saw on RStudio blog the other day the {sassy} system for SAS programmer transitioning to R: https://sassy.r-sassy.org/index.html Tatsu shared: For folks that have RStudio Connect and Tableau, there’s now a supported integration https://www.rstudio.com/blog/dynamic-r-and-python-models-in-tableau-using-plumbertableau/ Tatsu shared the Working with IT section of the champion site: https://www.rstudio.com/champion/working-with-it Mike’s Bandcamp: https://mikeksmith.bandcamp.com/ R Consortium Pharma Working Groups: https://www.r-consortium.org/projects/isc-working-groups R in Pharma Conference: https://rinpharma.com/ Upcoming Pharma meetup with Merck: https://youtu.be/RBVqKi3FV30
Question about style guides: Jesus shared: Tidyverse Style Guide: https://style.tidyverse.org/ Jesus shared: One guide overall guide on better clean R code is the contributing.md of the ggplot2 package: https://github.com/tidyverse/ggplot2/blob/main/CONTRIBUTING.md Sam shared: Efficient R Programming book that Colin wrote: https://csgillespie.github.io/efficientR/
Where to find more?
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout
Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
Mara Averick & Maya Gans | Data Visualization Accessibility | RStudio Meetup
2:55 - A11Y in R: Adapting Sarah L. Fossheim’s 10 dos and don’ts to keep in mind when designing accessible data visualizations | Maya Gans 30:11 - Adventures with {highcharter} and the Highcharts accessibility module | Mara Averick 58:07 - Q&A
A11Y in R: Adapting Sarah L. Fossheim’s 10 dos and don’ts to keep in mind when designing accessible data visualizations | Presented by Maya Gans
Abstract: This talk will use R based visualizations and walk through examples of accessibility considerations when making plots and applications.
Speaker Bio: Maya Gans is a Data Visualization Engineer at Atorus Research where she develops custom applications using R and JavaScript. As an RStudio intern she designed TidyBlocks, a visual block based programming language. Maya also co-wrote JavaScript for Data Science. Maya uses ggplot2 and d3.js to create music related infographics for JamBase.com. When Maya’s not coding, she’s climbing mountains.
Adventures with {highcharter} and the Highcharts accessibility module | Presented by Mara Averick
Abstract: Lessons learned about accessibility in data visualization through using the {highcharter} R package and the Highcharts visualization library’s accessibility module.
Speaker bio: Mara is a developer advocate at RStudio. She is the author of neither the highcharter package nor the Highcharts charting library, but enjoys using both to make interactive, accessible data visualizations.
So many amazing resources shared yesterday on data visualization accessibility:
Sarah L Fossheim’s blog - intro to designing accessible data viz: https://lnkd.in/dAAXfE35 Coblis - Color Blindness Simulator: https://lnkd.in/dJT-hJE4 Web Content Accessibility Guidelines (WCAG): https://lnkd.in/dJKmvFTq Color Contrast Accessibility Validator: https://color.a11y.com/ Google lighthouse (automated tool for improving quality of web pages): https://lnkd.in/d8xjSN5i A11y Project Checklist: https://lnkd.in/diFM_TBd Chartability (questions) for ensuring data visualizations, systems, and interfaces are accessible: https://lnkd.in/d_7wk3zx Accessible {highcharter} GitHub repo (Mara’s charts, and source .Rmds): https://lnkd.in/dhvBwQ-f Mara’s blog post series: https://lnkd.in/d9xz6VZ6 10 Guidelines for DataViz Accessibility by Øystein Moseng: https://lnkd.in/dS-XsKxw Accessible visualization via natural language descriptions by Alan Lundgard and Arvind Satyanarayan: https://lnkd.in/dpdN4skK DataViz Accessibility Advocacy and Advisory Group: https://lnkd.in/d336ACn3 Alt-texts: The Ultimate Guide by Daniel Göransson: https://lnkd.in/dsHcvPs2 JooYoung Seo’s Talk on non visual interactions with R packages: https://lnkd.in/dcid56BT Accessible Data Science for the Blind Using R: https://lnkd.in/dTWZbau8 Maya’s blog on skip links: https://lnkd.in/dFTYFxTk Twitter alt text: https://lnkd.in/d9bKqiPU Twitter account to follow: @alttextreminder Silvia Canelón’s blog posts: https://lnkd.in/drwbE2Rf
Packages shared: ggpattern: https://lnkd.in/dyTBvvz4 gglabeler: https://lnkd.in/dumA8Um8 gghighlight: https://lnkd.in/d_m25j7x sonfiy: https://lnkd.in/dyPwHimP tuneR: https://lnkd.in/dWi2WZH8 brailleR package - https://lnkd.in/d_75cdnQ
Caleb brought up a great point that visualizing data isn’t new, so it cam be helpful to look at adjacent disciplines to see ways people have solved these before as well. For example, cartographers have had really creative ways to make things visible.
Sarah Belle, cartographer who makes fonts / typography really legible on maps: sarahbellmaps.com/belltopo-sans-font-by-sarah-bell/ Cynthia Brewer’s work on color palettes. https://colorbrewer2.org/# cartographers who use 3d printing for tactile maps: https://touch-mapper.org/en/
Tom Mock | A Gentle Introduction to Tidy Statistics in R | RStudio (2019)
R is a fantastic language for statistical programming, but making the jump from point and click interfaces to code can be intimidating for individuals new to R. In this webinar I will gently cover how to get started quickly with the basics of research statistics in R, providing an emphasis on reading data into R, exploratory data analysis with the Tidyverse, statistical testing with ANOVAs, and finally producing a publication-ready plot in ggplot2.
Use the code presented instantly on RStudio Cloud!
RStudio Cloud: rstudio.cloud Webinar materials: https://rstudio.com/resources/webinars/a-gentle-introduction-to-tidy-statistics-in-r/
About Thomas: Thomas is involved in the local and global data science community, serving as Outreach Coordinator for the Dallas R User Group, as a mentor for the R for Data Science Online Learning Community, as co-founder of #TidyTuesday, attending various Data Science and R-related conferences/meetups, and participated in Startup Weekend Fort Worth as a data scientist/entrepreneur
Mike Garcia | R in Pharma: Intro to Shiny | Posit
Slides: https://garciamikep.github.io/rstudioglobal-2021-shiny-slides/slides.html#1
From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: in this introduction to Shiny app development, we begin with a quick review of visualization with {ggplot2} and then cover core concepts in app structure and reactive programming. After building several Shiny apps of increasing complexity, we wrap up with a demonstration of how to include your Shiny app in a dashboard using the {flexdashboard} package.
About Mike Garcia: Mike is a Data Science Consultant with ProCogia, with a background in Biostatistics and experience in clinical trial design and public health research. If not geeking out on data with a cup of coffee and spreading his passion for R, he’s probably out enjoying the outdoors.
Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
To hear more about how other major pharmaceutical companies are transitioning to open source data science you can watch talks from this year’s R in Pharma conference: https://www.youtube.com/@RinPharma/playlists
At Posit, we have a dedicated Pharma team to help organizations migrate and utilize open source for drug development. To learn more about our support for life sciences, please see our dedicated Pharma page where you can book a call with our team. (https://posit.co/solutions/pharma )
Hadley Wickham | testthat 3.0.0 | RStudio (2020)
In this webinar, I’ll introduce some of the major changes coming in testthat 3.0.0. The biggest new idea in testthat 3.0.0 is the idea of an edition. You must deliberately choose to use the 3rd edition, which allows us to make breaking changes without breaking old packages. testthat 3e deprecates a number of older functions that we no longer believe are a good idea, and tweaks the behaviour of expect_equal() and expect_identical() to give considerably more informative output (using the new waldo package).
testthat 3e also introduces the idea of snapshot tests which record expected value in external files, rather than in code. This makes them particularly well suited to testing user output and complex objects. I’ll show off the main advantages of snapshot testing, and why it’s better than our previous approaches of verify_output() and expect_known_output().
Finally, I’ll go over a bunch of smaller quality-of-life improvements, including tweaks to test reporting and improvements to expect_error(), expect_warning() and expect_message().
Webinar materials: https://rstudio.com/resources/webinars/testthat-3/
About Hadley: Hadley Wickham is the Chief Scientist at RStudio, a member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. You may be familiar with his packages for data science (the tidyverse: including ggplot2, dplyr, tidyr, purrr, and readr) and principled software development (roxygen2, testthat, devtools, pkgdown). Much of the material for the course is drawn from two of his existing books, Advanced R and R Packages, but the course also includes a lot of new material that will eventually become a book called “Tidy tools”

Nicole Kramer | A New Paradigm for Multifigure Coordinate-Based Plotting in R | RStudio
R is unparalleled in its ability to transform raw data into a wide array of beautiful graphics, all within the same environment. However, when it comes to complex, multi-paneled plots, users rely on 3rd party graphic design software to arrange plots. Here I present the new world of programmatic, coordinate-based multi-figure plotting in R. Employing grid Graphics and drawing from the paradigms of base plotting and ggplot2, I am developing a package that will revolutionize the way plots are laid out in R. Not only will individual plots be aesthetically customizable and tailored for speed, users will also be offered exquisite control over all aspects of page layout, plot placement, and arrangements. Come join me in changing how we plot in R!
About Nicole: Nicole Kramer is a third year Bioinformatics and Computational Biology graduate student at the University of North Carolina at Chapel Hill. She works in the lab of Dr. Doug Phanstiel , where her and her colleagues use experimental and computational techniques to study human genomics. Prior to grad school, Nicole received her B.S. in Biological Engineering from MIT in 2018. When not doing science, you can find Nicole petting dogs, admiring giraffes, or knitting tiny animals!
Jake Thompson | Branding and Packaging Reports with R Markdown | RStudio (2020)
The creation of research reports and manuscripts is a critical aspect of the work conducted by organizations and individual researchers. Most often, this process involves copying and pasting output from many different analyses into a separate document. Especially in organizations that produce annual reports for repeated analyses, this process can also involve applying incremental updates to annual reports. It is important to ensure that all relevant tables, figures, and numbers within the text are updated appropriately. Done manually, these processes are often error prone and inefficient. R Markdown is ideally suited to support these tasks. With R Markdown, users are able to conduct analyses directly in the document or read in output from a separate analyses pipeline. Tables, figures, and in-line results can then be dynamically populated and automatically numbered to ensure that everything is correctly updated when new data is provided. Additionally, the appearance of documents rendered with R Markdown can be customized to meet specific branding and formatting requirements of organizations and journals. In this presentation, we will present one implementation of customized R Markdown reports used for Accessible Teaching, Learning, and Assessment Systems (ATLAS) at the University of Kansas. A publicly available R package, ratlas, provides both Microsoft Word and LaTeX templates for different types of projects at ATLAS with their own unique formatting requirements. We will discuss how to create brand-specific templates, as well as how to incorporate the templates into an R package that can be used to unify report creation across an organization. We will also describe other components of branding reports beyond R Markdown templates, including customized ggplot2 themes, which can also be wrapped into the R package. Finally, we will share lessons learned from incorporating the R package workflow into an existing reporting pipeline. https://rstudio.com/resources/rstudioconf-2020/branding-and-packaging-reports-with-r-markdown/
Kara Woo | Boxplots: a case study in debugging and perseverance | RStudio (2019)
Come on a journey through pull request #2196. What started as a seemingly simple fix for a bug in ggplot2’s box plots developed into an entirely new placement algorithm for ggplot2 geoms. This talk will cover tips and techniques for debugging, testing, and not smashing your computer when dealing with tricky bugs.
VIEW MATERIALS https://github.com/karawoo/2019-01-17-rstudioconf
About the Author Kara Woo Kara is a research scientist in data curation at Sage Bionetworks, where she helps other researchers document and share their data. She has previously worked as an information manager at Washington State University and at the National Center for Ecological Analysis and Synthesis (NCEAS), where she combined data management with fieldwork at a remote Siberian lake. Kara is an enthusiastic R programmer, and collects data visualizations gone beautifully wrong on a blog called accidental aRt
Thomas Lin Pedersen | gganimate live cookbook | RStudio (2019)
Animation of data visualisation is becoming increasingly popular both as an attention grabber on social media and as a way to tell small data stories. gganimate is a package that extends ggplot2 for making animations and provides a grammar of animation on top of the grammar of graphics. This talk will quickly introduce gganimate, and then dive into a series of different animation and show how they were made and how they could be changed or expanded.
Slides: https://data-imaginist.com/slides/rstudioconf2019 4 Resources: https://resources.rstudio.com/rstudio-conf-2019/gganimate-live-cookbook 4 Discussion https://community.rstudio.com/t/gganimate-live-cookbook-thomas-lin-pedersen-rstudio-conf-2019l-video/24852

Edzer Pebesma | Spatial data science in the Tidyverse | RStudio (2019)
Package sf (simple feature) and ggplot2::geom_sf have caused a fast uptake of tidy spatial data analysis by data scientists. Important spatial data science challenges are not handled by them, including raster and vector data cubes (e.g. socio-economic time series, satellite imagery, weather forecast or climate predictions data), and out-of-memory datasets. Powerful methods to analyse such datasets have been developed in packages stars (spatiotemporal tidy arrays) and tidync (tidy analysis of NetCDF files). This talk discusses how the simple feature and tidy data frameworks are extended to handle these challenging data types, and shows how R can be used for out-of-memory spatial and spatiotemporal datasets using tidy concepts.
VIEW MATERIALS https://edzer.github.io/rstudio_conf/2019/index.html
About the Author Edzer Pebesma I lead the spatio-temporal modelling laboratory at the institute for geoinformatics. I hold a PhD in geosciences, and am interested in spatial statistics, environmental modelling, geoinformatics and GI Science, semantic technology for spatial analysis, optimizing environmental monitoring, but also in e-Science and reproducible research. I am an ordinary member of the R foundation. I am one of the authors of Applied Spatial Data Analysis with R (second edition), am Co-Editor-in-Chief for the Journal of Statistical Software, and associate editor for Spatial Statistics. I believe that research is useful in particular when it helps solving real-world problems
Claus Wilke | Visualizing uncertainty with hypothetical outcomes plots | RStudio (2019)
Uncertainty is a key component of statistical inference. However, uncertainty is not easy to convey effectively in data visualizations. For example, viewers have a tendency to interpret visualizations of the most likely outcome as the only possible one. Viewers may also misjudge the likelihood of different possible outcomes or the extent to which moderately rare outcomes may deviate from the expectation. One way in which we can help the viewer grasp the amount of uncertainty present in a dataset is by showing a variety of different possible modeling outcomes at once. For example, in a linear regression, we could plot a number of different regression lines with slopes and intercepts drawn from the range of likely values, as determined by the variation in the data. Such visualizations are called Hypothetical Outcomes Plots (HOPs). HOPs can be made in static form, showing the various hypothetical outcomes all at once, or preferably in an animated form, where the display cycles between the different hypothetical outcomes. With recent progress in ggplot2-based animation, via gganimate, as well as packages such as tidybayes that make it easy to generate hypothetical outcomes, we can easily produce animated HOPs in a few lines of R code. This presentation will cover the key concepts, packages, and techniques to generate such visualizations.
VIEW MATERIALS: https://docs.google.com/presentation/d/1zMuBSADaxdFnosOPWJNA10DaxGEheW6gDxqEPYAuado/edit?usp=sharing


