ggplot2

Open Source in Pharma | Harvey Lieberman | Data Science Hangout

To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We’d love to see you!

We were recently joined by Harvey Lieberman, Associate Director of Data Science at Novartis, to chat about R/Pharma, automating processes, career advice, and data science in drug discovery vs. development.

In this Hangout, Harvey talks about a lot of things, like the power of automating processes. He shares examples of how automating mundane tasks can save significant time and identify errors that humans might miss (we all know human error is a thing!). For instance, he automated the analysis of data from 48 Excel sheets that had previously taken a colleague about three months to process by hand; Harvey completed the automated analysis in one hour over lunch and found copying and pasting errors in the original manual process! Automating processes not only increases efficiency but can also help move people into more data-focused roles. Harvey suggests demonstrating that automation speeds things up and, most importantly, removes errors, which is when people start to pay attention and get interested.

Resources mentioned in the video and zoom chat: R/Pharma website → https://rinpharma.com/ Cecilia Baldoni’s scrollytelling project (on shrews!) → https://cecibaldoni.github.io/projects.html Advent of Code → https://adventofcode.com/ Pharmaverse.org (pharmaceutical R packages) → https://pharmaverse.org GSK’s Journey to R → https://www.youtube.com/watch?v=xDrt6txplek Roche’s Journey to R → https://www.youtube.com/watch?v=BlJNILSoZlM R/Pharma March 2025 newsletter (LinkedIn) → https://www.linkedin.com/pulse/rpharma-march-2025-newsletter-open-source-in-pharma-wmf5c/ ggplot2 extenders club → https://ggplot2-extenders.github.io/ggplot-extension-club/ Coursera: Making Data Science Work for Clinical Reporting Course → https://www.coursera.org/learn/making-data-science-work-for-clinical-reporting hiring.cafe (for finding R jobs) → https://hiring.cafe/ Posit’s PydyTuesday GitHub → https://github.com/posit-dev/python-tidytuesday Joy’s Law (management concept) Wikipedia → https://en.wikipedia.org/wiki/Joy%27s_law_(management)

If you didn’t join live, one great discussion you missed from the zoom chat was about the diverse backgrounds of attendees. Many participants shared that they came to data science “sideways,” holding degrees in fields such as sociology, psychology, mathematics, atmospheric science, education, history, chemistry, and various engineering disciplines, rather than traditional statistics or computational degrees. so many data scientists have non-traditional paths into the field! But we’re all better together.

► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu

Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co

Thanks for hanging out with us!

Janssens, Chow & Nieuwdorp - Turning DataFrames into Pretty Pictures with Plotnine | PyData NYC 2024

www.pydata.org

Learn how Plotnine, a Python package inspired by R’s ggplot2, enables the creation of sophisticated and effective data visualizations with minimal effort. This tutorial will explain how Plotnine’s grammar of graphics approach provides a flexible, intuitive way to visualize data, either as ad-hoc plots or fine-tuned graphs suited for communication.

Quick links Presentation: https://bit.ly/plotnine-tutorial GitHub repository: https://bit.ly/plotnine-repo Slideshow about what to expect: https://bit.ly/expect-plotnine

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps

Quarto Dashboards 3: Theming and Styling | Mine Çetinkaya-Rundel | Posit

Theming and styling Quarto dashboards built with R and/or Python.

Before watching this video, you might want to watch Parts 1 & 2.

This video takes you through

0:00 - Theming (including Bootswatch themes, light/dark mode, customizing themes with SCSS) 3:55 - Styling 4:55 - Live coding demo

Slides can be found at https://mine.quarto.pub/quarto-dashboards/3-theming-styling and the starter documents for the accompanying exercises at https://github.com/mine-cetinkaya-rundel/olympicdash .

Materials for all parts of the videos can be accessed at https://mine.quarto.pub/quarto-dashboards .

You already analyze and summarize your data in computational notebooks with R and/or Python. What’s next? You can share your insights or allow others to make their own conclusions in eye-catching dashboards and straight-forward to author, design, and deploy Quarto Dashboards, regardless of the language of your data processing, visualization, analysis, etc. With Quarto Dashboards, you can create elegant and production-ready dashboards using a variety of components, including static graphics (ggplot2, Matplotlib, Seaborn, etc.), interactive widgets (Plotly, Leaflet, Jupyter Widgets, htmlwidgets, etc.), tabular data, value boxes, text annotations, and more. Additionally, with intelligent resizing of components, your Quarto Dashboards look great on devices of all sizes. And importantly, you can author Quarto Dashboards without leaving the comfort of your “home” – in plain text markdown with any text editor (VS Code, RStudio, Neovim, etc.) or any notebook editor (JupyterLab, etc.).

This workshop will walk you through building an increasingly complex dashboard using various layout options and deploy them as static web pages (with no special server required) as well as with a Shiny Server on the backend for enhanced interactivity.

This course is for you if you:

do data analysis in computational notebooks
share your results with your audience in static or interactive dashboards
want to improve the design, user interface, and experience of your dashboards

Mine Çetinkaya-Rundel

ggplot2 leaflet Quarto rstudio Shiny Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Hadley Wickham - R in Production

R in Production by Hadley Wickham

Visit https://rstats.ai for information on upcoming conferences.

Abstract: In this talk, we delve into the strategic deployment of R in production environments, guided by three core principles to elevate your work from individual exploration to scalable, collaborative data science. The essence of putting R into production lies not just in executing code but in crafting solutions that are robust, repeatable, and collaborative, guided by three key principles:

Not just once: Successful data science projects are not a one-off, but will be run repeatedly for months or years. I’ll discuss some of the challenges for creating R scripts and applications that run repeatedly, handle new data seamlessly, and adapt to evolving analytical requirements without constant manual intervention. This principle ensures your analyses are enduring assets not throw away toys.
Not just my computer: the transition from development on your laptop (usually windows or mac) to a production environment (usually linux) introduces a number of challenges. Here, I’ll discuss some strategies for making R code portable, how you can minimise pain when something inevitably goes wrong, and few unresolved auth challenges that we’re currently working on.
Not just me: R is not just a tool for individual analysts but a platform for collaboration. I’ll cover some of the best practices for writing readable, understandable code, and how you might go about sharing that code with your colleagues. This principle underscores the importance of building R projects that are accessible, editable, and usable by others, fostering a culture of collaboration and knowledge sharing.

By adhering to these principles, we pave the way for R to be a powerful tool not just for individual analyses but as a cornerstone of enterprise-level data science solutions. Join me to explore how to harness the full potential of R in production, creating workflows that are robust, portable, and collaborative.

Bio: Hadley is Chief Scientist at Posit PBC, winner of the 2019 COPSS award, and a member of the R Foundation. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. His work includes packages for data science (like the tidyverse, which includes ggplot2, dplyr, and tidyr)and principled software development (e.g. roxygen2, testthat, and pkgdown). He is also a writer, educator, and speaker promoting the use of R for data science. Learn more on his website, http://hadley.nz .

Mastodon: https://fosstodon.org/@hadleywickham

Presented at the 2024 New York R Conference (May 17, 2024) Hosted by Lander Analytics (https://landeranalytics.com )

Hadley Wickham

posit::conf(2023) Workshop: Engaging and Beautiful Data Visualizations with ggplot2

Register now: http://pos.it/conf Instructor: Cédric Scherer Workshop Duration: 1-Day Workshop

This course will be appropriate for you if you: • already know how to create basic graphics with the ggplot2 package • aim to improve the design of your ggplot outputs • want to learn how to create more complex charts which feature multiple layers, annotations, text styling, custom themes, and more

Creating effective and easily accessible data visualizations of high quality in an efficient and preferably reproducible way is an essential skill for everyone working in a data-related field. Luckily, by leveraging the functionality of ggplot2, the most famous package for data visualization with R, and related extension packages one can create highly customized data visualization without the need for post-processing.

This workshop provides everything one needs to know to create and customize numerous chart types with ggplot2. Participants will learn the most important steps and helpful tips to create visually appealing and informative graphics with a code-only approach. The power of ggplot2 and related extension packages will be illustrated with advanced real–life examples that help to understand useful coding tricks and the process of creating engaging and effective visualizations. The workshop will particularly focus on more advanced tasks with ggplot2 such as styling labels and titles, customizing themes and visual aesthetics, and using less-common chart types

ggplot2 Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

posit::conf(2023) Workshop: Introduction to Data Science with R and Tidyverse

Register now: http://pos.it/conf Instructors: Posit Academy Instructors Workshop Duration: 2-Day Workshop

This course is ideal for: • those new to R or the Tidyverse • anyone who has dabbled in R, but now wants a rigorous foundation in up-to-date data science best practices • SAS and Excel users looking to switch their workflows to R

This is not a standard workshop, but a six-week online apprenticeship that culminates in two in-person days at posit::conf(2023). Begins August 7th, 2023. No knowledge of R required. Visit posit.co/academy to learn more about this uniquely effective learning format.

Here, you will learn the foundations of R and the Tidyverse under the guidance of a Posit Academy mentor and in the company of a close group of fellow learners. You will be expected to complete a weekly curriculum of interactive tutorials, and to attend a weekly presentation meeting with your mentor and fellow students. Topics will include the basics of R, importing data, visualizing data with ggplot2, wrangling data with dplyr and tidyr, working with strings, factors, and date-times, modelling data with base R, and reporting reproducibly with quarto

dplyr ggplot2 Quarto tidyr tidyverse Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

posit::conf(2023) Workshop: Steal like an Rtist: Creative Coding in R

Register now: http://pos.it/conf Instructors: Ijeamaka Anyene Fumagalli & Sharla Gelfand Workshop Duration: 1-Day Workshop

This workshop is for you if you: • are comfortable with R and RStudio, experience with tidyverse and ggplot2 • are interested in applying data visualization skills more creatively, but may not know where to start or how to develop style/inspiration • are an artist interested in exploring code as another medium for creating their work

R is a tool for data analysis but also can be used for self-expression. This workshop will be an introduction to creative coding in R in order to make visual art. We will take an inspiration-first approach, using compelling pieces to discuss and learn the techniques that shape the work. This workshop takes guidance from its namesake, the book “Steal Like An Artist” by Austin Kleon - once we have identified and learned to recreate existing works, we will cover how to take this inspiration and transform, remix, or reinterpret it in the pursuit of developing our own work and artistic styles.

This workshop is hands-on and will cover color theory and manipulation, a reintroduction of the data frame as the foundation for creating art (instead of just for analyzing data!), using ggplot2 as an artistic canvas, creating basic and specialized shapes, tiling and pattern making, developing your own functions and using iteration. We will also discuss how to use controlled randomness to convert a standalone piece into a generative art system that can produce many distinct outputs. Creative coding may seem a world apart from data analysis, but we see a large overlap and intersection of the skills used in both, not to mention the creative muscles that are already used in data visualization

ggplot2 rstudio tidyverse Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

posit::conf(2023) Workshop: Tidy time series and forecasting in R

Register now: http://pos.it/conf Instructor: Rob J Hyndman Workshop Duration: 2-Day Workshop

This course is for you if you: • already use the tidyverse packages in R such as dplyr, tidyr, tibble and ggplot2 • need to analyze large collections of related time series • would like to learn how to use some tidy tools for time series analysis including visualization, decomposition and forecasting

It is common for organizations to collect huge amounts of data over time, and existing time series analysis tools are not always suitable to handle the scale, frequency and structure of the data collected. In this workshop, we will look at some packages and methods that have been developed to handle the analysis of large collections of time series.

On day 1, we will look at the tsibble data structure for flexibly managing collections of related time series. We will look at how to do data wrangling, data visualizations and exploratory data analysis. We will explore feature-based methods to explore time series data in high dimensions. A similar feature-based approach can be used to identify anomalous time series within a collection of time series, or to cluster or classify time series. Primary packages for day 1 will be tsibble, lubridate and feasts (along with the tidyverse of course).

Day 2 will be about forecasting. We will look at some classical time series models and how they are automated in the fable package, and we will explore the creation of ensemble forecasts and hybrid forecasts. Best practices for evaluating forecast accuracy will also be covered. Finally, we will look at forecast reconciliation, allowing millions of time series to be forecast in a relatively short time while accounting for constraints on how the series are related

dplyr ggplot2 lubridate tibble tidyr tidyverse Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Open Source Chat - {gt} with Rich Iannone

Join Rich Iannone, maintainer of the {gt} package, as he takes questions from the community about the latest in {gt} v0.7.0, and building great looking data display tables with R.

Key Resources: ⬡ Get started with {gt} - https://gt.rstudio.com

Reach out: 38:48 - How do I ask Rich about {gt}, feature requests, bug reports, how to solve a problem via {gt}? Rich and the {gt} team would love to hear from you. ⬡ Feature requests & bug reports with GitHub Issues, https://github.com/rstudio/gt/issues ⬡ GitHub Discussions, https://github.com/rstudio/gt/discussions ⬡ Ask the community a question, https://community.rstudio.com/tag/gt ⬡ Follow {gt} on Twitter, feel free to reach out and ask questions, https://twitter.com/gt_package

Timestamps Rich Iannone Introduction. 03:52 - Why {gt}? - What does {gt} bring to the table? Why so much effort into static, data display tables? 05:50 - Why open source? Why is {gt} open source and why have you dedicated your career to develop open source software? 08:30 - {gt} v0.7.0, Tell us about those new vector formatting functions in {gt}. Why did you include them? Could you show us some examples? {gt}’s vector formatting functions help you customize the styling, look and feel of your values. Converting the output values R gives you, and making them look exactly the way you want them to can be tricky. A lot of work was put into {gt} to give nice value formatting options. You can now access all these outside of a gt table; e.g. in text, in a plot, etc. 22:35 - Could you provide an example or two with the new styling function called opt_stylize()? What kinds of tables can you make with that? Can you extend that with your own tweaks? 28:15 - Can you make your own themes and share them? “How do I create my own custom theme for my table? A theme I can share with the rest of my organization?” 31:58 - What is the distinction between tab_options and the opt_* functions? Why would a function be in opt_* and not tab_options? 34:00 - sub_values() function, to find and replace certain values in your table. 36:50 - What is the current support for latex in {gt} at the moment? “Personally, I much prefer HTML, but for scientific publications, we are asked to provide a LaTeX file.” 42:50 - “In my work, I often produce A4 output in PDF, mainly with ggplot2 content. It would be nice to be able to combine ggplot + gt tables in a similar way {patchwork} works. Having the plot and the table next to it is very useful sometimes.” 44:30 - Interactive Tables with {gt}? 47:45 - “Any plans to make applying of same style to several columns easier? Unless I’m mistaken, the locations argument of tab_style requires one to specify an individual column. See here: https://gt.rstudio.com/reference/tab_style.html#examples." Yes, supply a vector of columns or use tidyselect functions. 49:15 - “Excel output with {gt}? Would be a huge improvement. I often have to produce tabular output that can be easily reused. Usually it means Excel tables. So far I have mainly done this with Python and openpyxl or PyWin32 (through COM). A simple solution in R would be great.” 50:20 - Support for additional output formats with {gt}? Excel, PowerPoint, etc.? 50:25 - {pointplank}, a package to methodically validate your data whether in the form of data frames or as database tables., https://rich-iannone.github.io/pointblank/ . Check out the workshop materials at https://github.com/rich-iannone/pointblank-workshop 55:50 - “Are there ways to have grouped rows? I mean when repeated rows have same characters can we merge them to one?” 58:00 - “Is there an ability to add ‘battleship coordinates’ (e.g. column letters & row numbers) to a gt object? This is a standard for table across my org and I’ve been trying to figure out how to implement it.” 59:59 “Do you have suggestions or examples of building out & applying corporate formatting to gt tables (e.g. adding a company logo, company colors, etc.)?” 01:04:30 - “With PDF/LaTeX output for wide tables, it does not shrink the table.”

Rich Iannone

ggplot2 gt pointblank rstudio tidyselect Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Mike Garcia | R in Pharma: Intro to Shiny | Posit

Slides: https://garciamikep.github.io/rstudioglobal-2021-shiny-slides/slides.html#1

From rstudio::global(2021) Pharma X-Sessions, sponsored by ProCogia: in this introduction to Shiny app development, we begin with a quick review of visualization with {ggplot2} and then cover core concepts in app structure and reactive programming. After building several Shiny apps of increasing complexity, we wrap up with a demonstration of how to include your Shiny app in a dashboard using the {flexdashboard} package.

About Mike Garcia: Mike is a Data Science Consultant with ProCogia, with a background in Biostatistics and experience in clinical trial design and public health research. If not geeking out on data with a cup of coffee and spreading his passion for R, he’s probably out enjoying the outdoors.

Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/

To hear more about how other major pharmaceutical companies are transitioning to open source data science you can watch talks from this year’s R in Pharma conference: https://www.youtube.com/@RinPharma/playlists

At Posit, we have a dedicated Pharma team to help organizations migrate and utilize open source for drug development. To learn more about our support for life sciences, please see our dedicated Pharma page where you can book a call with our team. (https://posit.co/solutions/pharma )

flexdashboard ggplot2 rstudio Shiny Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate Mike Garcia Pharma How To

Kara Woo | Always look on the bright side of plots | RStudio

Everyone who creates visualizations in R is bound to make mistakes that prevent their plots from looking as they should. Sometimes, these mistakes create beautiful “accidental aRt”, though other times they’re just plain frustrating. Either way, however, there’s something to be learned. This talk will draw on years of watching both the ggplot2 issue tracker and the @accidental__aRt twitter account to highlight some common plot foibles and explain what they can teach us about how ggplot2 works.

About Kara: Kara Woo is a research scientist in data curation at Sage Bionetworks, where she builds tools to help researchers document and share their data. Kara is a core developer of ggplot2 and collects data visualizations gone beautifully wrong on a blog called accidental aRt

ggplot2 rstudio Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Forcats Rstats Open Source OSS Reticulate Kara Woo Accidental Art

Hadley Wickham | Maintaining the house the tidyverse built | RStudio

Hadley will talk about how the tidyverse has evolved since its creation (just five years ago!). You’ll learn about our greatest successes, learn from our biggest failures, and get some hints of what’s coming down the pipeline for the future.

About Hadley: Hadley Wickham is the Chief Scientist at RStudio, a member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. You may be familiar with his packages for data science (the tidyverse: including ggplot2, dplyr, tidyr, purrr, and readr) and principled software development (roxygen2, testthat, devtools, pkgdown). Much of the material for the course is drawn from two of his existing books, Advanced R and R Packages, but the course also includes a lot of new material that will eventually become a book called “Tidy tools”

Hadley Wickham

devtools dplyr ggplot2 pkgdown purrr readr roxygen2 rstudio testthat tidyr tidyverse Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Forcats Rstats Open Source OSS Reticulate Hadley Wickham

Jake Thompson | Branding and Packaging Reports with R Markdown | RStudio (2020)

The creation of research reports and manuscripts is a critical aspect of the work conducted by organizations and individual researchers. Most often, this process involves copying and pasting output from many different analyses into a separate document. Especially in organizations that produce annual reports for repeated analyses, this process can also involve applying incremental updates to annual reports. It is important to ensure that all relevant tables, figures, and numbers within the text are updated appropriately. Done manually, these processes are often error prone and inefficient. R Markdown is ideally suited to support these tasks. With R Markdown, users are able to conduct analyses directly in the document or read in output from a separate analyses pipeline. Tables, figures, and in-line results can then be dynamically populated and automatically numbered to ensure that everything is correctly updated when new data is provided. Additionally, the appearance of documents rendered with R Markdown can be customized to meet specific branding and formatting requirements of organizations and journals. In this presentation, we will present one implementation of customized R Markdown reports used for Accessible Teaching, Learning, and Assessment Systems (ATLAS) at the University of Kansas. A publicly available R package, ratlas, provides both Microsoft Word and LaTeX templates for different types of projects at ATLAS with their own unique formatting requirements. We will discuss how to create brand-specific templates, as well as how to incorporate the templates into an R package that can be used to unify report creation across an organization. We will also describe other components of branding reports beyond R Markdown templates, including customized ggplot2 themes, which can also be wrapped into the R package. Finally, we will share lessons learned from incorporating the R package workflow into an existing reporting pipeline. https://rstudio.com/resources/rstudioconf-2020/branding-and-packaging-reports-with-r-markdown/

ggplot2 rstudio Rstudio::conf(2020) Jake Thompson Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Dewey Dunnington | Best practices for programming with ggplot2 | RStudio (2020)

The ggplot2 package is widely acknowledged as a powerful, dynamic, and easy-to-learn graphics framework when used in an interactive environment

ggplot2 rstudio Rstudio::conf(2020) Dewey Dunnington Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Ian Lyttle | Small Team, Big Value: Using R to Design Visualizations | RStudio (2020)

Many R users can feel isolated due to the prevalence of Python or Tableau at their institutions. This talk will focus on how we use R to develop reference implementations of visualizations (using ggplot2), and to develop corporate-themed color maps (using the colorspace package) to bring value to the entire institution. Color maps can be translated into variety of formats, for Tableau, Qlik Sense, d3, etc., and deployed independently from R. For visualizations, our goal is to translate ggplot2 objects to Vega-Lite specifications, using a package we are developing: ggvega. Vega-Lite visualizations are web-native, and are rendered independently from R. Specifications can be designed to be extensible to new data, allowing them serve as templates, to be deployed and updated for use outside of R. Of course, despite isolation within an institution, our work with the larger R open-source communities provides a foundation on which to build; in fact, we have a lot of company and are having a lot of fun

ggplot2 rstudio Rstudio::conf(2020) Ian Lyttle Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Thomas Lin Pedersen | Extending your ability to extend ggplot2 | RStudio (2020)

The ggplot2 package continue to be one of the most used frameworks for producing graphics in R. While being extremely flexible, the package itself can be constrained by the different types of graphic elements and statistic transformations available. Instead of continuing to add new features, the development in recent years have focused on making ggplot2 extensible by other packages, thus distributing development and maintenance. Despite the best of intentions, ggplot2 can feel daunting to extend, due unusual idiosyncrasies, a foreign object system, and a partly obscured rendering model. This talk intend to remove the mystery of extending ggplot2, by describing the basic ways that it can be extended and showcasing a couple of simple extensions that can be build with very little code. Lastly, it will include discussions of some best practices and gotchas that may come in handy when you start out

Thomas Lin Pedersen

ggplot2 rstudio Rstudio::conf(2020) Thomas Lin Pedersen Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Forcats Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Rstats Open Source OSS Reticulate

Claus Wilke | Spruce up your ggplot2 visualizations with formatted text | RStudio (2020)

The ggtext package provides various functions to add formatted text to ggplot2 figures, both in the form of plot or axis labels and in the form of text labels or text boxes inside the plot panel. Text formatting can be achieved through a small subset of markdown, HTML, and CSS directives. Features currently supported include italics, bold, super- and sub-script, as well as changing font size, font family, and color. Basic support for adding images to formatted text is also available