Software

tidypredict

Run predictions inside the database

tidypredict.tidymodels.org

263 stars

33 forks

The tidypredict package converts R model objects into formulas that can be executed inside databases via SQL. It parses fitted models (like lm, glm, randomForest, xgboost, and others) and extracts the coefficients and structure needed to generate predictions without requiring the original model object or R environment.

This package solves the problem of scoring models at scale by pushing predictions into the database layer rather than pulling data into R. It works through dplyr’s database interface to support multiple SQL backends, eliminating the need to save model objects as .rds files or use PMML for deployment. The package also provides a parsed model specification format that can be stored as a simple data structure and works with parsnip-fitted models from the tidymodels ecosystem.

Contributors#

Edgar Ruiz

Senior Software Engineer

Emil Hvitfeldt

Senior Software Engineer

Max Kuhn

Principal Software Engineer

Simon Couch

Senior Software Engineer

Julia Silge

Engineering Manager

Hannah Frick

Senior Software Engineer

Davis Vaughan

Principal Software Engineer

Resources featuring tidypredict#

video

Emil Hvitfeldt - Tidypredict with recipes, turn workflow to SQL, spark, duckdb and beyond

Emil Hvitfeldt - Tidypredict with recipes, turn workflow to SQL, spark, duckdb and beyond

Tidypredict is one of my favorite packages. Being able to turn a fitted model object into an equation is very powerful! However, in tidymodels, we use recipes more and more to do preprocessing. So far, tidypredict didn’t have support for recipes, which severely limited its uses. This talk is about how I fixed that issue. After spending a couple of years thinking about this problem, I finally found a way! Being able to turn a tidymodels workflow into a series of equations for prediction is super powerful. For some uses, being able to turn a model to predict inside SQL, spark or duckdb allows us to handle some problems with more ease.

Talk by Emil Hvitfeldt

Slides: https://emilhvitfeldt.github.io/talk-orbital-positconf/ GitHub Repo: https://github.com/EmilHvitfeldt/talk-orbital-positconf/tree/main

Emil Hvitfeldt

Oct 31, 2024

20 min

504 views

tidymodels tidypredict

video

Edgar Ruiz - GitHub Copilot in RStudio

GitHub Copilot in RStudio - Edgar Ruiz

Presentation slides available at https://colorado.posit.co/rsc/rstudio-copilot/#/TitleSlide

Speaker Bio: Edgar Ruiz is a solutions engineer at Posit with a background in deploying enterprise reporting and business intelligence solutions. He is the author of multiple articles and blog posts sharing analytics insights and server infrastructure for data science. Edgar is the author and administrator of the https://db.rstudio.com web site, and current administrator of the sparklyr web site: https://spark.rstudio.com . Co-author of the dbplyr package, and creator of the dbplot, tidypredict and modeldb package.

Presented at the 2023 R/Pharma Conference (October 26, 2023)

Edgar Ruiz

Dec 11, 2023

9 min

219 views

dbplyr modeldb rstudio tidypredict

video

Edgar Ruiz | Programación con R | RStudio (2019)

Edgar Ruiz | Programación con R | RStudio (2019)

Hay ocasiones que, cuando trabajamos en un análisis en R, necesitamos dividir nuestros datos en grupos, y después tenemos que correr la misma operación sobre cada grupo. Por ejemplo, puede ser que los datos que tenemos contienen varios países, y necesitamos crear un modelo por cada país. Otro caso sería el de correr múltiples operaciones sobre los mismos datos. Estos casos requieren que sepamos cómo usar iteraciones con R. Este webinar se concentrará en cómo utilizar el paquete llamado purrr para ayudarnos a resolver este tipo de problema.

Descargar materiales: https://rstudio.com/resources/webinars/programacio-n-con-r/

About Edgar: Edgar Ruiz es un Ingeniero de Soluciones en RStudio. Es el administrador de los sitios oficiales de sparklyr y de R para bases de datos. También es autor de los paquetes de R: dbplot, tidypredict y modeldb, y co-autor de el paquete dbplyr

Edgar Ruiz

Mar 15, 2021

58 min

891 views

dbplyr modeldb purrr rstudio tidypredict webinars Rstudio Data Science Machine Learning Python Stats Tidyverse Data Visualization Data Viz Ggplot Technology Coding Connect Server Pro Shiny RMarkdown Package Manager CRAN Interoperability Serious Data Science Dplyr Ggplot2 Tibble Readr Stringr Tidyr Purrr Github Data Wrangling Tidy Data Odbc Rayshader Plumber Blogdown Gt Lazy Evaluation Tidymodels Statistics Debugging Programming Education Forcats Rstats Open Source OSS Reticulate Edgar Ruiz

Posts about tidypredict#

blog post

tidypredict 1.0.0

tidypredict 1.0.0 brings faster computations for tree-based models, more efficient tree representations, glmnet model support, and a change in how random forests are handled

Emil Hvitfeldt

Dec 10, 2025

tidyverse tidymodels tidypredict orbital Machine Learning Tidyverse Packages

blog post

tidymodels updates

The latest updates to the tidymodels packages

Max Kuhn, Edgar Ruiz, Davis Vaughan

Sep 5, 2019

tidyverse tidymodels recipes embed rsample parsnip corrr tidypredict yardstick Machine Learning Tidyverse Packages