tidypredict
Run predictions inside the database
The tidypredict package converts R model objects into formulas that can be executed inside databases via SQL. It parses fitted models (like lm, glm, randomForest, xgboost, and others) and extracts the coefficients and structure needed to generate predictions without requiring the original model object or R environment.
This package solves the problem of scoring models at scale by pushing predictions into the database layer rather than pulling data into R. It works through dplyr’s database interface to support multiple SQL backends, eliminating the need to save model objects as .rds files or use PMML for deployment. The package also provides a parsed model specification format that can be stored as a simple data structure and works with parsnip-fitted models from the tidymodels ecosystem.
Contributors#
Resources featuring tidypredict#
Emil Hvitfeldt - Tidypredict with recipes, turn workflow to SQL, spark, duckdb and beyond
Tidypredict is one of my favorite packages. Being able to turn a fitted model object into an equation is very powerful! However, in tidymodels, we use recipes more and more to do preprocessing. So far, tidypredict didn’t have support for recipes, which severely limited its uses. This talk is about how I fixed that issue. After spending a couple of years thinking about this problem, I finally found a way! Being able to turn a tidymodels workflow into a series of equations for prediction is super powerful. For some uses, being able to turn a model to predict inside SQL, spark or duckdb allows us to handle some problems with more ease.
Talk by Emil Hvitfeldt
Slides: https://emilhvitfeldt.github.io/talk-orbital-positconf/ GitHub Repo: https://github.com/EmilHvitfeldt/talk-orbital-positconf/tree/main

Edgar Ruiz - GitHub Copilot in RStudio
GitHub Copilot in RStudio - Edgar Ruiz
Presentation slides available at https://colorado.posit.co/rsc/rstudio-copilot/#/TitleSlide
Speaker Bio: Edgar Ruiz is a solutions engineer at Posit with a background in deploying enterprise reporting and business intelligence solutions. He is the author of multiple articles and blog posts sharing analytics insights and server infrastructure for data science. Edgar is the author and administrator of the https://db.rstudio.com web site, and current administrator of the sparklyr web site: https://spark.rstudio.com . Co-author of the dbplyr package, and creator of the dbplot, tidypredict and modeldb package.
Presented at the 2023 R/Pharma Conference (October 26, 2023)

Edgar Ruiz | Programación con R | RStudio (2019)
Hay ocasiones que, cuando trabajamos en un análisis en R, necesitamos dividir nuestros datos en grupos, y después tenemos que correr la misma operación sobre cada grupo. Por ejemplo, puede ser que los datos que tenemos contienen varios países, y necesitamos crear un modelo por cada país. Otro caso sería el de correr múltiples operaciones sobre los mismos datos. Estos casos requieren que sepamos cómo usar iteraciones con R. Este webinar se concentrará en cómo utilizar el paquete llamado purrr para ayudarnos a resolver este tipo de problema.
Descargar materiales: https://rstudio.com/resources/webinars/programacio-n-con-r/
About Edgar: Edgar Ruiz es un Ingeniero de Soluciones en RStudio. Es el administrador de los sitios oficiales de sparklyr y de R para bases de datos. También es autor de los paquetes de R: dbplot, tidypredict y modeldb, y co-autor de el paquete dbplyr


