DBI
A database interface (DBI) definition for communication between R and RDBMSs
DBI is a database interface package for R that provides a standardized front-end API for connecting to and working with database management systems. It defines a common set of methods that work across different databases through backend-specific packages like RPostgres, RMariaDB, RSQLite, and odbc.
The package solves the problem of database-specific code by providing a unified interface inspired by similar systems in other languages (Perl’s DBI, Java’s JDBC, Python’s DB-API). It supports essential database operations including connections, query execution, result extraction, transaction management, and metadata retrieval. This means you can write R code that works with multiple database systems by only changing the backend driver, not your application logic.
Contributors#
Resources featuring DBI#
Workflow Demo Live Q&A - September 25th!
On September 25th, we hosted a Workflow Demo on data-level permissions using Posit Connect (with Databricks, Snowflake, OAuth: https://youtu.be/ivEoeyWJzVY?feature=shared )
Links mentioned in the Q&A: Release Blurb: https://docs.posit.co/connect/news/#posit-connect-2024.08.0 Security: https://docs.posit.co/connect/admin/integrations/oauth-integrations/security.html Publishing Quarto: https://docs.posit.co/connect/how-to/basic/publish-databricks-quarto-notebook/ sparklyr: https://github.com/sparklyr/sparklyr?tab=readme-ov-file#connecting-through-databricks-connect-v2 odbc: https://github.com/r-dbi/odbc?tab=readme-ov-file#odbc-
Helpful resources for this workflow: Full examples to get you started: https://github.com/posit-dev/posit-sdk-py/tree/main/examples/connect Admins will likely be most interested in starting here: https://docs.posit.co/connect/admin/integrations/oauth-integrations/databricks/ End users will be most interested here: https://docs.posit.co/connect/user/oauth-integrations/ Databricks Integrations with Python Cookbook: https://docs.posit.co/connect/cookbook/content/integrations/databricks/python/ Databricks Integrations with R Cookbook: https://docs.posit.co/connect/cookbook/content/integrations/databricks/r/ Snowflake Integrations with Python Cookbook: https://docs.posit.co/connect/cookbook/content/integrations/snowflake/python/
Data-level permissions using Posit Connect (with Databricks, Snowflake, OAuth)
Should one viewer of your app be able to see more (or different) data than another? Maybe colleagues in California should only see data relevant to them? Or managers should only have access to their own employee data?
The Connect team joined us for a demo on inheriting data-level permissions using Posit Connect and Databricks Unity Catalog. While this workflow uses Databricks to illustrate federated data access controls, this same methodology can also be applied to Snowflake or any external data source that supports OAuth.
During this workflow demo, you will learn:
- How to define row-level access controls in Databricks Unity Catalog
- How to create a Databricks OAuth integration in Posit Connect
- How to write interactive applications that utilize the viewer’s Databricks credentials when reading data from Databricks Unity Catalog, providing the viewer with a personalized experience depending on their level of data access
- How to deploy this application to Posit Connect and share it within your organization
If you’d like to talk further with our team 1:1 about doing this, you can find a time to chat here: https://posit.co/schedule-a-call/?booking_calendar__c=WorkflowDemo
Ps. To enable OAuth integrations, your team will need to upgrade to Posit Connect 2024.08.0. This feature is available in Enhanced and Advanced product tiers.
Helpful resources for this workflow: Full examples to get you started: https://github.com/posit-dev/posit-sdk-py/tree/main/examples/connect Admins will likely be most interested in starting here: https://docs.posit.co/connect/admin/integrations/oauth-integrations/databricks/ End users will be most interested here: https://docs.posit.co/connect/user/oauth-integrations/ Q&A Link: https://youtube.com/live/TZQY6rm6hU4?feature=share
Additional resources shared: Release Blurb: https://docs.posit.co/connect/news/#posit-connect-2024.08.0 Security: https://docs.posit.co/connect/admin/integrations/oauth-integrations/security.html Publishing Quarto: https://docs.posit.co/connect/how-to/basic/publish-databricks-quarto-notebook/ sparklyr: https://github.com/sparklyr/sparklyr?tab=readme-ov-file#connecting-through-databricks-connect-v2 odbc: https://github.com/r-dbi/odbc?tab=readme-ov-file#odbc-
Connecting RStudio and Databricks with ODBC
The odbc package, in conjunction with a driver, provides DBI support and an ODBC connection.
With the new odbc::databricks_connect function, you can create an ODBC connection to determine and configure the necessary settings to access your Databricks account. Your Databricks HTTP path is the only argument you need to run databricks_connect(). Provide your HTTP path and you will be able to see your Databricks data in the RStudio Connections Pane. Then, you can analyze your data in RStudio.
Learn more:
- Databricks x Posit: https://posit.co/solutions/databricks/
- Empowering R and Python Developers: Databricks and Posit Announce New Integrations: https://posit.co/blog/databricks-and-posit-announce-new-integrations/
- RStudio IDE and Posit Workbench 2023.12.0: What’s New: https://posit.co/blog/rstudio-2023-12-0-whats-new/
- Posit Professional Drivers 2024.03.0: Support for Apple Silicon: https://posit.co/blog/pro-drivers-2024-03-0/
Contact our sales team to schedule a demo: https://posit.co/schedule-a-call/?booking_calendar__c=Databricks
Hadley Wickham @ Posit | Giving benefit to people using what you build | Data Science Hangout
We were recently joined by Hadley Wickham, Chief Scientist at Posit PBC. Listen in to hear our chat about building tools (like the tidyverse) to make data science easier, faster, and more fun.
36:57 - While I’m bought into developing open source packages to help deliver better processes, any advice to those of us doing that development in getting their company bought in?
You have to give some benefit to the people using (what you’re building)
You’ve got to either remove pain or add pleasure in some way because if you can’t do that and you’re not someone’s direct supervisor, it’s hard to get people to change.
The way I think about the tidyverse is, how do we give people some sort of quick wins so they can be motivated to do the things that are slower where they’re gonna have to learn some new ideas or some new tools. You kind of build up some equity with that person.
They build trust that you’ve helped them in the past and now they’re willing to invest a little bit more time before they see the payoff. But in the early days, it’s all about delivering payoffs as quickly as possible.
And I think if you’re doing, like, you know “my company’s first R package” - the easy pain points are: make themes for your company corporate style guide, make a ggplot2 theme, make an R Markdown, a Quarto theme. Make a Shiny theme that people can just use to get, you know, something that’s reasonably close to whatever your corporate style guide dictates.
That just feels like an easy win for people because it makes them look good inside the corporation and because you’ve put in all the hard work, it’s like three seconds for them to type the right function name to get the right theme.
I think the other bit is making it easier to get access to data. Set up some wrappers around DBI connections to the most important data sources. Provide some conventions around authentication so that stuff just works so that they’re not struggling with “What packages do I need to install? What’s the password? Where’s the path I need?” Just give them some, like, a list of the top ten most common data sources and people will love you by and large.
Follow-up question: Once you identify the things that you think would be useful for people - do you have a philosophy or a way in which you approach putting things together?
When you’re in an environment of scarcity when you’ve only got so much time that you can take out of your everyday job to invest in writing a package, it’s really tough to balance. Like, how do I add new stuff versus making sure the old stuff continues to work?
I think, again, some of it’s about building up trust. So, give people some wins so that when you inevitably break stuff, you’ve got some kind of cushion so people aren’t going to be really angry with you right away. They’re gonna be like, ok, well there’s a little bit of suffering now, but this person saved me so much time.
But yeah, it’s really hard. And particularly as you’re starting out, like, you’re going to make mistakes. That’s inevitable.
You’re going to do things that when you look back a year later, you’re like, why on earth did I do it that way? You’ll want to rip out the whole thing and ride it from scratch. And I think that if it feels horrible, you have to remember, that’s great. It means you’ve grown immensely as a programmer.
Certainly if you have my kind of mindset, you have to resist the temptation to rip things out and redo them as much as possible and just focus on making the next generation better rather than breaking what stuff people already have.
So I don’t have any great answers here, but I think you just have to think about those tensions of “how do I keep my forward velocity up while getting better as a programmer and evolving over time, but also thinking about how do you make the things you did a long time ago better?”
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc
To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We’d love to see you!)
Come hangout with us!
