Grammar of Data Manipulation
Resources tagged Grammar of Data Manipulation#
Working with Two Datasets: Binds, Set Operations, and Joins – Pt 4 Intro to Data Manipulation
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
dplyr docs: dplyr.tidyverse.org/reference/
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules:
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:10
tidyr::gather - /12:30
tidyr::spread - /15:23
tidyr::unite - /15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- /00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- 00.42
dplyr::bind_cols - 01:27
dplyr::bind_rows - 01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - 02:15 joining data -
dplyr::left_join,dplyr::inner_join, -dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
What is data wrangling? Intro, Motivation, Outline, Setup – Pt. 1 Data Wrangling Introduction
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. These videos introduce you to these tools. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- 01:44 Intro and what’s covered Ground Rules
- 02:40 What’s a tibble
- 04:50 Use View
- 05:25 The Pipe operator:
- 07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:15
tidyr::gather - /12:38
tidyr::spread - /15:30
tidyr::unite - /15:30
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by - /15:00
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
New York Times “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”, By STEVE LOHRAUG. 17, 2014 https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html
Tidy Data and tidyr – Pt 2 Intro to Data Wrangling with R and the Tidyverse
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
http://tidyr.tidyverse.org/reference/
- http://tidyr.tidyverse.org/reference/gather
- http://tidyr.tidyverse.org/reference/spread
- http://tidyr.tidyverse.org/reference/unite
- http://tidyr.tidyverse.org/reference/separate
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- 00:48 Goal 1 Making your data suitable for R
- 01:40
tidyr“Tidy” Data introduced and motivated - 08:10
tidyr::gather - 12:30
tidyr::spread - 15:23
tidyr::unite - 15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- /02:00
dplyr::select - /03:40
dplyr::filter - /05:05
dplyr::mutate - /07:05
dplyr::summarise - /08:30
dplyr::arrange - /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /11:45
dplyr::group_by - /15:00
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
Data Manipulation Tools: dplyr – Pt 3 Intro to the Grammar of Data Manipulation with R
Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
dplyr docs: dplyr.tidyverse.org/reference/
- http://dplyr.tidyverse.org/reference/union.html
- http://dplyr.tidyverse.org/reference/intersect.html
- http://dplyr.tidyverse.org/reference/set_diff.htm
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup https://youtu.be/jOd65mR1zfw
- /01:44 Intro and what’s covered Ground Rules
- /02:40 What’s a tibble
- /04:50 Use View
- /05:25 The Pipe operator:
- /07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr https://youtu.be/1ELALQlO-yM
- /00:48 Goal 1 Making your data suitable for R
- /01:40
tidyr“Tidy” Data introduced and motivated - /08:10
tidyr::gather - /12:30
tidyr::spread - /15:23
tidyr::unite - /15:23
tidyr::separate
Pt. 3: Data manipulation tools: dplyr https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
- 02:00
dplyr::select - 03:40
dplyr::filter - 05:05
dplyr::mutate - 07:05
dplyr::summarise - 08:30
dplyr::arrange - 09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- 11:45
dplyr::group_by
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins https://youtu.be/AuBgYDCg1Cg Combining two datasets together
- /00.42
dplyr::bind_cols - /01:27
dplyr::bind_rows - /01:42 Set operations
dplyr::union,dplyr::intersect,dplyr::set_diff - /02:15 joining data
dplyr::left_join,dplyr::inner_join,dplyr::right_join,dplyr::full_join,
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
tidyr docs: tidyr.tidyverse.org/reference/
tidyrvignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.htmldplyrdocs: http://dplyr.tidyverse.org/reference/dplyrone-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.htmldplyrtwo-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html