Перейти к файлу

Sharla Gelfand 8d784f65f1 Rename package and rproj file for consistency		2021-04-05 18:11:07 -04:00
R	done adding back the titles	2020-11-25 09:35:46 -05:00
data	added files; need to fix dependencies	2020-11-22 15:06:37 -05:00
data-raw	added files; need to fix dependencies	2020-11-22 15:06:37 -05:00
man	First pass at README	2021-04-05 18:09:57 -04:00
tests	first commit	2020-11-22 14:51:51 -05:00
.DS_Store	added install instructions	2020-11-23 12:43:35 -05:00
.Rbuildignore	first commit	2020-11-22 14:51:51 -05:00
.gitignore	first commit	2020-11-22 14:51:51 -05:00
DESCRIPTION	Rename package and rproj file for consistency	2021-04-05 18:11:07 -04:00
NAMESPACE	basically working	2020-11-24 21:09:26 -05:00
README.Rmd	First pass at README	2021-04-05 18:09:57 -04:00
README.md	First pass at README	2021-04-05 18:09:57 -04:00
datamations.Rproj	Rename package and rproj file for consistency	2021-04-05 18:11:07 -04:00
start-up.R	first commit	2020-11-22 14:51:51 -05:00

README.md

datamations

Installation

You can install datamations from GitHub with:

# install.packages("devtools")
devtools::install_github("jhofman/datamations)

Usage

To get started, load datamations and the tidyverse (for its operations to animate):

library(datamations)
library(tidyverse)

Plot-based datamations

In plot-based datamations, the datamation shows a plot of what the data looks like at each step of a pipeline. The following shows an example taking the built-in small_salary data set, grouping by Degree, and calculating the mean Salary.

First, define the code for the pipeline, and titles for each of the steps:

mean_salary_by_degree_pipe <- "small_salary %>% group_by(Degree) %>% summarize(mean = mean(Salary))"

degree_title_step1 <- "Step 1: Each dot shows one person\n and each group shows degree type"
degree_title_step2 <- "Step 2: Next you plot the salary of each person\n within each group"
degree_title_step3 <- "Step 3: Lastly you plot the average salary \n of each group and zoom in"

And generate the datamation with datamation_sanddance():

datamation_sanddance(
  pipeline = mean_salary_by_degree_pipe,
  output = "mean_salary_group_by_degree.gif",
  titles = c(degree_title_step1, degree_title_step2, degree_title_step3),
  nframes = 30
)

You can group by multiple variables, as in this example, grouping by Degree and Work before calculating the mean Salary:

mean_salary_by_degree_work <- "small_salary %>% group_by(Degree, Work) %>% summarize(mean = mean(Salary))"
work_degree_title_step1 <- "Step 1: Each dot shows one person and each group\n shows degree type AND work setting"

datamation_sanddance(
  pipeline = mean_salary_by_degree_work,
  output = "mean_salary_group_by_degree_work.gif",
  titles = c(work_degree_title_step1, degree_title_step2, degree_title_step3),
  nframes = 30
)

Table-based datamations

A table-based datamation shows a mock table of what the data looks like at each step of pipeline. The following shows our same first example: taking the built-in small_salary data set, grouping by Degree, and calculating the mean Salary, using the same pipeline and titles.

You can generate a table-based datamation with datamation_tibble():

datamation_tibble(
  pipeline = mean_salary_by_degree_pipe,
  output = "mean_salary_group_by_degree.gif"
)

# The command below takes 20 seconds to execute on my machine
pipeline <- "small_salary_data %>% group_by(Degree)"
dmpkg::datamation_tibble(pipeline, output = "salary_group_degree.gif")

# The command below takes 40 seconds to execute on my machine
pipeline <- "mtcars %>% group_by(cyl)"
dmpkg::datamation_tibble(pipeline, output = "mtcars_group_cyl.gif")

# The command below takes 50 seconds to execute on my machine
pipeline <- "small_salary_data %>% group_by(Degree, Work) %>% summarize(Avg_Salary = mean(Salary))"
dmpkg::datamation_tibble(pipeline, output = "salary_group2_summarize_mean.gif")

# The command below takes 50 seconds to execute on my machine
pipeline <- "small_salary_data %>% group_by(Degree, Work) %>% summarize(Avg_Salary = mean(Salary))"
dmpkg::datamation_tibble(pipeline,
  output = "salary_group2_summarize_mean.gif",
  titles = c(
    "Grouping by Degree and Work",
    "Calculating the Mean of Each Group"
  )
)