2020-11-22 22:51:51 +03:00
|
|
|
|
|
|
|
<!-- README.md is generated from README.Rmd. Please edit that file -->
|
|
|
|
|
2021-04-06 01:05:03 +03:00
|
|
|
# datamations
|
2020-11-23 20:43:35 +03:00
|
|
|
|
2021-04-06 17:53:46 +03:00
|
|
|
<!-- badges: start -->
|
|
|
|
|
2021-12-16 01:38:03 +03:00
|
|
|
[![R-CMD-check](https://github.com/microsoft/datamations/workflows/R-CMD-check/badge.svg)](https://github.com/microsoft/datamations/actions)
|
2021-04-06 17:53:46 +03:00
|
|
|
<!-- badges: end -->
|
|
|
|
|
2021-12-15 20:25:58 +03:00
|
|
|
datamations is a framework for the automatic generation of explanation
|
|
|
|
of the steps of an analysis pipeline. It automatically turns code into
|
|
|
|
animations, showing the state of the data at each step of an analysis.
|
2021-04-06 16:40:21 +03:00
|
|
|
|
2021-12-17 23:32:57 +03:00
|
|
|
For more information, please visit the [package
|
|
|
|
website](https://microsoft.github.io/datamations/), which includes
|
|
|
|
[additional
|
|
|
|
examples](https://microsoft.github.io/datamations/articles/Examples.html),
|
|
|
|
[defaults and
|
|
|
|
conventions](https://microsoft.github.io/datamations/articles/details.html),
|
|
|
|
and more.
|
2021-12-16 01:38:03 +03:00
|
|
|
|
2020-11-23 20:43:35 +03:00
|
|
|
## Installation
|
|
|
|
|
2021-04-06 01:05:03 +03:00
|
|
|
You can install datamations from GitHub with:
|
2020-11-23 20:43:35 +03:00
|
|
|
|
|
|
|
``` r
|
2021-04-06 01:05:03 +03:00
|
|
|
# install.packages("devtools")
|
2021-11-13 01:34:59 +03:00
|
|
|
devtools::install_github("microsoft/datamations")
|
2020-11-23 20:43:35 +03:00
|
|
|
```
|
|
|
|
|
2021-04-06 01:05:03 +03:00
|
|
|
## Usage
|
|
|
|
|
2021-05-10 23:19:37 +03:00
|
|
|
To get started, load datamations and dplyr:
|
2021-04-06 01:05:03 +03:00
|
|
|
|
2021-12-15 20:25:58 +03:00
|
|
|
A datamation shows a plot of what the data looks like at each step of a
|
|
|
|
tidyverse pipeline, animated by the transitions that lead to each state.
|
2021-04-06 16:40:21 +03:00
|
|
|
The following shows an example taking the built-in `small_salary` data
|
|
|
|
set, grouping by `Degree`, and calculating the mean `Salary`.
|
2021-04-06 01:05:03 +03:00
|
|
|
|
2021-05-07 20:25:20 +03:00
|
|
|
First, define the code for the pipeline, then generate the datamation
|
|
|
|
with `datamation_sanddance()`:
|
2021-04-06 01:05:03 +03:00
|
|
|
|
|
|
|
``` r
|
2021-12-16 01:38:03 +03:00
|
|
|
library(datamations)
|
|
|
|
library(dplyr)
|
|
|
|
|
2021-06-09 23:06:42 +03:00
|
|
|
"small_salary %>%
|
|
|
|
group_by(Degree) %>%
|
|
|
|
summarize(mean = mean(Salary))" %>%
|
2021-05-07 20:25:20 +03:00
|
|
|
datamation_sanddance()
|
2020-11-25 17:35:46 +03:00
|
|
|
```
|
|
|
|
|
2022-03-29 06:20:48 +03:00
|
|
|
<img src="man/figures/README-mean_salary_grouped_degree.gif" width="80%" />
|
2021-04-06 01:05:03 +03:00
|
|
|
|
2021-12-15 20:25:58 +03:00
|
|
|
datamations supports the following `dplyr` functions:
|
|
|
|
|
2021-12-16 01:38:03 +03:00
|
|
|
- `group_by()` (up to three grouping variables)
|
|
|
|
- `summarize()`/`summarise()` (limited to summarizing one variable)
|
2021-12-15 20:25:58 +03:00
|
|
|
- `filter()`
|
2022-03-29 06:20:48 +03:00
|
|
|
- `count()`/`tally`
|