This commit is contained in:
Martin Chan 2020-10-26 21:21:24 +00:00
Коммит 020c455952
306 изменённых файлов: 17550 добавлений и 0 удалений

1
.github/.gitignore поставляемый Normal file
Просмотреть файл

@ -0,0 +1 @@
*.html

106
.github/developer_guide.md поставляемый Normal file
Просмотреть файл

@ -0,0 +1,106 @@
# :hammer: Developer Guide
This is guide for "developers", or anyone who is looking to contribute code to the {wpa} R package.
## Pre-requisites
You should have the following installed before starting:
1. [R](https://www.r-project.org/)
2. [RStudio Desktop](https://rstudio.com/products/rstudio/download/#download)
3. [GitHub Desktop](https://desktop.github.com/) (Highly recommended, unless you are very familiar with git)
You should also have the following packages installed. These should all be available from CRAN:
1. {devtools}
2. {tidyverse}
3. {roxygen2}
Once these are all installed, you should be ready to roll!
---
## :hammer: Quick start guide - For developers
### Update package and installing to local directory
Once you've made changes to a script, here is how to update the package and install it locally on your computer. Run:
#### 1. Generating documentation
```R
roxygen2::roxygenise()
```
`roxygenise()` generates documentation files for the R package, and makes them available to the user.
#### 2. R CMD Checks
Next, run this to build and check the package. You will need {devtools} installed.
```R
devtools::check()
```
This step runs a R CMD check against the package, to check that there will be no errors and issues if somebody installs the package. This step can take around 5 minutes long, and usually it is worth running this only after you've implemented all the desired changes towards the end.
#### 3. Install the package
If everything runs smoothly, run this final line to install the local package. Make sure that your working directory is set to the package!
```R
devtools::install()
```
---
## To increment or change a package version
Each package release has a corresponding package version (e.g. 1.4.2.) To change that package version number for a particular release, simply go into the `DESCRIPTION` at the project root folder of the {wpa} package. You can edit this with any notepad software or RStudio.
Change the values that come after `Version: ` to change the package version. Prior to incrementing the package version, please align with the main developer team on the agreed version release details.
## How to release a package version
Our current protocol for package release is to distribute it as a zipped file or a tar ball. The advantage of this approach is that users do not need to have a GitHub account, and this package release can be distributed via SharePoint or even email. Our current approach is to offer **both** options for installation.
Once you are happy with the package and all checks have passed, you can run the following to generate a **zipped** distribution of the package. If your platform is Windows, this should generate a zipped file one level above your project directory:
```R
devtools::build(binary = TRUE)
```
The code to generate a **tarball** distribution is similar:
```R
devtools::build(binary = FALSE)
```
After the **zipped file** and the **tarball** files have been generated, they are ready for distribution.
## How to build the PDF documentation
It is good practice to distribute a package with an updated version of the documentation, so users won't be using outdated documentation. You can generate the familiar PDF documentation by running the following:
```R
devtools::build_manual()
```
## A short guide to adding a function to the package
1. Let us assume that you have written a new function, e.g. a checking function that returns TRUE if PersonId exists, and returns FALSE if it does not).
1. Once you check the function and ensure that it works, save this into a R file under the "R" folder of the package directory.
- Ensure that your current working directory is set to the package directory. You can either open the RStudio session with the package .Rproj file, or run `setwd()` to change you working directory. Run `getwd()` to check what your current working directory is.
- Ensure that the relevant roxygen headers are present (see [here](https://roxygen2.r-lib.org/articles/rd.html) for more reference).
- Whilst it is possible to have more than one function saved in a single R file, we recommend saving them in their own files with their own names for clarity. For instance, the function `collaboration_dist()` would be saved in a file called _collaboration_dist.R_.
1. Follow the steps in the Developer Guide above to generate the documentation (`roxygen2::roxygenise()`), check the package (`devtools::check()`), and install the package (`devtools::install()`). 
1. Once you are sure that everything works, open **GitHub Desktop**, and **check that you have selected the relevant branch**. If you are unsure what your changes are, it is always recommended that you work on a new branch so that you do not introduce any breaking changes to the master branch.
1. _Fetch / pull_ from the branch. This ensures that your local package is updated with changes that others have made to the code.
1. _Commit_ your changes, and add an intuitive commit message so others will know what you have changed to the code. Please see the wiki page of this repo for a style guide on the commit messages.
1. _Push_ your changes to the branch. If there are no errors, you should be able to see the changes on the GitHub repository!
## Using GitHub
*It's on its way...*
## Documenting functions with {roxygen2}
*It's on its way...*
## :link: Useful links
- [Sharing Internal R Packages](https://support.rstudio.com/hc/en-us/articles/115000239587-Sharing-Internal-R-Packages)
- [Writing an R package from scratch by Hilary Parker](https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/)
- [Semantic versioning](https://semver.org/)

19
.github/pull_request_template.md поставляемый Normal file
Просмотреть файл

@ -0,0 +1,19 @@
# Summary
This branch *<succinct summary of the purpose>*
# Changes
The changes made in this PR are:
1. Change 1
1. Change 2
...
# Check
- [ ] Check 1
- [ ] Check 2
...
# (OPTIONAL) Note
This "fixes #<issue_number>"
*<other things, such as how to incorporate new changes>*
*<brief summary of the purpose of this pull request>*

18
.github/workflows/R-CMD-check.yaml поставляемый Normal file
Просмотреть файл

@ -0,0 +1,18 @@
on: [push, pull_request]
name: R-CMD-check
jobs:
R-CMD-check:
runs-on: macOS-latest
steps:
- uses: actions/checkout@v2
- uses: r-lib/actions/setup-r@master
- name: Install dependencies
run: |
install.packages(c("remotes", "rcmdcheck"))
remotes::install_deps(dependencies = TRUE)
shell: Rscript {0}
- name: Check
run: rcmdcheck::rcmdcheck(args = "--no-manual", error_on = "error")
shell: Rscript {0}

12
.gitignore поставляемый Normal file
Просмотреть файл

@ -0,0 +1,12 @@
.Rproj.user
.Rhistory
.Rbuildignore
_data/
_development/
.RData
doc
Meta
.RDataTmp
SQ-overview.html
wpa export 20200427_131327.png
*.html

14
CONTRIBUTING.md Normal file
Просмотреть файл

@ -0,0 +1,14 @@
# Contributing
This project welcomes contributions and suggestions. Most contributions require you to
agree to a Contributor License Agreement (CLA) declaring that you have the right to,
and actually do, grant us the rights to use your contribution. For details, visit
https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need
to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

65
DESCRIPTION Normal file
Просмотреть файл

@ -0,0 +1,65 @@
Package: wpa
Type: Package
Title: Tools for Analysing and Visualising Workplace Analytics data
Version: 1.3.0
Authors@R: c(
person(given = "Martin", family = "Chan", role = c("aut", "cre"), email = "martin.chan@microsoft.com"),
person(given = "Carlos", family = "Morales", role = "ctb", email = "carlos.morales@microsoft.com"),
person(given = "Mark", family = "Powers", role = "ctb", email = "mark.powers@microsoft.com"),
person(given = "Ainize", family = "Cidoncha", role = "ctb", email = "ainize.cidoncha@microsoft.com"),
person(given = "Rosamary", family = "Ochoa Vargas", role = "ctb", email = "rosamary.ochoa@microsoft.com"),
person(given = "Tannaz", family = "Sattari", role = "ctb", email = "tannaz.sattari@microsoft.com"))
Description: Opinionated functions that enable easier and faster
analysis of Workplace Analytics data. There are three main types of functions in wpa:
(i) Standard functions create a ggplot visual or a summary table based on a specific
Workplace Analytics metric; (2) Report Generation functions generate HTML reports on
a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover
more specific applications (e.g. Subject Line text mining) of Workplace Analytics data.
This package adheres to tidyverse principles and works well with the pipe syntax.
wpa is built with the beginner-to-intermediate R users in mind, and is optimised for
simplicity.
URL: https://github.com/martinctc/wpa
BugReports: https://github.com/martinctc/wpa/issues
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Depends:
R (>= 3.1.2)
Imports:
base,
dplyr,
stats,
utils,
tibble,
tidyr,
tidyselect (>= 1.0.0),
magrittr,
purrr,
reshape2,
ggplot2,
ggrepel,
Information,
scales,
htmltools,
knitr,
markdown,
rmarkdown,
networkD3,
DT,
tidytext,
ggraph,
igraph,
widyr,
proxy,
rlang,
GGally,
network,
ggwordcloud,
methods,
data.table
RoxygenNote: 7.1.1
Roxygen: list(markdown = TRUE)
VignetteBuilder: knitr
Suggests:
extrafont
Language: en-US

2
LICENSE Normal file
Просмотреть файл

@ -0,0 +1,2 @@
YEAR: 2020
COPYRIGHT HOLDER: Microsoft Corporation

23
LICENSE.txt Normal file
Просмотреть файл

@ -0,0 +1,23 @@
wpa
Copyright (c) Microsoft Corporation.
MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

170
NAMESPACE Normal file
Просмотреть файл

@ -0,0 +1,170 @@
# Generated by roxygen2: do not edit by hand
export("%>%")
export(IV_by_period)
export(IV_report)
export(afterhours_dist)
export(afterhours_distribution)
export(afterhours_rank)
export(afterhours_sum)
export(afterhours_summary)
export(afterhours_trend)
export(analysis_scope)
export(camel_clean)
export(capacity_report)
export(check_inputs)
export(check_query)
export(coaching_report)
export(collab_sum)
export(collab_summary)
export(collaboration_area)
export(collaboration_dist)
export(collaboration_distribution)
export(collaboration_fizz)
export(collaboration_line)
export(collaboration_rank)
export(collaboration_report)
export(collaboration_sum)
export(collaboration_summary)
export(collaboration_trend)
export(comma)
export(connectivity_report)
export(copy_df)
export(create_IV)
export(create_bar)
export(create_bar_asis)
export(create_boxplot)
export(create_dist)
export(create_dt)
export(create_fizz)
export(create_line)
export(create_period_scatter)
export(create_rank)
export(create_scatter)
export(create_stacked)
export(cut_hour)
export(email_dist)
export(email_fizz)
export(email_line)
export(email_rank)
export(email_summary)
export(email_trend)
export(export)
export(external_network_plot)
export(extract_date_range)
export(extract_hr)
export(flag_ch_ratio)
export(flag_em_ratio)
export(flag_extreme)
export(flag_outlooktime)
export(g2g_network)
export(generate_report)
export(hr_trend)
export(hrvar_count)
export(hrvar_count_all)
export(identify_churn)
export(identify_holidayweeks)
export(identify_inactiveweeks)
export(identify_nkw)
export(identify_outlier)
export(identify_privacythreshold)
export(identify_query)
export(identify_shifts)
export(identify_tenure)
export(import_wpa)
export(internal_network_plot)
export(is_date_format)
export(keymetrics_scan)
export(meeting_dist)
export(meeting_distribution)
export(meeting_fizz)
export(meeting_line)
export(meeting_quality)
export(meeting_rank)
export(meeting_skim)
export(meeting_summary)
export(meeting_tm_report)
export(meeting_trend)
export(meetingtype_dist)
export(meetingtype_dist_ca)
export(meetingtype_dist_mt)
export(meetingtype_summary)
export(mgrcoatt_dist)
export(mgrrel_matrix)
export(one2one_dist)
export(one2one_line)
export(one2one_rank)
export(one2one_sum)
export(one2one_trend)
export(period_change)
export(personas_hclust)
export(read_preamble)
export(remove_outliers)
export(rgb2hex)
export(standardise_pq)
export(subject_validate)
export(subject_validate_report)
export(theme_wpa)
export(theme_wpa_basic)
export(tm_clean)
export(tm_cooc)
export(tm_freq)
export(tm_wordcloud)
export(totals_bind)
export(totals_reorder)
export(track_HR_change)
export(tstamp)
export(validation_report)
export(workloads_dist)
export(workloads_distribution)
export(workloads_line)
export(workloads_summary)
export(workloads_trend)
export(workpatterns_area)
export(workpatterns_hclust)
export(workpatterns_rank)
export(wrap)
import(DT)
import(Information)
import(dplyr)
import(ggplot2)
import(ggraph)
import(ggrepel)
import(reshape2)
import(scales)
import(tidyselect)
importFrom(data.table,"%between%")
importFrom(data.table,"%like%")
importFrom(data.table,":=")
importFrom(dplyr,`%>%`)
importFrom(dplyr,mutate_if)
importFrom(htmltools,HTML)
importFrom(igraph,graph_from_data_frame)
importFrom(magrittr,"%>%")
importFrom(markdown,markdownToHTML)
importFrom(methods,is)
importFrom(proxy,dist)
importFrom(purrr,map_if)
importFrom(purrr,pmap)
importFrom(purrr,reduce)
importFrom(rlang,"!!")
importFrom(rlang,sym)
importFrom(scales,percent)
importFrom(scales,wrap_format)
importFrom(stats,cutree)
importFrom(stats,hclust)
importFrom(stats,median)
importFrom(stats,na.omit)
importFrom(stats,rect.hclust)
importFrom(stats,reorder)
importFrom(stats,sd)
importFrom(stats,setNames)
importFrom(stats,wilcox.test)
importFrom(tidyr,gather)
importFrom(tidyr,replace_na)
importFrom(tidyr,spread)
importFrom(tidyselect,all_of)
importFrom(tidytext,unnest_tokens)
importFrom(utils,write.csv)
importFrom(utils,write.table)
importFrom(widyr,pairwise_count)

154
R/IV_by_Period.R Normal file
Просмотреть файл

@ -0,0 +1,154 @@
#' @title Identify the WPA metrics that have the biggest change between two periods.
#'
#' @description
#' This function uses the Information Value algorithm to predict which WPA metrics are most explained by the change in dates.
#'
#' @param data Person Query as a dataframe including date column named "Date"
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
#' @param before_start Start date of "before" time period in YYYY-MM-DD. Defaults to earliest date in dataset.
#' @param before_end End date of "before" time period in YYYY-MM-DD
#' @param after_start Start date of "after" time period in YYYY-MM-DD. Defaults to day after before_end.
#' @param after_end End date of "after" time period in YYYY-MM-DD. Defaults to latest date in dataset.
#' @param mybins Number of bins to cut the data into for Information Value analysis. Defaults to 10.
#' @param return Specify whether to return a summary table or detailed Excel and PDF files
#' Specify "table" or "detailed" for outputs. Defaults to table.
#'
#' @import dplyr
#' @import Information
#'
#'
#' @family Flexible Input
#'
#' @export
IV_by_period <-
function(data,
before_start = min(as.Date(data$Date, "%m/%d/%Y")),
before_end,
after_start = as.Date(before_end) + 1,
after_end = max(as.Date(data$Date, "%m/%d/%Y")),
mybins = 10,
return = "table") {
## Check inputs
required_variables <- c("Date",
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
daterange_1_start <- as.Date(before_start)
daterange_1_end <- as.Date(before_end)
daterange_2_start <- as.Date(after_start)
daterange_2_end <- as.Date(after_end)
WpA_dataset <- data %>% mutate(Date = as.Date(Date, "%m/%d/%Y"))
# Check for dates in data file
if (daterange_1_start < min(WpA_dataset$Date) |
daterange_1_start > max(WpA_dataset$Date) |
daterange_1_end < min(WpA_dataset$Date) |
daterange_1_end > max(WpA_dataset$Date) |
daterange_2_start < min(WpA_dataset$Date) |
daterange_2_start > max(WpA_dataset$Date) |
daterange_2_end < min(WpA_dataset$Date) |
daterange_2_end > max(WpA_dataset$Date)) {
stop('Dates not found in dataset')
geterrmessage()
}
# Create variable => Period
WpA_dataset_table <-
WpA_dataset %>%
mutate(
Period = case_when(
Date >= daterange_1_start &
Date <= daterange_1_end ~ "Before",
Date >= daterange_2_start &
Date <= daterange_2_end ~ "After"
)
) %>% filter(Period == "Before" | Period == "After")
WpA_dataset_table <-
WpA_dataset_table %>% mutate(outcome = case_when(Period == "Before" ~ "0",
Period == 'After' ~ "1"))
# Make the use of exponential notation less likely
options(scipen = 10)
# De-select character columns
train <-
WpA_dataset_table %>%
transform(outcome = as.numeric(outcome)) %>%
select_if(is.numeric)
# Filter out NAs
train <- train %>%
filter(rowSums(is.na(.[, ])) < 1)
# Rename Outcome Variable
# train <- transform(train, outcome = as.numeric(outcome))
train <- rename(train, 'Outcome' = "outcome")
colnames(train)
# Calculate Odds
odds <-
sum(train$Outcome) / (length(train$Outcome) - sum(train$Outcome))
lnodds <- log(odds)
# IV Analysis
IV <- create_infotables(data = train, y = "Outcome", bins = mybins)
# if(return == "detailed"){
# # Ranking variables using IV
# wb <- createWorkbook()
# addWorksheet(wb, "Ranking")
# writeDataTable(wb, "Ranking", x = data.frame(IV$Summary))
#
# # Export Individual Tables
# for(i in names(IV$Tables)){
# print(i)
# addWorksheet(wb, substr(i, start = nchar(i) - 30, stop = nchar(i)))
# temp <- IV$Tables[[i]]
# temp$ODDS <- exp(temp$WOE + lnodds)
# temp$PROB <- (temp$ODDS / (temp$ODDS + 1))
# writeDataTable(wb, substr(i, start = nchar(i) - 30, stop = nchar(i)) , x = data.frame(temp))
# }
#
# # Save Workbook
# saveWorkbook(wb, "Output_IV_v2.xlsx", overwrite = TRUE)
#
# # Plot Graph
# pdf("Output_IV_v2.pdf")
# plot_infotables(IV, IV$Summary$Variable[], same_scale=TRUE)
# dev.off()
# } else
if (return == "table") {
# Store all individual dataframes
Tables <- c()
Summary <- data.frame(IV$Summary)
Tables$Summary <- Summary
for (i in names(IV$Tables)) {
temp <- IV$Tables[[i]]
temp$ODDS <- exp(temp$WOE + lnodds)
temp$PROB <- (temp$ODDS / (temp$ODDS + 1))
Tables[[i]] <- create_dt(temp, rounding = 2)
}
# Return ranking table
return(create_dt(Tables$Summary, rounding = 2))
# print("Access individual metrics via Outputs[[metric_name]], e.g., Outputs[[Workweek_span]]")
# # Store each variable's plot
# plots <- c()
# for (i in names(IV$Tables)) {
# plots[[i]] <- plot_infotables(IV, i)
# }
} else {
stop("Please enter a valid input for `return`, either detailed or table.")
}
}

72
R/IV_report.R Normal file
Просмотреть файл

@ -0,0 +1,72 @@
#' @title Information Value HTML Report
#'
#' @description
#' The function generates an interactive HTML report using
#' Standard Query data as an input. The report contains a full Information Value analysis, a data exploration technique that helps determine which columns in a data set have predictive power or influence on the value of a specified dependent variable.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param predictors A character vector specifying the columns to be used as predictors.
#' Defaults to NULL, where all numeric vectors in the data will be used as predictors.
#' @param outcome A string specifying a binary variable, i.e. can only contain
#' the values 1 or 0.
#' @param bins Number of bins to use in `Information::create_infotables()`, defaults to 10.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "collaboration report".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#'
#' @family Reports
#'
#' @export
IV_report <- function(data,
predictors = NULL,
outcome,
bins= 5,
path = "IV report",
timestamp = TRUE){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
table_list <-
data %>%
create_IV(outcome = outcome,
predictors = predictors,
bins = bins,
return = "list")
table_names <- gsub("_", " ", x = names(table_list))
output_list <-
list(data %>% check_query(return = "text") %>% md2html(),
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins),
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins, return="summary"),
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins, return="plot-WOE")) %>%
c(table_list) %>%
purrr::map_if(is.data.frame, create_dt)
title_list <- c("Data Overview", "Top Predictors", "", "WOE Analysis", table_names)
n_title <- length(title_list)
title_levels <- c(2,2,4,2, rep(3, n_title-4))
generate_report(title = "Information Value Report",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = title_levels,
theme = "cosmo",
preamble = read_preamble("IV_report.md"))
}

64
R/afterhours_dist.R Normal file
Просмотреть файл

@ -0,0 +1,64 @@
#' @title After-Hours distribution
#' @name afterhours_dist
#'
#' @description Analyse the distribution of weekly after-hours collaboration time.
#' Returns a stacked bar plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @details
#' Uses the metric \code{After_hours_collaboration_hours}.
#'
#' @param data A Standard Person Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param cut A vector specifying the cuts to use for the data,
#' accepting "default" or "range-cut" as character vector,
#' or a numeric value of length three to specify the exact breaks to use. e.g. c(1, 3, 5)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom tidyr spread
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family After-Hours
#'
#' @examples
#' \dontrun{
#' ## Return a plot
#' afterhours_dist(sq_data, hrvar = "Organization")
#'
#' ## Return a table
#' afterhours_dist(sq_data, hrvar = "Organization", return = "table")
#'
#' ## Return result with a custom specified breaks
#' afterhours_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
#' }
#' @export
afterhours_dist <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
cut = c(1, 2, 3)) {
create_dist(data = data,
metric = "After_hours_collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
cut = cut,
dist_colours = c("#FE7F4F",
"#ffdfd3",
"#bed6f2",
"#e9f1fb"))
}
#' @rdname afterhours_dist
#' @export
afterhours_distribution <- afterhours_dist

52
R/afterhours_rank.R Normal file
Просмотреть файл

@ -0,0 +1,52 @@
#' @title Rank groups with high After-Hours Collaboration Hours
#'
#' @description
#' This function scans a Standard Person Query for groups with high levels of After-Hours Collaboration.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of After-Hours Collaboration Hours.
#'
#' @details
#' Uses the metric \code{After_hours_collaboration_hours}.
#'
#' @param data A Standard Person Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "html" (default, returning an interactive DataTable)
#' and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family After-Hours
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
afterhours_rank <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
output <-
data %>%
create_rank(metric = "After_hours_collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = "table")
if(return == "html"){
return(create_dt(output))
} else if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

47
R/afterhours_summary.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Summary of After-Hours Collaboration Hours
#'
#' @description
#' Provides an overview analysis of after-hours collaboration time.
#' Returns a bar plot showing average weekly after-hours collaboration hours by default.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metric \code{After_hours_collaboration_hours}.
#'
#' @inheritParams create_bar
#'
#' @family After-Hours
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' # Return a ggplot bar chart
#' afterhours_summary(sq_data, hrvar = "LevelDesignation")
#'
#' # Return a summary table
#' afterhours_summary(sq_data, hrvar = "LevelDesignation", return = "table")
#' @export
afterhours_summary <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_bar(data = data,
metric = "After_hours_collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
bar_colour = "alert")
}
#' @rdname afterhours_summary
#' @export
afterhours_sum <- afterhours_summary

106
R/afterhours_trend.R Normal file
Просмотреть файл

@ -0,0 +1,106 @@
#' @title After-Hours Time Trend
#'
#' @description
#' Provides a week by week view of after-hours collaboration time.
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metric \code{After_hours_collaboration_hours}.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family After-Hours
#'
#' @examples
#' ## Run plot
#' afterhours_trend(sq_data)
#'
#' ## Run table
#' afterhours_trend(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
afterhours_trend <- function(data, hrvar = "Organization", mingroup=5, return = "plot"){
## Check inputs
required_variables <- c("Date",
"After_hours_collaboration_hours",
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
myTable <-
data %>%
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, After_hours_collaboration_hours) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count=mean(Employee_Count),
After_hours_collaboration_hours=mean(After_hours_collaboration_hours))
myTable_plot <-
myTable %>%
select(Date, group, After_hours_collaboration_hours)
myTable_return <-
myTable_plot %>%
spread(Date, After_hours_collaboration_hours)
plot_object <-
myTable_plot %>%
ggplot(aes(x =Date , y = group , fill = After_hours_collaboration_hours)) +
geom_tile(height=.5) +
scale_fill_gradient(name="Hours", low = "white", high = "red") +
theme_classic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14), legend.position = "right",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = "After-Hours Collaboration",
subtitle = paste("Meeting and Email Time Outside Working Hours by", hrvar)) +
xlab("Date") +
ylab(hrvar) +
labs(caption = extract_date_range(data, return = "text"))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

101
R/capacity_report.R Normal file
Просмотреть файл

@ -0,0 +1,101 @@
#' @title Generate a Capacity report in HTML
#'
#' @description
#' The function generates an interactive HTML report using the
#' Standard Person Query data as an input. The report contains a series
#' of summary analysis and visualisations relating to key **capacity**
#' metrics in Workplace Analytics,including length of week and
#' time in after-hours collaboration.
#'
#' @param data A Standard Person Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "capacity report".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#'
#' @importFrom purrr map_if
#' @importFrom dplyr `%>%`
#'
#' @family Reports
#'
#' @export
capacity_report <- function(data,
hrvar = "Organization",
mingroup = 5,
path = "capacity report",
timestamp = TRUE){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
# Set outputs
output_list <-
list(data %>% check_query(return = "text") %>% md2html(),
md2html(text = read_preamble("workloads_section.md")), # Header
data %>% workloads_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% workloads_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% workloads_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% workloads_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% workloads_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% workloads_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% workloads_line(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% workloads_line(hrvar = hrvar, mingroup = mingroup, return = "table"),
md2html(text = read_preamble("afterhours_section.md")), # Header
data %>% afterhours_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% afterhours_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% afterhours_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% afterhours_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% afterhours_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% afterhours_trend(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
purrr::map_if(is.data.frame, create_dt)
# Set header titles
title_list <-
c("Data Overview",
"Workloads",
"Workloads Summary - Plot",
"Workloads Summary - Table",
"Workloads Distribution - Plot",
"Workloads Distribution - Table",
"Workloads Trend - Plot",
"Workloads Trend - Table",
"Workloads over time - Plot",
"Workloads over time - Table",
"After hours",
"After hours Summary - Plot",
"After hours Summary - Table",
"After hours Distribution - Plot",
"After hours Distribution - Table",
"After hours Trend - Plot",
"After hours Trend - Table")
# Set header levels
n_title <- length(title_list)
levels_list <- rep(3, n_title)
levels_list[c(1, 2, 11)] <- 2 # Section header
generate_report(title = "Capacity Report",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = rep(3, n_title),
theme = "cosmo",
preamble = read_preamble("capacity_report.md"))
}

206
R/check_query.R Normal file
Просмотреть файл

@ -0,0 +1,206 @@
#' @title Check a query to ensure that it is suitable for analysis
#'
#' @description
#' Prints diagnostic data about the data query to the R console, with information
#' such as date range, number of employees, HR attributes identified, etc.
#'
#' @details
#' This can be used with any person-level query, such as the standard person query,
#' collaboration assessment query, and the hourly collaboration query. When run,
#' this prints diagnostic data to the R console.
#'
#' @param data A person-level query in the form of a data frame, including the standard person query,
#' collaboration assessment query, and the hourly collaboration query.
#' @param return Character string to specify whether to return a console message ("message"),
#' a character string ("text"), or a data frame. Defaults to "message".
#' @param validation Logical value to specify whether to return a check used by the `validation_report()`.
#' Defaults to FALSE.
#'
#' @examples
#' check_query(sq_data)
#'
#' @family Data Validation
#'
#' @export
check_query <- function(data, return = "message", validation = FALSE){
if(!is.data.frame(data)){
stop("Input is not a data frame.")
}
if("PersonId" %in% names(data)){
if(validation == FALSE){
check_person_query(data = data, return = return)
} else if(validation == TRUE){
# Different displays required for validation_report()
check_query_validation(data = data, return = return)
}
} else {
message("Note: checks are currently unavailable for a non-Person query")
}
}
#' @title Check a Person Query to ensure that it is suitable for analysis
#'
#' @description
#' Prints diagnostic data about the data query to the R console, with information
#' such as date range, number of employees, HR attributes identified, etc.
#'
#' @inheritParams check_query
#'
#' @details Used as part of `check_query()`.
#'
#'
check_person_query <- function(data, return){
## Query Type - Initialise
## Uses `identify_query()`
main_chunk <- paste0("The data used is a ", identify_query(data), ".")
## PersonId
if(!("PersonId" %in% names(data))){
stop("There is no `PersonId` variable in the input.")
} else {
new_chunk <- paste("There are", dplyr::n_distinct(data$PersonId), "employees in this dataset.")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Date
if(!("Date" %in% names(data))){
stop("There is no `Date` variable in the input.")
} else {
data$Date <- as.Date(data$Date, "%m/%d/%Y")
new_chunk <- paste0("Date ranges from ", min(data$Date), " to ", max(data$Date), ".")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## HR Variables
hr_chr <- extract_hr(data, max_unique = 200) %>% wrap(wrapper = "`")
new_chunk <- paste("There are", length(hr_chr), "(estimated) HR attributes in the data:" )
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
new_chunk <- paste(hr_chr, collapse = ", ")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
## `IsActive` flag
if(!("IsActive" %in% names(data))){
new_chunk <- "The `IsActive` flag is not present in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
} else {
data$IsActive <- as.logical(data$IsActive) # Force to logical
active_n <- dplyr::n_distinct(data[data$IsActive == TRUE, "PersonId"])
new_chunk <- paste0("There are ", active_n, " active employees out of all in the dataset.")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Variable check header
new_chunk <- "Variable name check:"
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
## Collaboration_hours
if(!("Collaboration_hours" %in% names(data)) &
("Collaboration_hrs" %in% names(data))){
new_chunk <- "`Collaboration_hrs` is used instead of `Collaboration_hours` in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
} else if(!("Collaboration_hrs" %in% names(data)) &
("Collaboration_hours" %in% names(data))){
new_chunk <- "`Collaboration_hours` is used instead of `Collaboration_hrs` in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
} else {
new_chunk <- "No collaboration hour metric exists in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Instant_Message_hours
if(!("Instant_message_hours" %in% names(data)) &
("Instant_Message_hours" %in% names(data))){
new_chunk <- "`Instant_Message_hours` is used instead of `Instant_message_hours` in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
} else if(!("Instant_Message_hours" %in% names(data)) &
("Instant_message_hours" %in% names(data))){
new_chunk <- "`Instant_message_hours` is used instead of `Instant_Message_hours` in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
} else {
new_chunk <- "No instant message hour metric exists in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Return
if(return == "message"){
main_chunk <- paste("", main_chunk, sep = "\n")
message(main_chunk)
} else if(return == "text"){
main_chunk
} else {
stop("Please check inputs for `return`")
}
}
#' @title Perform a query check for the validation report
#'
#' @description
#' Prints diagnostic data about the data query to the R console, with information
#' such as date range, number of employees, HR attributes identified, etc.
#' Optimised for the `validation_report()`
#'
#' @inheritParams check_query
#'
#' @details Used as part of `check_query()`.
#'
check_query_validation <- function(data, return){
## Query Type - Initialise
main_chunk <- ""
## PersonId
if(!("PersonId" %in% names(data))){
stop("There is no `PersonId` variable in the input.")
} else {
new_chunk <- paste("There are", dplyr::n_distinct(data$PersonId), "employees in this dataset.")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Date
if(!("Date" %in% names(data))){
stop("There is no `Date` variable in the input.")
} else {
data$Date <- as.Date(data$Date, "%m/%d/%Y")
new_chunk <- paste0("Date ranges from ", min(data$Date), " to ", max(data$Date), ".")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## HR Variables
hr_chr <- extract_hr(data, max_unique = 200) %>% wrap(wrapper = "`")
new_chunk <- paste("There are", length(hr_chr), "(estimated) HR attributes in the data:" )
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
new_chunk <- paste(hr_chr, collapse = ", ")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
## `IsActive` flag
if(!("IsActive" %in% names(data))){
new_chunk <- "The `IsActive` flag is not present in the data."
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
} else {
data$IsActive <- as.logical(data$IsActive) # Force to logical
active_n <- dplyr::n_distinct(data[data$IsActive == TRUE, "PersonId"])
new_chunk <- paste0("There are ", active_n, " active employees out of all in the dataset.")
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
}
## Return
if(return == "message"){
main_chunk <- paste("", main_chunk, sep = "\n")
message(main_chunk)
} else if(return == "text"){
main_chunk
} else {
stop("Please check inputs for `return`")
}
}

81
R/coaching_report.R Normal file
Просмотреть файл

@ -0,0 +1,81 @@
#' @title Generate a Coaching report in HTML
#'
#' @description
#' The function generates an interactive HTML report using
#' Standard Query data as an input. The report contains a series
#' of summary analysis and visualisations relating to key **coaching**
#' metrics in Workplace Analytics, specifically relating to the time
#' spent between managers and their direct reports.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "LevelDesignation"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "collaboration report".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#'
#' @importFrom purrr map_if
#' @importFrom dplyr `%>%`
#'
#' @family Reports
#'
#' @export
coaching_report <- function(data,
hrvar = "LevelDesignation",
mingroup = 5,
path = "coaching report",
timestamp = TRUE){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
output_list <-
list(data %>% check_query(return = "text") %>% md2html(),
data %>% mgrrel_matrix(hrvar = hrvar, return = "plot"), # no mingroup arg
data %>% mgrrel_matrix(hrvar = hrvar, return = "table"), # no mingroup arg
data %>% one2one_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% one2one_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% one2one_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% one2one_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% one2one_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% one2one_trend(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
purrr::map_if(is.data.frame, create_dt)
title_list <-
c("Data Overview",
"Manager Relation Style - Plot",
"Manager Relation Style - Table",
"1-to-1 Summary - Plot",
"1-to-1 Summary - Table",
"1-to-1 Distribution - Plot",
"1-to-1 Distribution - Table",
"1-to-1 Trend - Plot",
"1-to-1 Trend - Table")
n_title <- length(title_list)
generate_report(title = "Coaching Report",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = rep(3, n_title),
theme = "cosmo",
preamble = read_preamble("coaching_report.md"))
}

134
R/collaboration_area.R Normal file
Просмотреть файл

@ -0,0 +1,134 @@
#' @title Collaboration - Stacked Area Plot
#'
#' @description
#' Provides an overview analysis of 'Weekly Digital Collaboration'.
#' Returns an stacked area plot of Email and Meeting Hours by default.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metrics `Meeting_hours`, `Email_hours`, `Unscheduled_Call_hours`,
#' and `Instant_Message_hours`.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' A Collaboration assessment dataset may also be provided, in which
#' Unscheduled call hours would be included in the output.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Collaboration
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
collaboration_area <- function(data,
hrvar = "Organization",
mingroup=5,
return = "plot"){
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
if("Instant_message_hours" %in% names(data)){
data <- rename(data, Instant_Message_hours = "Instant_message_hours")
}
if("Unscheduled_call_hours" %in% names(data)){
data <- rename(data, Unscheduled_Call_hours = "Unscheduled_call_hours")
}
if("Unscheduled_Call_hours" %in% names(data)){
main_vars <- c("Meeting_hours",
"Email_hours",
"Instant_Message_hours",
"Unscheduled_Call_hours")
} else {
main_vars <- c("Meeting_hours",
"Email_hours",
"Instant_Message_hours")
}
myTable <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId,
Date,
group,
main_vars) %>%
group_by(Date, group) %>%
summarise_at(vars(main_vars), ~mean(.)) %>%
left_join(hrvar_count(data, hrvar, return = "table"),
by = c("group" = hrvar)) %>%
rename(Employee_Count = "n") %>%
filter(Employee_Count >= mingroup) %>%
ungroup()
myTable_long <-
myTable %>%
select(Date, group, ends_with("_hours")) %>%
gather(Metric, Hours, -Date, -group) %>%
mutate(Metric = sub(pattern = "_hours", replacement = "", x = Metric))
## Levels
level_chr <- sub(pattern = "_hours", replacement = "", x = main_vars)
## Colour definitions
colour_defs <-
c("Meeting" = rgb2hex(174, 239, 168),
"Email" = rgb2hex(192, 117, 205),
"Instant_Message" = rgb2hex(113, 200, 234),
"Unscheduled_Call" = rgb2hex(252, 161, 124))
colour_defs <- colour_defs[names(colour_defs) %in% level_chr]
plot_object <-
myTable_long %>%
mutate(Metric = factor(Metric, levels = level_chr)) %>%
ggplot(aes(x = Date, y = Hours, colour = Metric)) +
geom_area(aes(fill = Metric), alpha = 1.0, position = 'stack') +
theme_wpa_basic() +
scale_y_continuous(labels = round) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_colour_manual(values = colour_defs) +
scale_fill_manual(values = colour_defs) +
facet_wrap(.~group) +
labs(title = "Total Collaboration Hours",
subtitle = paste("Weekly collaboration hours by", camel_clean(hrvar))) +
labs(caption = extract_date_range(data, return = "text"))
if(return == "table"){
myTable %>%
as_tibble() %>%
mutate(Collaboration_hours = select(., main_vars) %>%
apply(1, sum, na.rm = TRUE))
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

40
R/collaboration_dist.R Normal file
Просмотреть файл

@ -0,0 +1,40 @@
#' @title Collaboration Hours distribution
#'
#' @description
#' Analyze the distribution of Collaboration Hours.
#' Returns a stacked bar plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @details
#' Uses the metric `Collaboration_hours`.
#'
#' @inheritParams create_dist
#' @family Collaboration
#'
#' @examples
#' ## Return a plot
#' collaboration_dist(sq_data, hrvar = "Organization")
#'
#' ## Return a table
#' collaboration_dist(sq_data, hrvar = "Organization", return = "table")
#' @export
collaboration_dist <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
cut = c(15, 20, 25)) {
create_dist(data = data,
metric = "Collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
cut = cut)
}
#' @rdname collaboration_dist
#' @export
collaboration_distribution <- collaboration_dist

30
R/collaboration_fizz.R Normal file
Просмотреть файл

@ -0,0 +1,30 @@
#' @title Distribution of Collaboration Hours (Fizzy Drink plot)
#'
#' @description
#' Analyze weekly collaboration hours distribution, and returns
#' a 'fizzy' scatter plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @details
#' Uses the metric `Collaboration_hours`.
#'
#' @inheritParams create_fizz
#'
#' @family Collaboration
#'
#' @examples
#' collaboration_fizz(sq_data, hrvar = "Organization", return = "table")
#' @export
collaboration_fizz <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_fizz(data = data,
metric = "Collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

61
R/collaboration_line.R Normal file
Просмотреть файл

@ -0,0 +1,61 @@
#' @title Collaboration Time Trend - Line Chart
#'
#' @description
#' Provides a week by week view of collaboration time, visualised as line charts.
#' By default returns a line chart for collaboration hours,
#' with a separate panel per value in the HR attribute.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metric `Collaboration_hours`.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Collaboration
#'
#' @examples
#'
#' ## Return a line plot
#' collaboration_line(sq_data, hrvar = "LevelDesignation")
#'
#'
#' ## Return a table
#' collaboration_line(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
collaboration_line <- function(data,
hrvar = "Organization",
mingroup=5,
return = "plot"){
## Inherit arguments
output <- create_line(data = data,
metric = "Collaboration_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
if(return == "plot"){
output +
labs(title = "Collaboration Hours")
} else if(return == "table"){
output
} else {
stop("Invalid `return` value")
}
}

50
R/collaboration_rank.R Normal file
Просмотреть файл

@ -0,0 +1,50 @@
#' @title Collaboration Ranking
#'
#' @description
#' This function scans a standard query output for groups with high levels of 'Weekly Digital Collaboration'.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
#'
#' @details
#' Uses the metric `Collaboration_hours`.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "html" (default, returning an interactive DataTable)
#' and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Collaboration
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
collaboration_rank <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
output <-
data %>% create_rank(metric="Collaboration_hours", hrvar = hrvar,
mingroup = mingroup,
return = "table")
if(return == "html"){
return(create_dt(output))
} else if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

125
R/collaboration_report.R Normal file
Просмотреть файл

@ -0,0 +1,125 @@
#' @title Generate a Collaboration report in HTML
#'
#' @description
#' The function generates an interactive HTML report using
#' Standard Query data as an input. The report contains a series
#' of summary analysis and visualisations relating to key **collaboration**
#' metrics in Workplace Analytics,including email and meeting hours.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "collaboration report".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#'
#' @importFrom purrr map_if
#' @importFrom dplyr `%>%`
#'
#' @family Reports
#'
#' @export
collaboration_report <- function(data,
hrvar = "Organization",
mingroup = 5,
path = "collaboration report",
timestamp = TRUE){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
# Set outputs
output_list <-
list(data %>% check_query(return = "text") %>% md2html(),
md2html(text = read_preamble("collaboration_section.md")), # Header
data %>% collaboration_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% collaboration_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% afterhours_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% afterhours_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% collaboration_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% collaboration_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% collaboration_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% collaboration_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% collaboration_line(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% collaboration_line(hrvar = hrvar, mingroup = mingroup, return = "table"),
md2html(text = read_preamble("email_section.md")), # Header
data %>% email_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% email_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% email_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% email_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% email_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% email_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
md2html(text = read_preamble("meeting_section.md")), # Header
data %>% meeting_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% meeting_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% meeting_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% meeting_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% meeting_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% meeting_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% meeting_quality(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% meeting_quality(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
purrr::map_if(is.data.frame, create_dt)
# Set header titles
title_list <-
c("Data Overview",
"Collaboration", # Section header
"Collaboration Summary - Plot",
"Collaboration Summary - Table",
"Afterhours Summary - Plot",
"Afterhours Summary - Table",
"Collaboration Distribution - Plot",
"Collaboration Distribution - Table",
"Collaboration Trend - Plot",
"Collaboration Trend - Table",
"Collaboration over time - Plot",
"Collaboration over time - Table",
"Email", # Section header
"Email Summary - Plot",
"Email Summary - Table",
"Email Distribution - Plot",
"Email Distribution - Table",
"Email Trend - Plot",
"Email Trend - Table",
"Meeting", # Section header
"Meeting Summary - Plot",
"Meeting Summary - Table",
"Meeting Distribution - Plot",
"Meeting Distribution - Table",
"Meeting Trend - Plot",
"Meeting Trend - Table",
"Meeting Quality - Plot",
"Meeting Quality - Table")
# Set header levels
n_title <- length(title_list)
levels_list <- rep(3, n_title)
levels_list[c(1, 2, 13, 20)] <- 2 # Section header
# Generate report
generate_report(title = "Collaboration Report",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = levels_list,
theme = "cosmo",
preamble = read_preamble("collaboration_report.md"))
}

79
R/collaboration_sum.R Normal file
Просмотреть файл

@ -0,0 +1,79 @@
#' @title Collaboration Summary
#'
#' @description
#' Provides an overview analysis of 'Weekly Digital Collaboration'.
#' Returns a stacked bar plot of Email and Meeting Hours by default.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metrics `Meeting_hours`, `Email_hours`, `Unscheduled_Call_hours`,
#' and `Instant_Message_hours`.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Collaboration
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
collaboration_sum <- function(data,
hrvar = "Organization",
mingroup=5,
return = "plot"){
if("Instant_message_hours" %in% names(data)){
data <- rename(data, Instant_Message_hours = "Instant_message_hours")
}
if("Unscheduled_Call_hours" %in% names(data)){
main_vars <- c("Meeting_hours",
"Email_hours",
"Instant_Message_hours",
"Unscheduled_Call_hours")
} else {
main_vars <- c("Meeting_hours",
"Email_hours")
}
create_stacked(data = data,
hrvar = hrvar,
metrics = main_vars,
mingroup = mingroup,
return = return)
}
#' @rdname collaboration_sum
#' @export
collab_sum <- collaboration_sum
#' @rdname collaboration_sum
#' @export
collaboration_summary <- collaboration_sum
#' @rdname collaboration_sum
#' @export
collab_summary <-collaboration_sum

100
R/collaboration_trend.R Normal file
Просмотреть файл

@ -0,0 +1,100 @@
#' @title Collaboration Time Trend
#'
#' @description
#' Provides a week by week view of collaboration time.
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
#' Additional options available to return a summary table.
#'
#' @details
#' Uses the metric `Collaboration_hours`.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Collaboration
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
collaboration_trend <- function(data,
hrvar = "Organization",
mingroup=5,
return = "plot"){
## Check inputs
required_variables <- c("Date",
"Collaboration_hours",
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
myTable <-
data %>%
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, Collaboration_hours) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count = mean(Employee_Count),
Collaboration_hours = mean(Collaboration_hours))
myTable_plot <- myTable %>% select(Date, group, Collaboration_hours)
myTable_return <- myTable_plot %>% spread(Date, Collaboration_hours)
plot_object <-
myTable_plot %>%
ggplot(aes(x = Date , y = group , fill = Collaboration_hours)) +
geom_tile(height=.5) +
scale_fill_gradient(name="Hours", low = "white", high = "red") +
theme_classic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "right",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = "Collaboration Hours",
subtitle = paste("Total meeting and email time by", tolower(hrvar))) +
xlab("Date") +
ylab(hrvar) +
labs(caption = extract_date_range(data, return = "text"))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

73
R/connectivity_report.R Normal file
Просмотреть файл

@ -0,0 +1,73 @@
#' @title Generate a Connectivity report in HTML
#'
#' @description
#' The function generates an interactive HTML report using
#' Standard Query data as an input. The report contains a series
#' of summary analysis and visualisations relating to key **connectivity**
#' metrics in Workplace Analytics, including external/internal network size
#' vs breadth.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "collaboration report".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#'
#' @importFrom purrr map_if
#' @importFrom dplyr `%>%`
#'
#' @family Reports
#'
#' @export
connectivity_report <- function(data,
hrvar = "LevelDesignation",
mingroup = 5,
path = "connectivity report",
timestamp = TRUE){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
output_list <-
list(data %>% check_query(return = "text") %>% md2html(),
data %>% external_network_plot(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% external_network_plot(hrvar = hrvar, mingroup = mingroup, return = "table"),
data %>% internal_network_plot(hrvar = hrvar, mingroup = mingroup, return = "plot"),
data %>% internal_network_plot(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
purrr::map_if(is.data.frame, create_dt)
title_list <-
c("Data Overview",
"External network - Plot",
"External network - Table",
"Internal network - Plot",
"Internal network - Table")
n_title <- length(title_list)
generate_report(title = "Connectivity Report",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = rep(3, n_title),
theme = "cosmo",
preamble = read_preamble("connectivity_report.md"))
}

32
R/copy_df.R Normal file
Просмотреть файл

@ -0,0 +1,32 @@
#' @title Copy a data frame to clipboard for pasting in Excel
#'
#' @description
#' This is a pipe-optimised function, that feeds into `wpa::export()`,
#' but can be used as a stand-alone function.
#'
#' Based on the original function at http://www.github.com/martinctc/surveytoolbox.
#'
#' @param x Data frame to be passed through. Cannot contain list-columns or nested data frames.
#' @param row.names A logical vector for specifying whether to allow row names. Defaults to FALSE.
#' @param col.names A logical vector for specifying whether to allow column names. Defaults to FALSE.
#' @param quietly Set this to TRUE to not print data frame on console
#' @param ... Additional arguments for write.table().
#'
#' @importFrom utils write.table
#'
#' @export
copy_df <-function(x,
row.names = FALSE,
col.names = TRUE,
quietly = FALSE,...) {
utils::write.table(x,"clipboard-50000",
sep="\t",
row.names=row.names,
col.names=col.names,...)
if(quietly==FALSE) print(x)
}

92
R/create_IV.R Normal file
Просмотреть файл

@ -0,0 +1,92 @@
#' @title Calculate Information Value for a selected outcome variable
#'
#' @description
#' Specify an outcome variable and return IV outputs.
#' All numeric variables in the dataset are used as predictor variables.
#'
#' @param data A Person Query dataset in the form of a data frame.
#' @param predictors A character vector specifying the columns to be used as predictors.
#' Defaults to NULL, where all numeric vectors in the data will be used as predictors.
#' @param outcome A string specifying a binary variable, i.e. can only contain
#' the values 1 or 0.
#' @param bins Number of bins to use in `Information::create_infotables()`, defaults to 10.
#' @param return String specifying what output to return.
#' Defaults to "plot" that return a bar plot summarising the information value.
#' "summary" returns a summary table, "list" returns a list of outputs for all the
#' input variables, "plot-WOE" commpares distribution for top predictors.
#'
#' @import dplyr
#'
#' @family Information Value
#'
#' @examples
#' \dontrun{
#' sq_data %>%
#' mutate(X = ifelse(Collaboration_hours > 12, 1, 0)) %>%
#' create_IV(outcome = "X")
#' sq_data %>%
#' mutate(X = ifelse(Collaboration_hours > 12, 1, 0)) %>%
#' create_IV(outcome = "X",
#' predictors = c("Email_hours", "Meeting_hours"),
#' return = "list")
#' }
#'
#' @export
create_IV <- function(data,
predictors = NULL,
outcome,
bins = 5,
return = "plot"){
if(is.null(predictors)){
train <-
data %>%
rename(outcome = outcome) %>%
select_if(is.numeric) %>%
tidyr::drop_na()
} else {
train <-
data %>%
rename(outcome = outcome) %>%
select(predictors, outcome) %>%
tidyr::drop_na()
}
# Calculate Odds
odds <- sum(train$outcome) / (length(train$outcome) - sum(train$outcome))
lnodds <- log(odds)
# IV Analysis
IV <- Information::create_infotables(data = train, y = "outcome", bins = bins)
IV_summary <- IV$Summary
IV_names <- names(IV$Tables)
# Output list
output_list <-
IV_names %>%
purrr::map(function(x){
IV$Tables[[x]] %>%
mutate(ODDS = exp(WOE + lnodds),
PROB = ODDS / (ODDS + 1))
}) %>%
purrr::set_names(IV_names)
if(return == "summary"){
IV_summary
} else if(return == "plot"){
IV_summary %>%
utils::head(12) %>%
create_bar_asis(group_var = "Variable",
bar_var = "IV",
title = "Information Value (IV)",
subtitle = "Showing top 12 only")
} else if(return == "plot-WOE"){
Information::plot_infotables(IV, IV$Summary$Variable[], same_scale=TRUE) %>% grDevices::recordPlot()
} else if(return == "list"){
output_list
} else {
stop("Please enter a valid input for `return`.")
}
}

168
R/create_bar.R Normal file
Просмотреть файл

@ -0,0 +1,168 @@
#' @title Mean Bar Plot (General Purpose)
#'
#' @description
#' Provides an overview analysis of a selected metric by calculating a mean per metric.
#' Returns a bar plot showing the average of a selected metric by default.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param bar_colour String to specify colour to use for bars.
#' In-built accepted values include "default" (default), "alert" (red), and
#' "darkblue". Otherwise, hex codes are also accepted. You can also supply
#' RGB values via `rgb2hex()`.
#' @param na.rm A logical value indicating whether NA values should be stripped
#' before the computation proceeds. Defaults to FALSE.
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @importFrom scales wrap_format
#' @importFrom stats reorder
#'
#' @family General
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' # Return a ggplot bar chart
#' create_bar(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation")
#' create_bar(sq_data, metric = "Generated_workload_email_hours")
#' create_bar(sq_data, metric = "After_hours_collaboration_hours")
#'
#' # Return a summary table
#' create_bar(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "table")
#' @export
create_bar <- function(data,
metric,
hrvar = "Organization",
mingroup = 5,
return = "plot",
bar_colour = "default",
na.rm = FALSE){
## Check inputs
required_variables <- c("Date",
metric,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Clean metric name
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
## Data for bar plot
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>%
group_by(PersonId, group) %>%
summarise(!!sym(metric) := mean(!!sym(metric), na.rm = na.rm)) %>%
ungroup() %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup)
## Colour bar override
if(bar_colour == "default"){
bar_colour <- rgb2hex(0, 120, 212)
} else if(bar_colour == "alert"){
bar_colour <- rgb2hex(215, 59, 1)
} else if(bar_colour == "darkblue"){
bar_colour <- rgb2hex(0, 32, 80)
}
## Employee count / base size table
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count = first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
## Data for bar plot
plot_table <-
plot_data %>%
group_by(group) %>%
summarise_at(metric, ~mean(., na.rm = na.rm)) %>%
arrange(desc(!!sym(metric)))
## Table for annotation
annot_table <-
plot_legend %>%
dplyr::left_join(plot_table, by = "group")
## Location attribute for x axis
location <- plot_table %>% select(!!sym(metric)) %>% max()
## Bar plot
plot_object <-
plot_table %>%
ggplot(aes(x = stats::reorder(group, !!sym(metric)), y = !!sym(metric))) +
geom_bar(stat = "identity",
fill = bar_colour) +
geom_text(aes(label = round(!!sym(metric), 1)),
hjust = 1.3,
color = "#FFFFFF",
fontface = "bold",
size = 4) +
scale_y_continuous(limits = c(0, location * 1.25)) +
annotate("text",
x = plot_legend$group,
y = location * 1.15,
label = plot_legend$Employee_Count) +
annotate("rect", xmin = 0.5, xmax = length(plot_legend$group) + 0.5, ymin = location * 1.05, ymax = location * 1.25, alpha = .2) +
coord_flip() +
theme_classic() +
theme(axis.text = element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "top",
legend.justification = "right",
legend.title = element_text(size=14),
legend.text = element_text(size=14)) +
labs(title = clean_nm,
subtitle = paste("Average", clean_nm, "by", camel_clean(hrvar))) +
xlab(camel_clean(hrvar)) +
ylab(paste("Average weekly", clean_nm)) +
labs(caption = extract_date_range(data, return = "text"))
summary_table <-
plot_data %>%
select(group, !!sym(metric)) %>%
group_by(group) %>%
summarise(!!sym(metric) := mean(!!sym(metric)),
n = n())
if(return == "table"){
return(summary_table)
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

87
R/create_bar_asis.R Normal file
Просмотреть файл

@ -0,0 +1,87 @@
#' @title Create a bar chart without aggregation
#'
#' @description
#' This function creates a bar chart directly from the aggregated / summarised data.
#' Unlike `create_bar()` which performs a person-level aggregation, there is no
#' calculation for `create_bar_asis()` and the values are rendered as they are passed
#' into the function.
#'
#' @param data Plotting data as a data frame.
#' @param group_var String containing name of variable for the group.
#' @param bar_var String containing name of variable representing the value of the bars.
#' @param title Title of the plot.
#' @param subtitle Subtitle of the plot.
#' @param caption Caption of the plot.
#' @param ylab Y-axis label for the plot (group axis)
#' @param xlab X-axis label of the plot (bar axis).
#' @param percent Logical value to determine whether to show labels as percentage signs. Defaults to FALSE.
#' @param bar_colour String to specify colour to use for bars.
#' In-built accepted values include "default" (default), "alert" (red), and
#' "darkblue". Otherwise, hex codes are also accepted. You can also supply
#' RGB values via `rgb2hex()`.
#'
#' @import ggplot2
#' @import dplyr
#'
#' @export
create_bar_asis <- function(data,
group_var,
bar_var,
title = NULL,
subtitle = NULL,
caption = NULL,
ylab = group_var,
xlab = bar_var,
percent = FALSE,
bar_colour = "default"){
## Colour bar override
if(bar_colour == "default"){
bar_colour <- rgb2hex(0, 120, 212)
} else if(bar_colour == "alert"){
bar_colour <- rgb2hex(215, 59, 1)
} else if(bar_colour == "darkblue"){
bar_colour <- rgb2hex(0, 32, 80)
}
up_break <- max(data[[bar_var]], na.rm = TRUE) * 1.3
if(percent == FALSE){
returnPlot <-
data %>%
ggplot(aes(x = reorder(!!sym(group_var), !!sym(bar_var)), y = !!sym(bar_var))) +
geom_col(fill = bar_colour) +
geom_text(aes(label = round(!!sym(bar_var), 2)),
hjust = -0.25,
color = "#000000",
fontface = "bold",
size = 4)
} else if(percent == TRUE){
returnPlot <-
data %>%
ggplot(aes(x = reorder(!!sym(group_var), !!sym(bar_var)), y = !!sym(bar_var))) +
geom_col(fill = bar_colour) +
geom_text(aes(label = scales::percent(!!sym(bar_var), accuracy = 1)),
hjust = -0.25,
color = "#000000",
fontface = "bold",
size = 4)
}
returnPlot +
scale_y_continuous(limits = c(0, up_break)) +
coord_flip() +
labs(title = title,
subtitle = subtitle,
caption = caption,
y = xlab,
x = ylab) +
theme_wpa_basic()
}

148
R/create_boxplot.R Normal file
Просмотреть файл

@ -0,0 +1,148 @@
#' @title Box Plot (General Purpose)
#'
#' @description
#' Analyzes a selected metric and returns a a 'fizzy' scatter plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @details
#' This is a general purpose function that powers all the functions
#' in the package that produce box plots.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family General
#'
#' @examples
#' ## Create a fizzy plot for Work Week Span by Level Designation
#' create_boxplot(sq_data, metric = "Workweek_span", hrvar = "LevelDesignation", return = "plot")
#'
#' ## Create a summary statistics table for Work Week Span by Organization
#' create_boxplot(sq_data, metric = "Workweek_span", hrvar = "Organization", return = "table")
#'
#' ## Create a fizzy plot for Collaboration Hours by Level Designation
#' create_boxplot(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "plot")
#' @export
create_boxplot <- function(data,
metric,
hrvar = "Organization",
mingroup = 5,
return = "plot") {
## Check inputs
required_variables <- c("Date",
metric,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Clean metric name
clean_nm <- gsub(pattern = "_", replacement = "", x = metric)
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
group_by(PersonId, group) %>%
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
ungroup() %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup)
## Get max value
max_point <- max(plot_data[[metric]]) * 1.2
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count = first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
## summary table
summary_table <-
plot_data %>%
select(group, tidyselect::all_of(metric)) %>%
group_by(group) %>%
summarise(mean = mean(!!sym(metric)),
median = median(!!sym(metric)),
sd = sd(!!sym(metric)),
min = min(!!sym(metric)),
max = max(!!sym(metric)),
range = max - min,
n = n())
## group order
group_ord <-
summary_table %>%
arrange(desc(mean)) %>%
pull(group)
plot_object <-
plot_data %>%
mutate(group = factor(group, levels = group_ord)) %>%
ggplot(aes(x = group, y = !!sym(metric))) +
geom_boxplot(color = "#578DB8") +
ylim(0, max_point) +
annotate("text", x = plot_legend$group, y = 0, label = plot_legend$Employee_Count) +
scale_x_discrete(labels = scales::wrap_format(10)) +
theme_classic() +
theme(axis.text=element_text(size=12),
axis.text.x = element_text(angle = 30, hjust = 1),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "top",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = clean_nm,
subtitle = paste("Distribution of",
tolower(clean_nm),
"by",
camel_clean(hrvar))) +
xlab(hrvar) +
ylab(paste("Average", clean_nm)) +
labs(caption = extract_date_range(data, return = "text"))
if(return == "table"){
summary_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else if(return == "data"){
plot_data %>%
mutate(group = factor(group, levels = group_ord)) %>%
arrange(desc(group))
} else {
stop("Please enter a valid input for `return`.")
}
}

152
R/create_dist.R Normal file
Просмотреть файл

@ -0,0 +1,152 @@
#' @title Horizontal 100 percent stacked bar plot (General Purpose)
#'
#' @description
#' Provides an analysis of the distribution of a selected metric.
#' Returns a stacked bar plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param cut A numeric vector of length three to specify the breaks for the distribution,
#' e.g. c(10, 15, 20)
#' @param dist_colours A character vector of length four to specify colour
#' codes for the stacked bars.
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom tidyr spread
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family Collaboration
#'
#' @examples
#' ## Return a plot
#' create_dist(sq_data, metric = "Collaboration_hours", hrvar = "Organization")
#'
#' ## Return a table
#' create_dist(sq_data, metric = "Collaboration_hours", hrvar = "Organization", return = "table")
#' @export
create_dist <- function(data,
metric,
hrvar = "Organization",
mingroup = 5,
return = "plot",
cut = c(15, 20, 25),
dist_colours = c("#FE7F4F",
"#ffdfd3",
"#bed6f2",
"#e9f1fb")) {
## Check inputs
required_variables <- c("Date",
metric,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Clean metric name
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
## Basic Data for bar plot
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>%
group_by(PersonId, group) %>%
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
ungroup() %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup)
## Create buckets of collaboration hours
plot_data <-
plot_data %>%
mutate(bucket_hours = cut_hour(!!sym(metric),
cuts = cut))
## Employee count / base size table
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count=first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
## Data for bar plot
plot_table <-
plot_data %>%
group_by(group, bucket_hours) %>%
summarize(Employees=n(),
Employee_Count=first(Employee_Count),
percent= Employees / Employee_Count ) %>%
arrange(group, desc(bucket_hours))
## Table for annotation
annot_table <-
plot_legend %>%
dplyr::left_join(plot_table, by = "group")
## Bar plot
plot_object <-
plot_table %>%
ggplot(aes(x = group, y=Employees, fill = bucket_hours)) +
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
coord_flip() +
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
scale_fill_manual(name="",
values = rev(dist_colours)) +
theme_wpa_basic() +
labs(title = clean_nm,
subtitle = paste("Distribution of", clean_nm, "by", camel_clean(hrvar))) +
xlab(camel_clean(hrvar)) +
ylab("Fraction of employees") +
labs(caption = extract_date_range(data, return = "text"))
## Table to return
return_table <-
plot_table %>%
select(group, bucket_hours, percent) %>%
spread(bucket_hours, percent) %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group")
if(return == "table"){
return_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}
#' @rdname collaboration_dist
#' @export
collaboration_distribution <- collaboration_dist

47
R/create_dt.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Create interactive tables in HTML with 'download' buttons.
#'
#' @description
#' See https://martinctc.github.io/blog/vignette-downloadable-tables-in-rmarkdown-with-the-dt-package/ for more.
#'
#' @param x Data frame to be passed through.
#' @param rounding Numeric vector to specify the number of decimal points to display
#' @param freeze Number of columns from the left to 'freeze'. Defaults to 2,
#' which includes the row number column.
#'
#' @import DT
#' @importFrom dplyr mutate_if
#'
#' @export
create_dt <- function(x, rounding = 1, freeze = 2){
# Round all numeric to "rounding" number of dp
num_cols <- dplyr::select_if(x, is.numeric) %>% names()
if(length(num_cols) == 0){ # No numeric columns
DT::datatable(x,
extensions = c('Buttons',
'FixedColumns'),
options = list(dom = 'Blfrtip',
fixedColumns = list(leftColumns = freeze),
scrollX = TRUE,
buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
lengthMenu = list(c(10,25,50,-1),
c(10,25,50,"All"))))
} else {
DT::datatable(x,
extensions = c('Buttons',
'FixedColumns'),
options = list(dom = 'Blfrtip',
fixedColumns = list(leftColumns = freeze),
scrollX = TRUE,
buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
lengthMenu = list(c(10,25,50,-1),
c(10,25,50,"All")))) %>%
DT::formatRound(columns = num_cols, rounding)
}
}

136
R/create_fizz.R Normal file
Просмотреть файл

@ -0,0 +1,136 @@
#' @title Fizzy Drink / Jitter Scatter Plot (General Purpose)
#'
#' @description
#' Analyzes a selected metric and returns a a 'fizzy' scatter plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @details
#' This is a general purpose function that powers all the functions
#' in the package that produce 'fizzy drink' / jitter scatter plots.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family General
#'
#' @examples
#' ## Create a fizzy plot for Work Week Span by Level Designation
#' create_fizz(sq_data, metric = "Workweek_span", hrvar = "LevelDesignation", return = "plot")
#'
#' ## Create a summary statistics table for Work Week Span by Organization
#' create_fizz(sq_data, metric = "Workweek_span", hrvar = "Organization", return = "table")
#'
#' ## Create a fizzy plot for Collaboration Hours by Level Designation
#' create_fizz(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "plot")
#' @export
create_fizz <- function(data,
metric,
hrvar = "Organization",
mingroup = 5,
return = "plot") {
## Check inputs
required_variables <- c("Date",
metric,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Clean metric name
clean_nm <- gsub(pattern = "_", replacement = "", x = metric)
## Plot data
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
group_by(PersonId, group) %>%
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
ungroup() %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup)
## Get max value
max_point <- max(plot_data[[metric]]) * 1.2
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count = first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
plot_object <-
plot_data %>%
ggplot(aes(x = group, y = !!sym(metric))) +
geom_point(size = 1,
alpha = 1/5,
color = "#578DB8",
position = position_jitter(width = 0.1, height = 0.1)) +
ylim(0, max_point) +
annotate("text", x = plot_legend$group, y = 0, label = plot_legend$Employee_Count) +
scale_x_discrete(labels = scales::wrap_format(10)) +
theme_classic() +
theme(axis.text=element_text(size=12),
axis.text.x = element_text(angle = 30, hjust = 1),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "top",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = clean_nm,
subtitle = paste("Distribution of",
tolower(clean_nm),
"by",
camel_clean(hrvar))) +
xlab(hrvar) +
ylab(paste("Average", clean_nm)) +
labs(caption = extract_date_range(data, return = "text"))
summary_table <-
plot_data %>%
select(group, tidyselect::all_of(metric)) %>%
group_by(group) %>%
summarise(mean = mean(!!sym(metric)),
median = median(!!sym(metric)),
sd = sd(!!sym(metric)),
min = min(!!sym(metric)),
max = max(!!sym(metric)),
range = max - min,
n = n())
if(return == "table"){
summary_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

140
R/create_line.R Normal file
Просмотреть файл

@ -0,0 +1,140 @@
#' @title Time Trend - Line Chart (General Purpose)
#'
#' @description
#' Provides a week by week view of a selected metric, visualised as line charts.
#' By default returns a line chart for the defined metric,
#' with a separate panel per value in the HR attribute.
#' Additional options available to return a summary table.
#'
#' @details
#' This is a general purpose function that powers all the functions
#' in the package that produce faceted line plots.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom tidyselect all_of
#'
#' @family General
#'
#' @examples
#' \dontrun{
#' sq_data %>% create_line(metric = "Email_hours", return = "plot")
#'
#' sq_data %>% create_line(metric = "Collaboration_hours", return = "plot")
#'
#' sq_data %>% create_line(metric = "Workweek_span", hrvar = "LevelDesignation")
#' }
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
create_line <- function(data,
metric,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Check inputs
required_variables <- c("Date",
metric,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Clean metric name
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
myTable <-
data %>%
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, all_of(metric)) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count = mean(Employee_Count),
!!sym(metric) := mean(!!sym(metric)))
## Data frame to return
myTable_return <-
myTable %>%
select(Date, group, all_of(metric)) %>%
spread(Date, !!sym(metric))
## Data frame for creating plot
myTable_plot <-
myTable %>%
select(Date, group, all_of(metric)) %>%
group_by(Date, group) %>%
summarise_at(vars(all_of(metric)), ~mean(., na.rm = TRUE)) %>%
ungroup()
plot_object <-
myTable_plot %>%
ggplot(aes(x = Date, y = !!sym(metric))) +
geom_line(colour = "grey40") +
facet_wrap(.~group) +
scale_fill_gradient(name="Hours", low = "white", high = "red") +
theme_classic() +
theme(plot.title = element_text(color = "grey40",
face = "bold",
size = 18),
plot.subtitle = element_text(size = 14),
strip.background = element_rect(color = "grey40",
fill = "grey40"),
strip.text = element_text(size = 10,
colour = "#FFFFFF",
face = "bold"),
axis.text = element_text(size = 8, face = "bold"),
axis.line = element_line(colour = "grey40"),
legend.position = "right",
legend.justification = "right",
legend.title=element_text(size = 10),
legend.text=element_text(size = 10)) +
labs(title = clean_nm,
subtitle = paste("Total",
tolower(clean_nm),
"by",
camel_clean(hrvar))) +
xlab("Date") +
ylab("Weekly hours") +
labs(caption = extract_date_range(data, return = "text"))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

185
R/create_period_scatter.R Normal file
Просмотреть файл

@ -0,0 +1,185 @@
#' @title Period comparison scatter plot (General Purpose)
#'
#' @description
#' Returns two side-by-side scatter plots representing two selected metrics,
#' using colour to map an HR attribute and size to represent number of employees.
#' Returns a faceted scatter plot by default, with additional options
#' to return a summary table.
#'
#' @details
#' This is a general purpose function that powers all the functions
#' in the package that produce faceted scatter plots.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector,
#' defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param metric_x Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param metric_y Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param before_start Start date of "before" time period in YYYY-MM-DD
#' @param before_end End date of "before" time period in YYYY-MM-DD
#' @param after_start Start date of "after" time period in YYYY-MM-DD
#' @param after_end End date of "after" time period in YYYY-MM-DD
#' @param before_label String to specify a label for the "before" period. Defaults to "Period 1".
#' @param after_label String to specify a label for the "after" period. Defaults to "Period 2".
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#'
#' @family General
#'
#' @examples
#' \dontrun{
#'
#'create_period_scatter(sq_data,
#'hrvar = "LevelDesignation",
#'before_start = "2019-11-03",
#'before_end = "2019-12-31",
#'after_start = "2020-01-01",
#'after_end = "2020-01-26")
#'
#'
#' create_period_scatter(sq_data, before_end = "2019-12-31", return = "table")
#' }
#'
#' @export
create_period_scatter <- function(data,
hrvar = "Organization",
metric_x = "Multitasking_meeting_hours",
metric_y = "Meeting_hours",
before_start = min(as.Date(data$Date, "%m/%d/%Y")),
before_end,
after_start = as.Date(before_end) + 1,
after_end = max(as.Date(data$Date, "%m/%d/%Y")),
before_label = "Period 1",
after_label = "Period 2",
mingroup = 5,
return = "plot"){
## Check inputs
## Update these column names as per appropriate
required_variables <- c("Date",
hrvar,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
daterange_1_start <- as.Date(before_start)
daterange_1_end <- as.Date(before_end)
daterange_2_start <- as.Date(after_start)
daterange_2_end <- as.Date(after_end)
# Fix dates format for WpA Queries
WpA_dataset <- data %>% mutate(Date = as.Date(Date, "%m/%d/%Y"))
# Check for dates in data file
if (daterange_1_start < min(WpA_dataset$Date) |
daterange_1_start > max(WpA_dataset$Date) |
daterange_1_end < min(WpA_dataset$Date) |
daterange_1_end > max(WpA_dataset$Date) |
daterange_2_start < min(WpA_dataset$Date) |
daterange_2_start > max(WpA_dataset$Date) |
daterange_2_end < min(WpA_dataset$Date) |
daterange_2_end > max(WpA_dataset$Date)) {
stop('Dates not found in dataset')
geterrmessage()
}
## Employee count
emp_count <-
WpA_dataset %>%
group_by(!!sym(hrvar)) %>%
summarise(n = n_distinct(PersonId))
data_p1 <-
WpA_dataset %>%
rename(group = hrvar) %>%
filter(between(Date, daterange_1_start, daterange_1_end)) %>%
group_by(PersonId, group) %>%
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(.)) %>%
ungroup() %>%
group_by(group) %>%
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(., na.rm = TRUE)) %>%
mutate(Period = before_label) %>%
left_join(emp_count, by = c(group = hrvar)) %>%
filter(n >= mingroup)
data_p2 <-
WpA_dataset %>%
rename(group = hrvar) %>%
filter(between(Date, daterange_2_start, daterange_2_end)) %>%
group_by(PersonId, group) %>%
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(.)) %>%
ungroup() %>%
group_by(group) %>%
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(., na.rm = TRUE)) %>%
mutate(Period = after_label) %>%
left_join(emp_count, by = c(group = hrvar)) %>%
filter(n >= mingroup)
## bind data
data_both <- rbind(data_p1, data_p2)
date_range_str <-
paste("Data from",
daterange_1_start,
"to",
daterange_1_end,
"and",
daterange_2_start,
"to",
daterange_2_end)
clean_x <- gsub(pattern = "_", replacement = " ", x = metric_x)
clean_y <- gsub(pattern = "_", replacement = " ", x = metric_y)
plot_title <-
paste(clean_x, "and", clean_y,
"over", before_label,
"and", after_label)
plot_object <-
data_both %>%
ggplot(aes(x = !!sym(metric_x),
y = !!sym(metric_y),
colour = group,
size = n)) +
geom_point(alpha = 0.5) +
scale_size(range = c(1, 20)) +
facet_wrap(.~Period) +
guides(size = FALSE) +
theme(legend.position = "bottom") +
ggtitle(plot_title,
subtitle = paste("Weekly data by", hrvar)) +
ylab(clean_y) +
xlab(clean_x) +
labs(caption = date_range_str) +
wpa::theme_wpa_basic()
if(return == "table"){
# return(myTable_return)
return(data_both)
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

70
R/create_rank.R Normal file
Просмотреть файл

@ -0,0 +1,70 @@
#' @title Create Ranking
#'
#' @description
#' This function scans a standard query output for groups with high levels of a given Workplace Analytics Metric.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by the specified metric.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "table" (default) and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family General
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
create_rank <- function(data,
metric,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
results <-
data %>% create_bar(
metric = metric,
hrvar = hrvar[1],
mingroup = mingroup,
return = return, bar_colour = "default")
results$hrvar <- ""
results <- results[0,]
for (p in hrvar) {
table1 <-
data %>%
create_bar(metric = metric,
hrvar = p,
mingroup = mingroup,
return = "table", bar_colour = "default")
table1$hrvar <- p
results <- rbind(results,table1)
}
output <-
results %>% arrange(desc(get(metric))) %>%
select(hrvar, everything())
if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

118
R/create_scatter.R Normal file
Просмотреть файл

@ -0,0 +1,118 @@
#' @title Scatter plot (General Purpose)
#'
#' @description
#' Returns a scatter plot of two selected metrics, using colour to map
#' an HR attribute.
#' Returns a scatter plot by default, with additional options
#' to return a summary table.
#'
#' @details
#' This is a general purpose function that powers all the functions
#' in the package that produce scatter plots.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric_x Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param metric_y Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import scales
#'
#' @family General
#'
#' @examples
#' create_scatter(sq_data,
#' "Internal_network_size",
#' "External_network_size",
#' "Organization")
#'
#' create_scatter(sq_data,
#' "Generated_workload_call_hours",
#' "Generated_workload_email_hours",
#' "Organization", mingroup = 100, return = "plot")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
create_scatter <- function(data,
metric_x = "Internal_network_size",
metric_y = "External_network_size",
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Check inputs
required_variables <- c(hrvar,
metric_x,
metric_y,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Extract values violating privacy threshold
violate_thres_chr <-
data %>%
group_by(!!sym(hrvar)) %>%
summarise(n = n_distinct(PersonId)) %>%
filter(n < mingroup) %>%
pull(!!sym(hrvar))
clean_x <- gsub(pattern = "_", replacement = " ", x = metric_x)
clean_y <- gsub(pattern = "_", replacement = " ", x = metric_y)
myTable <-
data %>%
filter(!(!!sym(hrvar) %in% violate_thres_chr)) %>%
group_by(PersonId, !!sym(hrvar)) %>%
summarise_at(vars(!!sym(metric_x),
!!sym(metric_y)),
~mean(.)) %>%
ungroup()
plot_object <-
myTable %>%
ggplot(aes(x = !!sym(metric_x),
y = !!sym(metric_y),
colour = !!sym(hrvar))) +
geom_point(alpha = 0.5) +
labs(title = paste0(clean_x, " and\n", clean_y, " by ", camel_clean(hrvar)),
caption = extract_date_range(data, return = "text")) +
xlab(clean_x) +
ylab(clean_y) +
theme_wpa_basic()
myTable_return <-
myTable %>%
group_by(!!sym(hrvar)) %>%
summarise_at(vars(!!sym(metric_x),
!!sym(metric_y)),
~mean(.))
if(return == "table"){
return(myTable_return)
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

174
R/create_stacked.R Normal file
Просмотреть файл

@ -0,0 +1,174 @@
#' @title Horizontal stacked bar plot (General Purpose)
#'
#' @description
#' Creates a sum total calculation using selected metrics,
#' where the typical use case is to create different definitions
#' of collaboration hours.
#' Returns a stacked bar plot by default.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metrics A character vector to specify variables to be used
#' in calculating the "Total" value, e.g. c("Meeting_hours", "Email_hours").
#' The order of the variable names supplied determine the order in which they
#' appear on the stacked plot.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param stack_colours
#' A character vector to specify the colour codes for the stacked bar charts.
#' @param plot_title An option to override plot title.
#' @param plot_subtitle An option to override plot subtitle.
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family General
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' \dontrun{
#' sq_data %>%
#' create_stacked(hrvar = "LevelDesignation",
#' metrics = c("Meeting_hours", "Email_hours"),
#' return = "plot")
#'
#' sq_data %>%
#' create_stacked(hrvar = "FunctionType",
#' metrics = c("Meeting_hours",
#' "Email_hours",
#' "Call_hours",
#' "Instant_Message_hours"),
#' return = "plot")
#'
#' sq_data %>%
#' create_stacked(hrvar = "FunctionType",
#' metrics = c("Meeting_hours",
#' "Email_hours",
#' "Call_hours",
#' "Instant_Message_hours"),
#' return = "table")
#'}
#' @export
create_stacked <- function(data,
hrvar = "Organization",
metrics = c("Meeting_hours",
"Email_hours"),
mingroup = 5,
return = "plot",
stack_colours = c("#203864",
"#5191DD",
"darkgrey",
"lightgrey"),
plot_title = "Collaboration Hours",
plot_subtitle = "Weekly collaboration hours"){
## Check inputs
required_variables <- c("Date",
metrics,
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
n_count <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId))
## Person level table
myTable <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, group, metrics) %>%
group_by(PersonId, group) %>%
summarise_at(vars(metrics), ~mean(.)) %>%
ungroup() %>%
mutate(Total = select(., metrics) %>% apply(1, sum)) %>%
left_join(n_count, by = "group") %>%
# Keep only groups above privacy threshold
filter(Employee_Count >= mingroup)
myTableReturn <-
myTable %>%
group_by(group) %>%
summarise_at(vars(metrics, Total), ~mean(.)) %>%
left_join(n_count, by = "group")
plot_table <-
myTable %>%
select(PersonId, group, metrics, Total) %>%
gather(Metric, Value, -PersonId, -group)
totalTable <-
plot_table %>%
filter(Metric == "Total") %>%
group_by(group) %>%
summarise(Total = mean(Value))
myTable_legends <-
n_count %>%
filter(Employee_Count >= mingroup) %>%
mutate(Employee_Count = paste("n=",Employee_Count)) %>%
left_join(totalTable, by = "group")
## Get maximum value
location <- max(myTable_legends$Total)
plot_object <-
plot_table %>%
filter(Metric != "Total") %>%
mutate(Metric = factor(Metric, levels = rev(metrics))) %>%
group_by(group, Metric) %>%
summarise_at(vars(Value), ~mean(.)) %>%
ggplot(aes(x = stats::reorder(group, Value, mean), y = Value, fill = Metric)) +
geom_bar(position = "stack", stat = "identity") +
geom_text(aes(label = round(Value, 1)),
position = position_stack(vjust = 0.5),
color = "#FFFFFF",
fontface = "bold") +
scale_y_continuous(limits = c(0, location * 1.25)) +
annotate("text",
x = myTable_legends$group,
y = location * 1.15,
label = myTable_legends$Employee_Count) +
annotate("rect", xmin = 0.5, xmax = length(myTable_legends$group) + 0.5, ymin = location * 1.05, ymax = location * 1.25, alpha = .2) +
scale_fill_manual(name="",
values = stack_colours,
breaks = metrics,
labels = gsub("_", " ", metrics)) +
coord_flip() +
theme_wpa_basic() +
labs(title = plot_title,
subtitle = paste(plot_subtitle, "by", camel_clean(hrvar)),
caption = extract_date_range(data, return = "text")) +
xlab(hrvar) +
ylab("Average weekly hours")
if(return == "table"){
myTableReturn
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

43
R/cut_hour.R Normal file
Просмотреть файл

@ -0,0 +1,43 @@
#' @title Convert a numeric variable for hours into categorical
#'
#' @description
#' Supply a numeric variable, e.g. `Collaboration_hours`,
#' and the function returns a character vector
#'
#' @param metric A numeric variable representing hours.
#' @param cuts A numeric variable of length 3 to represent the
#' cut points required.
#'
#' @family General
#'
#' @examples
#' ## Direct use
#' cut_hour(1:30, cuts = c(15, 20, 25))
#'
#' ## Use on a query
#' cut_hour(sq_data$Collaboration_hours, cuts = c(10, 15, 20))
#'
#' @export
cut_hour <- function(metric, cuts){
label1 <- paste0("< ", cuts[1], " hours")
label2 <- paste0(cuts[1], " - ", cuts[2], " hours")
label3 <- paste0(cuts[2], " - ", cuts[3], " hours")
label4 <- paste0(cuts[3], "+ hours")
out <-
cut(metric,
breaks = c(0, cuts, 100),
include.lowest = TRUE,
labels = c(label1,
label2,
label3,
label4))
# out <- as.character(out)
return(out)
}

81
R/dv_data.R Normal file
Просмотреть файл

@ -0,0 +1,81 @@
#' @title Sample Standard Query dataset for Data Validation
#'
#' @description
#' A dataset generated from a Standard Query from WpA.
#'
#' @format A data frame with 897 rows and 69 variables:
#' \describe{
#' \item{PersonId}{ }
#' \item{Date}{ }
#' \item{Workweek_span}{ }
#' \item{Meetings_with_skip_level}{ }
#' \item{Meeting_hours_with_skip_level}{ }
#' \item{Generated_workload_email_hours}{ }
#' \item{Generated_workload_email_recipients}{ }
#' \item{Generated_workload_instant_messages_hours}{ }
#' \item{Generated_workload_instant_messages_recipients}{ }
#' \item{Generated_workload_call_hours}{ }
#' \item{Generated_workload_call_participants}{ }
#' \item{Generated_workload_calls_organized}{ }
#' \item{External_network_size}{ }
#' \item{Internal_network_size}{ }
#' \item{Networking_outside_company}{ }
#' \item{Networking_outside_organization}{ }
#' \item{After_hours_meeting_hours}{ }
#' \item{Open_1_hour_block}{ }
#' \item{Open_2_hour_blocks}{ }
#' \item{Total_focus_hours}{ }
#' \item{Low_quality_meeting_hours}{ }
#' \item{Total_emails_sent_during_meeting}{ }
#' \item{Meetings}{ }
#' \item{Meeting_hours}{ }
#' \item{Conflicting_meeting_hours}{ }
#' \item{Multitasking_meeting_hours}{ }
#' \item{Redundant_meeting_hours__lower_level_}{ }
#' \item{Redundant_meeting_hours__organizational_}{ }
#' \item{Time_in_self_organized_meetings}{ }
#' \item{Meeting_hours_during_working_hours}{ }
#' \item{Generated_workload_meeting_attendees}{ }
#' \item{Generated_workload_meeting_hours}{ }
#' \item{Generated_workload_meetings_organized}{ }
#' \item{Manager_coaching_hours_1_on_1}{ }
#' \item{Meetings_with_manager}{ }
#' \item{Meeting_hours_with_manager}{ }
#' \item{Meetings_with_manager_1_on_1}{ }
#' \item{Meeting_hours_with_manager_1_on_1}{ }
#' \item{After_hours_email_hours}{ }
#' \item{Emails_sent}{ }
#' \item{Email_hours}{ }
#' \item{Working_hours_email_hours}{ }
#' \item{After_hours_instant_messages}{ }
#' \item{Instant_messages_sent}{ }
#' \item{Instant_Message_hours}{ }
#' \item{Working_hours_instant_messages}{ }
#' \item{After_hours_collaboration_hours}{ }
#' \item{Collaboration_hours}{ }
#' \item{Collaboration_hours_external}{ }
#' \item{Working_hours_collaboration_hours}{ }
#' \item{After_hours_in_calls}{ }
#' \item{Total_calls}{ }
#' \item{Call_hours}{ }
#' \item{Working_hours_in_calls}{ }
#' \item{Domain}{ }
#' \item{FunctionType}{ }
#' \item{LevelDesignation}{ }
#' \item{Layer}{ }
#' \item{Region}{ }
#' \item{Organization}{ }
#' \item{zId}{ }
#' \item{attainment}{ }
#' \item{TimeZone}{ }
#' \item{HourlyRate}{ }
#' \item{IsInternal}{ }
#' \item{IsActive}{ }
#' \item{HireDate}{ }
#' \item{WorkingStartTimeSetInOutlook}{ }
#' \item{WorkingEndTimeSetInOutlook}{ }
#'
#' ...
#' }
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
"dv_data"

71
R/em_data.R Normal file
Просмотреть файл

@ -0,0 +1,71 @@
#' @title Sample Hourly Collaboration data with Email and IMs
#'
#' @description
#' A sample dataset generated from an Hourly Collaboration query
#' from the WpA demo tenant. The data is grouped by week and contains
#' columns for both IMs sent and Emails sent.
#'
#' @format A data frame with 41567 rows and 56 variables:
#' \describe{
#' \item{PersonId}{ }
#' \item{Date}{ }
#' \item{IMs_sent_23_24}{ }
#' \item{IMs_sent_22_23}{ }
#' \item{IMs_sent_21_22}{ }
#' \item{IMs_sent_20_21}{ }
#' \item{IMs_sent_19_20}{ }
#' \item{IMs_sent_18_19}{ }
#' \item{IMs_sent_17_18}{ }
#' \item{IMs_sent_16_17}{ }
#' \item{IMs_sent_15_16}{ }
#' \item{IMs_sent_14_15}{ }
#' \item{IMs_sent_13_14}{ }
#' \item{IMs_sent_12_13}{ }
#' \item{IMs_sent_11_12}{ }
#' \item{IMs_sent_10_11}{ }
#' \item{IMs_sent_09_10}{ }
#' \item{IMs_sent_08_09}{ }
#' \item{IMs_sent_07_08}{ }
#' \item{IMs_sent_06_07}{ }
#' \item{IMs_sent_05_06}{ }
#' \item{IMs_sent_04_05}{ }
#' \item{IMs_sent_03_04}{ }
#' \item{IMs_sent_02_03}{ }
#' \item{IMs_sent_01_02}{ }
#' \item{IMs_sent_00_01}{ }
#' \item{Emails_sent_23_24}{ }
#' \item{Emails_sent_22_23}{ }
#' \item{Emails_sent_21_22}{ }
#' \item{Emails_sent_20_21}{ }
#' \item{Emails_sent_19_20}{ }
#' \item{Emails_sent_18_19}{ }
#' \item{Emails_sent_17_18}{ }
#' \item{Emails_sent_16_17}{ }
#' \item{Emails_sent_15_16}{ }
#' \item{Emails_sent_14_15}{ }
#' \item{Emails_sent_13_14}{ }
#' \item{Emails_sent_12_13}{ }
#' \item{Emails_sent_11_12}{ }
#' \item{Emails_sent_10_11}{ }
#' \item{Emails_sent_09_10}{ }
#' \item{Emails_sent_08_09}{ }
#' \item{Emails_sent_07_08}{ }
#' \item{Emails_sent_06_07}{ }
#' \item{Emails_sent_05_06}{ }
#' \item{Emails_sent_04_05}{ }
#' \item{Emails_sent_03_04}{ }
#' \item{Emails_sent_02_03}{ }
#' \item{Emails_sent_01_02}{ }
#' \item{Emails_sent_00_01}{ }
#' \item{Domain}{ }
#' \item{LevelDesignation}{ }
#' \item{Layer}{ }
#' \item{Organization}{ }
#' \item{IsInternal}{ }
#' \item{IsActive}{ }
#'
#' ...
#' }
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
"em_data"

37
R/email_dist.R Normal file
Просмотреть файл

@ -0,0 +1,37 @@
#' @title Email Hours distribution
#'
#' @description
#' Analyze Email Hours distribution.
#' Returns a stacked bar plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @inheritParams create_dist
#'
#' @family Emails
#'
#' @examples
#' ## Return a plot
#' email_dist(sq_data, hrvar = "Organization")
#'
#' ## Return a table
#' email_dist(sq_data, hrvar = "Organization", return = "table")
#'
#' ## Return result with a custom specified breaks
#' email_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
#'
#' @export
email_dist <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
cut = c(5, 10, 15)) {
create_dist(data = data,
metric = "Email_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
cut = cut)
}

27
R/email_fizz.R Normal file
Просмотреть файл

@ -0,0 +1,27 @@
#' @title Distribution of Email Hours (Fizzy Drink plot)
#'
#' @description
#' Analyze weekly email hours distribution, and returns
#' a 'fizzy' scatter plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @inheritParams create_fizz
#'
#' @family Emails
#'
#' @examples
#' email_fizz(sq_data, hrvar = "Organization", return = "table")
#' @export
email_fizz <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_fizz(data = data,
metric = "Email_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

37
R/email_line.R Normal file
Просмотреть файл

@ -0,0 +1,37 @@
#' @title Email Time Trend - Line Chart
#'
#' @description
#' Provides a week by week view of email time, visualised as line charts.
#' By default returns a line chart for email hours,
#' with a separate panel per value in the HR attribute.
#' Additional options available to return a summary table.
#'
#' @inheritParams create_line
#'
#' @family Emails
#'
#' @examples
#'
#' ## Return a line plot
#' email_line(sq_data, hrvar = "LevelDesignation")
#'
#'
#' ## Return a table
#' email_line(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
email_line <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Inherit arguments
create_line(data = data,
metric = "Email_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

47
R/email_rank.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Email Hours Ranking
#'
#' @description
#' This function scans a standard query output for groups with high levels of 'Weekly Email Collaboration'.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "html" (default, returning an interactive DataTable)
#' and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Emails
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
email_rank <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
output <-
data %>% create_rank(metric="Email_hours", hrvar = hrvar,
mingroup = mingroup,
return = "table")
if(return == "html"){
return(create_dt(output))
} else if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

45
R/email_summary.R Normal file
Просмотреть файл

@ -0,0 +1,45 @@
#' @title Email Summary
#'
#' @description
#' Provides an overview analysis of weekly email hours.
#' Returns a bar plot showing average weekly email hours by default.
#' Additional options available to return a summary table.
#'
#' @inheritParams create_bar
#'
#'
#' @family Emails
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' # Return a ggplot bar chart
#' email_summary(sq_data, hrvar = "LevelDesignation")
#'
#' # Return a summary table
#' email_summary(sq_data, hrvar = "LevelDesignation", return = "table")
#' @export
email_summary <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_bar(data = data,
metric = "Email_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
bar_colour = "darkblue")
}

99
R/email_trend.R Normal file
Просмотреть файл

@ -0,0 +1,99 @@
#' Email Hours Time Trend
#'
#' Provides a week by week view of email time.
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Emails
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
email_trend <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Check inputs
required_variables <- c("Date",
"Email_hours",
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Date range data frame
myPeriod <- extract_date_range(data)
myTable <-
data %>%
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, Email_hours) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count = mean(Employee_Count),
Email_hours = mean(Email_hours))
myTable_plot <- myTable %>% select(Date, group, Email_hours)
myTable_return <- myTable_plot %>% spread(Date, Email_hours)
plot_object <-
myTable_plot %>%
ggplot(aes(x = Date , y = group , fill = Email_hours)) +
geom_tile(height=.5) +
scale_fill_gradient(name="Hours", low = "white", high = "red") +
theme_classic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "right",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = "Email Hours",
subtitle = paste("Total email time by", tolower(hrvar))) +
xlab("Date") +
ylab(hrvar) +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

76
R/export.R Normal file
Просмотреть файл

@ -0,0 +1,76 @@
#' @title Export {wpa} outputs to CSV, clipboard, or save as images
#'
#' @description
#' A general use function to export {wpa} outputs to CSV,
#' clipboard, or save as images. By default, `export()` copies
#' a data frame to the clipboard. If the input is a ggplot object,
#' the default behaviour is to export a PNG.
#'
#' @param x Data frame or ggplot object to be passed through.
#' @param method Character string specifying the method of export.
#' Valid inputs include "clipboard" (default), "csv", "png", and "svg".
#' @param path If exporting a file, enter the path and the desired file name, _excluding the file extension_.
#' For example, "Analysis/SQ Overview".
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name. Defaults to TRUE.
#' @param width Width of the plot
#' @param height Height of the plot
#'
#' @importFrom utils write.csv
#'
#' @export
export <- function(x,
method = "clipboard",
path = "wpa export",
timestamp = TRUE,
width = 12,
height = 9){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
## Force method to png if is.ggplot and method not appropriate
if(is.ggplot(x) & method %in% c("clipboard", "csv")){
message("Input is a ggplot object. Defaulted to exporting as PNG...")
method <- "png"
}
## Main export function
if(method == "clipboard"){
copy_df(x)
message(c("Data frame copied to clipboard.\n",
"You may paste the contents directly to Excel."))
## Export option: CSV
} else if(method == "csv"){
newpath <- paste0(newpath, ".csv")
write.csv(x = x, file = newpath)
## Export option: PNG
} else if(method == "png"){
newpath <- paste0(newpath, ".png")
ggsave(filename = newpath, plot = x, width = width, height = height)
## Export option: SVG
} else if(method == "svg"){
newpath <- paste0(newpath, ".svg")
ggsave(filename = newpath, plot = x, width = width, height = height)
} else {
stop("Please check inputs. Enter `?export` for more details.")
}
}

111
R/external_network_plot.R Normal file
Просмотреть файл

@ -0,0 +1,111 @@
#' @title Plot the external network metrics for a HR variable
#'
#' @description
#' Plot the external network metrics for a HR variable.
#'
#' @param data Person Query as a dataframe including date column named "Date"
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
#' @param hrvar WpA variable for an HR variable to group networks by
#' For example, "Layer"
#' @param mingroup Numeric vector for minimum group size for aggregation
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param bubble_size A numeric vector of length two to specify the size range of the bubbles
#'
#' @import dplyr
#' @import reshape2
#' @import ggplot2
#' @import ggrepel
#'
#' @examples
#' \dontrun{external_network_plot(sq_data)
#
#' }
#'
#' @family Connectivity
#'
#' @export
external_network_plot <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
bubble_size = c(1, 5)){
plot_data <-
data %>%
rename(group = !!sym(hrvar))
plot_data <-
plot_data %>%
group_by(group) %>%
summarize(Ext_network_size = mean(External_network_size),
Ext_network_breadth = mean(Networking_outside_company),
Employee_count = n_distinct(PersonId)
) %>%
filter(Employee_count >= mingroup)
summary_table <- plot_data %>% arrange(., desc(Ext_network_size))
ext_network_plot <- ggplot(plot_data,
aes(x=Ext_network_size,
y=Ext_network_breadth)
) +
geom_point(aes(size=Employee_count),
color = rgb2hex(0, 120, 212),
alpha = 0.5) +
geom_text_repel(label=plot_data$group) +
scale_x_continuous(name = "External Network Size") +
scale_y_continuous(name = "External Network Breadth") +
scale_size(range = bubble_size) +
theme_classic() +
theme(
axis.text = element_text(size = 10),
axis.text.x = element_text(
angle = 90,
hjust = 1,
vjust = 0.5
),
plot.title = element_text(
color = "grey40",
face = "bold",
size = 18
),
plot.subtitle = element_text(size = 14)
) +
labs(
size = "Size"
) +
labs(
title = paste("External network metrics by", hrvar),
subtitle = paste(
"Network size is number of people, breadth is number of organizations"
)
) +
labs(
caption = paste(
"Total employees =",
sum(plot_data$Employee_count),
"| Data from",
min(as.Date(data$Date, "%m/%d/%Y")),
"to",
max(as.Date(data$Date, "%m/%d/%Y"))
)
)
if(return == "table"){
summary_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(ext_network_plot)
} else {
stop("Please enter a valid input for `return`.")
}
}

48
R/extract_hr.R Normal file
Просмотреть файл

@ -0,0 +1,48 @@
#' @title Extract HR attribute variables
#'
#' @description
#' This function uses a combination of variable class,
#' number of unique values, and regular expression matching
#' to extract HR / organisational attributes from a data frame.
#'
#' @param data A data frame to be passed through.
#' @param max_unique A numeric value representing the maximum
#' number of unique values to accept for an HR attribute. Defaults to 50.
#' @param return Character string with values "names" (default)
#' or "vars" to determine whether the function returns a character
#' vector of variable names or a data frame containing the HR
#' attributes.
#'
#' @family General
#'
#' @examples
#' sq_data %>% extract_hr(return = "names")
#'
#' sq_data %>% extract_hr(return = "vars")
#'
#' @export
extract_hr <- function(data,
max_unique = 50,
return = "names"){
hr_var <-
data %>%
dplyr::select_if(~(is.character(.) | is.logical(.) | is.factor(.))) %>%
dplyr::select_if(~(dplyr::n_distinct(.) < max_unique)) %>%
dplyr::select_if(~!all(is_date_format(.))) %>%
names() %>%
.[.!= "WorkingStartTimeSetInOutlook"] %>%
.[.!= "WorkingEndTimeSetInOutlook"] %>%
.[.!= "WorkingDaysSetInOutlook"]
if(return == "names"){
return(hr_var)
} else if(return == "vars"){
return(dplyr::select(data, tidyselect::all_of(hr_var)))
} else {
stop("Invalid input for return")
}
}

80
R/flag_ch_ratio.R Normal file
Просмотреть файл

@ -0,0 +1,80 @@
#' @title Flag unusual high collaboration hours to after-hours collaboration hours ratio
#'
#' @description This function flags persons who have an unusual ratio
#' of collaboration hours to after-hours collaboration hours.
#' Returns a character string by default.
#'
#' @import dplyr
#'
#' @param data A data frame containing a Person Query.
#' @param threshold Numeric value specifying the threshold for flagging. Defaults to 30.
#' @param return String to specify what to return. Options include "message" for console return, and "text" for string return.
#'
#' @family Data Validation
#'
#' @examples
#' flag_ch_ratio(sq_data)
#'
#' \dontrun{
#' tibble(PersonId = c("Alice", "Bob"),
#' Collaboration_hours = c(30, 0.5),
#' After_hours_collaboration_hours = c(0.5, 30)) %>%
#' flag_ch_ratio()
#' }
#'
#' @export
flag_ch_ratio <- function(data, threshold = c(1, 30), return = "message"){
min_thres <- min(threshold, na.rm = TRUE)
max_thres <- max(threshold, na.rm = TRUE)
## Check for high collab hours but lower afterhour collab hours
## Because of faulty outlook settings
ch_summary <-
data %>%
group_by(PersonId) %>%
summarise_at(vars(Collaboration_hours, After_hours_collaboration_hours), ~mean(.)) %>%
mutate(CH_ratio = Collaboration_hours / After_hours_collaboration_hours) %>%
arrange(desc(CH_ratio)) %>%
mutate(CH_FlagLow = ifelse(CH_ratio < min_thres, TRUE, FALSE),
CH_FlagHigh = ifelse(CH_ratio > max_thres, TRUE, FALSE),
CH_Flag = ifelse(CH_ratio > max_thres | CH_ratio < min_thres, TRUE, FALSE))
## Percent of people with high collab hours + low afterhour collab hours
CHFlagN <- sum(ch_summary$CH_Flag, na.rm = TRUE)
CHFlagProp <- mean(ch_summary$CH_Flag, na.rm = TRUE)
CHFlagProp2 <- paste(round(CHFlagProp * 100), "%") # Formatted
CHFlagMessage_Warning <- paste0("[Warning] The ratio of after-hours collaboration to total collaboration hours is outside the expected threshold for ", CHFlagN, " employees (", CHFlagProp2, " of the total).")
CHFlagMessage_Pass_Low <- paste0("[Pass] The ratio of after-hours collaboration to total collaboration hours is outside the expected threshold for only ", CHFlagN, " employees (", CHFlagProp2, " of the total).")
CHFlagMessage_Pass_Zero <- paste0("[Pass] The ratio of after-hours collaboration to total collaboration hours falls within the expected threshold for all employees.")
CHFlagLowN <- sum(ch_summary$CH_FlagLow, na.rm = TRUE)
CHFlagLowProp <- mean(ch_summary$CH_FlagLow, na.rm = TRUE)
CHFlagLowProp2 <- paste(round(CHFlagLowProp * 100), "%") # Formatted
CHFlagLowMessage <- paste0("- ", CHFlagLowN, " employees (", CHFlagLowProp2,
") have an unusually low after-hours collaboration")
CHFlagHighN <- sum(ch_summary$CH_FlagHigh, na.rm = TRUE)
CHFlagHighProp <- mean(ch_summary$CH_FlagHigh, na.rm = TRUE)
CHFlagHighProp2 <- paste(round(CHFlagHighProp * 100), "%") # Formatted
CHFlagHighMessage <- paste0("- ", CHFlagHighN, " employees (", CHFlagHighProp2 , ") have an unusually high after-hours collaboration (relative to weekly collaboration hours)")
if(CHFlagProp >= .05){
CHFlagMessage <- paste(CHFlagMessage_Warning, CHFlagHighMessage, CHFlagLowMessage, sep = "\n")
} else if(CHFlagProp < .05 & CHFlagProp2 > 0){
CHFlagMessage <- paste(CHFlagMessage_Pass_Low, CHFlagHighMessage, CHFlagLowMessage, sep = "\n")
} else if(CHFlagProp==0){
CHFlagMessage <- CHFlagMessage_Pass_Zero
}
## Print diagnosis
## Should implement options to return the PersonIds or a full data frame
if(return == "message"){
message(CHFlagMessage)
} else if(return == "text"){
CHFlagMessage
}
}

43
R/flag_em_ratio.R Normal file
Просмотреть файл

@ -0,0 +1,43 @@
#' @title Flag unusual high Email hours to emails sent ratio
#'
#' @description This function flags persons who have an unusual ratio
#' of email hours to emails sent.
#' Returns a character string by default.
#'
#' @import dplyr
#'
#' @family Data Validation
#'
#' @param data A data frame containing a Person Query.
#' @param threshold Numeric value specifying the threshold for flagging. Defaults to 1.
#'
#' @examples
#' flag_em_ratio(sq_data)
#'
#' @export
flag_em_ratio <- function(data, threshold = 1){
## Check for high collab hours but lower afterhour collab hours
## Because of faulty outlook settings
em_summary <-
data %>%
group_by(PersonId) %>%
summarise_at(vars(Email_hours, Emails_sent), ~mean(.)) %>%
mutate(Email_ratio = Email_hours / Emails_sent) %>%
arrange(desc(Email_ratio)) %>%
mutate(Email_Flag = ifelse(Email_ratio > threshold, TRUE, FALSE))
## Percent of people with high collab hours + low afterhour collab hours
EmailFlagN <- sum(em_summary$Email_Flag, na.rm = TRUE)
EmailFlagProp <- mean(em_summary$Email_Flag, na.rm = TRUE)
EmailFlagProp2 <- paste(round(EmailFlagProp * 100), "%") # Formatted
EmailFlagMessage <- paste0(EmailFlagProp2, " (", EmailFlagN, ") ",
"of the population have an unusually high email hours to emails sent ratio.")
## Print diagnosis
## Should implement options to return the PersonIds or a full data frame
# EmailFlagMessage
em_summary
}

88
R/flag_extreme.R Normal file
Просмотреть файл

@ -0,0 +1,88 @@
#' @title Warn if a certain metric exceeds an arbitrary threshold
#'
#' @description
#' This is used as part of data validation to check if there are extreme values
#' in the dataset.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param metric A character string specifying the metric to test.
#' @param person A logical value to specify whether to calculate person-averages.
#' Defaults to TRUE (person-averages calculated).
#' @param threshold Numeric value specifying the threshold for flagging.
#' @param return A character string specifying what to return.
#'
#' @family Data Validation
#'
#' @import dplyr
#'
#' @examples
#' \dontrun{
#' # The threshold values are intentionally set low to trigger messages.
#' flag_extreme(sq_data, "Email_hours", threshold = 15)
#' flag_extreme(sq_data, "Email_hours", threshold = 15, return = "table")
#' flag_extreme(sq_data, "Email_hours", person = FALSE, threshold = 15)
#' }
#'
#' @export
flag_extreme <- function(data,
metric,
person = TRUE,
threshold,
return = "message"){
## Data frame containing the extreme values
if(person == TRUE){
extreme_df <-
data %>%
rename(metric = !!sym(metric)) %>%
group_by(PersonId) %>%
summarise_at(vars(metric), ~mean(.)) %>%
filter(metric > threshold) %>%
rename(!!sym(metric) := "metric")
} else if(person == FALSE){
extreme_df <-
data %>%
rename(metric = !!sym(metric)) %>%
filter(metric > threshold) %>%
rename(!!sym(metric) := "metric")
}
## Clean names for pretty printing
metric_nm <- gsub(pattern = "_", replacement = " ", x = metric)
metric_nm <- camel_clean(metric_nm)
## Define MessageLevel
if(person == TRUE){
MessageLevel <- " persons where their average "
} else if(person == FALSE){
MessageLevel <- " rows where their value of "
}
## Define FlagMessage
if(nrow(extreme_df) == 0){
FlagMessage <-
paste0("[Pass] There are no ",
MessageLevel,
metric_nm,
" exceeds ",
threshold, ".")
} else {
FlagMessage <-
paste0("[Warning] There are ",
nrow(extreme_df),
MessageLevel,
metric_nm,
" exceeds ",
threshold, ".")
}
if(return == "text"){
FlagMessage
} else if(return == "message"){
message(FlagMessage)
} else if(return == "table"){
extreme_df
}
}

105
R/flag_outlooktime.R Normal file
Просмотреть файл

@ -0,0 +1,105 @@
#' @title Flag unusual outlook time settings for work day start and end time
#'
#' @description This function flags unusual outlook calendar settings for
#' start and end time of work day.
#'
#' @import dplyr
#'
#' @param data A data frame containing a Person Query.
#' @param threshold A numeric vector of length two, specifying the hour threshold for flagging.
#' Defaults to c(4, 15).
#' @param return String to specify what to return.
#' Valid options include "text" (default), "message", and "data".
#'
#' @family Data Validation
#'
#' @examples
#' \dontrun{
#' flag_outlooktime(sq_data)
#' }
#' @export
flag_outlooktime <- function(data, threshold = c(4, 15), return = "message"){
# pad_times <- function(x){
# if(nchar(x) == 1){
# x <- paste0("0", x, "00")
# } else if(nchar(x) == 2){
# x <- paste0(x, "00")
# } else if(nchar(x) == 3){
# x <- paste0("0", x)
# } else {
# x
# }
# }
#
# pad_times <- Vectorize(pad_times)
if(any(nchar(data$WorkingStartTimeSetInOutlook) != 5 | nchar(data$WorkingEndTimeSetInOutlook) != 5)){
stop("Please check data format for `WorkingStartTimeSetInOutlook` or `WorkingEndTimeSetInOutlook.\n
These variables must be character vectors, and have the format `%H:%M`, such as `07:30` or `23:00`.")
}
clean_times <- function(x){
out <- gsub(pattern = ":", replacement = "", x = x)
# out <- pad_times(out)
strptime(out, format = "%H%M")
}
flagged_data <-
data %>%
# mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~clean_times(.)) %>%
mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~gsub(pattern = ":", replacement = "", x = .)) %>%
mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~strptime(., format = "%H%M")) %>%
mutate(WorkdayRange = as.numeric(WorkingEndTimeSetInOutlook - WorkingStartTimeSetInOutlook, units = "hours"),
WorkdayFlag1 = WorkdayRange < threshold[[1]],
WorkdayFlag2 = WorkdayRange > threshold[[2]],
WorkdayFlag = WorkdayRange < threshold[[1]] | WorkdayRange > threshold[[2]]) %>%
select(PersonId, WorkdayRange, WorkdayFlag, WorkdayFlag1, WorkdayFlag2)
## Short working hour settings
FlagN1 <- sum(flagged_data$WorkdayFlag1, na.rm = TRUE)
FlagProp1 <- mean(flagged_data$WorkdayFlag1, na.rm = TRUE)
FlagProp1F <- paste0(round(FlagProp1 * 100, 1), "%") # Formatted
## Long working hour settings
FlagN2 <- sum(flagged_data$WorkdayFlag2, na.rm = TRUE)
FlagProp2 <- mean(flagged_data$WorkdayFlag2, na.rm = TRUE)
FlagProp2F <- paste0(round(FlagProp2 * 100, 1), "%") # Formatted
## Short or long working hoursettings
FlagN <- sum(flagged_data$WorkdayFlag, na.rm = TRUE)
FlagProp <- mean(flagged_data$WorkdayFlag, na.rm = TRUE)
FlagPropF <- paste0(round(FlagProp * 100, 1), "%") # Formatted
## Flag Messages
Warning_Message <- paste0("[Warning] ", FlagPropF, " (", FlagN, ") ", "of the person-date rows in the data have extreme Outlook settings.")
Pass_Message1 <- paste0("[Pass] Only ", FlagPropF, " (", FlagN, ") ", "of the person-date rows in the data have extreme Outlook settings.")
Pass_Message2 <- paste0("There are no extreme Outlook settings in this dataset (Working hours shorter than ", threshold[[1]], " hours, or longer than ", threshold[[2]], " hours.")
Detail_Message <- paste0(FlagProp1F, " (", FlagN1, ") ", " have an Outlook workday shorter than ", threshold[[1]], " hours, while ",
FlagProp2F, " (", FlagN2, ") ", "have a workday longer than ", threshold[[2]], " hours.")
if(FlagProp >= .05){
FlagMessage <- paste(Warning_Message, Detail_Message, sep = "\n")
} else if(FlagProp < .05 & FlagProp > 0){
FlagMessage <- paste(Pass_Message1, Detail_Message, sep = "\n")
} else if(FlagProp==0){
FlagMessage <- Pass_Message2
}
## Print diagnosis
## Should implement options to return the PersonIds or a full data frame
if(return == "text"){
FlagMessage
} else if(return == "message"){
message(FlagMessage)
} else if(return == "data"){
flagged_data[flagged_data$WorkdayFlag == TRUE,]
} else {
stop("Error: please check inputs for `return`")
}
}

19
R/g2g_data.R Normal file
Просмотреть файл

@ -0,0 +1,19 @@
#' @title Sample Group-to-Group dataset
#'
#' @description
#' A dataset generated from a Group-to-Group Query from WpA.
#'
#' @format A data frame with 3517 rows and 7 variables:
#' \describe{
#' \item{TimeInvestors_Organization}{ }
#' \item{Collaborators_Organization}{ }
#' \item{Date}{ }
#' \item{Meetings}{ }
#' \item{Meeting_hours}{ }
#' \item{Email_hours}{ }
#' \item{Collaboration_hours}{ }
#'
#' ...
#' }
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
"g2g_data"

93
R/g2g_network.R Normal file
Просмотреть файл

@ -0,0 +1,93 @@
#' @title Create a network plot with the G2G query
#'
#' @description
#' Pass a data frame containing a G2G query and return a network
#' plot.
#' Automatically handles "Collaborators_within_group" and "Other_collaborators" within query data.
#'
#' @param data Data frame containing a G2G query.
#' @param time_investor String containing the variable name for the Time Investor column.
#' @param collaborator String containing the variable name for the Collaborator column.
#' @param metric String containing the variable name for metric.
#' @param exc_threshold Exclusion threshold to apply.
#' @param subtitle String to override default plot subtitle.
#' @param ... Additional arguments to pass to `GGally::ggnet2()`
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#'
#' @import ggplot2
#' @import dplyr
#'
#' @examples
#' \dontrun{
#' g2g_data %>%
#' g2g_network(time_investor = "TimeInvestors_Organization",
#' collaborator = "Collaborators_Organization",
#' metric = "Collaboration_hours")
#' }
#'
#' @export
g2g_network <- function(data,
time_investor,
collaborator,
metric,
exc_threshold = 0.1,
subtitle = "Collaboration Across Organizations",
return = "plot",
...){
plot_data <-
data %>%
rename(TimeInvestorOrg = time_investor,
CollaboratorOrg = collaborator,
Metric = metric) %>%
mutate(CollaboratorOrg = case_when(CollaboratorOrg == "Collaborators Within Group" ~ TimeInvestorOrg,
TRUE ~ CollaboratorOrg)) %>%
group_by(TimeInvestorOrg, CollaboratorOrg) %>%
filter(TimeInvestorOrg != "Other_Collaborators" &
CollaboratorOrg!="Other_Collaborators") %>%
summarise_at("Metric", ~mean(.)) %>%
group_by(TimeInvestorOrg) %>%
mutate(metric_prop = Metric / sum(Metric, na.rm = TRUE)) %>%
select(TimeInvestorOrg, CollaboratorOrg, metric_prop) %>%
ungroup()
if(return == "table"){
plot_data
} else if(return == "plot"){
mynet_em <-
plot_data %>%
filter(metric_prop > exc_threshold) %>%
mutate_at(vars(TimeInvestorOrg, CollaboratorOrg), ~sub(pattern = " ", replacement = "\n", x = .)) %>%
mutate(metric_prop = metric_prop * 10) %>%
network::network(matrix.type = "edgelist",
ignore.eval = FALSE,
names.eval = "weights")
mynet_em %>%
GGally::ggnet2(size = 12,
color = "lightblue",
label = TRUE,
label.size = 4,
label.color = "black",
edge.size = "weights",
edge.alpha = .5,
...) +
ggtitle("Group to Group Collaboration",
subtitle = subtitle) +
xlab(label = "") +
ylab(label = "") +
theme_wpa_basic() +
labs(caption = paste("Displays only collaboration above ", exc_threshold * 100, "% of node's total collaboration", sep = "")) +
theme(axis.line = element_blank())
} else {
stop("Please enter a valid input for `return`.")
}
}

196
R/generate_report.R Normal file
Просмотреть файл

@ -0,0 +1,196 @@
#' @title Generate HTML report with list inputs
#'
#' @description
#' This is a support function using a list-pmap workflow to
#' create a HTML document, using RMarkdown as the engine.
#'
#' @param title Character string to specify the title of the chunk.
#' @param filename File name to be used in the exported HTML.
#' @param outputs A list of outputs to be added to the HTML report.
#' Note that `outputs`, `titles`, `echos`, and `levels` must have the same length
#' @param titles A list/vector of character strings to specify the title of the chunks.
#' @param subheaders A list/vector of character strings to specify the subheaders for each chunk.
#' @param echos A list/vector of logical values to specify whether to display code.
#' @param levels A list/vector of numeric value to specify the header level of the chunk.
#' @param theme Character vector to specify theme to be used for the report.
#' E.g. "united", "default".
#' @param preamble A preamble to appear at the beginning of the report, passed as a text string.
#'
#' @importFrom purrr pmap
#' @importFrom purrr reduce
#'
#' @family Reports
#'
#' @examples
#' \dontrun{
#' # Step 1: Define Content
#' output_list <-
#' list(sq_data %>% workloads_summary(return = "plot"),
#' sq_data %>%
#' workloads_summary(return = "table")) %>%
#' purrr::map_if(is.data.frame, create_dt)
#'
#' # Step 2: Add Corresponding Titles
#' title_list <- c("Workloads Summary - Plot", "Workloads Summary - Table")
#' n_title <- length(title_list)
#'
#'# Step 3: Generate Report
#' generate_report(title = "My First Report",
#' filename = "My First Report",
#' outputs = output_list,
#' titles = title_list,
#' subheaders = rep("", n_title),
#' echos = rep(FALSE, n_title
#'
#' }
#' @export
generate_report <- function(title = "My minimal HTML generator",
filename = "minimal_html",
outputs = output_list,
titles,
subheaders,
echos,
levels,
theme = "united",
preamble = ""){
## Title of document
title_chr <- paste0('title: \"', title, '\"')
## chunk loopage
## merged to create `chunk_merged`
chunk_merged <-
list(output = outputs,
title = titles,
subheader = subheaders,
echo = echos,
level = levels,
id = seq(1, length(outputs))) %>%
purrr::pmap(function(output, title, subheader, echo, level, id){
generate_chunks(level = level,
title = title,
subheader = subheader,
echo = echo,
object = paste0("outputs[[", id, "]]"))
}) %>%
purrr::reduce(c)
# wpa_logo <- system.file("logos/logo.PNG", package = "wpa")
## markdown object
markobj <- c('---',
title_chr <- paste0('title: \"', title, '\"'),
'output: ',
' html_document:',
paste0(' theme: ', theme),
# ' theme: united',
' toc: true',
' toc_float:',
' collapsed: false',
' smooth_scroll: true',
'---',
# paste0('![]("', wpa_logo, '")'),
'',
preamble,
'',
chunk_merged)
writeLines(markobj, paste0(filename, ".Rmd"))
rmarkdown::render(paste0(filename, ".Rmd"))
## Load in browser
utils::browseURL(paste0(filename, ".html"))
## Deletes specified files
unlink(c(paste0(filename, ".Rmd"),
paste0(filename, ".md")))
}
#' @title Generate chunk strings
#'
#' @description This is used as a supporting function for `generate_report()`
#' and not directly used. `generate_report()`` works by creating a
#' loop structure around generate_chunks(), and binds them together
#' to create a report.
#'
#' @details
#' `generate_chunks()` is primarily a wrapper around paste() functions,
#' to create a structured character vector that will form the individual
#' chunks. No plots 'exist' within the environment of `generate_chunks()`.
#'
#' @param level Numeric value to specify the header level of the chunk.
#' @param title Character string to specify the title of the chunk.
#' @param subheader Character string to specify the subheader of the chunk.
#' @param echo Logical value to specify whether to display code.
#' @param object Character string to specify name of the object to show.
generate_chunks <- function(level = 3,
title,
subheader = "",
echo = FALSE,
object){
level_hash <- paste(rep('#', level), collapse = "")
obj <- c(paste(level_hash, title),
subheader,
paste0('```{r, echo=',
echo,
', fig.height=9, fig.width=12}'),
object,
'```',
' ')
return(obj)
}
#' Read preamble
#'
#' Read in a preamble to be used within each individual reporting function.
#' Reads from the Markdown file installed with the package.
#'
#' @param path Text string containing the path for the appropriate Markdown file.
#'
#' @export
read_preamble <- function(path){
full_path <- paste0("/preamble/", path)
complete_path <- paste0(path.package("wpa"), full_path)
text <- suppressWarnings(readLines(complete_path))
return(text)
}
#' Display HTML fragment in RMarkdown chunk, from Markdown text
#'
#' @description
#' This is a wrapper around `markdown::markdownToHTML()`, where
#' the default behaviour is to produce a HTML fragment.
#' `htmltools::HTML()` is then used to evaluate the HTML code
#' within a RMarkdown chunk.
#'
#' @importFrom htmltools HTML
#' @importFrom markdown markdownToHTML
#'
#' @param text Character vector containing Markdown text
#'
md2html <- function(text){
html_chunk <- markdown::markdownToHTML(text = text,
fragment.only = TRUE)
htmltools::HTML(html_chunk)
}

184
R/globals.R Normal file
Просмотреть файл

@ -0,0 +1,184 @@
###################################################################
## Global Variables
## This file is added to minimize the false positives flagged during R CMD check.
## Example: afterhours_trend: no visible binding for global variable 'Date'
###################################################################
utils::globalVariables(
c(
"Date",
"Start",
"End",
"PersonId",
"group",
"Period",
"outcome",
".",
"After_hours_collaboration_hours",
"Employee_Count",
"bins",
"Metric",
"Hours",
"Collaboration_hours",
"bucket_hours",
"Employees",
"Total",
"Value",
"Email_hours",
"External_network_size",
"Networking_outside_company",
"Ext_network_size",
"Ext_network_breadth",
"CH_ratio",
"Emails_sent",
"Email_ratio",
"WorkingEndTimeSetInOutlook",
"WorkingStartTimeSetInOutlook",
"WorkdayRange",
"WorkdayFlag",
"WorkdayFlag1",
"WorkdayFlag2",
"TimeInvestorOrg",
"CollaboratorOrg",
"metric_prop",
"output_list",
"n_unique",
"attribute",
"NA_per",
"no visible global function definition for head",
"mean_collab",
"holidayweek",
"ymin",
"ymax",
"z_score",
"Organization",
"flag_nkw",
"perc",
"tenure_years",
"odd_tenure",
"Internal_network_size",
"Networking_outside_organization",
"Int_network_size",
"Int_network_breadth",
"Workweek_span",
"Meeting_hours",
"Meetings",
"After_hours_meeting_hours",
"Low_quality_meeting_hours",
"Generated_workload_email_recipients",
"After_hours_email_hours",
"Total_focus_hours",
"variable",
"Multitasking_meeting_hours",
"Redundant_meeting_hours__organizational_",
"Conflicting_meeting_hours",
"Meeting_count",
"perc_after_hours_m",
"perc_low_quality",
"perc_Multitasking",
"perc_Redundant",
"perc_conflicting",
"Redundant_meeting_hours__lower_level_",
"HourType",
"extract_prop",
"extract_raw",
"RawHours",
"MeetingType",
"Attendee_meeting_hours",
"AttendeeMeetingHours",
"Prop",
"Duration",
"Attendees",
"Percent",
"value2",
"x",
"y",
"Meeting_hours_with_manager",
"coattendman_rate",
"bucket_coattendman_rate",
"Meeting_hours_with_manager_1_on_1",
"mgr1on1",
"coattendande",
"mgrRel",
"xmin",
"xmax",
"bucket_manager_1_on_1",
"Minutes_with_manager_1_on_1",
"Metrics",
"Values",
"Checks",
"Variables",
"UniqueValues",
"MissingValues",
"Examples",
"KPI",
"After",
"Before",
"delta",
"perc_diff",
"Mybins",
"Employee_perc",
"cluster",
"key",
"clusters",
"label",
"Signals",
"StartEnd",
"Before_start",
"Within_hours",
"After_end",
"Cases",
"subjectFlag",
"Subject",
"line",
"text",
"word",
"name",
"freq",
"pre_group",
"Utilization_hrs",
"UH_bin",
"UH_over_45",
"h_clust",
"dist_m",
"Signals_Total",
"Signals_sent",
"Personas",
"Employee_count",
"external_network_plot",
"total",
"value",
"prop",
"ODDS",
"..input_var2",
".N",
".SD",
"Freq",
"PersonCount",
"Shifts",
"Shoulders",
"per",
"sent",
"Fill",
"PersonWeekId",
"WeekCount",
"WeekPercentage",
"patternRank",
"Count",
"Day_Span",
"First_signal",
"Last_signal",
"MeetingHoursInLongOrLargeMeetings",
"Percentage",
"TenureYear",
"calculation",
"perc_nkw",
"value_rescaled",
"values",
"cleaned_data",
"zscore"
)
)

58
R/hr_trend.R Normal file
Просмотреть файл

@ -0,0 +1,58 @@
#' @title Employee count over time
#'
#' @description Returns a line chart showing the change in
#' employee count over time. Part of a data validation process to check
#' for unusual license growth / declines over time.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#'
#' @examples
#' \dontrun{
#' hr_trend(sq_data)
#' hr_trend(dv_data)
#' hr_trend(dv_data, return = "table")
#' }
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
hr_trend <- function(data, return = "plot"){
options(dplyr.summarise.inform = FALSE)
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
## Date range data frame
myPeriod <- extract_date_range(data)
plot_data <-
data %>%
group_by(Date) %>%
summarise(n = n_distinct(PersonId)) %>%
ungroup()
if(return == "plot"){
plot_data %>%
ggplot(aes(x = Date, y = n)) +
geom_line(size = 1) +
labs(title = "Population over time",
subtitle = "Unique licensed population by week",
caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End)) +
ylab("Employee count") +
xlab("Date") +
scale_y_continuous(labels = round, limits = c(0,NA)) +
theme_wpa_basic()
} else if(return == "table"){
plot_data
} else {
stop("Please enter a valid input for `return`.")
}
}

102
R/hrvar_count.R Normal file
Просмотреть файл

@ -0,0 +1,102 @@
#' @title Create a count of distinct people in a specified HR variable
#'
#' @description
#' This function enables you to create a count of the distinct people
#' by the specified HR attribute.The default behaviour is to return a
#' bar chart as typically seen in 'Analysis Scope'.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation".
#' If a vector with more than one value is provided, the HR attributes are automatically
#' concatenated.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import ggplot2
#' @import dplyr
#' @importFrom data.table ":=" "%like%" "%between%"
#'
#' @family General
#' @family Data Validation
#'
#' @examples
#' # Return a bar plot
#' hrvar_count(sq_data, hrvar = "LevelDesignation")
#'
#' # Return a summary table
#' hrvar_count(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#'@export
hrvar_count <- function(data,
hrvar = "Organization",
return = "plot"){
## Allow multiple HRvar inputs
if(length(hrvar) > 1){
hrvar_flat <- paste(hrvar, collapse = ", ")
summary_table <-
data %>%
select(PersonId, all_of(hrvar)) %>%
mutate(!!sym(hrvar_flat) := select(., hrvar) %>%
apply(1, paste, collapse = ", ")) %>%
group_by(!!sym(hrvar_flat)) %>%
summarise(n = n_distinct(PersonId)) %>%
arrange(desc(n))
# Single reference for single and multiple org attributes
hrvar_label <- hrvar_flat
} else {
summary_table <-
data %>%
select(PersonId, all_of(hrvar)) %>%
group_by(!!sym(hrvar)) %>%
summarise(n = n_distinct(PersonId)) %>%
arrange(desc(n))
# Single reference for single and multiple org attributes
hrvar_label <- hrvar
}
if(return == "table"){
data %>%
data.table::as.data.table() %>%
.[, .(n = n_distinct(PersonId)), by = hrvar] %>%
as_tibble() %>%
arrange(desc(n))
} else if(return == "plot"){
## This is re-run to enable multi-attribute grouping without concatenation
summary_table %>%
ggplot(aes(x = stats::reorder(!!sym(hrvar_label), -n),
y = n)) +
geom_col(fill = rgb2hex(0, 120, 212)) +
geom_text(aes(label = n),
vjust = -1,
fontface = "bold",
size = 4)+
theme_wpa_basic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste("People by", camel_clean(hrvar_label))) +
scale_y_continuous(limits = c(0, max(summary_table$n) * 1.1)) +
xlab(camel_clean(hrvar_label)) +
ylab("Number of employees")
} else {
stop("Please enter a valid input for `return`.")
}
}
#' @rdname hrvar_count
#' @export
analysis_scope <- hrvar_count

105
R/hrvar_count_all.R Normal file
Просмотреть файл

@ -0,0 +1,105 @@
#' @title Create count of distinct fields and percentage of employees with NAs for all HR variables
#'
#' @description
#' This function enables you to create a summary table to validate organizational data. This table will provide a summary of the data found
#' in the WpA Sources page.
#' This function will return a summary table with the count of distinct fields per HR attribute and the percentage of
#' employees with NAs for that attribute.
#' See hrvar_count function for more detail on the specific HR attribute of interest.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param n_var number of HR variables to include in report as rows. Default is set to 10 HR variables.
#' @param return String to specify what to return
#' @param threshold The max number of unique values allowed for any attribute. Default is 100.
#' @param maxna The max percentage of NAs allowable for any column. Default is 20.
#'
#' @import dplyr
#'
#' @family Data Validation
#'
#' @return
#' Returns an error message by default, where 'text' is passed in `return`.
#' When 'table' is passed, a summary table listing the number of distinct fields and percentage of NAs for the specified number of HR attributes will be returned.
#' when 'message' is passed, outputs a message indicating which values are beyond the specified thresholds.
#'
#' @export
hrvar_count_all <- function(data,
n_var = 50,
return = "message",
threshold = 100,
maxna = 20
){
## Character vector of HR attributes
extracted_chr <- extract_hr(data, return = "names")
summary_table_n <-
data %>%
select(PersonId, extracted_chr) %>%
summarise_at(vars(extracted_chr), ~n_distinct(.,na.rm = TRUE)) # Excludes NAs from unique count
## Note: WPA here is used for a matching separator
results <-
data %>%
select(PersonId, extracted_chr) %>%
summarise_at(vars(extracted_chr),
list(`WPAn_unique` = ~n_distinct(., na.rm = TRUE), # Excludes NAs from unique count
`WPAper_na` = ~(sum(is.na(.))/ nrow(data) * 100),
`WPAsum_na` = ~sum(is.na(.)) # Number of missing values
)) %>% # % of missing values
tidyr::gather(attribute, values) %>%
tidyr::separate(col = attribute, into = c("attribute", "calculation"), sep = "_WPA") %>%
tidyr::spread(calculation, values)
## Single print message
if(sum(results$n_unique >= threshold)==0){
printMessage <- paste("No attributes have greater than", threshold, "unique values.")
}
if(sum(results$per_na >= maxna)==0){
newMessage <- paste("No attributes have more than", maxna, "percent NA values.")
printMessage <- paste(printMessage, newMessage, collapse = "\n")
}
for (i in 1:nrow(results)) {
if(results$n_unique[i] >= threshold){
newMessage <- paste0("The attribute '",results$attribute[i],"' has a large amount of unique values. Please check.")
printMessage <- paste(printMessage, newMessage, collapse = "\n")
}
if(results$per_na[i]>=maxna){
newMessage <- paste0("The attribute '",results$attribute[i],"' has a large amount of NA values. Please check.")
printMessage <- paste(printMessage, newMessage, collapse = "\n")
}
}
if(return == "table"){
results <-
results %>%
select(Attributes = "attribute",
`Unique values` = "n_unique",
`Total missing values` = "sum_na",
`% missing values` = "per_na")
return(utils::head(results, n_var))
} else if(return == "text"){
printMessage
} else if(return == "message"){
message(printMessage)
} else {
stop("Error: please check inputs for `return`")
}
}

113
R/identify_churn.R Normal file
Просмотреть файл

@ -0,0 +1,113 @@
#' @title Identify employees who have churned from the dataset
#'
#' @description
#' This function identifies and counts the number of employees who have churned from
#' the dataset by measuring whether an employee who is present in the first `n` (n1) weeks
#' of the data is present in the last `n` (n2) weeks of the data.
#'
#' @details
#' An additional use case of this function is the ability to identify "new-joiners" by using
#' the argument `flip`.
#'
#' @param data A Person Query as a data frame. Must contain a `PersonId`.
#' @param n1 A numeric value specifying the number of weeks at the beginning of the period
#' that defines the measured employee set. Defaults to 6.
#' @param n2 A numeric value specifying the number of weeks at the end of the period
#' to calculate whether employees have churned from the data. Defaults to 6.
#' @param return String specifying what to return. Defaults to "message", with options to
#' return a character string ("text") or the `PersonId` of employees who have been identified
#' as churned ("data").
#' @param flip Logical, defaults to FALSE. This determines whether to reverse the logic of identifying the
#' non-overlapping set. If set to `TRUE`, this effectively identifies new-joiners, or those
#' who were not present in the first n weeks of the data but were present in the final n weeks.
#'
#' @details
#' If an employee is present in the first `n` weeks of the data but not present in the last
#' `n` weeks of the data, the function considers the employee as churned. As the measurement period
#' is defined by the number of weeks from the start and the end of the passed data frame, you
#' may consider filtering the dates accordingly before running this function.
#'
#' Another assumption that is in place is that any employee whose `PersonId` is not available in the
#' data has churned. Note that there may be other reasons why an employee's `PersonId` may not
#' be present, e.g. maternity/paternity leave, Workplace Analytics license has been removed,
#' shift to a low-collaboration role (to the extent that he/she becomes inactive).
#' @examples
#' \dontrun{
#' sq_data %>% identify_churn(n1 = 3, n2 = 3, return = "message")
#' }
#' @export
identify_churn <- function(data,
n1 = 6,
n2 = 6,
return = "message",
flip = FALSE){
data$Date <- as.Date(data$Date, format = "%m/%d/%Y") # Ensure correct format
unique_dates <- unique(data$Date) # Vector of unique dates
nlen <- length(unique_dates) # Total number of unique dates
# First and last n weeks
firstnweeks <- sort(unique_dates)[1:n1]
lastnweeks <- sort(unique_dates, decreasing = TRUE)[1:n2]
## People in the first week
first_peeps <-
data %>%
dplyr::filter(Date %in% firstnweeks) %>%
dplyr::pull(PersonId) %>%
unique()
## People in the last week
final_peeps <-
data %>%
dplyr::filter(Date %in% lastnweeks) %>%
dplyr::pull(PersonId) %>%
unique()
if(flip == FALSE){
## In first, not in last
churner_id <- setdiff(first_peeps, final_peeps)
## Message
printMessage <-
paste0("Churn:\nThere are ", length(churner_id),
" employees from ", min(firstnweeks), " to ",
max(firstnweeks), " (", n1, " weeks)",
" who are no longer present in ",
min(lastnweeks), " to ", max(lastnweeks),
" (", n2, " weeks).")
} else if(flip == TRUE){
## In last, not in first
## new joiners
churner_id <- dplyr::setdiff(final_peeps, first_peeps)
## Message
printMessage <-
paste0("New joiners:\nThere are ", length(churner_id),
" employees from ", min(lastnweeks), " to ",
max(lastnweeks), " (", n2, " weeks)",
" who were not present in ",
min(firstnweeks), " to ", max(firstnweeks),
" (", n1, " weeks).")
} else {
stop("Invalid argument for `flip`")
}
if(return == "message"){
message(printMessage)
} else if(return == "text"){
printMessage
} else if(return == "data"){
churner_id
} else {
stop("Invalid `return`")
}
}

105
R/identify_holidayweeks.R Normal file
Просмотреть файл

@ -0,0 +1,105 @@
#' @title Identify Holiday Weeks
#'
#' @description
#' This function scans a standard query output for weeks where collaboration hours is far outside the mean.
#' Returns a list of weeks that appear to be holiday weeks and optionally an edited dataframe with outliers removed.
#' By default, missing values are excluded.
#'
#' As best practice, run this function prior to any analysis to remove atypical collaboration weeks from your dataset.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param sd The standard deviation below the mean for collaboration hours that should define an outlier week. Enter a positive number. Default is 1 standard deviation.
#' @param return String to specify what to return
#'
#' @import dplyr
#' @import ggplot2
#' @importFrom methods is
#'
#' @family Data Validation
#'
#' @return
#' Returns a message by default (`return` = "message").
#' When 'message' is passed, a message is printed identifying holiday weeks.
#' When 'data' is passed, a dataset with outlier weeks flagged in a new column is returned as a dataframe.
#' When 'data_cleaned' is passed, a dataset with outlier weeks removed is returned as a dataframe.
#' when 'data_dirty' is passed, a dataset with only outlier weeks is returned as a dataframe.
#' when 'plot' is passed, a pot with holiday weeks highlighted is returned as a dataframe.
#'
#' @export
identify_holidayweeks <- function(data, sd = 1, return = "message"){
## Ensure date is formatted
if(all(is_date_format(data$Date))){
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
} else if(is(data$Date, "Date")){
# Do nothing
} else {
stop("`Date` appears not to be properly formatted.\n
It needs to be in the format MM/DD/YYYY.\n
Also check for missing values or stray values with inconsistent formats.")
}
Calc <-
data %>%
group_by(Date) %>%
summarize(mean_collab = mean(Collaboration_hours, na.rm = TRUE),.groups = 'drop') %>%
mutate(z_score = (mean_collab - mean(mean_collab, na.rm = TRUE))/ sd(mean_collab, na.rm = TRUE))
Outliers = (Calc$Date[Calc$z_score < -sd])
mean_collab_hrs <- mean(Calc$mean_collab, na.rm = TRUE)
Message <- paste0("The weeks where collaboration was ",
sd,
" standard deviations below the mean (",
round(mean_collab_hrs, 1),
") are: \n",
paste(wrap(Outliers, wrapper = "`"),collapse = ", "))
myTable_plot <-
data %>%
mutate(holidayweek = (Date %in% Outliers)) %>%
select("Date", "holidayweek", "Collaboration_hours") %>%
group_by(Date) %>%
summarise(Collaboration_hours=mean(Collaboration_hours), holidayweek=first(holidayweek)) %>%
mutate(Date=as.Date(Date, format = "%m/%d/%Y"))
myTable_plot_shade <-
myTable_plot %>%
filter(holidayweek == TRUE) %>%
mutate(min = Date - 3 , max = Date + 3 , ymin = -Inf, ymax = +Inf)
plot <-
myTable_plot %>%
ggplot(aes(x = Date, y = Collaboration_hours, group = 1)) +
geom_line(colour = "grey40") +
theme_wpa_basic() +
geom_rect(data = myTable_plot_shade,
aes(xmin = min, xmax = max, ymin = ymin, ymax = ymax),
color="transparent", fill="steelblue", alpha=0.3) +
labs(title = "Holiday Weeks",
subtitle = "Showing average collaboration hours over time")+
ylab("Collaboration Hours") +
xlab("Date")
if(return == "text"){
return(Message)
} else if(return == "message"){
message(Message)
}else if(return %in% c("data_clean", "data_cleaned")){
return(data %>% filter(!(Date %in% Outliers)) %>% data.frame())
} else if(return == "data_dirty"){
return(data %>% filter((Date %in% Outliers)) %>% data.frame())
} else if(return == "data"){
return(data %>% mutate(holidayweek = (Date %in% Outliers)) %>% data.frame())
} else if(return == "plot"){
return(plot)
} else {
stop("Error: please check inputs for `return`")
}
}

Просмотреть файл

@ -0,0 +1,55 @@
#' @title Identify Inactive Weeks
#'
#' @description
#' This function scans a standard query output for weeks where collaboration hours is far outside the mean for any individual person in the dataset.
#' Returns a list of weeks that appear to be inactive weeks and optionally an edited dataframe with outliers removed.
#'
#' As best practice, run this function prior to any analysis to remove atypical collaboration weeks from your dataset.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param sd The standard deviation below the mean for collaboration hours that should define an outlier week. Enter a positive number. Default is 1 standard deviation.
#' @param return String to specify what to return.
#'
#' @import dplyr
#'
#' @family Data Validation
#'
#' @return
#' Returns an error message by default, where 'text' is returned.
#' When 'data_cleaned' is passed, a dataset with outlier weeks removed is returned as a dataframe.
#' When 'data_dirty' is passed, a dataset with outlier weeks is returned as a dataframe.
#'
#' @export
identify_inactiveweeks <- function(data, sd = 2, return = "text"){
Calc <-
data %>%
group_by(PersonId) %>%
mutate(z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
filter(z_score<= -sd) %>%
select(PersonId,Date,z_score) %>%
data.frame()
Message <- paste0("There are ",nrow(Calc), " rows of data with weekly collaboration hours more than ",sd," standard deviations below the mean.")
if(return == "text"){
return(Message)
} else if(return == "data_dirty"){
data %>% group_by(PersonId) %>% mutate(
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
filter(z_score<= -sd) %>% select(-z_score) %>% data.frame()
} else if(return == "data_cleaned"){
data %>% group_by(PersonId) %>% mutate(
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
filter(z_score> -sd) %>% select(-z_score) %>% data.frame()
} else if(return == "data"){
data %>% group_by(PersonId) %>% mutate(
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
mutate(inactiveweek = (z_score<= -sd)) %>% select(-z_score) %>% data.frame()
} else {
stop("Error: please check inputs for `return`")
}
}

96
R/identify_nkw.R Normal file
Просмотреть файл

@ -0,0 +1,96 @@
#' @title Identify Non-Knowledge workers (nkw)
#'
#' @description
#' This function scans a standard query output to identify employees with consistently low collaboration signals.
#' Returns the % of non-knowledge workers identified by Organization, and optionally an edited dataframe with non-knowledge workers removed, or the full dataframe with the kw/nkw flag added.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param collab_threshold The collaboration hours threshold that should be exceeded as an average for the entire analysis period for the employee to be categorized as a knowledge worker ("kw").
#' Enter a positive number. Default is set to 5 collaboration hours.
#' @param return String to specify what to return
#'
#' @import dplyr
#'
#' @family Data Validation
#'
#' @return
#' Returns an error message by default, where 'text' is passed in `return`.
#' When 'data_with_flag' is passed, a copy of your original dataframe with an additional column containing the kw/nkw flag will be returned.
#' When 'data_clean' is passed, a full dataframe excluding "nkw" employees will be returned.
#' when 'data_summmary' is passed, a summary table by organization listing the number and % of non-knowledge workers will be returned.
#'
#' @export
identify_nkw <- function(data, collab_threshold = 5, return = "data_summary"){
options(dplyr.summarise.inform = FALSE)
summary_byPersonId <-
data %>%
group_by(PersonId, Organization) %>%
summarize(mean_collab = mean(Collaboration_hours))%>%
mutate(flag_nkw = case_when(mean_collab > collab_threshold ~ "kw",
TRUE ~ "nkw"))
data_with_flag <- left_join(data,
summary_byPersonId %>%
dplyr::select(PersonId,flag_nkw), by = 'PersonId')
summary_byOrganization <-
summary_byPersonId %>%
group_by(Organization, flag_nkw)%>%
summarise(total = n())%>%
group_by(Organization)%>%
mutate(perc = total/sum(total))%>% #need to format to %
filter(flag_nkw == "nkw")%>%
rename(n_nkw = total, perc_nkw = perc)%>%
select(-flag_nkw) %>%
ungroup()
## Number of NKW identified
n_nkw <- sum(summary_byPersonId$flag_nkw == "nkw")
if(n_nkw == 0){
flagMessage <- paste0("[Pass] There are no non-knowledge workers identified",
" (average collaboration hours below ",
collab_threshold,
" hours).")
} else {
flagMessage <-
paste0("[Warning] Out of a population of ", n_distinct(data$PersonId),
", there are ", n_nkw,
" employees who may be non-knowledge workers (average collaboration hours below ",
collab_threshold, " hours).")
}
if(return == "data_with_flag"){
return(data_with_flag)
} else if(return %in% c("data_clean", "data_cleaned")){
return(data_with_flag %>% filter(flag_nkw == "kw") %>% data.frame())
} else if(return == "text"){
flagMessage
} else if(return =="data_summary"){
summary_byOrganization %>%
mutate(perc_nkw = scales::percent(perc_nkw, accuracy = 1)) %>%
rename(`Non-knowledge workers (count)` = "n_nkw",
`Non-knowledge workers (%)` = "perc_nkw")
} else {
stop("Error: please check inputs for `return`")
}
}

51
R/identify_outlier.R Normal file
Просмотреть файл

@ -0,0 +1,51 @@
#' @title Identify outliers across time
#'
#' This function takes in a selected metric and uses
#' z-score (number of standard deviations) to identify outliers
#' across time. There are applications in this for identifying
#' weeks with abnormally low collaboration activity, e.g. holidays.
#' Time as a grouping variable can be overridden with the `group_var`
#' argument.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param group_var A string with the name of the grouping variable.
#' Defaults to `Date`.
#' @param metric Character string containing the name of the metric,
#' e.g. "Collaboration_hours"
#'
#' @import dplyr
#'
#' @examples
#' identify_outlier(sq_data, metric = "Collaboration_hours")
#'
#' @return
#' Returns a data frame with `Date` (if grouping variable is not set),
#' the metric, and the corresponding z-score.
#'
#' @family General
#' @family Data Validation
#'
#' @export
identify_outlier <- function(data,
group_var = "Date",
metric = "Collaboration_hours"){
## Check inputs
required_variables <- c(group_var,
"PersonId",
metric)
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
main_table <-
data %>%
group_by(!!sym(group_var)) %>%
summarise_at(vars(!!sym(metric)), ~mean(.)) %>%
mutate(zscore = (!!sym(metric) - mean(!!sym(metric)))/sd(!!sym(metric)))
return(main_table)
}

Просмотреть файл

@ -0,0 +1,81 @@
#' @title Identify groups under privacy threshold
#'
#' @description
#' This function scans a standard query output for groups with of employees under the privacy threshold.
#' The method consists in reviewing each individual HR attribute, and count the distinct people within each group.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "table" (default) and "text" (text)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Data Validation
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
identify_privacythreshold <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
results <-
data %>% hrvar_count(
hrvar = hrvar[1],
return = "table")
results$hrvar <- ""
results <- results[0,]
for (p in hrvar) {
table1 <-
data %>%
hrvar_count(hrvar = p,
return = "table")
table1$hrvar <- p
colnames(table1)[1] <- "group"
results <- rbind(results,table1)
}
output <- results %>% arrange(n) %>% select(hrvar, everything())
groups_under <- results %>% filter(n<mingroup) %>% nrow()
MinGroupFlagMessage_Warning <- paste0("[Warning] There are ", groups_under, " groups under the minimum group size privacy threshold of ", mingroup, ".")
MinGroupFlagMessage_Low <- paste0("[Pass] There is only ", groups_under, " group under the minimum group size privacy threshold of ", mingroup, ".")
MinGroupFlagMessage_Zero <- paste0("[Pass] There are no groups under the minimum group size privacy threshold of ", mingroup, ".")
if(groups_under > 1){
MinGroupFlagMessage <- MinGroupFlagMessage_Warning
} else if(groups_under == 1 ){
MinGroupFlagMessage <- MinGroupFlagMessage_Low
} else if(groups_under ==0){
MinGroupFlagMessage <- MinGroupFlagMessage_Zero
}
if(return == "table"){
return(output)
} else if(return == "message"){
message(MinGroupFlagMessage)
} else if(return == "text"){
MinGroupFlagMessage
} else {
stop("Invalid `return` argument.")
}
}

70
R/identify_query.R Normal file
Просмотреть файл

@ -0,0 +1,70 @@
#' @title Identify the query type of the passed data frame
#'
#' @description
#' Pass a Workplace Analytics dataset and return the identified
#' query type as a string. This function uses variable name string
#' matching to 'guess' the query type of the data frame.
#'
#' @param data A Workplace Analytics dataset in the form of a data frame.
#' If the data is not identified as a Workplace Analytics dataset, the function
#' will return an error.
#' @param threshold Debugging use only. Increase to raise the 'strictness' of the
#' guessing algorithm. Defaults to 2.
#'
#' @family Data Validation
#'
#' @examples
#' \dontrun{
#' identify_query(sq_data) # Standard query
#' identify_query(mt_data) # Meeting query
#' identify_query(em_data) # Hourly collaboration query
#' identify_query(iris) # Will return an error
#' identify_query(mtcars) # Will return an error
#' }
#'
#' @export
identify_query <- function(data, threshold = 2){
## variables to check for in each query type
spq_var <- c("PersonId", "Collaboration_hours", "Instant_Message_hours") # Standard Person query
caq_var <- c("PersonId", "Collaboration_hrs", "Instant_message_hours") # Collaboration Assessment query
smq_var <- c("MeetingId", "Date", "Attendees") # Standard Meeting Query
shc_var <- c("PersonId", "Emails_sent_00_01", "IMs_sent_23_24") # Standard Hourly Collaboration
## see if there are columns which do not exist
spq_check <- check_inputs(input = data, requirements = spq_var, return = "names")
caq_check <- check_inputs(input = data, requirements = caq_var, return = "names")
smq_check <- check_inputs(input = data, requirements = smq_var, return = "names")
shc_check <- check_inputs(input = data, requirements = shc_var, return = "names")
## length of checks
spq_check_n <- length(spq_check)
caq_check_n <- length(caq_check)
smq_check_n <- length(smq_check)
shc_check_n <- length(shc_check)
total_check_vec <- c(spq_check_n, caq_check_n, smq_check_n, shc_check_n)
## should never be zero
total_check_n <- sum(total_check_vec, na.rm = TRUE)
## Labels
qlabels <- c("Person Query",
"Collaboration Assessment Query",
"Meeting Query",
"Hourly Collaboration Query")
## Minimum number of non-matches
min_nm <- min(total_check_vec)
## Final guess
f_guess <- qlabels[which.min(total_check_vec)]
if(total_check_n == 0){
stop("Error: please check if query data is properly formatted query data.")
} else if(min_nm > threshold){
stop("Column mismatches: please check if query data is properly formatted query data.")
} else {
f_guess
}
}

63
R/identify_shifts.R Normal file
Просмотреть файл

@ -0,0 +1,63 @@
#' @title Identify shifts based on outlook time settings for work day start and end time
#'
#' @description
#' This function uses outlook calendar settings for start and end time of work day to identify
#' work shifts. The relevant variables are `WorkingStartTimeSetInOutlook` and
#' `WorkingEndTimeSetInOutlook`.
#'
#' @param data A data frame containing data from the Hourly Collaboration query.
#'
#' @param return Character vector to specify what to return.
#' "table" (default) returns a summary table of the counts of shifts.
#' "data" returns the original input data frame with the `Shifts` column appended.
#'
#' @importFrom data.table ":=" "%like%" "%between%"
#'
#' @examples
#' dv_data %>% identify_shifts()
#'
#' @export
identify_shifts <- function(data, return = "plot"){
clean_times <- function(x){
out <- gsub(pattern = ":00", replacement = "", x = x)
as.numeric(out)
}
data <- data.table::as.data.table(data)
# data <- data.table::copy(data)
# Make sure data.table knows we know we're using it
.datatable.aware = TRUE
data[, Shifts := paste(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook, sep = "-")]
# outputTable <- data[, .(count = .N), by = Shifts]
outputTable <- data[, list(WeekCount = .N,
PersonCount = dplyr::n_distinct(PersonId)), by = Shifts]
outputTable <- data.table::setorder(outputTable, -PersonCount)
if(return == "table"){
dplyr::as_tibble(outputTable)
} else if(return == "plot"){
outputTable %>%
utils::head(10) %>%
create_bar_asis(group_var = "Shifts",
bar_var = "WeekCount",
title = "Most frequent outlook shifts",
subtitle = "Showing top 10 only",
caption = extract_date_range(data, return = "text"),
ylab = "Shifts",
xlab = "Frequency")
} else if(return == "data"){
output_data <- data
dplyr::as_tibble(output_data)
}
}

114
R/identify_tenure.R Normal file
Просмотреть файл

@ -0,0 +1,114 @@
#' @title Tenure calculation based on different input dates, returns data summary table or histogram
#'
#' @description
#' This function calculates employee tenure based on different input dates.
#' `identify_tenure` uses the latest Date available if user selects "Date",
#' but also have flexibility to select a specific date, e.g. "1/1/2020".
#'
#' @family Data Validation
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param end_date A string specifying the name of the date variable representing the latest date. Defaults to "Date".
#' @param beg_date A string specifying the name of the date variable representing the hire date. Defaults to "HireDate".
#' @param maxten A numeric value representing the maximum tenure.
#' If the tenure exceeds this threshold, it would be accounted for in the flag message.
#' @param return String to specify what to return.
#' Defaults to "message".
#' Other valid values include "text", "plot", "data_cleaned", "data_dirty", and "data".
#' For "data", a data frame with the `PersonId` and a calculated variable called `TenureYear` is returned.
#'
#' @examples
#' \dontrun{
#' # Add HireDate to sq_data - method #1
#' sq_data2 <- sq_data %>% mutate(HireDate = as.Date("1/1/2015", format = "%m/%d/%Y" ))
#' identify_tenure(sq_data2)
#'
#' # Add HireDate to sq_data - method #2
#' sq_data$HireDate <-
#' rep(sample(seq(as.Date('1975/01/01'),
#' as.Date('2019/11/01'), by="day"), 15119),2)
#'
#' identify_tenure(sq_data)
#' }
#'
#' @export
identify_tenure <- function(data,
end_date = "Date",
beg_date = "HireDate",
maxten = 40,
return = "message"){
required_variables <- c("HireDate")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
data_prep <-
data %>%
mutate(Date = as.Date(Date, format= "%m/%d/%Y"), # Re-format `Date`
end_date = as.Date(!!sym(end_date), format= "%m/%d/%Y"), # Access a symbol, not a string
beg_date = as.Date(!!sym(beg_date), format= "%m/%d/%Y")) %>% # Access a symbol, not a string
arrange(end_date) %>%
mutate(End = last(end_date))
last_date <- data_prep$End
# graphing data
tenure_summary <-
data_prep %>%
filter(Date == last_date) %>%
mutate(tenure_years = (Date - beg_date)/365) %>%
group_by(tenure_years)%>%
summarise(n = n(),.groups = 'drop')
# off person IDs
oddpeople <-
data_prep %>%
filter(Date == last_date) %>%
mutate(tenure_years = (Date - beg_date)/365) %>%
filter(tenure_years >= maxten) %>%
select(PersonId)
# message
Message <- paste0("The mean tenure is ",round(mean(tenure_summary$tenure_years,na.rm = TRUE),1)," years.\nThe max tenure is ",
round(max(tenure_summary$tenure_years,na.rm = TRUE),1),".\nThere are ",
length(tenure_summary$tenure_years[tenure_summary$tenure_years>=maxten])," employees with a tenure greater than ",maxten," years.")
if(return == "text"){
return(Message)
} else if(return == "message"){
message(Message)
} else if(return == "plot"){
suppressWarnings(
ggplot(data = tenure_summary,aes(x = as.numeric(tenure_years))) +
geom_density() +
labs(title = "Tenure - Density",
subtitle = "Calculated with `HireDate`") +
xlab("Tenure in Years") +
ylab("Density - number of employees") +
theme_wpa_basic()
)
} else if(return == "data_cleaned"){
return(data %>% filter(!(PersonId %in% oddpeople$PersonId)) %>% data.frame())
} else if(return == "data_dirty"){
return(data %>% filter((PersonId %in% oddpeople$PersonId)) %>% data.frame())
} else if(return == "data"){
data_prep %>%
filter(Date == last_date) %>%
mutate(TenureYear = as.numeric((Date - beg_date)/365)) %>%
select(PersonId, TenureYear)
} else {
stop("Error: please check inputs for `return`")
}
}

52
R/import_wpa.R Normal file
Просмотреть файл

@ -0,0 +1,52 @@
#' @title Import a Workplace Analytics Query
#'
#' @description
#' Import a Workplace Analytics Query from a local CSV File, with variable classifications optimised
#' for other **wpa** functions.
#'
#' @details
#' `import_wpa()` uses `data.table::fread()` to import CSV files for speed,
#' and by default `stringsAsFactors` is set to FALSE.
#' A data frame is returned by the function (not a `data.table`).
#'
#' @param x String containing the path to the Workplace Analytics query to be imported.
#' The input file must be a CSV file, and the file extension must be explicitly entered,
#' e.g. "/files/standard query.csv"
#' @param standardise logical. If TRUE, `import_wpa()` runs `standardise_pq()` to make a Collaboration
#' Assessment query's columns name standard and consistent with a Standard Person Query. Note that this
#' will have no effect if the query being imported is not a Collaboration Assessment query. Defaults
#' as FALSE.
#'
#' @export
import_wpa <- function(x, standardise = FALSE){
return_data <- data.table::fread(x, stringsAsFactors = FALSE) %>% as.data.frame()
# Columns which are Dates
dateCols <- sapply(return_data, function(x) all(is_date_format(x)))
dateCols <- dateCols[dateCols == TRUE]
return_data <-
return_data %>%
dplyr::mutate_at(dplyr::vars(names(dateCols)), ~as.Date(., format = "%m/%d/%Y"))
message("Query has been imported successfully!")
## Query check only available for Person Queries
if(identify_query(return_data) == "Person Query"){
check_query(return_data)
}
## Standardise query if `standardise == TRUE`
if(standardise == TRUE & identify_query(return_data) == "Collaboration Assessment Query"){
message("Standardising column names for a Collaboration Assessment query to
a Person query...")
return_data <- standardise_pq(return_data)
}
dplyr::as_tibble(return_data)
}

7
R/init.R Normal file
Просмотреть файл

@ -0,0 +1,7 @@
.onAttach <- function(libname, pkgname) {
message <- c("\n Thank you for using the {wpa} R package!",
"\n \n Our analysts have taken every care to ensure that this package runs smoothly and bug-free.",
"\n However, if you do happen to encounter any, please email mac@microsoft.com to report any issues.",
"\n \n Happy coding!")
packageStartupMessage(message)
}

111
R/internal_network_plot.R Normal file
Просмотреть файл

@ -0,0 +1,111 @@
#' @title Plot the internal network metrics for a HR variable
#'
#' @description
#' lot the internal network metrics for a HR variable.
#'
#' @param data Person Query as a dataframe including date column named "Date"
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
#' @param hrvar WpA variable for an HR variable to group networks by
#' For example, "Layer"
#' @param mingroup Numeric vector for minimum group size for aggregation
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param bubble_size A numeric vector of length two to specify the size range of the bubbles
#'
#' @import dplyr
#' @import reshape2
#' @import ggplot2
#' @import ggrepel
#'
#' @examples
#' \dontrun{internal_network_plot(sq_data)
#
#' }
#'
#' @family Connectivity
#'
#' @export
internal_network_plot <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
bubble_size = c(1, 5))
{
plot_data <-
data %>%
rename(group = !!sym(hrvar))
plot_data <-
plot_data %>%
group_by(group) %>%
summarize(Int_network_size = mean(Internal_network_size),
Int_network_breadth = mean(Networking_outside_organization),
Employee_count = n_distinct(PersonId)
) %>%
filter(Employee_count >= mingroup)
# Create summary table
summary_table <- plot_data %>% arrange(., desc(Int_network_size))
# Create plot
int_network_plot <-ggplot(plot_data,
aes(x=Int_network_size,
y=Int_network_breadth)
) +
geom_point(aes(size=Employee_count),
color = rgb2hex(0, 120, 212),
alpha = 0.5) +
geom_text_repel(label=plot_data$group) +
scale_x_continuous(name = "Internal Network Size") +
scale_y_continuous(name = "Internal Network Breadth") +
scale_size(range = bubble_size) +
theme_classic() +
theme(
axis.text = element_text(size = 10),
axis.text.x = element_text(
angle = 90,
hjust = 1,
vjust = 0.5
),
plot.title = element_text(
color = "grey40",
face = "bold",
size = 18
),
plot.subtitle = element_text(size = 14)
) +
labs(
title = paste("Internal network metrics by", hrvar),
subtitle = paste(
"Network size is number of people, breadth is number of organizations"
)
) +
labs(
caption = paste(
"Total employees =",
sum(plot_data$Employee_count),
"| Data from",
min(as.Date(data$Date, "%m/%d/%Y")),
"to",
max(as.Date(data$Date, "%m/%d/%Y"))
)
)
if(return == "table"){
summary_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(int_network_plot)
} else {
stop("Please enter a valid input for `return`.")
}
}

19
R/is_date_format.R Normal file
Просмотреть файл

@ -0,0 +1,19 @@
#' @title Identify whether string is a date format
#'
#' @description
#' This function uses regular expression to determine whether
#' a string is of the format "mdy", separated by "-", "/",
#' or ".", returning a logical vector.
#'
#' @param string Character string to test whether is a date format.
#'
#' @examples
#' \dontrun{
#' is_date_format("1/5/2020")
#' }
#'
#' @export
is_date_format <- function(string){
grepl("^\\d{1,2}[- /.]\\d{1,2}[- /.]\\d{1,4}$",
string)
}

183
R/keymetrics_report.R Normal file
Просмотреть файл

@ -0,0 +1,183 @@
# #' @title Create an exploratory overview of Standard Query dataset
# #'
# #' @description
# #' Input a Standard Query dataset as a data frame in the argument, returning an overview of the key metrics.
# #'
# #' @param data A Standard Query dataset in the form of a data frame.
# #' @param mingroup Numeric value setting the privacy threshold / minimum group size.
# #'
# #' @import dplyr
# #' @import ggplot2
# #' @import reshape2
# #'
# #' @return
# #' Returns a HTML report containing an overview of the key metrics in a Standard Person Query.
# #'
# #' @export
#
# keymetrics_report <- function(data, mingroup = 10){
# ## Table by Person (Period Averages)
# data %>%
# group_by(PersonId) %>%
# summarise(periods = n(),
# Organization = first(Organization),
# LevelDesignation = first(LevelDesignation)) -> mydata_person
#
# ## Plot available data by Organization
# mydata_person %>% count(Organization, name = "count") -> mydata_OrgLevel
#
# mydata_OrgLevel %>%
# ggplot(aes(x = Organization, y = 1, label=count)) +
# geom_point(aes(size = count), colour = "steelblue", alpha = 0.5) +
# scale_size_continuous(range = c(20, 50)) +
# geom_label() +
# ggtitle("Measured Individuals (by Organization)") +
# ylab("") +
# xlab("Organization") +
# theme(legend.position="none",
# axis.text.x = element_text(angle = 90, hjust = 1),
# axis.title.y = element_blank(),
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank()) -> plot_OrgLevel
#
# ## Plot available data by LevelDesignation
# mydata_person %>% count(LevelDesignation, name = "count") -> mydata_LevelDesignation
#
# mydata_LevelDesignation %>%
# ggplot(aes(x=LevelDesignation, y=1, label=count)) +
# geom_point(aes(size = count), colour = "steelblue", alpha = 0.5) +
# scale_size_continuous(range = c(20, 50)) +
# geom_label() +
# ggtitle("Measured Individuals (by Level Designation)") +
# ylab("") + xlab("Level Designation") +
# theme(legend.position="none",
# axis.text.x = element_text(angle = 90, hjust = 1),
# axis.title.y = element_blank(),
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank(),
# plot.title = element_text(color="grey40", face="bold", size=20)) -> plot_LevelDesignation
#
# ## Plot available data by Organization and Level
# mydata_person %>% count(Organization, LevelDesignation, name = "count") -> mydata_OrgLevelDesignation
#
# mydata_OrgLevelDesignation %>%
# ggplot(aes(x = Organization, y = LevelDesignation)) +
# geom_tile(aes(fill = count), colour = "grey50") +
# geom_text(aes(label = round(count, 1))) +
# scale_fill_gradient(low = "lightblue", high = "blue",
# limits=c(mingroup,NA)) +
# labs(fill = "Count") +
# ggtitle("Group Sizes (by Organization and Level)") +
# ylab("Level Designation") +
# theme(axis.text.x = element_text(angle = 45, hjust = 1),
# plot.title = element_text(color="grey40", face="bold", size=20)) -> plot_OrgLevelDesignation
#
# ## Pipe in Manager Relationship
# data %>% mgrrel_matrix(return = "plot") -> plot_ManagerRelationship
#
# ## Pipe in WITL
# data %>% keymetrics_scan(hrvar = "Organization", return = "plot") -> plot_WITL
#
# ## Pipe in Meeting Habits
# data %>% meeting_summary(hrvar = "Organization", return = "plot") -> plot_MeetingHabits
#
# ## Generate kables
#
# mydata_OrgLevel %>%
# knitr::kable(caption = "Data by Organization") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_OrgLevel
#
# mydata_LevelDesignation %>%
# knitr::kable(caption = "Data by Level Designation") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_LevelDesignation
#
# mydata_OrgLevelDesignation %>%
# spread(LevelDesignation, count) %>%
# knitr::kable(caption = "Data by Level Designation") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_OrgLevelDesignation
#
# ## Pipe in Manager Relationship
# data %>%
# mgrrel_matrix(return = "table") %>%
# knitr::kable(caption = "Distribution of Manager-Direct Relationship") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_ManagerRelationship
#
# ## Pipe in WITL
# data %>%
# keymetrics_scan(hrvar = "Organization", return = "table") %>%
# mutate_at(vars(-variable), ~round(., 1)) %>%
# knitr::kable(caption = "Week in the Life - by Organization") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_WITL
#
# ## Pipe in Meeting Habits
# data %>%
# meeting_summary(hrvar = "Organization", return = "table") %>%
# mutate_at(vars(-variable), ~round(., 1)) %>%
# knitr::kable(caption = "Meeting Habits - by Organization") %>%
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_MeetingHabits
#
# markobj <- c('---',
# 'title: "WPA Standard Query Overview"',
# 'output: ',
# ' html_document:',
# ' theme: united',
# ' toc: true',
# ' toc_float:',
# ' collapsed: false',
# ' smooth_scroll: true',
#
# '---',
# '',
# '',
#
# '### Week in the Life',
# '```{r, echo=FALSE}',
# 'plot_WITL',
# '```',
# '```{r, echo=FALSE}',
# 'tb_WITL',
# '```',
#
# '### Organization Level',
#
# '```{r, echo=FALSE}',
# 'tb_OrgLevel',
# '```',
#
# '### Level Designation',
# '```{r, echo=FALSE}',
# 'plot_LevelDesignation',
# '```',
# '```{r, echo=FALSE}',
# 'tb_LevelDesignation',
# '```',
#
# '### Organization and Level Designation',
# '```{r, echo=FALSE}',
# 'plot_OrgLevelDesignation',
# '```',
# '```{r, echo=FALSE}',
# 'tb_OrgLevelDesignation',
# '```',
#
# '### Meeting Habits',
# '```{r, echo=FALSE}',
# 'plot_MeetingHabits',
# '```',
# '```{r, echo=FALSE}',
# 'tb_MeetingHabits',
# '```',
#
# '### Manager-direct relationship Distribution',
# '```{r, echo=FALSE}',
# 'plot_ManagerRelationship',
# '```',
# '```{r, echo=FALSE}',
# 'tb_ManagerRelationship')
#
# writeLines(markobj, "SQ-overview.Rmd")
# rmarkdown::render("SQ-overview.Rmd")
# utils::browseURL("SQ-overview.html")
# unlink(c("SQ-overview.Rmd",
# "SQ-overview.md"))
# }

118
R/keymetrics_scan.R Normal file
Просмотреть файл

@ -0,0 +1,118 @@
#' @title Run a summary of Key Metrics from the Standard Query data
#'
#' @description
#' Returns a heatmapped table by default, with options to return a table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, e.g. "Organization"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param metrics A character vector containing the variable names to calculate averages of.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#' @param textsize A numeric value specifying the text size to show in the plot.
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @importFrom stats reorder
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' keymetrics_scan(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @export
keymetrics_scan <- function(data,
hrvar = "Organization",
mingroup = 5,
metrics = c("Workweek_span",
"Collaboration_hours",
"After_hours_collaboration_hours",
"Meetings",
"Meeting_hours",
"After_hours_meeting_hours",
"Low_quality_meeting_hours",
"Meeting_hours_with_manager_1_on_1",
"Meeting_hours_with_manager",
"Emails_sent",
"Email_hours",
"After_hours_email_hours",
"Generated_workload_email_hours",
"Total_focus_hours",
"Internal_network_size",
"Networking_outside_organization",
"External_network_size",
"Networking_outside_company"),
return = "plot",
textsize = 2){
myTable <-
data %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
group_by(group, PersonId) %>%
summarise_at(vars(metrics), ~mean(., na.rm = TRUE)) %>%
group_by(group) %>%
summarise_at(vars(metrics), ~mean(., na.rm = TRUE)) %>%
left_join(hrvar_count(data, hrvar = hrvar, return = "table") %>%
rename(Employee_Count = "n"),
by = c("group" = hrvar)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable %>%
reshape2::melt(id.vars = "group") %>%
reshape2::dcast(variable ~ group) -> myTable_wide
myTable_long <- reshape2::melt(myTable, id.vars=c("group")) %>%
mutate(variable = factor(variable)) %>%
group_by(variable) %>%
# Heatmap by row
mutate(value_rescaled = value/mean(value)) %>%
ungroup()
# Underscore to space
us_to_space <- function(x){
gsub(pattern = "_", replacement = " ", x = x)
}
plot_object <-
myTable_long %>%
filter(variable != "Employee_Count") %>%
ggplot(aes(x = group,
y = stats::reorder(variable, desc(variable)))) +
geom_tile(aes(fill = value_rescaled)) +
geom_text(aes(label=round(value, 1)), size = textsize) +
scale_fill_distiller(palette = "Blues", direction = 1) +
scale_x_discrete(position = "top") +
scale_y_discrete(labels = us_to_space) +
theme_light() +
labs(title = "Key Workplace Analytics metrics",
subtitle = paste("Weekly average by", camel_clean(hrvar)),
y =" ",
x =" ",
caption = extract_date_range(data, return = "text")) +
theme(axis.text.x = element_text(angle = 90, hjust = 0),
plot.title = element_text(color="grey40", face="bold", size=20)) +
guides(fill=FALSE)
if(return == "table"){
myTable_wide %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

41
R/meeting_dist.R Normal file
Просмотреть файл

@ -0,0 +1,41 @@
#' @title Meeting Hours distribution
#'
#' @description
#' Analyze Meeting Hours distribution.
#' Returns a stacked bar plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @inheritParams create_dist
#'
#' @family Meeting Culture
#'
#' @examples
#' ## Return a plot
#' meeting_dist(sq_data, hrvar = "Organization")
#'
#' ## Return a table
#' meeting_dist(sq_data, hrvar = "Organization", return = "table")
#'
#' ## Return result with a custom specified breaks
#' meeting_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
#'
#'
#' @export
meeting_dist <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot",
cut = c(5, 10, 15)) {
create_dist(data = data,
metric = "Meeting_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
cut = cut)
}
#' @rdname meeting_dist
#' @export
meeting_distribution <- meeting_dist

27
R/meeting_fizz.R Normal file
Просмотреть файл

@ -0,0 +1,27 @@
#' @title Distribution of Meeting Hours (Fizzy Drink plot)
#'
#' @description
#' Analyze weekly meeting hours distribution, and returns
#' a 'fizzy' scatter plot by default.
#' Additional options available to return a table with distribution elements.
#'
#' @inheritParams create_fizz
#'
#' @family Meetings
#'
#' @examples
#' meeting_fizz(sq_data, hrvar = "Organization", return = "table")
#' @export
meeting_fizz <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_fizz(data = data,
metric = "Meeting_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

37
R/meeting_line.R Normal file
Просмотреть файл

@ -0,0 +1,37 @@
#' @title Meeting Time Trend - Line Chart
#'
#' @description
#' Provides a week by week view of meeting time, visualised as line charts.
#' By default returns a line chart for meeting hours,
#' with a separate panel per value in the HR attribute.
#' Additional options available to return a summary table.
#'
#' @inheritParams create_line
#'
#' @family Meetings
#'
#' @examples
#'
#' ## Return a line plot
#' meeting_line(sq_data, hrvar = "LevelDesignation")
#'
#'
#' ## Return a table
#' meeting_line(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
meeting_line <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Inherit arguments
create_line(data = data,
metric = "Meeting_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

106
R/meeting_quality.R Normal file
Просмотреть файл

@ -0,0 +1,106 @@
#' Run a meeting habits / meeting quality analysis
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, e.g. "Organization"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" or "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @family Meeting Culture
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
meeting_quality <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Date range data frame
myPeriod <- extract_date_range(data)
## Prepare Table
data %>%
rename(group = !!sym(hrvar)) %>% # Rename hrvar to `group`
group_by(PersonId, group) %>%
summarize(Meeting_count = mean(Meetings),
Meeting_hours = mean(Meeting_hours),
Low_quality_meeting_hours = mean(Low_quality_meeting_hours),
perc_after_hours_m = 100*mean(After_hours_meeting_hours)/mean(Meeting_hours),
perc_low_quality = 100*mean(Low_quality_meeting_hours)/mean(Meeting_hours),
perc_Multitasking = 100*mean(Multitasking_meeting_hours)/mean(Meeting_hours),
perc_Redundant = 100*mean(Redundant_meeting_hours__organizational_)/mean(Meeting_hours),
perc_conflicting = 100*mean(Conflicting_meeting_hours)/mean(Meeting_hours)) %>%
group_by(group) %>%
summarise_at(vars(Meeting_count,
Meeting_hours,
Low_quality_meeting_hours,
perc_after_hours_m,
perc_low_quality,
perc_Multitasking,
perc_Redundant,
perc_conflicting),
~mean(.)) %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup) -> myTable
myTable_wide <- myTable %>%
reshape2::melt(id.vars = "group") %>%
reshape2::dcast(variable ~ group)
myTable_long <- reshape2::melt(myTable, id.vars=c("group"))
myTable_plot <- myTable %>% select(group, perc_low_quality, Meeting_hours)
## Bar plot
plot_object <-
myTable_plot %>%
ggplot(aes(x = perc_low_quality, y = Meeting_hours, size=2)) +
geom_point(stat = "identity",
fill = "#203864", alpha=0.1) +
geom_text(aes(label = group),
hjust = 0.2,
color = "black",
fontface = "bold",
size = 4)+
ylim(min(myTable_plot$Meeting_hours),max(myTable_plot$Meeting_hours) + 2) +
xlim(min(myTable_plot$perc_low_quality),max(myTable_plot$perc_low_quality) + 2) +
theme_wpa_basic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "none",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = "Meeting Quality",
subtitle = paste("Meeting time and low-quality percentage by", tolower(hrvar))) +
ylab("Average weekly meeting hours") +
xlab("Average weekly percentage of low-quality meeting hours") +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
if(return == "table"){
myTable %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

47
R/meeting_rank.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Meeting Hours Ranking
#'
#' @description
#' This function scans a standard query output for groups with high levels of 'Weekly Email Collaboration'.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "html" (default, returning an interactive DataTable)
#' and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Meetings
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
meeting_rank <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
output <-
data %>% create_rank(metric="Meeting_hours", hrvar = hrvar,
mingroup = mingroup,
return = "table")
if(return == "html"){
return(create_dt(output))
} else if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

101
R/meeting_skim.R Normal file
Просмотреть файл

@ -0,0 +1,101 @@
#' @title Produce a skim summary of meeting hours
#'
#' @description
#' This function returns a skim summary in the console
#' when provided a standard query in the input.
#'
#' @param data A standard person query data in the form of a data frame.
#' @param return A character string to specify whether to return a message
#' or a text string. Defaults to "message". Valid options include "text" or
#' "table".
#'
#' @import dplyr
#' @examples
#' meeting_skim(sq_data)
#'
#' @export
meeting_skim <- function(data, return = "message"){
key_output <-
data %>%
select(PersonId,
Meeting_hours,
Conflicting_meeting_hours,
Multitasking_meeting_hours,
Redundant_meeting_hours__lower_level_,
Redundant_meeting_hours__organizational_,
Low_quality_meeting_hours) %>%
summarise_at(vars(-PersonId), ~sum(.)) %>% # sum total
tidyr::gather(HourType, Hours, -Meeting_hours) %>%
mutate_at(vars(Hours), ~./Meeting_hours) %>%
mutate(RawHours = Hours * Meeting_hours)
mh_total <- round(key_output$Meeting_hours[1])
mh_lowqualityprop <-
key_output %>%
filter(HourType == "Low_quality_meeting_hours") %>%
pull(Hours) %>%
`*`(100) %>%
round() %>%
paste0("%")
bracket <- function(text){
paste0("(", text, ")")
}
extract_prop <- function(filt_chr){
key_output %>%
filter(HourType == filt_chr) %>%
pull(Hours) %>%
`*`(100) %>%
round() %>%
paste0("%")
}
extract_raw <- function(filt_chr){
key_output %>%
filter(HourType == filt_chr) %>%
pull(RawHours) %>%
round()
}
print_text <-
paste("There are",
mh_total,
"total meeting hours across the analysis population.\n",
">>>",
extract_raw("Low_quality_meeting_hours"),
"are low quality",
bracket(extract_prop("Low_quality_meeting_hours")),
"\n",
">>>",
extract_raw("Redundant_meeting_hours__organizational_"),
"are redundant",
bracket(extract_prop("Redundant_meeting_hours__organizational_")),
"\n",
">>>",
extract_raw("Conflicting_meeting_hours"),
"are conflicting",
bracket(extract_prop("Conflicting_meeting_hours")),
"\n",
">>>",
extract_raw("Multitasking_meeting_hours"),
"are multitasking.",
bracket(extract_prop("Multitasking_meeting_hours")))
if(return == "message"){
message(print_text)
} else if(return == "text"){
print_text <- gsub(pattern = ">>>", replacement = " - ", x = print_text)
print_text
} else if(return == "table"){
key_output
} else {
stop("Please check `return`")
}
}

42
R/meeting_summary.R Normal file
Просмотреть файл

@ -0,0 +1,42 @@
#' @title Meeting Summary
#'
#' @description
#' Provides an overview analysis of weekly meeting hours.
#' Returns a bar plot showing average weekly meeting hours by default.
#' Additional options available to return a summary table.
#'
#' @inheritParams create_bar
#'
#' @family Meetings
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @examples
#' # Return a ggplot bar chart
#' meeting_summary(sq_data, hrvar = "LevelDesignation")
#'
#' # Return a summary table
#' meeting_summary(sq_data, hrvar = "LevelDesignation", return = "table")
#' @export
meeting_summary <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
create_bar(data = data,
metric = "Meeting_hours",
hrvar = hrvar,
mingroup = mingroup,
return = return,
bar_colour = "darkblue")
}

78
R/meeting_tm_report.R Normal file
Просмотреть файл

@ -0,0 +1,78 @@
#' @title Generate a Meeting Text Mining report in HTML
#'
#' @description
#' Create a text mining report in HTML based on Meeting Subject Lines
#'
#' @param data A Meeting Query dataset in the form of a data frame.
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
#' For example, "meeting text mining report".
#' @param stopwords A single-column data frame labelled 'word' containing custom stopwords to remove.
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
#' Defaults to TRUE.
#' @param keep A numeric vector specifying maximum number of words to keep.
#' @param seed A numeric vector to set seed for random generation.
#'
#' @family Text-mining
#'
#' @export
meeting_tm_report <- function(data,
path = "meeting text mining report",
stopwords = NULL,
timestamp = TRUE,
keep = 100,
seed = 100){
## Create timestamped path (if applicable)
if(timestamp == TRUE){
newpath <- paste(path, wpa::tstamp())
} else {
newpath <- path
}
# Set outputs
output_list <-
list(md2html(text = read_preamble("blank.md")), # Header
data %>% tm_wordcloud(stopwords = stopwords, keep = keep),
data %>% tm_freq(token = "words", stopwords = stopwords, keep = keep),
data %>% tm_freq(token = "words", stopwords = stopwords, keep = keep, return = "table"),
data %>% tm_freq(token = "ngrams", stopwords = stopwords, keep = keep),
data %>% tm_freq(token = "ngrams", stopwords = stopwords, keep = keep, return = "table"),
data %>% tm_cooc(stopwords = stopwords, seed = seed),
data %>% tm_cooc(stopwords = stopwords, seed = seed, return="table")) %>%
purrr::map_if(is.data.frame, create_dt)
# Set header titles
title_list <-
c("Text Mining Report", # Section header
"Word cloud",
"Word Frequency",
"",
"Phrase Frequency",
"",
"Word Co-occurrence",
"")
# Set header levels
n_title <- length(title_list)
levels_list <- rep(3, n_title)
levels_list[c(1)] <- 2 # Section header
# Generate report
generate_report(title = "Analysis of Meeting Subject Lines",
filename = newpath,
outputs = output_list,
titles = title_list,
subheaders = rep("", n_title),
echos = rep(FALSE, n_title),
levels = levels_list,
theme = "cosmo",
preamble = read_preamble("meeting_tm_report.md"))
}

105
R/meeting_trend.R Normal file
Просмотреть файл

@ -0,0 +1,105 @@
#' @title Meeting Hours Time Trend
#'
#' @description
#' Provides a week by week view of meeting time.
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Collaboration
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
meeting_trend <- function(data,
hrvar = "Organization",
mingroup=5,
return = "plot"){
## Check inputs
required_variables <- c("Date",
"Meeting_hours",
"PersonId")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
myPeriod <-
data %>%
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
arrange(Date) %>%
mutate(Start = first(Date), End=last(Date)) %>%
filter(row_number() == 1) %>%
select(Start, End)
myTable <-
data %>%
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, Meeting_hours) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count = mean(Employee_Count),
Meeting_hours = mean(Meeting_hours))
myTable_plot <- myTable %>% select(Date, group, Meeting_hours)
myTable_return <- myTable_plot %>% spread(Date, Meeting_hours)
plot_object <-
myTable_plot %>%
ggplot(aes(x = Date , y = group , fill = Meeting_hours)) +
geom_tile(height=.5) +
scale_fill_gradient(name="Hours", low = "white", high = "red") +
theme_classic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14),
legend.position = "right",
legend.justification = "right",
legend.title=element_text(size=14),
legend.text=element_text(size=14)) +
labs(title = "Meeting Hours",
subtitle = paste("Total meeting time by", tolower(hrvar))) +
xlab("Date") +
ylab(hrvar) +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

40
R/meetingtype_dist.R Normal file
Просмотреть файл

@ -0,0 +1,40 @@
#' @title Meeting Type Distribution
#'
#' @description
#' Calculate the hour distribution of internal meeting types.
#' This is a wrapper around `meetingtype_dist_mt()` and
#' `meetingtype_dist_ca()`, depending on whether a Meeting Query or a Collaboration Assessment Query
#' is passed as an input.
#'
#' @param data Data frame. If a meeting query, must contain the variables `Attendee` and `DurationHours`.
#' @param hrvar Character string to specify the HR attribute to split the data by.
#' Note that this is only applicable if a Collaboration Assessment query is passed to the function. If a Meeting Query
#' is passed instead, this argument is ignored.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import ggplot2
#' @import dplyr
#' @importFrom stats setNames
#'
#' @export
meetingtype_dist <- function(data,
hrvar = NULL,
mingroup = 5,
return = "plot"){
if("MeetingId" %in% names(data)){
message("Calculating results using a Meeting Query...")
meetingtype_dist_mt(data, return = return)
} else if("PersonId" %in% names(data)){
message("Calculating results using a Collaboration Assessment Query...")
meetingtype_dist_ca(data, hrvar = hrvar, mingroup = mingroup, return = return)
} else {
stop("Please check query type. Must be either a Collaboration Assessment Query or a Meeting Query.")
}
}

230
R/meetingtype_dist_ca.R Normal file
Просмотреть файл

@ -0,0 +1,230 @@
#' @title Meeting Type Distribution (Collaboration Assessment Query)
#'
#' @description
#' Calculate the hour distribution of internal meeting types,
#' using a Collaboration Assessment Query with core WpA variables as an input.
#'
#' @param data Meeting Query data frame. Must contain the variables `Attendee` and `DurationHours`
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import ggplot2
#' @import dplyr
#' @importFrom stats setNames
#'
#' @export
meetingtype_dist_ca <- function(data,
hrvar = NULL,
mingroup = 5,
return = "plot"){
mt_dist_str <- c("Bloated_meeting_hours",
"Lengthy_meeting_hours",
"Workshop_meeting_hours",
"All_hands_meeting_hours",
"Status_update_meeting_hours",
"Decision_making_meeting_hours",
"One_on_one_meeting_hours")
mt_dist_str_clean <-
mt_dist_str %>%
gsub(pattern = "_meeting_hours", replacement = "", x = .) %>%
gsub(pattern = "_", replacement = " ", x = .)
## Add dummy "Total" column if hrvar = NULL
if(is.null(hrvar)){
data <- mutate(data, Total = "Total")
hrvar <- "Total"
}
## No org splits
if(hrvar == "Total"){
myResultsTable <-
data %>%
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
gather(MeetingType, AttendeeMeetingHours) %>%
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
Percent = paste(round(Prop * 100), "%")) %>%
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType))
## Only for creating the bottom row data
myResultsTableTotal <-
data %>%
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
gather(MeetingType, AttendeeMeetingHours) %>%
mutate(MeetingType = "Total") %>%
group_by(MeetingType) %>%
summarise(AttendeeMeetingHours = sum(AttendeeMeetingHours, na.rm = TRUE)) %>%
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
Percent = paste(round(Prop * 100), "%"))
outputTable <-
rbind(myResultsTable,
myResultsTableTotal)
} else {
## Group by hrvar
myResultsTable <-
data %>%
group_by(!!sym(hrvar)) %>%
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
left_join(data %>% hrvar_count(hrvar = hrvar, return = "table"),
by = hrvar) %>%
filter(n >= mingroup) %>%
gather(MeetingType, AttendeeMeetingHours, -!!sym(hrvar), -n) %>%
group_by(!!sym(hrvar)) %>%
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
Percent = paste(round(Prop * 100), "%")) %>%
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType))
outputTable <-
myResultsTable %>%
ungroup() %>%
select(!!sym(hrvar), MeetingType, Prop) %>%
spread(MeetingType, Prop)
}
if(hrvar == "Total"){
base_df <-
data.frame(id = 1:7,
value = c("All hands",
"Bloated",
"Status update",
"Decision making",
"One on one",
"Lengthy",
"Workshop"))
duration <-
c(0, 0, 3, 3,
0, 0, 2, 2,
0, 0, 1, 1,
0, 0, 1, 1,
0, 0, 1, 1,
1, 1, 2, 2,
2, 2, 3, 3)
attendees <-
c(4, 5, 5, 4,
3, 4, 4, 3,
2, 3, 3, 2,
1, 2, 2, 1,
0, 1, 1, 0,
0, 3, 3, 0,
0, 4, 4, 0)
main_plot_df <-
rbind(base_df,
base_df,
base_df,
base_df) %>%
arrange(id) %>%
mutate(Duration = duration,
Attendees = attendees)
label_df <-
main_plot_df %>%
group_by(id, value) %>%
summarise(x = mean(Duration),
y = mean(Attendees)) %>%
mutate(value2 = sub(" ", "\n", value),
text = ifelse(id %in% c(1, 2, 6, 7),"#FFFFFF", "black"),
fill = ifelse(id %in% c(1, 2, 6, 7),"#31617C", "#9FE5FF")) %>%
left_join(select(myResultsTable, MeetingType, Percent),
by = c("value" = "MeetingType")) %>%
mutate(Percent = ifelse(is.na(Percent), "0 %", Percent)) %>%
mutate(value2 = paste(value2, Percent, sep = "\n"))
colo_v <- setNames(label_df$fill, nm = base_df$value)
text_v <- setNames(label_df$text, nm = label_df$value)
plot_object <-
main_plot_df %>%
ggplot(aes(x = Duration, y = Attendees)) +
geom_polygon(aes(fill = value, group = id),
colour = "#FFFFFF") +
scale_fill_manual(values = colo_v) +
geom_text(data = label_df, aes(x = x, y = y, label = value2, colour = value), size = 3) +
scale_colour_manual(values = text_v) +
scale_x_continuous(breaks = 0:3,
labels = c("0", "1", "2+", "")) +
scale_y_continuous(breaks = 0:5,
labels = c("2", "3-8", "9-18",
"19-50", "51+", "")) +
labs(title = "Meeting types by attendees and duration",
subtitle = "% of total time spent in meetings",
ylab = "Attendees",
xlab = "Duration (hours)",
caption = extract_date_range(data, return = "text")) +
theme_wpa_basic() +
theme(legend.position = "none")
} else {
plot_object <-
myResultsTable %>%
mutate(MeetingType = factor(MeetingType, levels = rev(mt_dist_str_clean))) %>%
mutate(Fill = case_when(MeetingType == "Bloated" ~ rgb2hex(180,180,180),
MeetingType == "Lengthy" ~ rgb2hex(120,120,120),
MeetingType == "Workshop" ~ rgb2hex(90,90,90),
MeetingType == "All hands" ~ rgb2hex(60,60,60),
MeetingType == "Status update" ~ rgb2hex(45,160,160),
MeetingType == "Decision making" ~ rgb2hex(25,120,140),
MeetingType == "One on one" ~ rgb2hex(5,85,100))) %>%
ggplot(aes(x = !!sym(hrvar), y = Prop, group = MeetingType, fill = Fill)) +
geom_bar(position = "stack", stat = "identity") +
geom_text(aes(label = paste(round(Prop * 100), "%")),
position = position_stack(vjust = 0.5),
color = "#FFFFFF",
fontface = "bold") +
scale_fill_identity(name = "Coaching styles",
breaks = c(
rgb2hex(180,180,180),
rgb2hex(120,120,120),
rgb2hex(90,90,90),
rgb2hex(60,60,60),
rgb2hex(45,160,160),
rgb2hex(25,120,140),
rgb2hex(5,85,100)
),
labels = mt_dist_str_clean,
guide = "legend") +
coord_flip() +
theme_wpa_basic() +
labs(title = "Meeting types by attendees and duration",
subtitle = "% of total time spent in meetings",
y = "Percentage",
caption = extract_date_range(data, return = "text"))
}
if(return == "table"){
outputTable
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

143
R/meetingtype_dist_mt.R Normal file
Просмотреть файл

@ -0,0 +1,143 @@
#' @title Meeting Type Distribution (Meeting Query)
#'
#' @description
#' Calculate the hour distribution of internal meeting types,
#' using a Meeting Query with core WpA variables as an input.
#'
#' @param data Meeting Query data frame. Must contain the variables `Attendee` and `DurationHours`
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import ggplot2
#' @import dplyr
#' @importFrom stats setNames
#'
#' @export
meetingtype_dist_mt <- function(data, return = "plot"){
## Date range data frame
myPeriod <- extract_date_range(data)
data_typed <-
data %>%
mutate(MeetingType = case_when(
Attendees == 2 & DurationHours <= 1 ~ "One on one",
Attendees >= 3 & Attendees <= 8 & DurationHours <= 1 ~ "Decision making",
Attendees >= 9 & Attendees <= 18 & DurationHours <= 1 ~ "Status update",
Attendees >= 19 & Attendees <= 50 & DurationHours <= 2 ~ "Bloated",
Attendees <= 18 & DurationHours >= 1 & DurationHours <= 2 ~ "Lengthy",
Attendees >= 51 ~ "All hands",
Attendees <= 50 & DurationHours >= 2 ~ "Workshop",
TRUE ~ NA_character_))
myResultsTable <-
data_typed %>%
group_by(MeetingType) %>%
summarise(AttendeeMeetingHours = sum(Attendee_meeting_hours),
TotalMeetings = n()) %>%
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours)) %>%
mutate(Percent = paste(round(Prop * 100), "%"))
myResultsTableTotal <-
data_typed %>%
summarise(AttendeeMeetingHours = sum(Attendee_meeting_hours),
TotalMeetings = n()) %>%
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
Percent = paste(round(Prop * 100), "%"),
MeetingType = "Total") %>%
select(MeetingType, tidyselect::everything())
base_df <-
data.frame(id = 1:7,
value = c("All hands",
"Bloated",
"Status update",
"Decision making",
"One on one",
"Lengthy",
"Workshop"))
duration <-
c(0, 0, 3, 3,
0, 0, 2, 2,
0, 0, 1, 1,
0, 0, 1, 1,
0, 0, 1, 1,
1, 1, 2, 2,
2, 2, 3, 3)
attendees <-
c(4, 5, 5, 4,
3, 4, 4, 3,
2, 3, 3, 2,
1, 2, 2, 1,
0, 1, 1, 0,
0, 3, 3, 0,
0, 4, 4, 0)
main_plot_df <-
rbind(base_df,
base_df,
base_df,
base_df) %>%
arrange(id) %>%
mutate(Duration = duration,
Attendees = attendees)
label_df <-
main_plot_df %>%
group_by(id, value) %>%
summarise(x = mean(Duration),
y = mean(Attendees)) %>%
mutate(value2 = sub(" ", "\n", value),
text = ifelse(id %in% c(1, 2, 6, 7),"#FFFFFF", "black"),
fill = ifelse(id %in% c(1, 2, 6, 7),"#31617C", "#9FE5FF")) %>%
left_join(select(myResultsTable, MeetingType, Percent),
by = c("value" = "MeetingType")) %>%
mutate(Percent = ifelse(is.na(Percent), "0 %", Percent)) %>%
mutate(value2 = paste(value2, Percent, sep = "\n"))
colo_v <- setNames(label_df$fill, nm = base_df$value)
text_v <- setNames(label_df$text, nm = label_df$value)
plot_object <-
main_plot_df %>%
ggplot(aes(x = Duration, y = Attendees)) +
geom_polygon(aes(fill = value, group = id),
colour = "#FFFFFF") +
scale_fill_manual(values = colo_v) +
geom_text(data = label_df, aes(x = x, y = y, label = value2, colour = value), size = 3) +
scale_colour_manual(values = text_v) +
scale_x_continuous(breaks = 0:3,
labels = c("0", "1", "2+", "")) +
scale_y_continuous(breaks = 0:5,
labels = c("2", "3-8", "9-18",
"19-50", "51+", "")) +
labs(title = "Meeting types by attendees and duration",
subtitle = "% of total time spent in meetings",
ylab = "Attendees",
xlab = "Duration (hours)",
caption = paste("Data from week of",
myPeriod$Start,
"to week of",
myPeriod$End)) +
theme_wpa_basic() +
theme(legend.position = "none")
if(return == "table"){
rbind(myResultsTable,
myResultsTableTotal)
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

69
R/meetingtype_summary.R Normal file
Просмотреть файл

@ -0,0 +1,69 @@
#' @title Create a summary bar chart of the proportion of Meeting Hours spent in Long or Large Meetings
#'
#' @description
#' This function creates a bar chart showing the percentage of meeting hours which are spent in
#' long or large meetings.
#'
#' @param data Collaboration Assessment query in the form of a data frame. Requires the following variables:
#' - `Bloated_meeting_hours`
#' - `Lengthy_meeting_hours`
#' - `Workshop_meeting_hours`
#' - `All_hands_meeting_hours`
#' - `Status_update_meeting_hours`
#' - `Decision_making_meeting_hours`
#' - `One_on_one_meeting_hours`
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import ggplot2
#' @import dplyr
#'
#' @export
meetingtype_summary <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
mt_dist_str <- c("Bloated_meeting_hours",
"Lengthy_meeting_hours",
"Workshop_meeting_hours",
"All_hands_meeting_hours",
"Status_update_meeting_hours",
"Decision_making_meeting_hours",
"One_on_one_meeting_hours")
returnTable <-
data %>%
group_by(!!sym(hrvar)) %>%
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
gather(MeetingType, AttendeeMeetingHours, -!!sym(hrvar)) %>%
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType)) %>%
group_by(!!sym(hrvar)) %>%
mutate(AttendeeMeetingHours = AttendeeMeetingHours / sum(AttendeeMeetingHours)) %>%
spread(MeetingType, AttendeeMeetingHours) %>%
left_join(hrvar_count(data, hrvar, return = "table"), by = hrvar) %>%
filter(n >= mingroup) %>%
ungroup() %>%
mutate(MeetingHoursInLongOrLargeMeetings = select(., c("All hands", "Bloated", "Lengthy", "Workshop")) %>%
apply(1, sum, na.rm = TRUE)) %>%
select(!!sym(hrvar), MeetingHoursInLongOrLargeMeetings, n)
if(return == "plot"){
returnTable %>%
create_bar_asis(group_var = hrvar,
bar_var = "MeetingHoursInLongOrLargeMeetings",
title = "% of Meeting Hours\nin Long or Large Meetings",
subtitle = paste0("By ", camel_clean(hrvar)),
caption = extract_date_range(data, return = "text"),
percent = TRUE,
bar_colour = "alert")
} else if(return == "table"){
returnTable
} else {
stop("Please enter a valid input for `return`.")
}
}

120
R/mgrcoatt_dist.R Normal file
Просмотреть файл

@ -0,0 +1,120 @@
#' @title Manager meeting coattendance distribution
#'
#' @description
#' Analyze degree of attendance between employes and their managers.
#' Returns a stacked bar plot of different buckets of coattendance.
#' Additional options available to return a table with distribution elements.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family Managerial Relations
#' @family Meeting Culture
#'
#' @examples
#' workloads_dist(sq_data, hrvar = "Organization", return = "table")
#' @export
mgrcoatt_dist <- function(data, hrvar = "Organization", mingroup = 5, return = "plot") {
myPeriod <-
data %>%
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
arrange(Date) %>%
mutate(Start=first(Date), End=last(Date)) %>%
filter(row_number()==1) %>%
select(Start, End)
## Basic Data for bar plot
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>%
group_by(PersonId) %>%
filter(Meeting_hours>0) %>%
mutate(coattendman_rate = Meeting_hours_with_manager / Meeting_hours) %>%
summarise(periods = n(),
group = first(group), coattendman_rate=mean(coattendman_rate)) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup)
## Create buckets of coattendance time
plot_data <-
plot_data %>%
mutate(bucket_coattendman_rate = case_when(coattendman_rate>=0 & coattendman_rate<.20 ~ "0 - 20%",
coattendman_rate>=.20 & coattendman_rate<.4 ~ "20 - 40%",
coattendman_rate>=.40 & coattendman_rate<.6 ~ "40 - 60%",
coattendman_rate>=.6 ~ "60% +"))
## Employee count / base size table
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count=first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
## Data for bar plot
plot_table <-
plot_data %>%
group_by(group, bucket_coattendman_rate) %>%
summarize(Employees=n(),
Employee_Count=first(Employee_Count), percent= Employees / Employee_Count) %>%
arrange(group, bucket_coattendman_rate)
## Table for annotation
annot_table <-
plot_legend %>%
dplyr::left_join(plot_table, by = "group")
## Bar plot
plot_object <-
plot_table %>%
ggplot(aes(x = group, y=Employees, fill = bucket_coattendman_rate)) +
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
coord_flip() +
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
scale_fill_manual(name="", values = c("#bed6f2", "#e9f1fb","#ffdfd3","#FE7F4F")) +
theme_classic() +
theme(axis.text=element_text(size=12),
plot.title = element_text(color="grey40", face="bold", size=18),
plot.subtitle = element_text(size=14), legend.position = "top", legend.justification = "right",
legend.title=element_text(size=14), legend.text=element_text(size=14)) +
labs(title = "Time with Manager", subtitle = paste("Meeting Co-attendance Rate by", hrvar)) +
xlab(hrvar) +
ylab("Fraction of Employees") +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
## Table to return
return_table <-
plot_table %>%
select(group, bucket_coattendman_rate, percent) %>%
spread(bucket_coattendman_rate, percent)
if(return == "table"){
return_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

223
R/mgrrel_matrix.R Normal file
Просмотреть файл

@ -0,0 +1,223 @@
#' @title Manager Relationship 2x2 Matrix
#'
#' @description
#' Generate the Manager-Relationship 2x2 matrix, returning a ggplot object by default.
#' Additional options available to return a "wide" or "long" summary table.
#'
#' @param data Standard Query data to pass through. Accepts a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector,
#' e.g. "Organization". Defaults to NULL.
#' @param return A character vector specifying whether to return a matrix plot or a table.
#' Defaults to the 2 by 2 matrix. Valid values are "plot", "table", and "chartdata".
#' @param plot_colors
#' Pass a character vector of length 4 containing HEX codes to specify colors to use in plotting.
#' @param threshold
#' Specify a numeric value to determine threshold (in minutes) for 1:1 manager hours.
#' Defaults to 15.
#'
#' @import dplyr
#' @import reshape2
#' @import ggplot2
#' @importFrom scales percent
#'
#' @family Managerial Relations
#' @family Meeting Culture
#'
#' @examples
#' mgrrel_matrix(sq_data)
#'
#' @export
mgrrel_matrix <- function(data,
hrvar = NULL,
return = "plot",
plot_colors = c("#FE7F4F", "#C3E5FF", "#FFD4C4", "#FEAA8A"),
threshold = 15){
## Add dummy "Total" column if hrvar = NULL
if(is.null(hrvar)){
data <- mutate(data, Total = "Total")
hrvar <- "Total"
}
## Check inputs
required_variables <- c("Date",
hrvar,
"PersonId",
"Meeting_hours_with_manager",
"Meeting_hours",
"Meeting_hours_with_manager_1_on_1")
## Error message if variables are not present
## Nothing happens if all present
data %>%
check_inputs(requirements = required_variables)
## Create a Person Weekly Average
data1 <-
data %>%
mutate(coattendman_rate = Meeting_hours_with_manager / Meeting_hours) %>% # Coattendance Rate with Manager
filter(!is.na(coattendman_rate)) %>%
group_by(PersonId, !!sym(hrvar)) %>%
summarise_at(vars(Meeting_hours_with_manager,
Meeting_hours,
Meeting_hours_with_manager_1_on_1,
coattendman_rate),
~mean(.)) %>%
ungroup()
## Threshold
thres_low_chr <- paste("<", threshold, "min")
thres_top_chr <- paste(">", threshold, "min")
## Create key variables
data2 <-
data1 %>%
mutate(coattendande = ifelse(coattendman_rate < 0.5, "<50%", ">50%"),
mgr1on1 = ifelse(Meeting_hours_with_manager_1_on_1 * 60 < threshold,
thres_low_chr,
thres_top_chr))
## Grouping variable split
if(hrvar == "Total"){
data2 %>%
count(mgr1on1, coattendande) %>%
mutate(perc = n / sum(n)) %>% # Calculate percentages
mutate(xmin = ifelse(mgr1on1 == thres_low_chr, -sqrt(perc), 0),
xmax = ifelse(mgr1on1 == thres_top_chr, sqrt(perc), 0),
ymin = ifelse(coattendande == "<50%", -sqrt(perc), 0),
ymax = ifelse(coattendande == ">50%", sqrt(perc), 0),
mgrRel = case_when(mgr1on1 == thres_low_chr & coattendande == "<50%" ~ "Under-coached",
mgr1on1 == thres_low_chr & coattendande == ">50%" ~ "Co-attending",
mgr1on1 == thres_top_chr & coattendande == ">50%" ~ "Highly managed",
TRUE ~ "Coaching")) %>%
mutate_at("mgrRel", ~as.factor(.)) -> chart
chart %>%
select(mgrRel, n, perc) %>%
group_by(mgrRel) %>%
summarise_all(~sum(., na.rm = TRUE)) -> clean_tb
} else if(hrvar != "Total"){
data2 %>%
count(!!sym(hrvar), mgr1on1, coattendande) %>%
group_by(!!sym(hrvar)) %>%
mutate(perc = n / sum(n)) %>% # Calculate percentages
mutate(xmin = ifelse(mgr1on1 == thres_low_chr, -sqrt(perc), 0),
xmax = ifelse(mgr1on1 == thres_top_chr, sqrt(perc), 0),
ymin = ifelse(coattendande == "<50%", -sqrt(perc), 0),
ymax = ifelse(coattendande == ">50%", sqrt(perc), 0),
mgrRel = case_when(mgr1on1 == thres_low_chr & coattendande == "<50%" ~ "Under-coached",
mgr1on1 == thres_low_chr & coattendande == ">50%" ~ "Co-attending",
mgr1on1 == thres_top_chr & coattendande == ">50%" ~ "Highly managed",
TRUE ~ "Coaching")) %>%
ungroup() %>%
mutate_at("mgrRel", ~as.factor(.)) -> chart
chart %>%
select(mgrRel, !!sym(hrvar), n, perc) %>%
group_by(mgrRel, !!sym(hrvar)) %>%
summarise_all(~sum(., na.rm = TRUE)) -> clean_tb
}
## Sort colours out
# Legacy variable names
myColors <- plot_colors
names(myColors) <- levels(chart$mgrRel)
## Show stacked bar chart if multiple groups
if(hrvar == "Total"){
plot <-
chart %>%
ggplot() +
geom_rect(aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, fill = mgrRel), color = "white") +
scale_fill_manual(name = "mgrRel", values = myColors) +
geom_text(aes(x = xmin + 0.5*sqrt(perc),
y = ymin + 0.5*sqrt(perc),
label = scales::percent(perc, accuracy = 1))) +
coord_equal() +
xlab("Weekly 1:1 time with manager") +
scale_x_continuous(breaks = c(-max(abs(chart$xmin),abs(chart$xmax))/2,max(abs(chart$xmin),abs(chart$xmax))/2),
labels = c(thres_low_chr, thres_top_chr),
limits = c(-max(abs(chart$xmin), abs(chart$xmax)), max(abs(chart$xmin), abs(chart$xmax)))) +
ylab("Employee and manager coattend") +
scale_y_continuous(breaks = c(-max(abs(chart$ymin), abs(chart$ymax))/2, max(abs(chart$ymin), abs(chart$ymax))/2),
labels = c("<50%", ">50%"),
limits = c(-max(abs(chart$ymin), abs(chart$ymax)), max(abs(chart$ymin), abs(chart$ymax)))) +
theme_minimal() +
theme(panel.grid = element_blank(),
plot.title = element_text(color="grey40", face="bold", size=20),
axis.line = element_line(),
legend.position = "bottom",
legend.title = element_blank(),
axis.title = element_text(size = 12),
axis.text = element_text(size = 12),
legend.text = element_text(size = 12),
strip.text.x = element_text(color = "grey40", face = "bold", size = 14)) +
labs(caption = extract_date_range(data, return = "text"))
} else if(hrvar != "Total"){
plot <-
chart %>%
mutate(Fill = case_when(mgrRel == "Co-attending" ~ rgb2hex(68,151,169),
mgrRel == "Coaching" ~ rgb2hex(95,190,212),
mgrRel == "Highly managed" ~ rgb2hex(49,97,124),
mgrRel == "Under-coached" ~ rgb2hex(89,89,89))) %>%
ggplot(aes(x = !!sym(hrvar), y = perc, group = mgrRel, fill = Fill)) +
geom_bar(position = "stack", stat = "identity") +
geom_text(aes(label = paste(round(perc * 100), "%")),
position = position_stack(vjust = 0.5),
color = "#FFFFFF",
fontface = "bold") +
scale_fill_identity(name = "Coaching styles",
breaks = c(rgb2hex(68,151,169),
rgb2hex(95,190,212),
rgb2hex(49,97,124),
rgb2hex(89,89,89)),
labels = c("Co-attending",
"Coaching",
"Highly managed",
"Under-coached"),
guide = "legend") +
coord_flip() +
theme_wpa_basic() +
labs(title = "Distribution of types of \nmanager-direct relationship across organizations",
subtitle = "Based on manager 1:1 time and percentage of overall time spent with managers")
}
if(return == "plot"){
plot +
labs(title = "Distribution of types of \nmanager-direct relationship",
subtitle = "Based on manager 1:1 time and percentage of\noverall time spent with managers")
} else if(return == "table"){
clean_tb %>%
as_tibble() %>%
return()
} else if(return == "chartdata"){
chart %>%
as_tibble() %>%
return()
} else {
stop("Please enter a valid input for `return`.")
}
}

42
R/mt_data.R Normal file
Просмотреть файл

@ -0,0 +1,42 @@
#' @title Sample Meeting Query dataset
#'
#' @description
#' A dataset generated from a Meeting Query from WpA.
#'
#' @format A data frame with 2000 rows and 30 variables:
#' \describe{
#' \item{MeetingId}{ }
#' \item{StartDate}{ }
#' \item{StartTimeUTC}{ }
#' \item{EndDate}{ }
#' \item{EndTimeUTC}{ }
#' \item{Attendee_meeting_hours}{ }
#' \item{Attendees}{ }
#' \item{Organizer_Domain}{ }
#' \item{Organizer_FunctionType}{ }
#' \item{Organizer_LevelDesignation}{ }
#' \item{Organizer_Layer}{ }
#' \item{Organizer_Region}{ }
#' \item{Organizer_Organization}{ }
#' \item{Organizer_zId}{ }
#' \item{Organizer_attainment}{ }
#' \item{Organizer_TimeZone}{ }
#' \item{Organizer_HourlyRate}{ }
#' \item{Organizer_IsInternal}{ }
#' \item{Organizer_PersonId}{ }
#' \item{IsCancelled}{ }
#' \item{DurationHours}{ }
#' \item{IsRecurring}{ }
#' \item{Subject}{ }
#' \item{TotalAccept}{ }
#' \item{TotalNoResponse}{ }
#' \item{TotalDecline}{ }
#' \item{TotalNoEmailsDuringMeeting}{ }
#' \item{TotalNoDoubleBooked}{ }
#' \item{TotalNoAttendees}{ }
#' \item{MeetingResources}{ }
#'
#' ...
#' }
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
"mt_data"

110
R/one2one_dist.R Normal file
Просмотреть файл

@ -0,0 +1,110 @@
#' @title Manager 1:1 Time Trend distribution
#'
#' @description
#' Analyze Manager 1:1 Time distribution.
#' Returns a stacked bar plot of different buckets of 1:1 time.
#' Additional options available to return a table with distribution elements.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats median
#' @importFrom stats sd
#'
#' @family Managerial Relations
#'
#' @examples
#' workloads_dist(sq_data, hrvar = "Organization", return = "table")
#' @export
one2one_dist <- function(data, hrvar = "Organization", mingroup = 5, return = "plot") {
## Date range data frame
myPeriod <- extract_date_range(data)
## Basic Data for bar plot
plot_data <-
data %>%
rename(group = !!sym(hrvar)) %>%
group_by(PersonId, group) %>%
summarise(Minutes_with_manager_1_on_1 = mean(Meeting_hours_with_manager_1_on_1 * 60)) %>%
ungroup() %>%
left_join(data %>%
rename(group = !!sym(hrvar)) %>%
group_by(group) %>%
summarise(Employee_Count = n_distinct(PersonId)),
by = "group") %>%
filter(Employee_Count >= mingroup)
## Create buckets of 1:1 time
plot_data <-
plot_data %>%
mutate(bucket_manager_1_on_1 = case_when(Minutes_with_manager_1_on_1 == 0 ~ "0 minutes",
Minutes_with_manager_1_on_1 > 0 & Minutes_with_manager_1_on_1 < 15 ~ "1 - 15 minutes",
Minutes_with_manager_1_on_1 >= 15 & Minutes_with_manager_1_on_1 < 30 ~ "15 - 30 minutes",
Minutes_with_manager_1_on_1 >= 30 ~ "30 min +"))
## Employee count / base size table
plot_legend <-
plot_data %>%
group_by(group) %>%
summarize(Employee_Count=first(Employee_Count)) %>%
mutate(Employee_Count = paste("n=",Employee_Count))
## Data for bar plot
plot_table <-
plot_data %>%
group_by(group, bucket_manager_1_on_1) %>%
summarize(Employees = n(),
Employee_Count = first(Employee_Count),
percent = Employees / Employee_Count) %>%
arrange(group, desc(bucket_manager_1_on_1))
## Table for annotation
annot_table <-
plot_legend %>%
dplyr::left_join(plot_table, by = "group")
## Bar plot
plot_object <-
plot_table %>%
ggplot(aes(x = group, y=Employees, fill = bucket_manager_1_on_1)) +
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
coord_flip() +
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
scale_fill_manual(name="", values = c("#FE7F4F", "#ffdfd3", "#bed6f2", "#e9f1fb")) +
theme_wpa_basic() +
labs(title = "Time with Manager", subtitle = paste("Scheduled 1:1 weekly meeting minutes by", tolower(hrvar))) +
xlab(hrvar) +
ylab("Fraction of employees") +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
## Table to return
return_table <- plot_table %>% select(group, bucket_manager_1_on_1, percent) %>% spread(bucket_manager_1_on_1, percent)
if(return == "table"){
return_table %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

47
R/one2one_line.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Manager 1:1 Time Trend - Line Chart
#'
#' @description
#' Provides a week by week view of 1:1 time with managers, visualised as line charts.
#' By default returns a line chart for 1:1 meeting hours,
#' with a separate panel per value in the HR attribute.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
#' but accepts any character vector, e.g. "LevelDesignation"
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Managerial Relations
#'
#' @examples
#'
#' ## Return a line plot
#' one2one_line(sq_data, hrvar = "LevelDesignation")
#'
#'
#' ## Return a table
#' one2one_line(sq_data, hrvar = "LevelDesignation", return = "table")
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
one2one_line <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
## Inherit arguments
create_line(data = data,
metric = "Meeting_hours_with_manager_1_on_1",
hrvar = hrvar,
mingroup = mingroup,
return = return)
}

47
R/one2one_rank.R Normal file
Просмотреть файл

@ -0,0 +1,47 @@
#' @title Manager 1:1 Time Ranking
#'
#' @description
#' This function scans a standard query output for groups with high levels of 'Manager 1:1 Time'.
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar A list of HR Variables to consider in the scan.
#' Defaults to all HR attributes identified.
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
#' Defaults to 5.
#' @param return A character vector specifying what to return.
#' Valid values include "html" (default, returning an interactive DataTable)
#' and "df" (data frame)
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#' @importFrom stats reorder
#'
#' @family Managerial Relations
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
one2one_rank <- function(data,
hrvar = extract_hr(data),
mingroup = 5,
return = "table"){
output <-
data %>% create_rank(metric="Meeting_hours_with_manager_1_on_1", hrvar = hrvar,
mingroup = mingroup,
return = "table")
if(return == "html"){
return(create_dt(output))
} else if(return == "table"){
return(output)
} else {
stop("Invalid `return` argument.")
}
}

45
R/one2one_sum.R Normal file
Просмотреть файл

@ -0,0 +1,45 @@
#' @title Manager 1:1 Time Summary
#'
#' @description
#' Provides an overview analysis of Manager 1:1 Time.
#' Returns a bar plot showing average weekly minutes of Manager 1:1 Time by default.
#' Additional options available to return a summary table.
#'
#' @inheritParams create_bar
#'
#' @family Managerial Relations
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
one2one_sum <- function(data,
hrvar = "Organization",
mingroup = 5,
return = "plot"){
cleaned_data <-
data %>%
mutate(`Scheduled 1:1 meeting minutes with manager` = Meeting_hours_with_manager_1_on_1 * 60)
create_bar(data = cleaned_data,
hrvar = hrvar,
mingroup = mingroup,
metric = "Scheduled 1:1 meeting minutes with manager",
return = return,
bar_colour = "darkblue")
}

86
R/one2one_trend.R Normal file
Просмотреть файл

@ -0,0 +1,86 @@
#' @title Manager 1:1 Time Trend
#'
#' @description
#' Provides a week by week view of scheduled manager 1:1 Time.
#' By defualt returns a week by week heatmap, highlighting the points in time with most activity.
#' Additional options available to return a summary table.
#'
#' @param data A Standard Query dataset in the form of a data frame.
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector (e.g. "LevelDesignation")
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
#' @param return Character vector specifying what to return, defaults to "plot".
#' Valid inputs are "plot" and "table".
#'
#' @import dplyr
#' @import ggplot2
#' @import reshape2
#' @import scales
#'
#' @family Managerial Relations
#'
#' @return
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
#' When 'table' is passed, a summary table is returned as a data frame.
#'
#' @export
one2one_trend <- function(data, hrvar = "Organization", mingroup=5, return = "plot"){
## Date range data frame
myPeriod <- extract_date_range(data)
myTable <-
data %>%
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
select(PersonId, Date, group, Meeting_hours_with_manager_1_on_1) %>%
group_by(group) %>%
mutate(Employee_Count = n_distinct(PersonId)) %>%
filter(Employee_Count >= mingroup) # Keep only groups above privacy treshold
myTable <-
myTable %>%
group_by(Date, group) %>%
summarize(Employee_Count=mean(Employee_Count),
Minutes_with_manager_1_on_1 = mean(Meeting_hours_with_manager_1_on_1 * 60))
myTable_plot <-
myTable %>%
select(Date, group, Minutes_with_manager_1_on_1)
myTable_return <- myTable_plot %>% spread(Date, Minutes_with_manager_1_on_1)
plot_object <-
myTable_plot %>%
ggplot(aes(x =Date , y = group , fill = Minutes_with_manager_1_on_1)) +
geom_tile(height=.5) +
scale_fill_gradient(name="Minutes", low = "white", high = "red") +
theme_wpa_basic() +
labs(title = "Time with Manager",
subtitle = paste("Average scheduled 1:1 meeting minutes by", tolower(hrvar))) +
xlab("Date") +
ylab(hrvar) +
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
if(return == "table"){
myTable_return %>%
as_tibble() %>%
return()
} else if(return == "plot"){
return(plot_object)
} else {
stop("Please enter a valid input for `return`.")
}
}

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше