зеркало из https://github.com/microsoft/wpa.git
init: first commit
This commit is contained in:
Коммит
020c455952
|
@ -0,0 +1 @@
|
|||
*.html
|
|
@ -0,0 +1,106 @@
|
|||
# :hammer: Developer Guide
|
||||
|
||||
This is guide for "developers", or anyone who is looking to contribute code to the {wpa} R package.
|
||||
|
||||
## Pre-requisites
|
||||
|
||||
You should have the following installed before starting:
|
||||
|
||||
1. [R](https://www.r-project.org/)
|
||||
2. [RStudio Desktop](https://rstudio.com/products/rstudio/download/#download)
|
||||
3. [GitHub Desktop](https://desktop.github.com/) (Highly recommended, unless you are very familiar with git)
|
||||
|
||||
You should also have the following packages installed. These should all be available from CRAN:
|
||||
1. {devtools}
|
||||
2. {tidyverse}
|
||||
3. {roxygen2}
|
||||
|
||||
Once these are all installed, you should be ready to roll!
|
||||
|
||||
---
|
||||
|
||||
## :hammer: Quick start guide - For developers
|
||||
|
||||
### Update package and installing to local directory
|
||||
|
||||
Once you've made changes to a script, here is how to update the package and install it locally on your computer. Run:
|
||||
|
||||
#### 1. Generating documentation
|
||||
|
||||
```R
|
||||
roxygen2::roxygenise()
|
||||
```
|
||||
`roxygenise()` generates documentation files for the R package, and makes them available to the user.
|
||||
|
||||
#### 2. R CMD Checks
|
||||
Next, run this to build and check the package. You will need {devtools} installed.
|
||||
|
||||
```R
|
||||
devtools::check()
|
||||
```
|
||||
This step runs a R CMD check against the package, to check that there will be no errors and issues if somebody installs the package. This step can take around 5 minutes long, and usually it is worth running this only after you've implemented all the desired changes towards the end.
|
||||
|
||||
#### 3. Install the package
|
||||
|
||||
If everything runs smoothly, run this final line to install the local package. Make sure that your working directory is set to the package!
|
||||
|
||||
```R
|
||||
devtools::install()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## To increment or change a package version
|
||||
|
||||
Each package release has a corresponding package version (e.g. 1.4.2.) To change that package version number for a particular release, simply go into the `DESCRIPTION` at the project root folder of the {wpa} package. You can edit this with any notepad software or RStudio.
|
||||
|
||||
Change the values that come after `Version: ` to change the package version. Prior to incrementing the package version, please align with the main developer team on the agreed version release details.
|
||||
|
||||
## How to release a package version
|
||||
|
||||
Our current protocol for package release is to distribute it as a zipped file or a tar ball. The advantage of this approach is that users do not need to have a GitHub account, and this package release can be distributed via SharePoint or even email. Our current approach is to offer **both** options for installation.
|
||||
|
||||
Once you are happy with the package and all checks have passed, you can run the following to generate a **zipped** distribution of the package. If your platform is Windows, this should generate a zipped file one level above your project directory:
|
||||
|
||||
```R
|
||||
devtools::build(binary = TRUE)
|
||||
```
|
||||
|
||||
The code to generate a **tarball** distribution is similar:
|
||||
```R
|
||||
devtools::build(binary = FALSE)
|
||||
```
|
||||
|
||||
After the **zipped file** and the **tarball** files have been generated, they are ready for distribution.
|
||||
|
||||
## How to build the PDF documentation
|
||||
|
||||
It is good practice to distribute a package with an updated version of the documentation, so users won't be using outdated documentation. You can generate the familiar PDF documentation by running the following:
|
||||
```R
|
||||
devtools::build_manual()
|
||||
```
|
||||
|
||||
## A short guide to adding a function to the package
|
||||
|
||||
1. Let us assume that you have written a new function, e.g. a checking function that returns TRUE if PersonId exists, and returns FALSE if it does not).
|
||||
1. Once you check the function and ensure that it works, save this into a R file under the "R" folder of the package directory.
|
||||
- Ensure that your current working directory is set to the package directory. You can either open the RStudio session with the package .Rproj file, or run `setwd()` to change you working directory. Run `getwd()` to check what your current working directory is.
|
||||
- Ensure that the relevant roxygen headers are present (see [here](https://roxygen2.r-lib.org/articles/rd.html) for more reference).
|
||||
- Whilst it is possible to have more than one function saved in a single R file, we recommend saving them in their own files with their own names for clarity. For instance, the function `collaboration_dist()` would be saved in a file called _collaboration_dist.R_.
|
||||
1. Follow the steps in the Developer Guide above to generate the documentation (`roxygen2::roxygenise()`), check the package (`devtools::check()`), and install the package (`devtools::install()`).
|
||||
1. Once you are sure that everything works, open **GitHub Desktop**, and **check that you have selected the relevant branch**. If you are unsure what your changes are, it is always recommended that you work on a new branch so that you do not introduce any breaking changes to the master branch.
|
||||
1. _Fetch / pull_ from the branch. This ensures that your local package is updated with changes that others have made to the code.
|
||||
1. _Commit_ your changes, and add an intuitive commit message so others will know what you have changed to the code. Please see the wiki page of this repo for a style guide on the commit messages.
|
||||
1. _Push_ your changes to the branch. If there are no errors, you should be able to see the changes on the GitHub repository!
|
||||
|
||||
## Using GitHub
|
||||
*It's on its way...*
|
||||
|
||||
## Documenting functions with {roxygen2}
|
||||
*It's on its way...*
|
||||
|
||||
## :link: Useful links
|
||||
|
||||
- [Sharing Internal R Packages](https://support.rstudio.com/hc/en-us/articles/115000239587-Sharing-Internal-R-Packages)
|
||||
- [Writing an R package from scratch by Hilary Parker](https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/)
|
||||
- [Semantic versioning](https://semver.org/)
|
|
@ -0,0 +1,19 @@
|
|||
# Summary
|
||||
This branch *<succinct summary of the purpose>*
|
||||
|
||||
# Changes
|
||||
The changes made in this PR are:
|
||||
1. Change 1
|
||||
1. Change 2
|
||||
...
|
||||
|
||||
# Check
|
||||
- [ ] Check 1
|
||||
- [ ] Check 2
|
||||
...
|
||||
|
||||
# (OPTIONAL) Note
|
||||
This "fixes #<issue_number>"
|
||||
|
||||
*<other things, such as how to incorporate new changes>*
|
||||
*<brief summary of the purpose of this pull request>*
|
|
@ -0,0 +1,18 @@
|
|||
on: [push, pull_request]
|
||||
|
||||
name: R-CMD-check
|
||||
|
||||
jobs:
|
||||
R-CMD-check:
|
||||
runs-on: macOS-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: r-lib/actions/setup-r@master
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
install.packages(c("remotes", "rcmdcheck"))
|
||||
remotes::install_deps(dependencies = TRUE)
|
||||
shell: Rscript {0}
|
||||
- name: Check
|
||||
run: rcmdcheck::rcmdcheck(args = "--no-manual", error_on = "error")
|
||||
shell: Rscript {0}
|
|
@ -0,0 +1,12 @@
|
|||
.Rproj.user
|
||||
.Rhistory
|
||||
.Rbuildignore
|
||||
_data/
|
||||
_development/
|
||||
.RData
|
||||
doc
|
||||
Meta
|
||||
.RDataTmp
|
||||
SQ-overview.html
|
||||
wpa export 20200427_131327.png
|
||||
*.html
|
|
@ -0,0 +1,14 @@
|
|||
# Contributing
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to
|
||||
agree to a Contributor License Agreement (CLA) declaring that you have the right to,
|
||||
and actually do, grant us the rights to use your contribution. For details, visit
|
||||
https://cla.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA-bot will automatically determine whether you need
|
||||
to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
|
||||
instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
|
||||
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
|
@ -0,0 +1,65 @@
|
|||
Package: wpa
|
||||
Type: Package
|
||||
Title: Tools for Analysing and Visualising Workplace Analytics data
|
||||
Version: 1.3.0
|
||||
Authors@R: c(
|
||||
person(given = "Martin", family = "Chan", role = c("aut", "cre"), email = "martin.chan@microsoft.com"),
|
||||
person(given = "Carlos", family = "Morales", role = "ctb", email = "carlos.morales@microsoft.com"),
|
||||
person(given = "Mark", family = "Powers", role = "ctb", email = "mark.powers@microsoft.com"),
|
||||
person(given = "Ainize", family = "Cidoncha", role = "ctb", email = "ainize.cidoncha@microsoft.com"),
|
||||
person(given = "Rosamary", family = "Ochoa Vargas", role = "ctb", email = "rosamary.ochoa@microsoft.com"),
|
||||
person(given = "Tannaz", family = "Sattari", role = "ctb", email = "tannaz.sattari@microsoft.com"))
|
||||
Description: Opinionated functions that enable easier and faster
|
||||
analysis of Workplace Analytics data. There are three main types of functions in wpa:
|
||||
(i) Standard functions create a ggplot visual or a summary table based on a specific
|
||||
Workplace Analytics metric; (2) Report Generation functions generate HTML reports on
|
||||
a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover
|
||||
more specific applications (e.g. Subject Line text mining) of Workplace Analytics data.
|
||||
This package adheres to tidyverse principles and works well with the pipe syntax.
|
||||
wpa is built with the beginner-to-intermediate R users in mind, and is optimised for
|
||||
simplicity.
|
||||
URL: https://github.com/martinctc/wpa
|
||||
BugReports: https://github.com/martinctc/wpa/issues
|
||||
License: MIT + file LICENSE
|
||||
Encoding: UTF-8
|
||||
LazyData: true
|
||||
Depends:
|
||||
R (>= 3.1.2)
|
||||
Imports:
|
||||
base,
|
||||
dplyr,
|
||||
stats,
|
||||
utils,
|
||||
tibble,
|
||||
tidyr,
|
||||
tidyselect (>= 1.0.0),
|
||||
magrittr,
|
||||
purrr,
|
||||
reshape2,
|
||||
ggplot2,
|
||||
ggrepel,
|
||||
Information,
|
||||
scales,
|
||||
htmltools,
|
||||
knitr,
|
||||
markdown,
|
||||
rmarkdown,
|
||||
networkD3,
|
||||
DT,
|
||||
tidytext,
|
||||
ggraph,
|
||||
igraph,
|
||||
widyr,
|
||||
proxy,
|
||||
rlang,
|
||||
GGally,
|
||||
network,
|
||||
ggwordcloud,
|
||||
methods,
|
||||
data.table
|
||||
RoxygenNote: 7.1.1
|
||||
Roxygen: list(markdown = TRUE)
|
||||
VignetteBuilder: knitr
|
||||
Suggests:
|
||||
extrafont
|
||||
Language: en-US
|
|
@ -0,0 +1,2 @@
|
|||
YEAR: 2020
|
||||
COPYRIGHT HOLDER: Microsoft Corporation
|
|
@ -0,0 +1,23 @@
|
|||
wpa
|
||||
|
||||
Copyright (c) Microsoft Corporation.
|
||||
|
||||
MIT License
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
|
@ -0,0 +1,170 @@
|
|||
# Generated by roxygen2: do not edit by hand
|
||||
|
||||
export("%>%")
|
||||
export(IV_by_period)
|
||||
export(IV_report)
|
||||
export(afterhours_dist)
|
||||
export(afterhours_distribution)
|
||||
export(afterhours_rank)
|
||||
export(afterhours_sum)
|
||||
export(afterhours_summary)
|
||||
export(afterhours_trend)
|
||||
export(analysis_scope)
|
||||
export(camel_clean)
|
||||
export(capacity_report)
|
||||
export(check_inputs)
|
||||
export(check_query)
|
||||
export(coaching_report)
|
||||
export(collab_sum)
|
||||
export(collab_summary)
|
||||
export(collaboration_area)
|
||||
export(collaboration_dist)
|
||||
export(collaboration_distribution)
|
||||
export(collaboration_fizz)
|
||||
export(collaboration_line)
|
||||
export(collaboration_rank)
|
||||
export(collaboration_report)
|
||||
export(collaboration_sum)
|
||||
export(collaboration_summary)
|
||||
export(collaboration_trend)
|
||||
export(comma)
|
||||
export(connectivity_report)
|
||||
export(copy_df)
|
||||
export(create_IV)
|
||||
export(create_bar)
|
||||
export(create_bar_asis)
|
||||
export(create_boxplot)
|
||||
export(create_dist)
|
||||
export(create_dt)
|
||||
export(create_fizz)
|
||||
export(create_line)
|
||||
export(create_period_scatter)
|
||||
export(create_rank)
|
||||
export(create_scatter)
|
||||
export(create_stacked)
|
||||
export(cut_hour)
|
||||
export(email_dist)
|
||||
export(email_fizz)
|
||||
export(email_line)
|
||||
export(email_rank)
|
||||
export(email_summary)
|
||||
export(email_trend)
|
||||
export(export)
|
||||
export(external_network_plot)
|
||||
export(extract_date_range)
|
||||
export(extract_hr)
|
||||
export(flag_ch_ratio)
|
||||
export(flag_em_ratio)
|
||||
export(flag_extreme)
|
||||
export(flag_outlooktime)
|
||||
export(g2g_network)
|
||||
export(generate_report)
|
||||
export(hr_trend)
|
||||
export(hrvar_count)
|
||||
export(hrvar_count_all)
|
||||
export(identify_churn)
|
||||
export(identify_holidayweeks)
|
||||
export(identify_inactiveweeks)
|
||||
export(identify_nkw)
|
||||
export(identify_outlier)
|
||||
export(identify_privacythreshold)
|
||||
export(identify_query)
|
||||
export(identify_shifts)
|
||||
export(identify_tenure)
|
||||
export(import_wpa)
|
||||
export(internal_network_plot)
|
||||
export(is_date_format)
|
||||
export(keymetrics_scan)
|
||||
export(meeting_dist)
|
||||
export(meeting_distribution)
|
||||
export(meeting_fizz)
|
||||
export(meeting_line)
|
||||
export(meeting_quality)
|
||||
export(meeting_rank)
|
||||
export(meeting_skim)
|
||||
export(meeting_summary)
|
||||
export(meeting_tm_report)
|
||||
export(meeting_trend)
|
||||
export(meetingtype_dist)
|
||||
export(meetingtype_dist_ca)
|
||||
export(meetingtype_dist_mt)
|
||||
export(meetingtype_summary)
|
||||
export(mgrcoatt_dist)
|
||||
export(mgrrel_matrix)
|
||||
export(one2one_dist)
|
||||
export(one2one_line)
|
||||
export(one2one_rank)
|
||||
export(one2one_sum)
|
||||
export(one2one_trend)
|
||||
export(period_change)
|
||||
export(personas_hclust)
|
||||
export(read_preamble)
|
||||
export(remove_outliers)
|
||||
export(rgb2hex)
|
||||
export(standardise_pq)
|
||||
export(subject_validate)
|
||||
export(subject_validate_report)
|
||||
export(theme_wpa)
|
||||
export(theme_wpa_basic)
|
||||
export(tm_clean)
|
||||
export(tm_cooc)
|
||||
export(tm_freq)
|
||||
export(tm_wordcloud)
|
||||
export(totals_bind)
|
||||
export(totals_reorder)
|
||||
export(track_HR_change)
|
||||
export(tstamp)
|
||||
export(validation_report)
|
||||
export(workloads_dist)
|
||||
export(workloads_distribution)
|
||||
export(workloads_line)
|
||||
export(workloads_summary)
|
||||
export(workloads_trend)
|
||||
export(workpatterns_area)
|
||||
export(workpatterns_hclust)
|
||||
export(workpatterns_rank)
|
||||
export(wrap)
|
||||
import(DT)
|
||||
import(Information)
|
||||
import(dplyr)
|
||||
import(ggplot2)
|
||||
import(ggraph)
|
||||
import(ggrepel)
|
||||
import(reshape2)
|
||||
import(scales)
|
||||
import(tidyselect)
|
||||
importFrom(data.table,"%between%")
|
||||
importFrom(data.table,"%like%")
|
||||
importFrom(data.table,":=")
|
||||
importFrom(dplyr,`%>%`)
|
||||
importFrom(dplyr,mutate_if)
|
||||
importFrom(htmltools,HTML)
|
||||
importFrom(igraph,graph_from_data_frame)
|
||||
importFrom(magrittr,"%>%")
|
||||
importFrom(markdown,markdownToHTML)
|
||||
importFrom(methods,is)
|
||||
importFrom(proxy,dist)
|
||||
importFrom(purrr,map_if)
|
||||
importFrom(purrr,pmap)
|
||||
importFrom(purrr,reduce)
|
||||
importFrom(rlang,"!!")
|
||||
importFrom(rlang,sym)
|
||||
importFrom(scales,percent)
|
||||
importFrom(scales,wrap_format)
|
||||
importFrom(stats,cutree)
|
||||
importFrom(stats,hclust)
|
||||
importFrom(stats,median)
|
||||
importFrom(stats,na.omit)
|
||||
importFrom(stats,rect.hclust)
|
||||
importFrom(stats,reorder)
|
||||
importFrom(stats,sd)
|
||||
importFrom(stats,setNames)
|
||||
importFrom(stats,wilcox.test)
|
||||
importFrom(tidyr,gather)
|
||||
importFrom(tidyr,replace_na)
|
||||
importFrom(tidyr,spread)
|
||||
importFrom(tidyselect,all_of)
|
||||
importFrom(tidytext,unnest_tokens)
|
||||
importFrom(utils,write.csv)
|
||||
importFrom(utils,write.table)
|
||||
importFrom(widyr,pairwise_count)
|
|
@ -0,0 +1,154 @@
|
|||
#' @title Identify the WPA metrics that have the biggest change between two periods.
|
||||
#'
|
||||
#' @description
|
||||
#' This function uses the Information Value algorithm to predict which WPA metrics are most explained by the change in dates.
|
||||
#'
|
||||
#' @param data Person Query as a dataframe including date column named "Date"
|
||||
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
|
||||
#' @param before_start Start date of "before" time period in YYYY-MM-DD. Defaults to earliest date in dataset.
|
||||
#' @param before_end End date of "before" time period in YYYY-MM-DD
|
||||
#' @param after_start Start date of "after" time period in YYYY-MM-DD. Defaults to day after before_end.
|
||||
#' @param after_end End date of "after" time period in YYYY-MM-DD. Defaults to latest date in dataset.
|
||||
#' @param mybins Number of bins to cut the data into for Information Value analysis. Defaults to 10.
|
||||
#' @param return Specify whether to return a summary table or detailed Excel and PDF files
|
||||
#' Specify "table" or "detailed" for outputs. Defaults to table.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import Information
|
||||
#'
|
||||
#'
|
||||
#' @family Flexible Input
|
||||
#'
|
||||
#' @export
|
||||
|
||||
IV_by_period <-
|
||||
function(data,
|
||||
before_start = min(as.Date(data$Date, "%m/%d/%Y")),
|
||||
before_end,
|
||||
after_start = as.Date(before_end) + 1,
|
||||
after_end = max(as.Date(data$Date, "%m/%d/%Y")),
|
||||
mybins = 10,
|
||||
return = "table") {
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
"PersonId")
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
daterange_1_start <- as.Date(before_start)
|
||||
daterange_1_end <- as.Date(before_end)
|
||||
daterange_2_start <- as.Date(after_start)
|
||||
daterange_2_end <- as.Date(after_end)
|
||||
|
||||
WpA_dataset <- data %>% mutate(Date = as.Date(Date, "%m/%d/%Y"))
|
||||
|
||||
|
||||
# Check for dates in data file
|
||||
if (daterange_1_start < min(WpA_dataset$Date) |
|
||||
daterange_1_start > max(WpA_dataset$Date) |
|
||||
daterange_1_end < min(WpA_dataset$Date) |
|
||||
daterange_1_end > max(WpA_dataset$Date) |
|
||||
daterange_2_start < min(WpA_dataset$Date) |
|
||||
daterange_2_start > max(WpA_dataset$Date) |
|
||||
daterange_2_end < min(WpA_dataset$Date) |
|
||||
daterange_2_end > max(WpA_dataset$Date)) {
|
||||
stop('Dates not found in dataset')
|
||||
geterrmessage()
|
||||
}
|
||||
|
||||
# Create variable => Period
|
||||
WpA_dataset_table <-
|
||||
WpA_dataset %>%
|
||||
mutate(
|
||||
Period = case_when(
|
||||
Date >= daterange_1_start &
|
||||
Date <= daterange_1_end ~ "Before",
|
||||
Date >= daterange_2_start &
|
||||
Date <= daterange_2_end ~ "After"
|
||||
)
|
||||
) %>% filter(Period == "Before" | Period == "After")
|
||||
|
||||
WpA_dataset_table <-
|
||||
WpA_dataset_table %>% mutate(outcome = case_when(Period == "Before" ~ "0",
|
||||
Period == 'After' ~ "1"))
|
||||
|
||||
# Make the use of exponential notation less likely
|
||||
options(scipen = 10)
|
||||
|
||||
# De-select character columns
|
||||
train <-
|
||||
WpA_dataset_table %>%
|
||||
transform(outcome = as.numeric(outcome)) %>%
|
||||
select_if(is.numeric)
|
||||
|
||||
# Filter out NAs
|
||||
train <- train %>%
|
||||
filter(rowSums(is.na(.[, ])) < 1)
|
||||
|
||||
# Rename Outcome Variable
|
||||
# train <- transform(train, outcome = as.numeric(outcome))
|
||||
train <- rename(train, 'Outcome' = "outcome")
|
||||
colnames(train)
|
||||
|
||||
|
||||
# Calculate Odds
|
||||
odds <-
|
||||
sum(train$Outcome) / (length(train$Outcome) - sum(train$Outcome))
|
||||
lnodds <- log(odds)
|
||||
|
||||
# IV Analysis
|
||||
IV <- create_infotables(data = train, y = "Outcome", bins = mybins)
|
||||
|
||||
# if(return == "detailed"){
|
||||
# # Ranking variables using IV
|
||||
# wb <- createWorkbook()
|
||||
# addWorksheet(wb, "Ranking")
|
||||
# writeDataTable(wb, "Ranking", x = data.frame(IV$Summary))
|
||||
#
|
||||
# # Export Individual Tables
|
||||
# for(i in names(IV$Tables)){
|
||||
# print(i)
|
||||
# addWorksheet(wb, substr(i, start = nchar(i) - 30, stop = nchar(i)))
|
||||
# temp <- IV$Tables[[i]]
|
||||
# temp$ODDS <- exp(temp$WOE + lnodds)
|
||||
# temp$PROB <- (temp$ODDS / (temp$ODDS + 1))
|
||||
# writeDataTable(wb, substr(i, start = nchar(i) - 30, stop = nchar(i)) , x = data.frame(temp))
|
||||
# }
|
||||
#
|
||||
# # Save Workbook
|
||||
# saveWorkbook(wb, "Output_IV_v2.xlsx", overwrite = TRUE)
|
||||
#
|
||||
# # Plot Graph
|
||||
# pdf("Output_IV_v2.pdf")
|
||||
# plot_infotables(IV, IV$Summary$Variable[], same_scale=TRUE)
|
||||
# dev.off()
|
||||
# } else
|
||||
if (return == "table") {
|
||||
# Store all individual dataframes
|
||||
Tables <- c()
|
||||
Summary <- data.frame(IV$Summary)
|
||||
Tables$Summary <- Summary
|
||||
for (i in names(IV$Tables)) {
|
||||
temp <- IV$Tables[[i]]
|
||||
temp$ODDS <- exp(temp$WOE + lnodds)
|
||||
temp$PROB <- (temp$ODDS / (temp$ODDS + 1))
|
||||
Tables[[i]] <- create_dt(temp, rounding = 2)
|
||||
}
|
||||
|
||||
# Return ranking table
|
||||
return(create_dt(Tables$Summary, rounding = 2))
|
||||
# print("Access individual metrics via Outputs[[metric_name]], e.g., Outputs[[Workweek_span]]")
|
||||
|
||||
# # Store each variable's plot
|
||||
# plots <- c()
|
||||
# for (i in names(IV$Tables)) {
|
||||
# plots[[i]] <- plot_infotables(IV, i)
|
||||
# }
|
||||
|
||||
} else {
|
||||
stop("Please enter a valid input for `return`, either detailed or table.")
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,72 @@
|
|||
#' @title Information Value HTML Report
|
||||
#'
|
||||
#' @description
|
||||
#' The function generates an interactive HTML report using
|
||||
#' Standard Query data as an input. The report contains a full Information Value analysis, a data exploration technique that helps determine which columns in a data set have predictive power or influence on the value of a specified dependent variable.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param predictors A character vector specifying the columns to be used as predictors.
|
||||
#' Defaults to NULL, where all numeric vectors in the data will be used as predictors.
|
||||
#' @param outcome A string specifying a binary variable, i.e. can only contain
|
||||
#' the values 1 or 0.
|
||||
#' @param bins Number of bins to use in `Information::create_infotables()`, defaults to 10.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "collaboration report".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @export
|
||||
IV_report <- function(data,
|
||||
predictors = NULL,
|
||||
outcome,
|
||||
bins= 5,
|
||||
path = "IV report",
|
||||
timestamp = TRUE){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
table_list <-
|
||||
data %>%
|
||||
create_IV(outcome = outcome,
|
||||
predictors = predictors,
|
||||
bins = bins,
|
||||
return = "list")
|
||||
|
||||
table_names <- gsub("_", " ", x = names(table_list))
|
||||
|
||||
output_list <-
|
||||
list(data %>% check_query(return = "text") %>% md2html(),
|
||||
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins),
|
||||
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins, return="summary"),
|
||||
data %>% create_IV(outcome = outcome, predictors=predictors, bins= bins, return="plot-WOE")) %>%
|
||||
c(table_list) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
title_list <- c("Data Overview", "Top Predictors", "", "WOE Analysis", table_names)
|
||||
|
||||
n_title <- length(title_list)
|
||||
|
||||
title_levels <- c(2,2,4,2, rep(3, n_title-4))
|
||||
|
||||
generate_report(title = "Information Value Report",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = title_levels,
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("IV_report.md"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,64 @@
|
|||
#' @title After-Hours distribution
|
||||
#' @name afterhours_dist
|
||||
#'
|
||||
#' @description Analyse the distribution of weekly after-hours collaboration time.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric \code{After_hours_collaboration_hours}.
|
||||
#'
|
||||
#' @param data A Standard Person Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param cut A vector specifying the cuts to use for the data,
|
||||
#' accepting "default" or "range-cut" as character vector,
|
||||
#' or a numeric value of length three to specify the exact breaks to use. e.g. c(1, 3, 5)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom tidyr spread
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family After-Hours
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' ## Return a plot
|
||||
#' afterhours_dist(sq_data, hrvar = "Organization")
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' afterhours_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#'
|
||||
#' ## Return result with a custom specified breaks
|
||||
#' afterhours_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
|
||||
#' }
|
||||
#' @export
|
||||
afterhours_dist <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
cut = c(1, 2, 3)) {
|
||||
|
||||
create_dist(data = data,
|
||||
metric = "After_hours_collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
cut = cut,
|
||||
dist_colours = c("#FE7F4F",
|
||||
"#ffdfd3",
|
||||
"#bed6f2",
|
||||
"#e9f1fb"))
|
||||
|
||||
|
||||
}
|
||||
|
||||
#' @rdname afterhours_dist
|
||||
#' @export
|
||||
afterhours_distribution <- afterhours_dist
|
|
@ -0,0 +1,52 @@
|
|||
#' @title Rank groups with high After-Hours Collaboration Hours
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a Standard Person Query for groups with high levels of After-Hours Collaboration.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of After-Hours Collaboration Hours.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric \code{After_hours_collaboration_hours}.
|
||||
#'
|
||||
#' @param data A Standard Person Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "html" (default, returning an interactive DataTable)
|
||||
#' and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family After-Hours
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
afterhours_rank <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
output <-
|
||||
data %>%
|
||||
create_rank(metric = "After_hours_collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = "table")
|
||||
|
||||
if(return == "html"){
|
||||
return(create_dt(output))
|
||||
} else if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Summary of After-Hours Collaboration Hours
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of after-hours collaboration time.
|
||||
#' Returns a bar plot showing average weekly after-hours collaboration hours by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric \code{After_hours_collaboration_hours}.
|
||||
#'
|
||||
#' @inheritParams create_bar
|
||||
#'
|
||||
#' @family After-Hours
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' # Return a ggplot bar chart
|
||||
#' afterhours_summary(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#' # Return a summary table
|
||||
#' afterhours_summary(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#' @export
|
||||
afterhours_summary <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
create_bar(data = data,
|
||||
metric = "After_hours_collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
bar_colour = "alert")
|
||||
|
||||
}
|
||||
|
||||
|
||||
#' @rdname afterhours_summary
|
||||
#' @export
|
||||
afterhours_sum <- afterhours_summary
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,106 @@
|
|||
#' @title After-Hours Time Trend
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of after-hours collaboration time.
|
||||
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric \code{After_hours_collaboration_hours}.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family After-Hours
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Run plot
|
||||
#' afterhours_trend(sq_data)
|
||||
#'
|
||||
#' ## Run table
|
||||
#' afterhours_trend(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
afterhours_trend <- function(data, hrvar = "Organization", mingroup=5, return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
"After_hours_collaboration_hours",
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, After_hours_collaboration_hours) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count=mean(Employee_Count),
|
||||
After_hours_collaboration_hours=mean(After_hours_collaboration_hours))
|
||||
|
||||
myTable_plot <-
|
||||
myTable %>%
|
||||
select(Date, group, After_hours_collaboration_hours)
|
||||
|
||||
myTable_return <-
|
||||
myTable_plot %>%
|
||||
spread(Date, After_hours_collaboration_hours)
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x =Date , y = group , fill = After_hours_collaboration_hours)) +
|
||||
geom_tile(height=.5) +
|
||||
scale_fill_gradient(name="Hours", low = "white", high = "red") +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14), legend.position = "right",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = "After-Hours Collaboration",
|
||||
subtitle = paste("Meeting and Email Time Outside Working Hours by", hrvar)) +
|
||||
xlab("Date") +
|
||||
ylab(hrvar) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,101 @@
|
|||
#' @title Generate a Capacity report in HTML
|
||||
#'
|
||||
#' @description
|
||||
#' The function generates an interactive HTML report using the
|
||||
#' Standard Person Query data as an input. The report contains a series
|
||||
#' of summary analysis and visualisations relating to key **capacity**
|
||||
#' metrics in Workplace Analytics,including length of week and
|
||||
#' time in after-hours collaboration.
|
||||
#'
|
||||
#' @param data A Standard Person Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "capacity report".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#'
|
||||
#' @importFrom purrr map_if
|
||||
#' @importFrom dplyr `%>%`
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @export
|
||||
capacity_report <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
path = "capacity report",
|
||||
timestamp = TRUE){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
# Set outputs
|
||||
output_list <-
|
||||
list(data %>% check_query(return = "text") %>% md2html(),
|
||||
md2html(text = read_preamble("workloads_section.md")), # Header
|
||||
data %>% workloads_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% workloads_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% workloads_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% workloads_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% workloads_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% workloads_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% workloads_line(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% workloads_line(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
md2html(text = read_preamble("afterhours_section.md")), # Header
|
||||
data %>% afterhours_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% afterhours_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% afterhours_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% afterhours_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% afterhours_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% afterhours_trend(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
# Set header titles
|
||||
title_list <-
|
||||
c("Data Overview",
|
||||
|
||||
"Workloads",
|
||||
"Workloads Summary - Plot",
|
||||
"Workloads Summary - Table",
|
||||
"Workloads Distribution - Plot",
|
||||
"Workloads Distribution - Table",
|
||||
"Workloads Trend - Plot",
|
||||
"Workloads Trend - Table",
|
||||
"Workloads over time - Plot",
|
||||
"Workloads over time - Table",
|
||||
|
||||
"After hours",
|
||||
"After hours Summary - Plot",
|
||||
"After hours Summary - Table",
|
||||
"After hours Distribution - Plot",
|
||||
"After hours Distribution - Table",
|
||||
"After hours Trend - Plot",
|
||||
"After hours Trend - Table")
|
||||
|
||||
# Set header levels
|
||||
n_title <- length(title_list)
|
||||
levels_list <- rep(3, n_title)
|
||||
levels_list[c(1, 2, 11)] <- 2 # Section header
|
||||
|
||||
generate_report(title = "Capacity Report",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = rep(3, n_title),
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("capacity_report.md"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,206 @@
|
|||
#' @title Check a query to ensure that it is suitable for analysis
|
||||
#'
|
||||
#' @description
|
||||
#' Prints diagnostic data about the data query to the R console, with information
|
||||
#' such as date range, number of employees, HR attributes identified, etc.
|
||||
#'
|
||||
#' @details
|
||||
#' This can be used with any person-level query, such as the standard person query,
|
||||
#' collaboration assessment query, and the hourly collaboration query. When run,
|
||||
#' this prints diagnostic data to the R console.
|
||||
#'
|
||||
#' @param data A person-level query in the form of a data frame, including the standard person query,
|
||||
#' collaboration assessment query, and the hourly collaboration query.
|
||||
#' @param return Character string to specify whether to return a console message ("message"),
|
||||
#' a character string ("text"), or a data frame. Defaults to "message".
|
||||
#' @param validation Logical value to specify whether to return a check used by the `validation_report()`.
|
||||
#' Defaults to FALSE.
|
||||
#'
|
||||
#' @examples
|
||||
#' check_query(sq_data)
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @export
|
||||
check_query <- function(data, return = "message", validation = FALSE){
|
||||
|
||||
if(!is.data.frame(data)){
|
||||
stop("Input is not a data frame.")
|
||||
}
|
||||
|
||||
if("PersonId" %in% names(data)){
|
||||
|
||||
if(validation == FALSE){
|
||||
|
||||
check_person_query(data = data, return = return)
|
||||
|
||||
} else if(validation == TRUE){
|
||||
|
||||
# Different displays required for validation_report()
|
||||
check_query_validation(data = data, return = return)
|
||||
|
||||
}
|
||||
} else {
|
||||
message("Note: checks are currently unavailable for a non-Person query")
|
||||
}
|
||||
}
|
||||
|
||||
#' @title Check a Person Query to ensure that it is suitable for analysis
|
||||
#'
|
||||
#' @description
|
||||
#' Prints diagnostic data about the data query to the R console, with information
|
||||
#' such as date range, number of employees, HR attributes identified, etc.
|
||||
#'
|
||||
#' @inheritParams check_query
|
||||
#'
|
||||
#' @details Used as part of `check_query()`.
|
||||
#'
|
||||
#'
|
||||
check_person_query <- function(data, return){
|
||||
|
||||
## Query Type - Initialise
|
||||
## Uses `identify_query()`
|
||||
main_chunk <- paste0("The data used is a ", identify_query(data), ".")
|
||||
|
||||
## PersonId
|
||||
if(!("PersonId" %in% names(data))){
|
||||
stop("There is no `PersonId` variable in the input.")
|
||||
} else {
|
||||
new_chunk <- paste("There are", dplyr::n_distinct(data$PersonId), "employees in this dataset.")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Date
|
||||
if(!("Date" %in% names(data))){
|
||||
stop("There is no `Date` variable in the input.")
|
||||
} else {
|
||||
data$Date <- as.Date(data$Date, "%m/%d/%Y")
|
||||
new_chunk <- paste0("Date ranges from ", min(data$Date), " to ", max(data$Date), ".")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## HR Variables
|
||||
hr_chr <- extract_hr(data, max_unique = 200) %>% wrap(wrapper = "`")
|
||||
new_chunk <- paste("There are", length(hr_chr), "(estimated) HR attributes in the data:" )
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
new_chunk <- paste(hr_chr, collapse = ", ")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
|
||||
|
||||
## `IsActive` flag
|
||||
if(!("IsActive" %in% names(data))){
|
||||
new_chunk <- "The `IsActive` flag is not present in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
|
||||
} else {
|
||||
data$IsActive <- as.logical(data$IsActive) # Force to logical
|
||||
active_n <- dplyr::n_distinct(data[data$IsActive == TRUE, "PersonId"])
|
||||
new_chunk <- paste0("There are ", active_n, " active employees out of all in the dataset.")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Variable check header
|
||||
new_chunk <- "Variable name check:"
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
|
||||
## Collaboration_hours
|
||||
if(!("Collaboration_hours" %in% names(data)) &
|
||||
("Collaboration_hrs" %in% names(data))){
|
||||
new_chunk <- "`Collaboration_hrs` is used instead of `Collaboration_hours` in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
} else if(!("Collaboration_hrs" %in% names(data)) &
|
||||
("Collaboration_hours" %in% names(data))){
|
||||
new_chunk <- "`Collaboration_hours` is used instead of `Collaboration_hrs` in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
} else {
|
||||
new_chunk <- "No collaboration hour metric exists in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Instant_Message_hours
|
||||
if(!("Instant_message_hours" %in% names(data)) &
|
||||
("Instant_Message_hours" %in% names(data))){
|
||||
new_chunk <- "`Instant_Message_hours` is used instead of `Instant_message_hours` in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
} else if(!("Instant_Message_hours" %in% names(data)) &
|
||||
("Instant_message_hours" %in% names(data))){
|
||||
new_chunk <- "`Instant_message_hours` is used instead of `Instant_Message_hours` in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
} else {
|
||||
new_chunk <- "No instant message hour metric exists in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Return
|
||||
if(return == "message"){
|
||||
|
||||
main_chunk <- paste("", main_chunk, sep = "\n")
|
||||
message(main_chunk)
|
||||
|
||||
} else if(return == "text"){
|
||||
main_chunk
|
||||
} else {
|
||||
stop("Please check inputs for `return`")
|
||||
}
|
||||
}
|
||||
|
||||
#' @title Perform a query check for the validation report
|
||||
#'
|
||||
#' @description
|
||||
#' Prints diagnostic data about the data query to the R console, with information
|
||||
#' such as date range, number of employees, HR attributes identified, etc.
|
||||
#' Optimised for the `validation_report()`
|
||||
#'
|
||||
#' @inheritParams check_query
|
||||
#'
|
||||
#' @details Used as part of `check_query()`.
|
||||
#'
|
||||
check_query_validation <- function(data, return){
|
||||
## Query Type - Initialise
|
||||
main_chunk <- ""
|
||||
|
||||
## PersonId
|
||||
if(!("PersonId" %in% names(data))){
|
||||
stop("There is no `PersonId` variable in the input.")
|
||||
} else {
|
||||
new_chunk <- paste("There are", dplyr::n_distinct(data$PersonId), "employees in this dataset.")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Date
|
||||
if(!("Date" %in% names(data))){
|
||||
stop("There is no `Date` variable in the input.")
|
||||
} else {
|
||||
data$Date <- as.Date(data$Date, "%m/%d/%Y")
|
||||
new_chunk <- paste0("Date ranges from ", min(data$Date), " to ", max(data$Date), ".")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## HR Variables
|
||||
hr_chr <- extract_hr(data, max_unique = 200) %>% wrap(wrapper = "`")
|
||||
new_chunk <- paste("There are", length(hr_chr), "(estimated) HR attributes in the data:" )
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
new_chunk <- paste(hr_chr, collapse = ", ")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
|
||||
|
||||
## `IsActive` flag
|
||||
if(!("IsActive" %in% names(data))){
|
||||
new_chunk <- "The `IsActive` flag is not present in the data."
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n")
|
||||
} else {
|
||||
data$IsActive <- as.logical(data$IsActive) # Force to logical
|
||||
active_n <- dplyr::n_distinct(data[data$IsActive == TRUE, "PersonId"])
|
||||
new_chunk <- paste0("There are ", active_n, " active employees out of all in the dataset.")
|
||||
main_chunk <- paste(main_chunk, new_chunk, sep = "\n\n")
|
||||
}
|
||||
|
||||
## Return
|
||||
if(return == "message"){
|
||||
|
||||
main_chunk <- paste("", main_chunk, sep = "\n")
|
||||
message(main_chunk)
|
||||
|
||||
} else if(return == "text"){
|
||||
main_chunk
|
||||
} else {
|
||||
stop("Please check inputs for `return`")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,81 @@
|
|||
#' @title Generate a Coaching report in HTML
|
||||
#'
|
||||
#' @description
|
||||
#' The function generates an interactive HTML report using
|
||||
#' Standard Query data as an input. The report contains a series
|
||||
#' of summary analysis and visualisations relating to key **coaching**
|
||||
#' metrics in Workplace Analytics, specifically relating to the time
|
||||
#' spent between managers and their direct reports.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "LevelDesignation"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "collaboration report".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#'
|
||||
#' @importFrom purrr map_if
|
||||
#' @importFrom dplyr `%>%`
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @export
|
||||
coaching_report <- function(data,
|
||||
hrvar = "LevelDesignation",
|
||||
mingroup = 5,
|
||||
path = "coaching report",
|
||||
timestamp = TRUE){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
output_list <-
|
||||
list(data %>% check_query(return = "text") %>% md2html(),
|
||||
|
||||
data %>% mgrrel_matrix(hrvar = hrvar, return = "plot"), # no mingroup arg
|
||||
data %>% mgrrel_matrix(hrvar = hrvar, return = "table"), # no mingroup arg
|
||||
|
||||
data %>% one2one_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% one2one_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% one2one_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% one2one_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% one2one_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% one2one_trend(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
title_list <-
|
||||
c("Data Overview",
|
||||
|
||||
"Manager Relation Style - Plot",
|
||||
"Manager Relation Style - Table",
|
||||
|
||||
"1-to-1 Summary - Plot",
|
||||
"1-to-1 Summary - Table",
|
||||
"1-to-1 Distribution - Plot",
|
||||
"1-to-1 Distribution - Table",
|
||||
"1-to-1 Trend - Plot",
|
||||
"1-to-1 Trend - Table")
|
||||
|
||||
n_title <- length(title_list)
|
||||
|
||||
generate_report(title = "Coaching Report",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = rep(3, n_title),
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("coaching_report.md"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,134 @@
|
|||
#' @title Collaboration - Stacked Area Plot
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of 'Weekly Digital Collaboration'.
|
||||
#' Returns an stacked area plot of Email and Meeting Hours by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metrics `Meeting_hours`, `Email_hours`, `Unscheduled_Call_hours`,
|
||||
#' and `Instant_Message_hours`.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' A Collaboration assessment dataset may also be provided, in which
|
||||
#' Unscheduled call hours would be included in the output.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
collaboration_area <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup=5,
|
||||
return = "plot"){
|
||||
|
||||
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
|
||||
|
||||
if("Instant_message_hours" %in% names(data)){
|
||||
|
||||
data <- rename(data, Instant_Message_hours = "Instant_message_hours")
|
||||
|
||||
}
|
||||
|
||||
if("Unscheduled_call_hours" %in% names(data)){
|
||||
|
||||
data <- rename(data, Unscheduled_Call_hours = "Unscheduled_call_hours")
|
||||
|
||||
}
|
||||
|
||||
if("Unscheduled_Call_hours" %in% names(data)){
|
||||
|
||||
main_vars <- c("Meeting_hours",
|
||||
"Email_hours",
|
||||
"Instant_Message_hours",
|
||||
"Unscheduled_Call_hours")
|
||||
|
||||
} else {
|
||||
|
||||
main_vars <- c("Meeting_hours",
|
||||
"Email_hours",
|
||||
"Instant_Message_hours")
|
||||
|
||||
}
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId,
|
||||
Date,
|
||||
group,
|
||||
main_vars) %>%
|
||||
group_by(Date, group) %>%
|
||||
summarise_at(vars(main_vars), ~mean(.)) %>%
|
||||
left_join(hrvar_count(data, hrvar, return = "table"),
|
||||
by = c("group" = hrvar)) %>%
|
||||
rename(Employee_Count = "n") %>%
|
||||
filter(Employee_Count >= mingroup) %>%
|
||||
ungroup()
|
||||
|
||||
|
||||
myTable_long <-
|
||||
myTable %>%
|
||||
select(Date, group, ends_with("_hours")) %>%
|
||||
gather(Metric, Hours, -Date, -group) %>%
|
||||
mutate(Metric = sub(pattern = "_hours", replacement = "", x = Metric))
|
||||
|
||||
## Levels
|
||||
level_chr <- sub(pattern = "_hours", replacement = "", x = main_vars)
|
||||
|
||||
## Colour definitions
|
||||
colour_defs <-
|
||||
c("Meeting" = rgb2hex(174, 239, 168),
|
||||
"Email" = rgb2hex(192, 117, 205),
|
||||
"Instant_Message" = rgb2hex(113, 200, 234),
|
||||
"Unscheduled_Call" = rgb2hex(252, 161, 124))
|
||||
|
||||
colour_defs <- colour_defs[names(colour_defs) %in% level_chr]
|
||||
|
||||
plot_object <-
|
||||
myTable_long %>%
|
||||
mutate(Metric = factor(Metric, levels = level_chr)) %>%
|
||||
ggplot(aes(x = Date, y = Hours, colour = Metric)) +
|
||||
geom_area(aes(fill = Metric), alpha = 1.0, position = 'stack') +
|
||||
theme_wpa_basic() +
|
||||
scale_y_continuous(labels = round) +
|
||||
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
|
||||
scale_colour_manual(values = colour_defs) +
|
||||
scale_fill_manual(values = colour_defs) +
|
||||
facet_wrap(.~group) +
|
||||
labs(title = "Total Collaboration Hours",
|
||||
subtitle = paste("Weekly collaboration hours by", camel_clean(hrvar))) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable %>%
|
||||
as_tibble() %>%
|
||||
mutate(Collaboration_hours = select(., main_vars) %>%
|
||||
apply(1, sum, na.rm = TRUE))
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
#' @title Collaboration Hours distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze the distribution of Collaboration Hours.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric `Collaboration_hours`.
|
||||
#'
|
||||
#' @inheritParams create_dist
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Return a plot
|
||||
#' collaboration_dist(sq_data, hrvar = "Organization")
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' collaboration_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
collaboration_dist <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
cut = c(15, 20, 25)) {
|
||||
|
||||
create_dist(data = data,
|
||||
metric = "Collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
cut = cut)
|
||||
|
||||
|
||||
}
|
||||
|
||||
#' @rdname collaboration_dist
|
||||
#' @export
|
||||
collaboration_distribution <- collaboration_dist
|
|
@ -0,0 +1,30 @@
|
|||
#' @title Distribution of Collaboration Hours (Fizzy Drink plot)
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze weekly collaboration hours distribution, and returns
|
||||
#' a 'fizzy' scatter plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric `Collaboration_hours`.
|
||||
#'
|
||||
#' @inheritParams create_fizz
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @examples
|
||||
#' collaboration_fizz(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
collaboration_fizz <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
create_fizz(data = data,
|
||||
metric = "Collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
|
||||
}
|
|
@ -0,0 +1,61 @@
|
|||
#' @title Collaboration Time Trend - Line Chart
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of collaboration time, visualised as line charts.
|
||||
#' By default returns a line chart for collaboration hours,
|
||||
#' with a separate panel per value in the HR attribute.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric `Collaboration_hours`.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @examples
|
||||
#'
|
||||
#' ## Return a line plot
|
||||
#' collaboration_line(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' collaboration_line(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
collaboration_line <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup=5,
|
||||
return = "plot"){
|
||||
|
||||
## Inherit arguments
|
||||
output <- create_line(data = data,
|
||||
metric = "Collaboration_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
|
||||
if(return == "plot"){
|
||||
output +
|
||||
labs(title = "Collaboration Hours")
|
||||
} else if(return == "table"){
|
||||
output
|
||||
} else {
|
||||
stop("Invalid `return` value")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,50 @@
|
|||
#' @title Collaboration Ranking
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with high levels of 'Weekly Digital Collaboration'.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric `Collaboration_hours`.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "html" (default, returning an interactive DataTable)
|
||||
#' and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
collaboration_rank <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
output <-
|
||||
data %>% create_rank(metric="Collaboration_hours", hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = "table")
|
||||
|
||||
if(return == "html"){
|
||||
return(create_dt(output))
|
||||
} else if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,125 @@
|
|||
#' @title Generate a Collaboration report in HTML
|
||||
#'
|
||||
#' @description
|
||||
#' The function generates an interactive HTML report using
|
||||
#' Standard Query data as an input. The report contains a series
|
||||
#' of summary analysis and visualisations relating to key **collaboration**
|
||||
#' metrics in Workplace Analytics,including email and meeting hours.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "collaboration report".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#'
|
||||
#' @importFrom purrr map_if
|
||||
#' @importFrom dplyr `%>%`
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @export
|
||||
collaboration_report <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
path = "collaboration report",
|
||||
timestamp = TRUE){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
# Set outputs
|
||||
output_list <-
|
||||
list(data %>% check_query(return = "text") %>% md2html(),
|
||||
md2html(text = read_preamble("collaboration_section.md")), # Header
|
||||
data %>% collaboration_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% collaboration_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
data %>% afterhours_sum(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% afterhours_sum(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
data %>% collaboration_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% collaboration_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% collaboration_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% collaboration_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% collaboration_line(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% collaboration_line(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
md2html(text = read_preamble("email_section.md")), # Header
|
||||
data %>% email_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% email_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% email_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% email_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% email_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% email_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
md2html(text = read_preamble("meeting_section.md")), # Header
|
||||
data %>% meeting_summary(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% meeting_summary(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% meeting_dist(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% meeting_dist(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% meeting_trend(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% meeting_trend(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
data %>% meeting_quality(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% meeting_quality(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
# Set header titles
|
||||
title_list <-
|
||||
c("Data Overview",
|
||||
|
||||
"Collaboration", # Section header
|
||||
"Collaboration Summary - Plot",
|
||||
"Collaboration Summary - Table",
|
||||
|
||||
"Afterhours Summary - Plot",
|
||||
"Afterhours Summary - Table",
|
||||
|
||||
"Collaboration Distribution - Plot",
|
||||
"Collaboration Distribution - Table",
|
||||
"Collaboration Trend - Plot",
|
||||
"Collaboration Trend - Table",
|
||||
"Collaboration over time - Plot",
|
||||
"Collaboration over time - Table",
|
||||
|
||||
"Email", # Section header
|
||||
"Email Summary - Plot",
|
||||
"Email Summary - Table",
|
||||
"Email Distribution - Plot",
|
||||
"Email Distribution - Table",
|
||||
"Email Trend - Plot",
|
||||
"Email Trend - Table",
|
||||
|
||||
"Meeting", # Section header
|
||||
"Meeting Summary - Plot",
|
||||
"Meeting Summary - Table",
|
||||
"Meeting Distribution - Plot",
|
||||
"Meeting Distribution - Table",
|
||||
"Meeting Trend - Plot",
|
||||
"Meeting Trend - Table",
|
||||
"Meeting Quality - Plot",
|
||||
"Meeting Quality - Table")
|
||||
|
||||
# Set header levels
|
||||
n_title <- length(title_list)
|
||||
levels_list <- rep(3, n_title)
|
||||
levels_list[c(1, 2, 13, 20)] <- 2 # Section header
|
||||
|
||||
# Generate report
|
||||
generate_report(title = "Collaboration Report",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = levels_list,
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("collaboration_report.md"))
|
||||
|
||||
}
|
|
@ -0,0 +1,79 @@
|
|||
#' @title Collaboration Summary
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of 'Weekly Digital Collaboration'.
|
||||
#' Returns a stacked bar plot of Email and Meeting Hours by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metrics `Meeting_hours`, `Email_hours`, `Unscheduled_Call_hours`,
|
||||
#' and `Instant_Message_hours`.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
collaboration_sum <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup=5,
|
||||
return = "plot"){
|
||||
|
||||
if("Instant_message_hours" %in% names(data)){
|
||||
|
||||
data <- rename(data, Instant_Message_hours = "Instant_message_hours")
|
||||
|
||||
}
|
||||
|
||||
if("Unscheduled_Call_hours" %in% names(data)){
|
||||
|
||||
main_vars <- c("Meeting_hours",
|
||||
"Email_hours",
|
||||
"Instant_Message_hours",
|
||||
"Unscheduled_Call_hours")
|
||||
|
||||
} else {
|
||||
|
||||
main_vars <- c("Meeting_hours",
|
||||
"Email_hours")
|
||||
|
||||
}
|
||||
|
||||
create_stacked(data = data,
|
||||
hrvar = hrvar,
|
||||
metrics = main_vars,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
|
||||
}
|
||||
|
||||
#' @rdname collaboration_sum
|
||||
#' @export
|
||||
collab_sum <- collaboration_sum
|
||||
|
||||
#' @rdname collaboration_sum
|
||||
#' @export
|
||||
collaboration_summary <- collaboration_sum
|
||||
|
||||
#' @rdname collaboration_sum
|
||||
#' @export
|
||||
collab_summary <-collaboration_sum
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,100 @@
|
|||
#' @title Collaboration Time Trend
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of collaboration time.
|
||||
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' Uses the metric `Collaboration_hours`.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
collaboration_trend <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup=5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
"Collaboration_hours",
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, Collaboration_hours) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count = mean(Employee_Count),
|
||||
Collaboration_hours = mean(Collaboration_hours))
|
||||
|
||||
myTable_plot <- myTable %>% select(Date, group, Collaboration_hours)
|
||||
|
||||
myTable_return <- myTable_plot %>% spread(Date, Collaboration_hours)
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = Date , y = group , fill = Collaboration_hours)) +
|
||||
geom_tile(height=.5) +
|
||||
scale_fill_gradient(name="Hours", low = "white", high = "red") +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "right",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = "Collaboration Hours",
|
||||
subtitle = paste("Total meeting and email time by", tolower(hrvar))) +
|
||||
xlab("Date") +
|
||||
ylab(hrvar) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,73 @@
|
|||
#' @title Generate a Connectivity report in HTML
|
||||
#'
|
||||
#' @description
|
||||
#' The function generates an interactive HTML report using
|
||||
#' Standard Query data as an input. The report contains a series
|
||||
#' of summary analysis and visualisations relating to key **connectivity**
|
||||
#' metrics in Workplace Analytics, including external/internal network size
|
||||
#' vs breadth.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "collaboration report".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#'
|
||||
#' @importFrom purrr map_if
|
||||
#' @importFrom dplyr `%>%`
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @export
|
||||
connectivity_report <- function(data,
|
||||
hrvar = "LevelDesignation",
|
||||
mingroup = 5,
|
||||
path = "connectivity report",
|
||||
timestamp = TRUE){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
output_list <-
|
||||
list(data %>% check_query(return = "text") %>% md2html(),
|
||||
|
||||
data %>% external_network_plot(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% external_network_plot(hrvar = hrvar, mingroup = mingroup, return = "table"),
|
||||
|
||||
data %>% internal_network_plot(hrvar = hrvar, mingroup = mingroup, return = "plot"),
|
||||
data %>% internal_network_plot(hrvar = hrvar, mingroup = mingroup, return = "table")) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
title_list <-
|
||||
c("Data Overview",
|
||||
|
||||
"External network - Plot",
|
||||
"External network - Table",
|
||||
|
||||
"Internal network - Plot",
|
||||
"Internal network - Table")
|
||||
|
||||
n_title <- length(title_list)
|
||||
|
||||
generate_report(title = "Connectivity Report",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = rep(3, n_title),
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("connectivity_report.md"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
#' @title Copy a data frame to clipboard for pasting in Excel
|
||||
#'
|
||||
#' @description
|
||||
#' This is a pipe-optimised function, that feeds into `wpa::export()`,
|
||||
#' but can be used as a stand-alone function.
|
||||
#'
|
||||
#' Based on the original function at http://www.github.com/martinctc/surveytoolbox.
|
||||
#'
|
||||
#' @param x Data frame to be passed through. Cannot contain list-columns or nested data frames.
|
||||
#' @param row.names A logical vector for specifying whether to allow row names. Defaults to FALSE.
|
||||
#' @param col.names A logical vector for specifying whether to allow column names. Defaults to FALSE.
|
||||
#' @param quietly Set this to TRUE to not print data frame on console
|
||||
#' @param ... Additional arguments for write.table().
|
||||
#'
|
||||
#' @importFrom utils write.table
|
||||
#'
|
||||
#' @export
|
||||
|
||||
copy_df <-function(x,
|
||||
row.names = FALSE,
|
||||
col.names = TRUE,
|
||||
quietly = FALSE,...) {
|
||||
|
||||
|
||||
utils::write.table(x,"clipboard-50000",
|
||||
sep="\t",
|
||||
row.names=row.names,
|
||||
col.names=col.names,...)
|
||||
|
||||
if(quietly==FALSE) print(x)
|
||||
|
||||
}
|
|
@ -0,0 +1,92 @@
|
|||
#' @title Calculate Information Value for a selected outcome variable
|
||||
#'
|
||||
#' @description
|
||||
#' Specify an outcome variable and return IV outputs.
|
||||
#' All numeric variables in the dataset are used as predictor variables.
|
||||
#'
|
||||
#' @param data A Person Query dataset in the form of a data frame.
|
||||
#' @param predictors A character vector specifying the columns to be used as predictors.
|
||||
#' Defaults to NULL, where all numeric vectors in the data will be used as predictors.
|
||||
#' @param outcome A string specifying a binary variable, i.e. can only contain
|
||||
#' the values 1 or 0.
|
||||
#' @param bins Number of bins to use in `Information::create_infotables()`, defaults to 10.
|
||||
#' @param return String specifying what output to return.
|
||||
#' Defaults to "plot" that return a bar plot summarising the information value.
|
||||
#' "summary" returns a summary table, "list" returns a list of outputs for all the
|
||||
#' input variables, "plot-WOE" commpares distribution for top predictors.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @family Information Value
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' sq_data %>%
|
||||
#' mutate(X = ifelse(Collaboration_hours > 12, 1, 0)) %>%
|
||||
#' create_IV(outcome = "X")
|
||||
#' sq_data %>%
|
||||
#' mutate(X = ifelse(Collaboration_hours > 12, 1, 0)) %>%
|
||||
#' create_IV(outcome = "X",
|
||||
#' predictors = c("Email_hours", "Meeting_hours"),
|
||||
#' return = "list")
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
create_IV <- function(data,
|
||||
predictors = NULL,
|
||||
outcome,
|
||||
bins = 5,
|
||||
return = "plot"){
|
||||
|
||||
if(is.null(predictors)){
|
||||
train <-
|
||||
data %>%
|
||||
rename(outcome = outcome) %>%
|
||||
select_if(is.numeric) %>%
|
||||
tidyr::drop_na()
|
||||
} else {
|
||||
train <-
|
||||
data %>%
|
||||
rename(outcome = outcome) %>%
|
||||
select(predictors, outcome) %>%
|
||||
tidyr::drop_na()
|
||||
}
|
||||
|
||||
# Calculate Odds
|
||||
odds <- sum(train$outcome) / (length(train$outcome) - sum(train$outcome))
|
||||
lnodds <- log(odds)
|
||||
|
||||
# IV Analysis
|
||||
IV <- Information::create_infotables(data = train, y = "outcome", bins = bins)
|
||||
IV_summary <- IV$Summary
|
||||
IV_names <- names(IV$Tables)
|
||||
|
||||
# Output list
|
||||
output_list <-
|
||||
IV_names %>%
|
||||
purrr::map(function(x){
|
||||
IV$Tables[[x]] %>%
|
||||
mutate(ODDS = exp(WOE + lnodds),
|
||||
PROB = ODDS / (ODDS + 1))
|
||||
}) %>%
|
||||
purrr::set_names(IV_names)
|
||||
|
||||
if(return == "summary"){
|
||||
IV_summary
|
||||
} else if(return == "plot"){
|
||||
IV_summary %>%
|
||||
utils::head(12) %>%
|
||||
create_bar_asis(group_var = "Variable",
|
||||
bar_var = "IV",
|
||||
title = "Information Value (IV)",
|
||||
subtitle = "Showing top 12 only")
|
||||
|
||||
} else if(return == "plot-WOE"){
|
||||
Information::plot_infotables(IV, IV$Summary$Variable[], same_scale=TRUE) %>% grDevices::recordPlot()
|
||||
|
||||
} else if(return == "list"){
|
||||
output_list
|
||||
} else {
|
||||
stop("Please enter a valid input for `return`.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,168 @@
|
|||
#' @title Mean Bar Plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of a selected metric by calculating a mean per metric.
|
||||
#' Returns a bar plot showing the average of a selected metric by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param bar_colour String to specify colour to use for bars.
|
||||
#' In-built accepted values include "default" (default), "alert" (red), and
|
||||
#' "darkblue". Otherwise, hex codes are also accepted. You can also supply
|
||||
#' RGB values via `rgb2hex()`.
|
||||
#' @param na.rm A logical value indicating whether NA values should be stripped
|
||||
#' before the computation proceeds. Defaults to FALSE.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @importFrom scales wrap_format
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' # Return a ggplot bar chart
|
||||
#' create_bar(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation")
|
||||
#' create_bar(sq_data, metric = "Generated_workload_email_hours")
|
||||
#' create_bar(sq_data, metric = "After_hours_collaboration_hours")
|
||||
#'
|
||||
#' # Return a summary table
|
||||
#' create_bar(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "table")
|
||||
#' @export
|
||||
create_bar <- function(data,
|
||||
metric,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
bar_colour = "default",
|
||||
na.rm = FALSE){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metric,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Clean metric name
|
||||
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
|
||||
|
||||
## Data for bar plot
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise(!!sym(metric) := mean(!!sym(metric), na.rm = na.rm)) %>%
|
||||
ungroup() %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Colour bar override
|
||||
if(bar_colour == "default"){
|
||||
|
||||
bar_colour <- rgb2hex(0, 120, 212)
|
||||
|
||||
} else if(bar_colour == "alert"){
|
||||
|
||||
bar_colour <- rgb2hex(215, 59, 1)
|
||||
|
||||
} else if(bar_colour == "darkblue"){
|
||||
|
||||
bar_colour <- rgb2hex(0, 32, 80)
|
||||
|
||||
}
|
||||
|
||||
## Employee count / base size table
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count = first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
## Data for bar plot
|
||||
plot_table <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(metric, ~mean(., na.rm = na.rm)) %>%
|
||||
arrange(desc(!!sym(metric)))
|
||||
|
||||
## Table for annotation
|
||||
annot_table <-
|
||||
plot_legend %>%
|
||||
dplyr::left_join(plot_table, by = "group")
|
||||
|
||||
## Location attribute for x axis
|
||||
location <- plot_table %>% select(!!sym(metric)) %>% max()
|
||||
|
||||
## Bar plot
|
||||
plot_object <-
|
||||
plot_table %>%
|
||||
ggplot(aes(x = stats::reorder(group, !!sym(metric)), y = !!sym(metric))) +
|
||||
geom_bar(stat = "identity",
|
||||
fill = bar_colour) +
|
||||
geom_text(aes(label = round(!!sym(metric), 1)),
|
||||
hjust = 1.3,
|
||||
color = "#FFFFFF",
|
||||
fontface = "bold",
|
||||
size = 4) +
|
||||
scale_y_continuous(limits = c(0, location * 1.25)) +
|
||||
annotate("text",
|
||||
x = plot_legend$group,
|
||||
y = location * 1.15,
|
||||
label = plot_legend$Employee_Count) +
|
||||
annotate("rect", xmin = 0.5, xmax = length(plot_legend$group) + 0.5, ymin = location * 1.05, ymax = location * 1.25, alpha = .2) +
|
||||
coord_flip() +
|
||||
theme_classic() +
|
||||
theme(axis.text = element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "top",
|
||||
legend.justification = "right",
|
||||
legend.title = element_text(size=14),
|
||||
legend.text = element_text(size=14)) +
|
||||
labs(title = clean_nm,
|
||||
subtitle = paste("Average", clean_nm, "by", camel_clean(hrvar))) +
|
||||
xlab(camel_clean(hrvar)) +
|
||||
ylab(paste("Average weekly", clean_nm)) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
summary_table <-
|
||||
plot_data %>%
|
||||
select(group, !!sym(metric)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(!!sym(metric) := mean(!!sym(metric)),
|
||||
n = n())
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
return(summary_table)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,87 @@
|
|||
#' @title Create a bar chart without aggregation
|
||||
#'
|
||||
#' @description
|
||||
#' This function creates a bar chart directly from the aggregated / summarised data.
|
||||
#' Unlike `create_bar()` which performs a person-level aggregation, there is no
|
||||
#' calculation for `create_bar_asis()` and the values are rendered as they are passed
|
||||
#' into the function.
|
||||
#'
|
||||
#' @param data Plotting data as a data frame.
|
||||
#' @param group_var String containing name of variable for the group.
|
||||
#' @param bar_var String containing name of variable representing the value of the bars.
|
||||
#' @param title Title of the plot.
|
||||
#' @param subtitle Subtitle of the plot.
|
||||
#' @param caption Caption of the plot.
|
||||
#' @param ylab Y-axis label for the plot (group axis)
|
||||
#' @param xlab X-axis label of the plot (bar axis).
|
||||
#' @param percent Logical value to determine whether to show labels as percentage signs. Defaults to FALSE.
|
||||
#' @param bar_colour String to specify colour to use for bars.
|
||||
#' In-built accepted values include "default" (default), "alert" (red), and
|
||||
#' "darkblue". Otherwise, hex codes are also accepted. You can also supply
|
||||
#' RGB values via `rgb2hex()`.
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @export
|
||||
create_bar_asis <- function(data,
|
||||
group_var,
|
||||
bar_var,
|
||||
title = NULL,
|
||||
subtitle = NULL,
|
||||
caption = NULL,
|
||||
ylab = group_var,
|
||||
xlab = bar_var,
|
||||
percent = FALSE,
|
||||
bar_colour = "default"){
|
||||
|
||||
## Colour bar override
|
||||
if(bar_colour == "default"){
|
||||
|
||||
bar_colour <- rgb2hex(0, 120, 212)
|
||||
|
||||
} else if(bar_colour == "alert"){
|
||||
|
||||
bar_colour <- rgb2hex(215, 59, 1)
|
||||
|
||||
} else if(bar_colour == "darkblue"){
|
||||
|
||||
bar_colour <- rgb2hex(0, 32, 80)
|
||||
|
||||
}
|
||||
|
||||
up_break <- max(data[[bar_var]], na.rm = TRUE) * 1.3
|
||||
|
||||
if(percent == FALSE){
|
||||
returnPlot <-
|
||||
data %>%
|
||||
ggplot(aes(x = reorder(!!sym(group_var), !!sym(bar_var)), y = !!sym(bar_var))) +
|
||||
geom_col(fill = bar_colour) +
|
||||
geom_text(aes(label = round(!!sym(bar_var), 2)),
|
||||
hjust = -0.25,
|
||||
color = "#000000",
|
||||
fontface = "bold",
|
||||
size = 4)
|
||||
} else if(percent == TRUE){
|
||||
returnPlot <-
|
||||
data %>%
|
||||
ggplot(aes(x = reorder(!!sym(group_var), !!sym(bar_var)), y = !!sym(bar_var))) +
|
||||
geom_col(fill = bar_colour) +
|
||||
geom_text(aes(label = scales::percent(!!sym(bar_var), accuracy = 1)),
|
||||
hjust = -0.25,
|
||||
color = "#000000",
|
||||
fontface = "bold",
|
||||
size = 4)
|
||||
|
||||
}
|
||||
|
||||
returnPlot +
|
||||
scale_y_continuous(limits = c(0, up_break)) +
|
||||
coord_flip() +
|
||||
labs(title = title,
|
||||
subtitle = subtitle,
|
||||
caption = caption,
|
||||
y = xlab,
|
||||
x = ylab) +
|
||||
theme_wpa_basic()
|
||||
}
|
|
@ -0,0 +1,148 @@
|
|||
#' @title Box Plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Analyzes a selected metric and returns a a 'fizzy' scatter plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @details
|
||||
#' This is a general purpose function that powers all the functions
|
||||
#' in the package that produce box plots.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Create a fizzy plot for Work Week Span by Level Designation
|
||||
#' create_boxplot(sq_data, metric = "Workweek_span", hrvar = "LevelDesignation", return = "plot")
|
||||
#'
|
||||
#' ## Create a summary statistics table for Work Week Span by Organization
|
||||
#' create_boxplot(sq_data, metric = "Workweek_span", hrvar = "Organization", return = "table")
|
||||
#'
|
||||
#' ## Create a fizzy plot for Collaboration Hours by Level Designation
|
||||
#' create_boxplot(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "plot")
|
||||
#' @export
|
||||
|
||||
create_boxplot <- function(data,
|
||||
metric,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot") {
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metric,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Clean metric name
|
||||
clean_nm <- gsub(pattern = "_", replacement = "", x = metric)
|
||||
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
group_by(PersonId, group) %>%
|
||||
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
|
||||
ungroup() %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Get max value
|
||||
max_point <- max(plot_data[[metric]]) * 1.2
|
||||
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count = first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
## summary table
|
||||
summary_table <-
|
||||
plot_data %>%
|
||||
select(group, tidyselect::all_of(metric)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(mean = mean(!!sym(metric)),
|
||||
median = median(!!sym(metric)),
|
||||
sd = sd(!!sym(metric)),
|
||||
min = min(!!sym(metric)),
|
||||
max = max(!!sym(metric)),
|
||||
range = max - min,
|
||||
n = n())
|
||||
|
||||
## group order
|
||||
group_ord <-
|
||||
summary_table %>%
|
||||
arrange(desc(mean)) %>%
|
||||
pull(group)
|
||||
|
||||
plot_object <-
|
||||
plot_data %>%
|
||||
mutate(group = factor(group, levels = group_ord)) %>%
|
||||
ggplot(aes(x = group, y = !!sym(metric))) +
|
||||
geom_boxplot(color = "#578DB8") +
|
||||
ylim(0, max_point) +
|
||||
annotate("text", x = plot_legend$group, y = 0, label = plot_legend$Employee_Count) +
|
||||
scale_x_discrete(labels = scales::wrap_format(10)) +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
axis.text.x = element_text(angle = 30, hjust = 1),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "top",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = clean_nm,
|
||||
subtitle = paste("Distribution of",
|
||||
tolower(clean_nm),
|
||||
"by",
|
||||
camel_clean(hrvar))) +
|
||||
xlab(hrvar) +
|
||||
ylab(paste("Average", clean_nm)) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
summary_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else if(return == "data"){
|
||||
|
||||
plot_data %>%
|
||||
mutate(group = factor(group, levels = group_ord)) %>%
|
||||
arrange(desc(group))
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,152 @@
|
|||
#' @title Horizontal 100 percent stacked bar plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an analysis of the distribution of a selected metric.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param cut A numeric vector of length three to specify the breaks for the distribution,
|
||||
#' e.g. c(10, 15, 20)
|
||||
#' @param dist_colours A character vector of length four to specify colour
|
||||
#' codes for the stacked bars.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom tidyr spread
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Return a plot
|
||||
#' create_dist(sq_data, metric = "Collaboration_hours", hrvar = "Organization")
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' create_dist(sq_data, metric = "Collaboration_hours", hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
create_dist <- function(data,
|
||||
metric,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
cut = c(15, 20, 25),
|
||||
dist_colours = c("#FE7F4F",
|
||||
"#ffdfd3",
|
||||
"#bed6f2",
|
||||
"#e9f1fb")) {
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metric,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Clean metric name
|
||||
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
|
||||
|
||||
## Basic Data for bar plot
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
|
||||
ungroup() %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Create buckets of collaboration hours
|
||||
plot_data <-
|
||||
plot_data %>%
|
||||
mutate(bucket_hours = cut_hour(!!sym(metric),
|
||||
cuts = cut))
|
||||
|
||||
## Employee count / base size table
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count=first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
## Data for bar plot
|
||||
plot_table <-
|
||||
plot_data %>%
|
||||
group_by(group, bucket_hours) %>%
|
||||
summarize(Employees=n(),
|
||||
Employee_Count=first(Employee_Count),
|
||||
percent= Employees / Employee_Count ) %>%
|
||||
arrange(group, desc(bucket_hours))
|
||||
|
||||
## Table for annotation
|
||||
annot_table <-
|
||||
plot_legend %>%
|
||||
dplyr::left_join(plot_table, by = "group")
|
||||
|
||||
## Bar plot
|
||||
|
||||
plot_object <-
|
||||
plot_table %>%
|
||||
ggplot(aes(x = group, y=Employees, fill = bucket_hours)) +
|
||||
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
|
||||
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
|
||||
coord_flip() +
|
||||
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
|
||||
scale_fill_manual(name="",
|
||||
values = rev(dist_colours)) +
|
||||
theme_wpa_basic() +
|
||||
labs(title = clean_nm,
|
||||
subtitle = paste("Distribution of", clean_nm, "by", camel_clean(hrvar))) +
|
||||
xlab(camel_clean(hrvar)) +
|
||||
ylab("Fraction of employees") +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
## Table to return
|
||||
return_table <-
|
||||
plot_table %>%
|
||||
select(group, bucket_hours, percent) %>%
|
||||
spread(bucket_hours, percent) %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group")
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
return_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
#' @rdname collaboration_dist
|
||||
#' @export
|
||||
collaboration_distribution <- collaboration_dist
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Create interactive tables in HTML with 'download' buttons.
|
||||
#'
|
||||
#' @description
|
||||
#' See https://martinctc.github.io/blog/vignette-downloadable-tables-in-rmarkdown-with-the-dt-package/ for more.
|
||||
#'
|
||||
#' @param x Data frame to be passed through.
|
||||
#' @param rounding Numeric vector to specify the number of decimal points to display
|
||||
#' @param freeze Number of columns from the left to 'freeze'. Defaults to 2,
|
||||
#' which includes the row number column.
|
||||
#'
|
||||
#' @import DT
|
||||
#' @importFrom dplyr mutate_if
|
||||
#'
|
||||
#' @export
|
||||
create_dt <- function(x, rounding = 1, freeze = 2){
|
||||
|
||||
# Round all numeric to "rounding" number of dp
|
||||
num_cols <- dplyr::select_if(x, is.numeric) %>% names()
|
||||
|
||||
if(length(num_cols) == 0){ # No numeric columns
|
||||
|
||||
DT::datatable(x,
|
||||
extensions = c('Buttons',
|
||||
'FixedColumns'),
|
||||
options = list(dom = 'Blfrtip',
|
||||
fixedColumns = list(leftColumns = freeze),
|
||||
scrollX = TRUE,
|
||||
buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
|
||||
lengthMenu = list(c(10,25,50,-1),
|
||||
c(10,25,50,"All"))))
|
||||
|
||||
} else {
|
||||
|
||||
DT::datatable(x,
|
||||
extensions = c('Buttons',
|
||||
'FixedColumns'),
|
||||
options = list(dom = 'Blfrtip',
|
||||
fixedColumns = list(leftColumns = freeze),
|
||||
scrollX = TRUE,
|
||||
buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
|
||||
lengthMenu = list(c(10,25,50,-1),
|
||||
c(10,25,50,"All")))) %>%
|
||||
DT::formatRound(columns = num_cols, rounding)
|
||||
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,136 @@
|
|||
#' @title Fizzy Drink / Jitter Scatter Plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Analyzes a selected metric and returns a a 'fizzy' scatter plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @details
|
||||
#' This is a general purpose function that powers all the functions
|
||||
#' in the package that produce 'fizzy drink' / jitter scatter plots.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Create a fizzy plot for Work Week Span by Level Designation
|
||||
#' create_fizz(sq_data, metric = "Workweek_span", hrvar = "LevelDesignation", return = "plot")
|
||||
#'
|
||||
#' ## Create a summary statistics table for Work Week Span by Organization
|
||||
#' create_fizz(sq_data, metric = "Workweek_span", hrvar = "Organization", return = "table")
|
||||
#'
|
||||
#' ## Create a fizzy plot for Collaboration Hours by Level Designation
|
||||
#' create_fizz(sq_data, metric = "Collaboration_hours", hrvar = "LevelDesignation", return = "plot")
|
||||
#' @export
|
||||
|
||||
create_fizz <- function(data,
|
||||
metric,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot") {
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metric,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Clean metric name
|
||||
clean_nm <- gsub(pattern = "_", replacement = "", x = metric)
|
||||
|
||||
## Plot data
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
group_by(PersonId, group) %>%
|
||||
summarise(!!sym(metric) := mean(!!sym(metric))) %>%
|
||||
ungroup() %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Get max value
|
||||
max_point <- max(plot_data[[metric]]) * 1.2
|
||||
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count = first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
plot_object <-
|
||||
plot_data %>%
|
||||
ggplot(aes(x = group, y = !!sym(metric))) +
|
||||
geom_point(size = 1,
|
||||
alpha = 1/5,
|
||||
color = "#578DB8",
|
||||
position = position_jitter(width = 0.1, height = 0.1)) +
|
||||
ylim(0, max_point) +
|
||||
annotate("text", x = plot_legend$group, y = 0, label = plot_legend$Employee_Count) +
|
||||
scale_x_discrete(labels = scales::wrap_format(10)) +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
axis.text.x = element_text(angle = 30, hjust = 1),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "top",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = clean_nm,
|
||||
subtitle = paste("Distribution of",
|
||||
tolower(clean_nm),
|
||||
"by",
|
||||
camel_clean(hrvar))) +
|
||||
xlab(hrvar) +
|
||||
ylab(paste("Average", clean_nm)) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
summary_table <-
|
||||
plot_data %>%
|
||||
select(group, tidyselect::all_of(metric)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(mean = mean(!!sym(metric)),
|
||||
median = median(!!sym(metric)),
|
||||
sd = sd(!!sym(metric)),
|
||||
min = min(!!sym(metric)),
|
||||
max = max(!!sym(metric)),
|
||||
range = max - min,
|
||||
n = n())
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
summary_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,140 @@
|
|||
#' @title Time Trend - Line Chart (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of a selected metric, visualised as line charts.
|
||||
#' By default returns a line chart for the defined metric,
|
||||
#' with a separate panel per value in the HR attribute.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' This is a general purpose function that powers all the functions
|
||||
#' in the package that produce faceted line plots.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom tidyselect all_of
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' sq_data %>% create_line(metric = "Email_hours", return = "plot")
|
||||
#'
|
||||
#' sq_data %>% create_line(metric = "Collaboration_hours", return = "plot")
|
||||
#'
|
||||
#' sq_data %>% create_line(metric = "Workweek_span", hrvar = "LevelDesignation")
|
||||
#' }
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
create_line <- function(data,
|
||||
metric,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metric,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
|
||||
## Clean metric name
|
||||
clean_nm <- gsub(pattern = "_", replacement = " ", x = metric)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, all_of(metric)) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count = mean(Employee_Count),
|
||||
!!sym(metric) := mean(!!sym(metric)))
|
||||
|
||||
## Data frame to return
|
||||
myTable_return <-
|
||||
myTable %>%
|
||||
select(Date, group, all_of(metric)) %>%
|
||||
spread(Date, !!sym(metric))
|
||||
|
||||
## Data frame for creating plot
|
||||
myTable_plot <-
|
||||
myTable %>%
|
||||
select(Date, group, all_of(metric)) %>%
|
||||
group_by(Date, group) %>%
|
||||
summarise_at(vars(all_of(metric)), ~mean(., na.rm = TRUE)) %>%
|
||||
ungroup()
|
||||
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = Date, y = !!sym(metric))) +
|
||||
geom_line(colour = "grey40") +
|
||||
facet_wrap(.~group) +
|
||||
scale_fill_gradient(name="Hours", low = "white", high = "red") +
|
||||
theme_classic() +
|
||||
theme(plot.title = element_text(color = "grey40",
|
||||
face = "bold",
|
||||
size = 18),
|
||||
plot.subtitle = element_text(size = 14),
|
||||
strip.background = element_rect(color = "grey40",
|
||||
fill = "grey40"),
|
||||
strip.text = element_text(size = 10,
|
||||
colour = "#FFFFFF",
|
||||
face = "bold"),
|
||||
axis.text = element_text(size = 8, face = "bold"),
|
||||
axis.line = element_line(colour = "grey40"),
|
||||
legend.position = "right",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size = 10),
|
||||
legend.text=element_text(size = 10)) +
|
||||
labs(title = clean_nm,
|
||||
subtitle = paste("Total",
|
||||
tolower(clean_nm),
|
||||
"by",
|
||||
camel_clean(hrvar))) +
|
||||
xlab("Date") +
|
||||
ylab("Weekly hours") +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,185 @@
|
|||
#' @title Period comparison scatter plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Returns two side-by-side scatter plots representing two selected metrics,
|
||||
#' using colour to map an HR attribute and size to represent number of employees.
|
||||
#' Returns a faceted scatter plot by default, with additional options
|
||||
#' to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' This is a general purpose function that powers all the functions
|
||||
#' in the package that produce faceted scatter plots.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector,
|
||||
#' defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param metric_x Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param metric_y Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param before_start Start date of "before" time period in YYYY-MM-DD
|
||||
#' @param before_end End date of "before" time period in YYYY-MM-DD
|
||||
#' @param after_start Start date of "after" time period in YYYY-MM-DD
|
||||
#' @param after_end End date of "after" time period in YYYY-MM-DD
|
||||
#' @param before_label String to specify a label for the "before" period. Defaults to "Period 1".
|
||||
#' @param after_label String to specify a label for the "after" period. Defaults to "Period 2".
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#'
|
||||
#'create_period_scatter(sq_data,
|
||||
#'hrvar = "LevelDesignation",
|
||||
#'before_start = "2019-11-03",
|
||||
#'before_end = "2019-12-31",
|
||||
#'after_start = "2020-01-01",
|
||||
#'after_end = "2020-01-26")
|
||||
#'
|
||||
#'
|
||||
#' create_period_scatter(sq_data, before_end = "2019-12-31", return = "table")
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
create_period_scatter <- function(data,
|
||||
hrvar = "Organization",
|
||||
metric_x = "Multitasking_meeting_hours",
|
||||
metric_y = "Meeting_hours",
|
||||
before_start = min(as.Date(data$Date, "%m/%d/%Y")),
|
||||
before_end,
|
||||
after_start = as.Date(before_end) + 1,
|
||||
after_end = max(as.Date(data$Date, "%m/%d/%Y")),
|
||||
before_label = "Period 1",
|
||||
after_label = "Period 2",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
## Update these column names as per appropriate
|
||||
required_variables <- c("Date",
|
||||
hrvar,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
|
||||
daterange_1_start <- as.Date(before_start)
|
||||
daterange_1_end <- as.Date(before_end)
|
||||
daterange_2_start <- as.Date(after_start)
|
||||
daterange_2_end <- as.Date(after_end)
|
||||
|
||||
# Fix dates format for WpA Queries
|
||||
WpA_dataset <- data %>% mutate(Date = as.Date(Date, "%m/%d/%Y"))
|
||||
|
||||
# Check for dates in data file
|
||||
if (daterange_1_start < min(WpA_dataset$Date) |
|
||||
daterange_1_start > max(WpA_dataset$Date) |
|
||||
daterange_1_end < min(WpA_dataset$Date) |
|
||||
daterange_1_end > max(WpA_dataset$Date) |
|
||||
daterange_2_start < min(WpA_dataset$Date) |
|
||||
daterange_2_start > max(WpA_dataset$Date) |
|
||||
daterange_2_end < min(WpA_dataset$Date) |
|
||||
daterange_2_end > max(WpA_dataset$Date)) {
|
||||
stop('Dates not found in dataset')
|
||||
geterrmessage()
|
||||
}
|
||||
|
||||
## Employee count
|
||||
emp_count <-
|
||||
WpA_dataset %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise(n = n_distinct(PersonId))
|
||||
|
||||
data_p1 <-
|
||||
WpA_dataset %>%
|
||||
rename(group = hrvar) %>%
|
||||
filter(between(Date, daterange_1_start, daterange_1_end)) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(.)) %>%
|
||||
ungroup() %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(., na.rm = TRUE)) %>%
|
||||
mutate(Period = before_label) %>%
|
||||
left_join(emp_count, by = c(group = hrvar)) %>%
|
||||
filter(n >= mingroup)
|
||||
|
||||
data_p2 <-
|
||||
WpA_dataset %>%
|
||||
rename(group = hrvar) %>%
|
||||
filter(between(Date, daterange_2_start, daterange_2_end)) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(.)) %>%
|
||||
ungroup() %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(vars(!!sym(metric_x), !!sym(metric_y)), ~mean(., na.rm = TRUE)) %>%
|
||||
mutate(Period = after_label) %>%
|
||||
left_join(emp_count, by = c(group = hrvar)) %>%
|
||||
filter(n >= mingroup)
|
||||
|
||||
## bind data
|
||||
data_both <- rbind(data_p1, data_p2)
|
||||
|
||||
date_range_str <-
|
||||
paste("Data from",
|
||||
daterange_1_start,
|
||||
"to",
|
||||
daterange_1_end,
|
||||
"and",
|
||||
daterange_2_start,
|
||||
"to",
|
||||
daterange_2_end)
|
||||
|
||||
clean_x <- gsub(pattern = "_", replacement = " ", x = metric_x)
|
||||
clean_y <- gsub(pattern = "_", replacement = " ", x = metric_y)
|
||||
|
||||
plot_title <-
|
||||
paste(clean_x, "and", clean_y,
|
||||
"over", before_label,
|
||||
"and", after_label)
|
||||
|
||||
|
||||
plot_object <-
|
||||
data_both %>%
|
||||
ggplot(aes(x = !!sym(metric_x),
|
||||
y = !!sym(metric_y),
|
||||
colour = group,
|
||||
size = n)) +
|
||||
geom_point(alpha = 0.5) +
|
||||
scale_size(range = c(1, 20)) +
|
||||
facet_wrap(.~Period) +
|
||||
guides(size = FALSE) +
|
||||
theme(legend.position = "bottom") +
|
||||
ggtitle(plot_title,
|
||||
subtitle = paste("Weekly data by", hrvar)) +
|
||||
ylab(clean_y) +
|
||||
xlab(clean_x) +
|
||||
labs(caption = date_range_str) +
|
||||
wpa::theme_wpa_basic()
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
# return(myTable_return)
|
||||
return(data_both)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,70 @@
|
|||
#' @title Create Ranking
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with high levels of a given Workplace Analytics Metric.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by the specified metric.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "table" (default) and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
create_rank <- function(data,
|
||||
metric,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
results <-
|
||||
data %>% create_bar(
|
||||
metric = metric,
|
||||
hrvar = hrvar[1],
|
||||
mingroup = mingroup,
|
||||
return = return, bar_colour = "default")
|
||||
|
||||
results$hrvar <- ""
|
||||
|
||||
results <- results[0,]
|
||||
|
||||
for (p in hrvar) {
|
||||
table1 <-
|
||||
data %>%
|
||||
create_bar(metric = metric,
|
||||
hrvar = p,
|
||||
mingroup = mingroup,
|
||||
return = "table", bar_colour = "default")
|
||||
|
||||
table1$hrvar <- p
|
||||
|
||||
results <- rbind(results,table1)
|
||||
}
|
||||
|
||||
output <-
|
||||
results %>% arrange(desc(get(metric))) %>%
|
||||
select(hrvar, everything())
|
||||
|
||||
if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,118 @@
|
|||
#' @title Scatter plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Returns a scatter plot of two selected metrics, using colour to map
|
||||
#' an HR attribute.
|
||||
#' Returns a scatter plot by default, with additional options
|
||||
#' to return a summary table.
|
||||
#'
|
||||
#' @details
|
||||
#' This is a general purpose function that powers all the functions
|
||||
#' in the package that produce scatter plots.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric_x Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param metric_y Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' create_scatter(sq_data,
|
||||
#' "Internal_network_size",
|
||||
#' "External_network_size",
|
||||
#' "Organization")
|
||||
#'
|
||||
#' create_scatter(sq_data,
|
||||
#' "Generated_workload_call_hours",
|
||||
#' "Generated_workload_email_hours",
|
||||
#' "Organization", mingroup = 100, return = "plot")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
create_scatter <- function(data,
|
||||
metric_x = "Internal_network_size",
|
||||
metric_y = "External_network_size",
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c(hrvar,
|
||||
metric_x,
|
||||
metric_y,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Extract values violating privacy threshold
|
||||
violate_thres_chr <-
|
||||
data %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise(n = n_distinct(PersonId)) %>%
|
||||
filter(n < mingroup) %>%
|
||||
pull(!!sym(hrvar))
|
||||
|
||||
|
||||
clean_x <- gsub(pattern = "_", replacement = " ", x = metric_x)
|
||||
clean_y <- gsub(pattern = "_", replacement = " ", x = metric_y)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
filter(!(!!sym(hrvar) %in% violate_thres_chr)) %>%
|
||||
group_by(PersonId, !!sym(hrvar)) %>%
|
||||
summarise_at(vars(!!sym(metric_x),
|
||||
!!sym(metric_y)),
|
||||
~mean(.)) %>%
|
||||
ungroup()
|
||||
|
||||
plot_object <-
|
||||
myTable %>%
|
||||
ggplot(aes(x = !!sym(metric_x),
|
||||
y = !!sym(metric_y),
|
||||
colour = !!sym(hrvar))) +
|
||||
geom_point(alpha = 0.5) +
|
||||
labs(title = paste0(clean_x, " and\n", clean_y, " by ", camel_clean(hrvar)),
|
||||
caption = extract_date_range(data, return = "text")) +
|
||||
xlab(clean_x) +
|
||||
ylab(clean_y) +
|
||||
theme_wpa_basic()
|
||||
|
||||
myTable_return <-
|
||||
myTable %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise_at(vars(!!sym(metric_x),
|
||||
!!sym(metric_y)),
|
||||
~mean(.))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
return(myTable_return)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,174 @@
|
|||
#' @title Horizontal stacked bar plot (General Purpose)
|
||||
#'
|
||||
#' @description
|
||||
#' Creates a sum total calculation using selected metrics,
|
||||
#' where the typical use case is to create different definitions
|
||||
#' of collaboration hours.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metrics A character vector to specify variables to be used
|
||||
#' in calculating the "Total" value, e.g. c("Meeting_hours", "Email_hours").
|
||||
#' The order of the variable names supplied determine the order in which they
|
||||
#' appear on the stacked plot.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param stack_colours
|
||||
#' A character vector to specify the colour codes for the stacked bar charts.
|
||||
#' @param plot_title An option to override plot title.
|
||||
#' @param plot_subtitle An option to override plot subtitle.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' sq_data %>%
|
||||
#' create_stacked(hrvar = "LevelDesignation",
|
||||
#' metrics = c("Meeting_hours", "Email_hours"),
|
||||
#' return = "plot")
|
||||
#'
|
||||
#' sq_data %>%
|
||||
#' create_stacked(hrvar = "FunctionType",
|
||||
#' metrics = c("Meeting_hours",
|
||||
#' "Email_hours",
|
||||
#' "Call_hours",
|
||||
#' "Instant_Message_hours"),
|
||||
#' return = "plot")
|
||||
#'
|
||||
#' sq_data %>%
|
||||
#' create_stacked(hrvar = "FunctionType",
|
||||
#' metrics = c("Meeting_hours",
|
||||
#' "Email_hours",
|
||||
#' "Call_hours",
|
||||
#' "Instant_Message_hours"),
|
||||
#' return = "table")
|
||||
#'}
|
||||
#' @export
|
||||
create_stacked <- function(data,
|
||||
hrvar = "Organization",
|
||||
metrics = c("Meeting_hours",
|
||||
"Email_hours"),
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
stack_colours = c("#203864",
|
||||
"#5191DD",
|
||||
"darkgrey",
|
||||
"lightgrey"),
|
||||
plot_title = "Collaboration Hours",
|
||||
plot_subtitle = "Weekly collaboration hours"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
metrics,
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
n_count <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId))
|
||||
|
||||
## Person level table
|
||||
myTable <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, group, metrics) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise_at(vars(metrics), ~mean(.)) %>%
|
||||
ungroup() %>%
|
||||
mutate(Total = select(., metrics) %>% apply(1, sum)) %>%
|
||||
left_join(n_count, by = "group") %>%
|
||||
# Keep only groups above privacy threshold
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
myTableReturn <-
|
||||
myTable %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(vars(metrics, Total), ~mean(.)) %>%
|
||||
left_join(n_count, by = "group")
|
||||
|
||||
plot_table <-
|
||||
myTable %>%
|
||||
select(PersonId, group, metrics, Total) %>%
|
||||
gather(Metric, Value, -PersonId, -group)
|
||||
|
||||
totalTable <-
|
||||
plot_table %>%
|
||||
filter(Metric == "Total") %>%
|
||||
group_by(group) %>%
|
||||
summarise(Total = mean(Value))
|
||||
|
||||
myTable_legends <-
|
||||
n_count %>%
|
||||
filter(Employee_Count >= mingroup) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count)) %>%
|
||||
left_join(totalTable, by = "group")
|
||||
|
||||
## Get maximum value
|
||||
location <- max(myTable_legends$Total)
|
||||
|
||||
plot_object <-
|
||||
plot_table %>%
|
||||
filter(Metric != "Total") %>%
|
||||
mutate(Metric = factor(Metric, levels = rev(metrics))) %>%
|
||||
group_by(group, Metric) %>%
|
||||
summarise_at(vars(Value), ~mean(.)) %>%
|
||||
ggplot(aes(x = stats::reorder(group, Value, mean), y = Value, fill = Metric)) +
|
||||
geom_bar(position = "stack", stat = "identity") +
|
||||
geom_text(aes(label = round(Value, 1)),
|
||||
position = position_stack(vjust = 0.5),
|
||||
color = "#FFFFFF",
|
||||
fontface = "bold") +
|
||||
scale_y_continuous(limits = c(0, location * 1.25)) +
|
||||
annotate("text",
|
||||
x = myTable_legends$group,
|
||||
y = location * 1.15,
|
||||
label = myTable_legends$Employee_Count) +
|
||||
annotate("rect", xmin = 0.5, xmax = length(myTable_legends$group) + 0.5, ymin = location * 1.05, ymax = location * 1.25, alpha = .2) +
|
||||
scale_fill_manual(name="",
|
||||
values = stack_colours,
|
||||
breaks = metrics,
|
||||
labels = gsub("_", " ", metrics)) +
|
||||
coord_flip() +
|
||||
theme_wpa_basic() +
|
||||
labs(title = plot_title,
|
||||
subtitle = paste(plot_subtitle, "by", camel_clean(hrvar)),
|
||||
caption = extract_date_range(data, return = "text")) +
|
||||
xlab(hrvar) +
|
||||
ylab("Average weekly hours")
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTableReturn
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
#' @title Convert a numeric variable for hours into categorical
|
||||
#'
|
||||
#' @description
|
||||
#' Supply a numeric variable, e.g. `Collaboration_hours`,
|
||||
#' and the function returns a character vector
|
||||
#'
|
||||
#' @param metric A numeric variable representing hours.
|
||||
#' @param cuts A numeric variable of length 3 to represent the
|
||||
#' cut points required.
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Direct use
|
||||
#' cut_hour(1:30, cuts = c(15, 20, 25))
|
||||
#'
|
||||
#' ## Use on a query
|
||||
#' cut_hour(sq_data$Collaboration_hours, cuts = c(10, 15, 20))
|
||||
#'
|
||||
#' @export
|
||||
|
||||
cut_hour <- function(metric, cuts){
|
||||
|
||||
|
||||
label1 <- paste0("< ", cuts[1], " hours")
|
||||
label2 <- paste0(cuts[1], " - ", cuts[2], " hours")
|
||||
label3 <- paste0(cuts[2], " - ", cuts[3], " hours")
|
||||
label4 <- paste0(cuts[3], "+ hours")
|
||||
|
||||
out <-
|
||||
cut(metric,
|
||||
breaks = c(0, cuts, 100),
|
||||
include.lowest = TRUE,
|
||||
labels = c(label1,
|
||||
label2,
|
||||
label3,
|
||||
label4))
|
||||
|
||||
# out <- as.character(out)
|
||||
return(out)
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,81 @@
|
|||
#' @title Sample Standard Query dataset for Data Validation
|
||||
#'
|
||||
#' @description
|
||||
#' A dataset generated from a Standard Query from WpA.
|
||||
#'
|
||||
#' @format A data frame with 897 rows and 69 variables:
|
||||
#' \describe{
|
||||
#' \item{PersonId}{ }
|
||||
#' \item{Date}{ }
|
||||
#' \item{Workweek_span}{ }
|
||||
#' \item{Meetings_with_skip_level}{ }
|
||||
#' \item{Meeting_hours_with_skip_level}{ }
|
||||
#' \item{Generated_workload_email_hours}{ }
|
||||
#' \item{Generated_workload_email_recipients}{ }
|
||||
#' \item{Generated_workload_instant_messages_hours}{ }
|
||||
#' \item{Generated_workload_instant_messages_recipients}{ }
|
||||
#' \item{Generated_workload_call_hours}{ }
|
||||
#' \item{Generated_workload_call_participants}{ }
|
||||
#' \item{Generated_workload_calls_organized}{ }
|
||||
#' \item{External_network_size}{ }
|
||||
#' \item{Internal_network_size}{ }
|
||||
#' \item{Networking_outside_company}{ }
|
||||
#' \item{Networking_outside_organization}{ }
|
||||
#' \item{After_hours_meeting_hours}{ }
|
||||
#' \item{Open_1_hour_block}{ }
|
||||
#' \item{Open_2_hour_blocks}{ }
|
||||
#' \item{Total_focus_hours}{ }
|
||||
#' \item{Low_quality_meeting_hours}{ }
|
||||
#' \item{Total_emails_sent_during_meeting}{ }
|
||||
#' \item{Meetings}{ }
|
||||
#' \item{Meeting_hours}{ }
|
||||
#' \item{Conflicting_meeting_hours}{ }
|
||||
#' \item{Multitasking_meeting_hours}{ }
|
||||
#' \item{Redundant_meeting_hours__lower_level_}{ }
|
||||
#' \item{Redundant_meeting_hours__organizational_}{ }
|
||||
#' \item{Time_in_self_organized_meetings}{ }
|
||||
#' \item{Meeting_hours_during_working_hours}{ }
|
||||
#' \item{Generated_workload_meeting_attendees}{ }
|
||||
#' \item{Generated_workload_meeting_hours}{ }
|
||||
#' \item{Generated_workload_meetings_organized}{ }
|
||||
#' \item{Manager_coaching_hours_1_on_1}{ }
|
||||
#' \item{Meetings_with_manager}{ }
|
||||
#' \item{Meeting_hours_with_manager}{ }
|
||||
#' \item{Meetings_with_manager_1_on_1}{ }
|
||||
#' \item{Meeting_hours_with_manager_1_on_1}{ }
|
||||
#' \item{After_hours_email_hours}{ }
|
||||
#' \item{Emails_sent}{ }
|
||||
#' \item{Email_hours}{ }
|
||||
#' \item{Working_hours_email_hours}{ }
|
||||
#' \item{After_hours_instant_messages}{ }
|
||||
#' \item{Instant_messages_sent}{ }
|
||||
#' \item{Instant_Message_hours}{ }
|
||||
#' \item{Working_hours_instant_messages}{ }
|
||||
#' \item{After_hours_collaboration_hours}{ }
|
||||
#' \item{Collaboration_hours}{ }
|
||||
#' \item{Collaboration_hours_external}{ }
|
||||
#' \item{Working_hours_collaboration_hours}{ }
|
||||
#' \item{After_hours_in_calls}{ }
|
||||
#' \item{Total_calls}{ }
|
||||
#' \item{Call_hours}{ }
|
||||
#' \item{Working_hours_in_calls}{ }
|
||||
#' \item{Domain}{ }
|
||||
#' \item{FunctionType}{ }
|
||||
#' \item{LevelDesignation}{ }
|
||||
#' \item{Layer}{ }
|
||||
#' \item{Region}{ }
|
||||
#' \item{Organization}{ }
|
||||
#' \item{zId}{ }
|
||||
#' \item{attainment}{ }
|
||||
#' \item{TimeZone}{ }
|
||||
#' \item{HourlyRate}{ }
|
||||
#' \item{IsInternal}{ }
|
||||
#' \item{IsActive}{ }
|
||||
#' \item{HireDate}{ }
|
||||
#' \item{WorkingStartTimeSetInOutlook}{ }
|
||||
#' \item{WorkingEndTimeSetInOutlook}{ }
|
||||
#'
|
||||
#' ...
|
||||
#' }
|
||||
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
|
||||
"dv_data"
|
|
@ -0,0 +1,71 @@
|
|||
#' @title Sample Hourly Collaboration data with Email and IMs
|
||||
#'
|
||||
#' @description
|
||||
#' A sample dataset generated from an Hourly Collaboration query
|
||||
#' from the WpA demo tenant. The data is grouped by week and contains
|
||||
#' columns for both IMs sent and Emails sent.
|
||||
#'
|
||||
#' @format A data frame with 41567 rows and 56 variables:
|
||||
#' \describe{
|
||||
#' \item{PersonId}{ }
|
||||
#' \item{Date}{ }
|
||||
#' \item{IMs_sent_23_24}{ }
|
||||
#' \item{IMs_sent_22_23}{ }
|
||||
#' \item{IMs_sent_21_22}{ }
|
||||
#' \item{IMs_sent_20_21}{ }
|
||||
#' \item{IMs_sent_19_20}{ }
|
||||
#' \item{IMs_sent_18_19}{ }
|
||||
#' \item{IMs_sent_17_18}{ }
|
||||
#' \item{IMs_sent_16_17}{ }
|
||||
#' \item{IMs_sent_15_16}{ }
|
||||
#' \item{IMs_sent_14_15}{ }
|
||||
#' \item{IMs_sent_13_14}{ }
|
||||
#' \item{IMs_sent_12_13}{ }
|
||||
#' \item{IMs_sent_11_12}{ }
|
||||
#' \item{IMs_sent_10_11}{ }
|
||||
#' \item{IMs_sent_09_10}{ }
|
||||
#' \item{IMs_sent_08_09}{ }
|
||||
#' \item{IMs_sent_07_08}{ }
|
||||
#' \item{IMs_sent_06_07}{ }
|
||||
#' \item{IMs_sent_05_06}{ }
|
||||
#' \item{IMs_sent_04_05}{ }
|
||||
#' \item{IMs_sent_03_04}{ }
|
||||
#' \item{IMs_sent_02_03}{ }
|
||||
#' \item{IMs_sent_01_02}{ }
|
||||
#' \item{IMs_sent_00_01}{ }
|
||||
#' \item{Emails_sent_23_24}{ }
|
||||
#' \item{Emails_sent_22_23}{ }
|
||||
#' \item{Emails_sent_21_22}{ }
|
||||
#' \item{Emails_sent_20_21}{ }
|
||||
#' \item{Emails_sent_19_20}{ }
|
||||
#' \item{Emails_sent_18_19}{ }
|
||||
#' \item{Emails_sent_17_18}{ }
|
||||
#' \item{Emails_sent_16_17}{ }
|
||||
#' \item{Emails_sent_15_16}{ }
|
||||
#' \item{Emails_sent_14_15}{ }
|
||||
#' \item{Emails_sent_13_14}{ }
|
||||
#' \item{Emails_sent_12_13}{ }
|
||||
#' \item{Emails_sent_11_12}{ }
|
||||
#' \item{Emails_sent_10_11}{ }
|
||||
#' \item{Emails_sent_09_10}{ }
|
||||
#' \item{Emails_sent_08_09}{ }
|
||||
#' \item{Emails_sent_07_08}{ }
|
||||
#' \item{Emails_sent_06_07}{ }
|
||||
#' \item{Emails_sent_05_06}{ }
|
||||
#' \item{Emails_sent_04_05}{ }
|
||||
#' \item{Emails_sent_03_04}{ }
|
||||
#' \item{Emails_sent_02_03}{ }
|
||||
#' \item{Emails_sent_01_02}{ }
|
||||
#' \item{Emails_sent_00_01}{ }
|
||||
#' \item{Domain}{ }
|
||||
#' \item{LevelDesignation}{ }
|
||||
#' \item{Layer}{ }
|
||||
#' \item{Organization}{ }
|
||||
#' \item{IsInternal}{ }
|
||||
#' \item{IsActive}{ }
|
||||
#'
|
||||
#' ...
|
||||
#' }
|
||||
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
|
||||
"em_data"
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
#' @title Email Hours distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze Email Hours distribution.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @inheritParams create_dist
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Return a plot
|
||||
#' email_dist(sq_data, hrvar = "Organization")
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' email_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#'
|
||||
#' ## Return result with a custom specified breaks
|
||||
#' email_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
|
||||
#'
|
||||
#' @export
|
||||
|
||||
email_dist <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
cut = c(5, 10, 15)) {
|
||||
|
||||
create_dist(data = data,
|
||||
metric = "Email_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
cut = cut)
|
||||
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
#' @title Distribution of Email Hours (Fizzy Drink plot)
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze weekly email hours distribution, and returns
|
||||
#' a 'fizzy' scatter plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @inheritParams create_fizz
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @examples
|
||||
#' email_fizz(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
email_fizz <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
create_fizz(data = data,
|
||||
metric = "Email_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
|
||||
}
|
|
@ -0,0 +1,37 @@
|
|||
#' @title Email Time Trend - Line Chart
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of email time, visualised as line charts.
|
||||
#' By default returns a line chart for email hours,
|
||||
#' with a separate panel per value in the HR attribute.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @inheritParams create_line
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @examples
|
||||
#'
|
||||
#' ## Return a line plot
|
||||
#' email_line(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' email_line(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
email_line <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
## Inherit arguments
|
||||
create_line(data = data,
|
||||
metric = "Email_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Email Hours Ranking
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with high levels of 'Weekly Email Collaboration'.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "html" (default, returning an interactive DataTable)
|
||||
#' and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
email_rank <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
output <-
|
||||
data %>% create_rank(metric="Email_hours", hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = "table")
|
||||
|
||||
if(return == "html"){
|
||||
return(create_dt(output))
|
||||
} else if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,45 @@
|
|||
#' @title Email Summary
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of weekly email hours.
|
||||
#' Returns a bar plot showing average weekly email hours by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @inheritParams create_bar
|
||||
#'
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' # Return a ggplot bar chart
|
||||
#' email_summary(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#' # Return a summary table
|
||||
#' email_summary(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#' @export
|
||||
|
||||
email_summary <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
create_bar(data = data,
|
||||
metric = "Email_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
bar_colour = "darkblue")
|
||||
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,99 @@
|
|||
#' Email Hours Time Trend
|
||||
#'
|
||||
#' Provides a week by week view of email time.
|
||||
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Emails
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
email_trend <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
"Email_hours",
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, Email_hours) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count = mean(Employee_Count),
|
||||
Email_hours = mean(Email_hours))
|
||||
|
||||
myTable_plot <- myTable %>% select(Date, group, Email_hours)
|
||||
|
||||
myTable_return <- myTable_plot %>% spread(Date, Email_hours)
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = Date , y = group , fill = Email_hours)) +
|
||||
geom_tile(height=.5) +
|
||||
scale_fill_gradient(name="Hours", low = "white", high = "red") +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "right",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = "Email Hours",
|
||||
subtitle = paste("Total email time by", tolower(hrvar))) +
|
||||
xlab("Date") +
|
||||
ylab(hrvar) +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,76 @@
|
|||
#' @title Export {wpa} outputs to CSV, clipboard, or save as images
|
||||
#'
|
||||
#' @description
|
||||
#' A general use function to export {wpa} outputs to CSV,
|
||||
#' clipboard, or save as images. By default, `export()` copies
|
||||
#' a data frame to the clipboard. If the input is a ggplot object,
|
||||
#' the default behaviour is to export a PNG.
|
||||
#'
|
||||
#' @param x Data frame or ggplot object to be passed through.
|
||||
#' @param method Character string specifying the method of export.
|
||||
#' Valid inputs include "clipboard" (default), "csv", "png", and "svg".
|
||||
#' @param path If exporting a file, enter the path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "Analysis/SQ Overview".
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name. Defaults to TRUE.
|
||||
#' @param width Width of the plot
|
||||
#' @param height Height of the plot
|
||||
#'
|
||||
#' @importFrom utils write.csv
|
||||
#'
|
||||
#' @export
|
||||
|
||||
export <- function(x,
|
||||
method = "clipboard",
|
||||
path = "wpa export",
|
||||
timestamp = TRUE,
|
||||
width = 12,
|
||||
height = 9){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
## Force method to png if is.ggplot and method not appropriate
|
||||
if(is.ggplot(x) & method %in% c("clipboard", "csv")){
|
||||
message("Input is a ggplot object. Defaulted to exporting as PNG...")
|
||||
method <- "png"
|
||||
}
|
||||
|
||||
## Main export function
|
||||
if(method == "clipboard"){
|
||||
copy_df(x)
|
||||
message(c("Data frame copied to clipboard.\n",
|
||||
"You may paste the contents directly to Excel."))
|
||||
|
||||
## Export option: CSV
|
||||
} else if(method == "csv"){
|
||||
|
||||
newpath <- paste0(newpath, ".csv")
|
||||
|
||||
write.csv(x = x, file = newpath)
|
||||
|
||||
## Export option: PNG
|
||||
} else if(method == "png"){
|
||||
|
||||
newpath <- paste0(newpath, ".png")
|
||||
|
||||
ggsave(filename = newpath, plot = x, width = width, height = height)
|
||||
|
||||
|
||||
## Export option: SVG
|
||||
} else if(method == "svg"){
|
||||
|
||||
newpath <- paste0(newpath, ".svg")
|
||||
|
||||
ggsave(filename = newpath, plot = x, width = width, height = height)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please check inputs. Enter `?export` for more details.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,111 @@
|
|||
#' @title Plot the external network metrics for a HR variable
|
||||
#'
|
||||
#' @description
|
||||
#' Plot the external network metrics for a HR variable.
|
||||
#'
|
||||
#' @param data Person Query as a dataframe including date column named "Date"
|
||||
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
|
||||
#' @param hrvar WpA variable for an HR variable to group networks by
|
||||
#' For example, "Layer"
|
||||
#' @param mingroup Numeric vector for minimum group size for aggregation
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param bubble_size A numeric vector of length two to specify the size range of the bubbles
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import reshape2
|
||||
#' @import ggplot2
|
||||
#' @import ggrepel
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{external_network_plot(sq_data)
|
||||
#
|
||||
#' }
|
||||
#'
|
||||
#' @family Connectivity
|
||||
#'
|
||||
#' @export
|
||||
|
||||
external_network_plot <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
bubble_size = c(1, 5)){
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar))
|
||||
|
||||
plot_data <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Ext_network_size = mean(External_network_size),
|
||||
Ext_network_breadth = mean(Networking_outside_company),
|
||||
Employee_count = n_distinct(PersonId)
|
||||
) %>%
|
||||
filter(Employee_count >= mingroup)
|
||||
|
||||
summary_table <- plot_data %>% arrange(., desc(Ext_network_size))
|
||||
|
||||
ext_network_plot <- ggplot(plot_data,
|
||||
aes(x=Ext_network_size,
|
||||
y=Ext_network_breadth)
|
||||
) +
|
||||
geom_point(aes(size=Employee_count),
|
||||
color = rgb2hex(0, 120, 212),
|
||||
alpha = 0.5) +
|
||||
geom_text_repel(label=plot_data$group) +
|
||||
scale_x_continuous(name = "External Network Size") +
|
||||
scale_y_continuous(name = "External Network Breadth") +
|
||||
scale_size(range = bubble_size) +
|
||||
theme_classic() +
|
||||
theme(
|
||||
axis.text = element_text(size = 10),
|
||||
axis.text.x = element_text(
|
||||
angle = 90,
|
||||
hjust = 1,
|
||||
vjust = 0.5
|
||||
),
|
||||
plot.title = element_text(
|
||||
color = "grey40",
|
||||
face = "bold",
|
||||
size = 18
|
||||
),
|
||||
plot.subtitle = element_text(size = 14)
|
||||
) +
|
||||
labs(
|
||||
size = "Size"
|
||||
) +
|
||||
labs(
|
||||
title = paste("External network metrics by", hrvar),
|
||||
subtitle = paste(
|
||||
"Network size is number of people, breadth is number of organizations"
|
||||
)
|
||||
) +
|
||||
labs(
|
||||
caption = paste(
|
||||
"Total employees =",
|
||||
sum(plot_data$Employee_count),
|
||||
"| Data from",
|
||||
min(as.Date(data$Date, "%m/%d/%Y")),
|
||||
"to",
|
||||
max(as.Date(data$Date, "%m/%d/%Y"))
|
||||
)
|
||||
)
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
summary_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(ext_network_plot)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
#' @title Extract HR attribute variables
|
||||
#'
|
||||
#' @description
|
||||
#' This function uses a combination of variable class,
|
||||
#' number of unique values, and regular expression matching
|
||||
#' to extract HR / organisational attributes from a data frame.
|
||||
#'
|
||||
#' @param data A data frame to be passed through.
|
||||
#' @param max_unique A numeric value representing the maximum
|
||||
#' number of unique values to accept for an HR attribute. Defaults to 50.
|
||||
#' @param return Character string with values "names" (default)
|
||||
#' or "vars" to determine whether the function returns a character
|
||||
#' vector of variable names or a data frame containing the HR
|
||||
#' attributes.
|
||||
#'
|
||||
#' @family General
|
||||
#'
|
||||
#' @examples
|
||||
#' sq_data %>% extract_hr(return = "names")
|
||||
#'
|
||||
#' sq_data %>% extract_hr(return = "vars")
|
||||
#'
|
||||
#' @export
|
||||
extract_hr <- function(data,
|
||||
max_unique = 50,
|
||||
return = "names"){
|
||||
|
||||
hr_var <-
|
||||
data %>%
|
||||
dplyr::select_if(~(is.character(.) | is.logical(.) | is.factor(.))) %>%
|
||||
dplyr::select_if(~(dplyr::n_distinct(.) < max_unique)) %>%
|
||||
dplyr::select_if(~!all(is_date_format(.))) %>%
|
||||
names() %>%
|
||||
.[.!= "WorkingStartTimeSetInOutlook"] %>%
|
||||
.[.!= "WorkingEndTimeSetInOutlook"] %>%
|
||||
.[.!= "WorkingDaysSetInOutlook"]
|
||||
|
||||
if(return == "names"){
|
||||
return(hr_var)
|
||||
} else if(return == "vars"){
|
||||
return(dplyr::select(data, tidyselect::all_of(hr_var)))
|
||||
} else {
|
||||
stop("Invalid input for return")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,80 @@
|
|||
#' @title Flag unusual high collaboration hours to after-hours collaboration hours ratio
|
||||
#'
|
||||
#' @description This function flags persons who have an unusual ratio
|
||||
#' of collaboration hours to after-hours collaboration hours.
|
||||
#' Returns a character string by default.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @param data A data frame containing a Person Query.
|
||||
#' @param threshold Numeric value specifying the threshold for flagging. Defaults to 30.
|
||||
#' @param return String to specify what to return. Options include "message" for console return, and "text" for string return.
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @examples
|
||||
#' flag_ch_ratio(sq_data)
|
||||
#'
|
||||
#' \dontrun{
|
||||
#' tibble(PersonId = c("Alice", "Bob"),
|
||||
#' Collaboration_hours = c(30, 0.5),
|
||||
#' After_hours_collaboration_hours = c(0.5, 30)) %>%
|
||||
#' flag_ch_ratio()
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
flag_ch_ratio <- function(data, threshold = c(1, 30), return = "message"){
|
||||
|
||||
min_thres <- min(threshold, na.rm = TRUE)
|
||||
max_thres <- max(threshold, na.rm = TRUE)
|
||||
|
||||
## Check for high collab hours but lower afterhour collab hours
|
||||
## Because of faulty outlook settings
|
||||
ch_summary <-
|
||||
data %>%
|
||||
group_by(PersonId) %>%
|
||||
summarise_at(vars(Collaboration_hours, After_hours_collaboration_hours), ~mean(.)) %>%
|
||||
mutate(CH_ratio = Collaboration_hours / After_hours_collaboration_hours) %>%
|
||||
arrange(desc(CH_ratio)) %>%
|
||||
mutate(CH_FlagLow = ifelse(CH_ratio < min_thres, TRUE, FALSE),
|
||||
CH_FlagHigh = ifelse(CH_ratio > max_thres, TRUE, FALSE),
|
||||
CH_Flag = ifelse(CH_ratio > max_thres | CH_ratio < min_thres, TRUE, FALSE))
|
||||
|
||||
## Percent of people with high collab hours + low afterhour collab hours
|
||||
CHFlagN <- sum(ch_summary$CH_Flag, na.rm = TRUE)
|
||||
CHFlagProp <- mean(ch_summary$CH_Flag, na.rm = TRUE)
|
||||
CHFlagProp2 <- paste(round(CHFlagProp * 100), "%") # Formatted
|
||||
|
||||
CHFlagMessage_Warning <- paste0("[Warning] The ratio of after-hours collaboration to total collaboration hours is outside the expected threshold for ", CHFlagN, " employees (", CHFlagProp2, " of the total).")
|
||||
CHFlagMessage_Pass_Low <- paste0("[Pass] The ratio of after-hours collaboration to total collaboration hours is outside the expected threshold for only ", CHFlagN, " employees (", CHFlagProp2, " of the total).")
|
||||
CHFlagMessage_Pass_Zero <- paste0("[Pass] The ratio of after-hours collaboration to total collaboration hours falls within the expected threshold for all employees.")
|
||||
|
||||
|
||||
CHFlagLowN <- sum(ch_summary$CH_FlagLow, na.rm = TRUE)
|
||||
CHFlagLowProp <- mean(ch_summary$CH_FlagLow, na.rm = TRUE)
|
||||
CHFlagLowProp2 <- paste(round(CHFlagLowProp * 100), "%") # Formatted
|
||||
CHFlagLowMessage <- paste0("- ", CHFlagLowN, " employees (", CHFlagLowProp2,
|
||||
") have an unusually low after-hours collaboration")
|
||||
|
||||
CHFlagHighN <- sum(ch_summary$CH_FlagHigh, na.rm = TRUE)
|
||||
CHFlagHighProp <- mean(ch_summary$CH_FlagHigh, na.rm = TRUE)
|
||||
CHFlagHighProp2 <- paste(round(CHFlagHighProp * 100), "%") # Formatted
|
||||
CHFlagHighMessage <- paste0("- ", CHFlagHighN, " employees (", CHFlagHighProp2 , ") have an unusually high after-hours collaboration (relative to weekly collaboration hours)")
|
||||
|
||||
if(CHFlagProp >= .05){
|
||||
CHFlagMessage <- paste(CHFlagMessage_Warning, CHFlagHighMessage, CHFlagLowMessage, sep = "\n")
|
||||
} else if(CHFlagProp < .05 & CHFlagProp2 > 0){
|
||||
CHFlagMessage <- paste(CHFlagMessage_Pass_Low, CHFlagHighMessage, CHFlagLowMessage, sep = "\n")
|
||||
} else if(CHFlagProp==0){
|
||||
CHFlagMessage <- CHFlagMessage_Pass_Zero
|
||||
}
|
||||
|
||||
## Print diagnosis
|
||||
## Should implement options to return the PersonIds or a full data frame
|
||||
if(return == "message"){
|
||||
message(CHFlagMessage)
|
||||
} else if(return == "text"){
|
||||
CHFlagMessage
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,43 @@
|
|||
#' @title Flag unusual high Email hours to emails sent ratio
|
||||
#'
|
||||
#' @description This function flags persons who have an unusual ratio
|
||||
#' of email hours to emails sent.
|
||||
#' Returns a character string by default.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @param data A data frame containing a Person Query.
|
||||
#' @param threshold Numeric value specifying the threshold for flagging. Defaults to 1.
|
||||
#'
|
||||
#' @examples
|
||||
#' flag_em_ratio(sq_data)
|
||||
#'
|
||||
#' @export
|
||||
flag_em_ratio <- function(data, threshold = 1){
|
||||
|
||||
## Check for high collab hours but lower afterhour collab hours
|
||||
## Because of faulty outlook settings
|
||||
em_summary <-
|
||||
data %>%
|
||||
group_by(PersonId) %>%
|
||||
summarise_at(vars(Email_hours, Emails_sent), ~mean(.)) %>%
|
||||
mutate(Email_ratio = Email_hours / Emails_sent) %>%
|
||||
arrange(desc(Email_ratio)) %>%
|
||||
mutate(Email_Flag = ifelse(Email_ratio > threshold, TRUE, FALSE))
|
||||
|
||||
## Percent of people with high collab hours + low afterhour collab hours
|
||||
EmailFlagN <- sum(em_summary$Email_Flag, na.rm = TRUE)
|
||||
EmailFlagProp <- mean(em_summary$Email_Flag, na.rm = TRUE)
|
||||
EmailFlagProp2 <- paste(round(EmailFlagProp * 100), "%") # Formatted
|
||||
EmailFlagMessage <- paste0(EmailFlagProp2, " (", EmailFlagN, ") ",
|
||||
"of the population have an unusually high email hours to emails sent ratio.")
|
||||
|
||||
## Print diagnosis
|
||||
## Should implement options to return the PersonIds or a full data frame
|
||||
# EmailFlagMessage
|
||||
em_summary
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,88 @@
|
|||
#' @title Warn if a certain metric exceeds an arbitrary threshold
|
||||
#'
|
||||
#' @description
|
||||
#' This is used as part of data validation to check if there are extreme values
|
||||
#' in the dataset.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param metric A character string specifying the metric to test.
|
||||
#' @param person A logical value to specify whether to calculate person-averages.
|
||||
#' Defaults to TRUE (person-averages calculated).
|
||||
#' @param threshold Numeric value specifying the threshold for flagging.
|
||||
#' @param return A character string specifying what to return.
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' # The threshold values are intentionally set low to trigger messages.
|
||||
#' flag_extreme(sq_data, "Email_hours", threshold = 15)
|
||||
#' flag_extreme(sq_data, "Email_hours", threshold = 15, return = "table")
|
||||
#' flag_extreme(sq_data, "Email_hours", person = FALSE, threshold = 15)
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
flag_extreme <- function(data,
|
||||
metric,
|
||||
person = TRUE,
|
||||
threshold,
|
||||
return = "message"){
|
||||
|
||||
## Data frame containing the extreme values
|
||||
if(person == TRUE){
|
||||
extreme_df <-
|
||||
data %>%
|
||||
rename(metric = !!sym(metric)) %>%
|
||||
group_by(PersonId) %>%
|
||||
summarise_at(vars(metric), ~mean(.)) %>%
|
||||
filter(metric > threshold) %>%
|
||||
rename(!!sym(metric) := "metric")
|
||||
} else if(person == FALSE){
|
||||
extreme_df <-
|
||||
data %>%
|
||||
rename(metric = !!sym(metric)) %>%
|
||||
filter(metric > threshold) %>%
|
||||
rename(!!sym(metric) := "metric")
|
||||
}
|
||||
|
||||
|
||||
## Clean names for pretty printing
|
||||
metric_nm <- gsub(pattern = "_", replacement = " ", x = metric)
|
||||
metric_nm <- camel_clean(metric_nm)
|
||||
|
||||
## Define MessageLevel
|
||||
if(person == TRUE){
|
||||
MessageLevel <- " persons where their average "
|
||||
} else if(person == FALSE){
|
||||
MessageLevel <- " rows where their value of "
|
||||
}
|
||||
|
||||
## Define FlagMessage
|
||||
|
||||
if(nrow(extreme_df) == 0){
|
||||
FlagMessage <-
|
||||
paste0("[Pass] There are no ",
|
||||
MessageLevel,
|
||||
metric_nm,
|
||||
" exceeds ",
|
||||
threshold, ".")
|
||||
} else {
|
||||
FlagMessage <-
|
||||
paste0("[Warning] There are ",
|
||||
nrow(extreme_df),
|
||||
MessageLevel,
|
||||
metric_nm,
|
||||
" exceeds ",
|
||||
threshold, ".")
|
||||
}
|
||||
|
||||
if(return == "text"){
|
||||
FlagMessage
|
||||
} else if(return == "message"){
|
||||
message(FlagMessage)
|
||||
} else if(return == "table"){
|
||||
extreme_df
|
||||
}
|
||||
}
|
|
@ -0,0 +1,105 @@
|
|||
#' @title Flag unusual outlook time settings for work day start and end time
|
||||
#'
|
||||
#' @description This function flags unusual outlook calendar settings for
|
||||
#' start and end time of work day.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @param data A data frame containing a Person Query.
|
||||
#' @param threshold A numeric vector of length two, specifying the hour threshold for flagging.
|
||||
#' Defaults to c(4, 15).
|
||||
#' @param return String to specify what to return.
|
||||
#' Valid options include "text" (default), "message", and "data".
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' flag_outlooktime(sq_data)
|
||||
#' }
|
||||
#' @export
|
||||
flag_outlooktime <- function(data, threshold = c(4, 15), return = "message"){
|
||||
|
||||
# pad_times <- function(x){
|
||||
# if(nchar(x) == 1){
|
||||
# x <- paste0("0", x, "00")
|
||||
# } else if(nchar(x) == 2){
|
||||
# x <- paste0(x, "00")
|
||||
# } else if(nchar(x) == 3){
|
||||
# x <- paste0("0", x)
|
||||
# } else {
|
||||
# x
|
||||
# }
|
||||
# }
|
||||
#
|
||||
# pad_times <- Vectorize(pad_times)
|
||||
|
||||
if(any(nchar(data$WorkingStartTimeSetInOutlook) != 5 | nchar(data$WorkingEndTimeSetInOutlook) != 5)){
|
||||
stop("Please check data format for `WorkingStartTimeSetInOutlook` or `WorkingEndTimeSetInOutlook.\n
|
||||
These variables must be character vectors, and have the format `%H:%M`, such as `07:30` or `23:00`.")
|
||||
}
|
||||
|
||||
clean_times <- function(x){
|
||||
out <- gsub(pattern = ":", replacement = "", x = x)
|
||||
# out <- pad_times(out)
|
||||
strptime(out, format = "%H%M")
|
||||
}
|
||||
|
||||
flagged_data <-
|
||||
data %>%
|
||||
# mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~clean_times(.)) %>%
|
||||
mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~gsub(pattern = ":", replacement = "", x = .)) %>%
|
||||
mutate_at(vars(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook), ~strptime(., format = "%H%M")) %>%
|
||||
mutate(WorkdayRange = as.numeric(WorkingEndTimeSetInOutlook - WorkingStartTimeSetInOutlook, units = "hours"),
|
||||
WorkdayFlag1 = WorkdayRange < threshold[[1]],
|
||||
WorkdayFlag2 = WorkdayRange > threshold[[2]],
|
||||
WorkdayFlag = WorkdayRange < threshold[[1]] | WorkdayRange > threshold[[2]]) %>%
|
||||
select(PersonId, WorkdayRange, WorkdayFlag, WorkdayFlag1, WorkdayFlag2)
|
||||
|
||||
## Short working hour settings
|
||||
FlagN1 <- sum(flagged_data$WorkdayFlag1, na.rm = TRUE)
|
||||
FlagProp1 <- mean(flagged_data$WorkdayFlag1, na.rm = TRUE)
|
||||
FlagProp1F <- paste0(round(FlagProp1 * 100, 1), "%") # Formatted
|
||||
|
||||
## Long working hour settings
|
||||
FlagN2 <- sum(flagged_data$WorkdayFlag2, na.rm = TRUE)
|
||||
FlagProp2 <- mean(flagged_data$WorkdayFlag2, na.rm = TRUE)
|
||||
FlagProp2F <- paste0(round(FlagProp2 * 100, 1), "%") # Formatted
|
||||
|
||||
## Short or long working hoursettings
|
||||
FlagN <- sum(flagged_data$WorkdayFlag, na.rm = TRUE)
|
||||
FlagProp <- mean(flagged_data$WorkdayFlag, na.rm = TRUE)
|
||||
FlagPropF <- paste0(round(FlagProp * 100, 1), "%") # Formatted
|
||||
|
||||
## Flag Messages
|
||||
Warning_Message <- paste0("[Warning] ", FlagPropF, " (", FlagN, ") ", "of the person-date rows in the data have extreme Outlook settings.")
|
||||
Pass_Message1 <- paste0("[Pass] Only ", FlagPropF, " (", FlagN, ") ", "of the person-date rows in the data have extreme Outlook settings.")
|
||||
Pass_Message2 <- paste0("There are no extreme Outlook settings in this dataset (Working hours shorter than ", threshold[[1]], " hours, or longer than ", threshold[[2]], " hours.")
|
||||
Detail_Message <- paste0(FlagProp1F, " (", FlagN1, ") ", " have an Outlook workday shorter than ", threshold[[1]], " hours, while ",
|
||||
FlagProp2F, " (", FlagN2, ") ", "have a workday longer than ", threshold[[2]], " hours.")
|
||||
|
||||
|
||||
if(FlagProp >= .05){
|
||||
FlagMessage <- paste(Warning_Message, Detail_Message, sep = "\n")
|
||||
|
||||
} else if(FlagProp < .05 & FlagProp > 0){
|
||||
FlagMessage <- paste(Pass_Message1, Detail_Message, sep = "\n")
|
||||
|
||||
} else if(FlagProp==0){
|
||||
FlagMessage <- Pass_Message2
|
||||
}
|
||||
|
||||
## Print diagnosis
|
||||
## Should implement options to return the PersonIds or a full data frame
|
||||
if(return == "text"){
|
||||
FlagMessage
|
||||
} else if(return == "message"){
|
||||
message(FlagMessage)
|
||||
} else if(return == "data"){
|
||||
flagged_data[flagged_data$WorkdayFlag == TRUE,]
|
||||
} else {
|
||||
stop("Error: please check inputs for `return`")
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
#' @title Sample Group-to-Group dataset
|
||||
#'
|
||||
#' @description
|
||||
#' A dataset generated from a Group-to-Group Query from WpA.
|
||||
#'
|
||||
#' @format A data frame with 3517 rows and 7 variables:
|
||||
#' \describe{
|
||||
#' \item{TimeInvestors_Organization}{ }
|
||||
#' \item{Collaborators_Organization}{ }
|
||||
#' \item{Date}{ }
|
||||
#' \item{Meetings}{ }
|
||||
#' \item{Meeting_hours}{ }
|
||||
#' \item{Email_hours}{ }
|
||||
#' \item{Collaboration_hours}{ }
|
||||
#'
|
||||
#' ...
|
||||
#' }
|
||||
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
|
||||
"g2g_data"
|
|
@ -0,0 +1,93 @@
|
|||
#' @title Create a network plot with the G2G query
|
||||
#'
|
||||
#' @description
|
||||
#' Pass a data frame containing a G2G query and return a network
|
||||
#' plot.
|
||||
#' Automatically handles "Collaborators_within_group" and "Other_collaborators" within query data.
|
||||
#'
|
||||
#' @param data Data frame containing a G2G query.
|
||||
#' @param time_investor String containing the variable name for the Time Investor column.
|
||||
#' @param collaborator String containing the variable name for the Collaborator column.
|
||||
#' @param metric String containing the variable name for metric.
|
||||
#' @param exc_threshold Exclusion threshold to apply.
|
||||
#' @param subtitle String to override default plot subtitle.
|
||||
#' @param ... Additional arguments to pass to `GGally::ggnet2()`
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' g2g_data %>%
|
||||
#' g2g_network(time_investor = "TimeInvestors_Organization",
|
||||
#' collaborator = "Collaborators_Organization",
|
||||
#' metric = "Collaboration_hours")
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
g2g_network <- function(data,
|
||||
time_investor,
|
||||
collaborator,
|
||||
metric,
|
||||
exc_threshold = 0.1,
|
||||
subtitle = "Collaboration Across Organizations",
|
||||
return = "plot",
|
||||
...){
|
||||
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(TimeInvestorOrg = time_investor,
|
||||
CollaboratorOrg = collaborator,
|
||||
Metric = metric) %>%
|
||||
mutate(CollaboratorOrg = case_when(CollaboratorOrg == "Collaborators Within Group" ~ TimeInvestorOrg,
|
||||
TRUE ~ CollaboratorOrg)) %>%
|
||||
group_by(TimeInvestorOrg, CollaboratorOrg) %>%
|
||||
filter(TimeInvestorOrg != "Other_Collaborators" &
|
||||
CollaboratorOrg!="Other_Collaborators") %>%
|
||||
summarise_at("Metric", ~mean(.)) %>%
|
||||
group_by(TimeInvestorOrg) %>%
|
||||
mutate(metric_prop = Metric / sum(Metric, na.rm = TRUE)) %>%
|
||||
select(TimeInvestorOrg, CollaboratorOrg, metric_prop) %>%
|
||||
ungroup()
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
plot_data
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
mynet_em <-
|
||||
plot_data %>%
|
||||
filter(metric_prop > exc_threshold) %>%
|
||||
mutate_at(vars(TimeInvestorOrg, CollaboratorOrg), ~sub(pattern = " ", replacement = "\n", x = .)) %>%
|
||||
mutate(metric_prop = metric_prop * 10) %>%
|
||||
network::network(matrix.type = "edgelist",
|
||||
ignore.eval = FALSE,
|
||||
names.eval = "weights")
|
||||
|
||||
mynet_em %>%
|
||||
GGally::ggnet2(size = 12,
|
||||
color = "lightblue",
|
||||
label = TRUE,
|
||||
label.size = 4,
|
||||
label.color = "black",
|
||||
edge.size = "weights",
|
||||
edge.alpha = .5,
|
||||
...) +
|
||||
ggtitle("Group to Group Collaboration",
|
||||
subtitle = subtitle) +
|
||||
xlab(label = "") +
|
||||
ylab(label = "") +
|
||||
theme_wpa_basic() +
|
||||
labs(caption = paste("Displays only collaboration above ", exc_threshold * 100, "% of node's total collaboration", sep = "")) +
|
||||
theme(axis.line = element_blank())
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,196 @@
|
|||
#' @title Generate HTML report with list inputs
|
||||
#'
|
||||
#' @description
|
||||
#' This is a support function using a list-pmap workflow to
|
||||
#' create a HTML document, using RMarkdown as the engine.
|
||||
#'
|
||||
#' @param title Character string to specify the title of the chunk.
|
||||
#' @param filename File name to be used in the exported HTML.
|
||||
#' @param outputs A list of outputs to be added to the HTML report.
|
||||
#' Note that `outputs`, `titles`, `echos`, and `levels` must have the same length
|
||||
#' @param titles A list/vector of character strings to specify the title of the chunks.
|
||||
#' @param subheaders A list/vector of character strings to specify the subheaders for each chunk.
|
||||
#' @param echos A list/vector of logical values to specify whether to display code.
|
||||
#' @param levels A list/vector of numeric value to specify the header level of the chunk.
|
||||
#' @param theme Character vector to specify theme to be used for the report.
|
||||
#' E.g. "united", "default".
|
||||
#' @param preamble A preamble to appear at the beginning of the report, passed as a text string.
|
||||
#'
|
||||
#' @importFrom purrr pmap
|
||||
#' @importFrom purrr reduce
|
||||
#'
|
||||
#' @family Reports
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' # Step 1: Define Content
|
||||
#' output_list <-
|
||||
#' list(sq_data %>% workloads_summary(return = "plot"),
|
||||
#' sq_data %>%
|
||||
#' workloads_summary(return = "table")) %>%
|
||||
#' purrr::map_if(is.data.frame, create_dt)
|
||||
#'
|
||||
#' # Step 2: Add Corresponding Titles
|
||||
#' title_list <- c("Workloads Summary - Plot", "Workloads Summary - Table")
|
||||
#' n_title <- length(title_list)
|
||||
#'
|
||||
#'# Step 3: Generate Report
|
||||
#' generate_report(title = "My First Report",
|
||||
#' filename = "My First Report",
|
||||
#' outputs = output_list,
|
||||
#' titles = title_list,
|
||||
#' subheaders = rep("", n_title),
|
||||
#' echos = rep(FALSE, n_title
|
||||
#'
|
||||
#' }
|
||||
#' @export
|
||||
generate_report <- function(title = "My minimal HTML generator",
|
||||
filename = "minimal_html",
|
||||
outputs = output_list,
|
||||
titles,
|
||||
subheaders,
|
||||
echos,
|
||||
levels,
|
||||
theme = "united",
|
||||
preamble = ""){
|
||||
|
||||
|
||||
|
||||
## Title of document
|
||||
title_chr <- paste0('title: \"', title, '\"')
|
||||
|
||||
|
||||
## chunk loopage
|
||||
## merged to create `chunk_merged`
|
||||
chunk_merged <-
|
||||
list(output = outputs,
|
||||
title = titles,
|
||||
subheader = subheaders,
|
||||
echo = echos,
|
||||
level = levels,
|
||||
id = seq(1, length(outputs))) %>%
|
||||
purrr::pmap(function(output, title, subheader, echo, level, id){
|
||||
|
||||
generate_chunks(level = level,
|
||||
title = title,
|
||||
subheader = subheader,
|
||||
echo = echo,
|
||||
object = paste0("outputs[[", id, "]]"))
|
||||
|
||||
}) %>%
|
||||
purrr::reduce(c)
|
||||
|
||||
# wpa_logo <- system.file("logos/logo.PNG", package = "wpa")
|
||||
|
||||
## markdown object
|
||||
markobj <- c('---',
|
||||
title_chr <- paste0('title: \"', title, '\"'),
|
||||
'output: ',
|
||||
' html_document:',
|
||||
paste0(' theme: ', theme),
|
||||
# ' theme: united',
|
||||
' toc: true',
|
||||
' toc_float:',
|
||||
' collapsed: false',
|
||||
' smooth_scroll: true',
|
||||
'---',
|
||||
# paste0('![]("', wpa_logo, '")'),
|
||||
'',
|
||||
preamble,
|
||||
'',
|
||||
chunk_merged)
|
||||
|
||||
writeLines(markobj, paste0(filename, ".Rmd"))
|
||||
rmarkdown::render(paste0(filename, ".Rmd"))
|
||||
|
||||
## Load in browser
|
||||
utils::browseURL(paste0(filename, ".html"))
|
||||
|
||||
## Deletes specified files
|
||||
unlink(c(paste0(filename, ".Rmd"),
|
||||
paste0(filename, ".md")))
|
||||
|
||||
|
||||
}
|
||||
|
||||
|
||||
#' @title Generate chunk strings
|
||||
#'
|
||||
#' @description This is used as a supporting function for `generate_report()`
|
||||
#' and not directly used. `generate_report()`` works by creating a
|
||||
#' loop structure around generate_chunks(), and binds them together
|
||||
#' to create a report.
|
||||
#'
|
||||
#' @details
|
||||
#' `generate_chunks()` is primarily a wrapper around paste() functions,
|
||||
#' to create a structured character vector that will form the individual
|
||||
#' chunks. No plots 'exist' within the environment of `generate_chunks()`.
|
||||
#'
|
||||
#' @param level Numeric value to specify the header level of the chunk.
|
||||
#' @param title Character string to specify the title of the chunk.
|
||||
#' @param subheader Character string to specify the subheader of the chunk.
|
||||
#' @param echo Logical value to specify whether to display code.
|
||||
#' @param object Character string to specify name of the object to show.
|
||||
generate_chunks <- function(level = 3,
|
||||
title,
|
||||
subheader = "",
|
||||
echo = FALSE,
|
||||
object){
|
||||
|
||||
level_hash <- paste(rep('#', level), collapse = "")
|
||||
|
||||
obj <- c(paste(level_hash, title),
|
||||
subheader,
|
||||
paste0('```{r, echo=',
|
||||
echo,
|
||||
', fig.height=9, fig.width=12}'),
|
||||
object,
|
||||
'```',
|
||||
' ')
|
||||
|
||||
return(obj)
|
||||
}
|
||||
|
||||
#' Read preamble
|
||||
#'
|
||||
#' Read in a preamble to be used within each individual reporting function.
|
||||
#' Reads from the Markdown file installed with the package.
|
||||
#'
|
||||
#' @param path Text string containing the path for the appropriate Markdown file.
|
||||
#'
|
||||
#' @export
|
||||
read_preamble <- function(path){
|
||||
|
||||
full_path <- paste0("/preamble/", path)
|
||||
|
||||
complete_path <- paste0(path.package("wpa"), full_path)
|
||||
|
||||
text <- suppressWarnings(readLines(complete_path))
|
||||
|
||||
return(text)
|
||||
}
|
||||
|
||||
#' Display HTML fragment in RMarkdown chunk, from Markdown text
|
||||
#'
|
||||
#' @description
|
||||
#' This is a wrapper around `markdown::markdownToHTML()`, where
|
||||
#' the default behaviour is to produce a HTML fragment.
|
||||
#' `htmltools::HTML()` is then used to evaluate the HTML code
|
||||
#' within a RMarkdown chunk.
|
||||
#'
|
||||
#' @importFrom htmltools HTML
|
||||
#' @importFrom markdown markdownToHTML
|
||||
#'
|
||||
#' @param text Character vector containing Markdown text
|
||||
#'
|
||||
md2html <- function(text){
|
||||
|
||||
html_chunk <- markdown::markdownToHTML(text = text,
|
||||
fragment.only = TRUE)
|
||||
|
||||
htmltools::HTML(html_chunk)
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,184 @@
|
|||
###################################################################
|
||||
|
||||
## Global Variables
|
||||
|
||||
## This file is added to minimize the false positives flagged during R CMD check.
|
||||
## Example: afterhours_trend: no visible binding for global variable 'Date'
|
||||
|
||||
###################################################################
|
||||
|
||||
utils::globalVariables(
|
||||
c(
|
||||
"Date",
|
||||
"Start",
|
||||
"End",
|
||||
"PersonId",
|
||||
"group",
|
||||
"Period",
|
||||
"outcome",
|
||||
".",
|
||||
"After_hours_collaboration_hours",
|
||||
"Employee_Count",
|
||||
"bins",
|
||||
"Metric",
|
||||
"Hours",
|
||||
"Collaboration_hours",
|
||||
"bucket_hours",
|
||||
"Employees",
|
||||
"Total",
|
||||
"Value",
|
||||
"Email_hours",
|
||||
"External_network_size",
|
||||
"Networking_outside_company",
|
||||
"Ext_network_size",
|
||||
"Ext_network_breadth",
|
||||
"CH_ratio",
|
||||
"Emails_sent",
|
||||
"Email_ratio",
|
||||
"WorkingEndTimeSetInOutlook",
|
||||
"WorkingStartTimeSetInOutlook",
|
||||
"WorkdayRange",
|
||||
"WorkdayFlag",
|
||||
"WorkdayFlag1",
|
||||
"WorkdayFlag2",
|
||||
"TimeInvestorOrg",
|
||||
"CollaboratorOrg",
|
||||
"metric_prop",
|
||||
"output_list",
|
||||
"n_unique",
|
||||
"attribute",
|
||||
"NA_per",
|
||||
"no visible global function definition for head",
|
||||
"mean_collab",
|
||||
"holidayweek",
|
||||
"ymin",
|
||||
"ymax",
|
||||
"z_score",
|
||||
"Organization",
|
||||
"flag_nkw",
|
||||
"perc",
|
||||
"tenure_years",
|
||||
"odd_tenure",
|
||||
"Internal_network_size",
|
||||
"Networking_outside_organization",
|
||||
"Int_network_size",
|
||||
"Int_network_breadth",
|
||||
"Workweek_span",
|
||||
"Meeting_hours",
|
||||
"Meetings",
|
||||
"After_hours_meeting_hours",
|
||||
"Low_quality_meeting_hours",
|
||||
"Generated_workload_email_recipients",
|
||||
"After_hours_email_hours",
|
||||
"Total_focus_hours",
|
||||
"variable",
|
||||
"Multitasking_meeting_hours",
|
||||
"Redundant_meeting_hours__organizational_",
|
||||
"Conflicting_meeting_hours",
|
||||
"Meeting_count",
|
||||
"perc_after_hours_m",
|
||||
"perc_low_quality",
|
||||
"perc_Multitasking",
|
||||
"perc_Redundant",
|
||||
"perc_conflicting",
|
||||
"Redundant_meeting_hours__lower_level_",
|
||||
"HourType",
|
||||
"extract_prop",
|
||||
"extract_raw",
|
||||
"RawHours",
|
||||
"MeetingType",
|
||||
"Attendee_meeting_hours",
|
||||
"AttendeeMeetingHours",
|
||||
"Prop",
|
||||
"Duration",
|
||||
"Attendees",
|
||||
"Percent",
|
||||
"value2",
|
||||
"x",
|
||||
"y",
|
||||
"Meeting_hours_with_manager",
|
||||
"coattendman_rate",
|
||||
"bucket_coattendman_rate",
|
||||
"Meeting_hours_with_manager_1_on_1",
|
||||
"mgr1on1",
|
||||
"coattendande",
|
||||
"mgrRel",
|
||||
"xmin",
|
||||
"xmax",
|
||||
"bucket_manager_1_on_1",
|
||||
"Minutes_with_manager_1_on_1",
|
||||
"Metrics",
|
||||
"Values",
|
||||
"Checks",
|
||||
"Variables",
|
||||
"UniqueValues",
|
||||
"MissingValues",
|
||||
"Examples",
|
||||
"KPI",
|
||||
"After",
|
||||
"Before",
|
||||
"delta",
|
||||
"perc_diff",
|
||||
"Mybins",
|
||||
"Employee_perc",
|
||||
"cluster",
|
||||
"key",
|
||||
"clusters",
|
||||
"label",
|
||||
"Signals",
|
||||
"StartEnd",
|
||||
"Before_start",
|
||||
"Within_hours",
|
||||
"After_end",
|
||||
"Cases",
|
||||
"subjectFlag",
|
||||
"Subject",
|
||||
"line",
|
||||
"text",
|
||||
"word",
|
||||
"name",
|
||||
"freq",
|
||||
"pre_group",
|
||||
"Utilization_hrs",
|
||||
"UH_bin",
|
||||
"UH_over_45",
|
||||
"h_clust",
|
||||
"dist_m",
|
||||
"Signals_Total",
|
||||
"Signals_sent",
|
||||
"Personas",
|
||||
"Employee_count",
|
||||
"external_network_plot",
|
||||
"total",
|
||||
"value",
|
||||
"prop",
|
||||
"ODDS",
|
||||
"..input_var2",
|
||||
".N",
|
||||
".SD",
|
||||
"Freq",
|
||||
"PersonCount",
|
||||
"Shifts",
|
||||
"Shoulders",
|
||||
"per",
|
||||
"sent",
|
||||
"Fill",
|
||||
"PersonWeekId",
|
||||
"WeekCount",
|
||||
"WeekPercentage",
|
||||
"patternRank",
|
||||
"Count",
|
||||
"Day_Span",
|
||||
"First_signal",
|
||||
"Last_signal",
|
||||
"MeetingHoursInLongOrLargeMeetings",
|
||||
"Percentage",
|
||||
"TenureYear",
|
||||
"calculation",
|
||||
"perc_nkw",
|
||||
"value_rescaled",
|
||||
"values",
|
||||
"cleaned_data",
|
||||
"zscore"
|
||||
)
|
||||
)
|
|
@ -0,0 +1,58 @@
|
|||
#' @title Employee count over time
|
||||
#'
|
||||
#' @description Returns a line chart showing the change in
|
||||
#' employee count over time. Part of a data validation process to check
|
||||
#' for unusual license growth / declines over time.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' hr_trend(sq_data)
|
||||
#' hr_trend(dv_data)
|
||||
#' hr_trend(dv_data, return = "table")
|
||||
#' }
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
hr_trend <- function(data, return = "plot"){
|
||||
|
||||
options(dplyr.summarise.inform = FALSE)
|
||||
|
||||
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
plot_data <-
|
||||
data %>%
|
||||
group_by(Date) %>%
|
||||
summarise(n = n_distinct(PersonId)) %>%
|
||||
ungroup()
|
||||
|
||||
if(return == "plot"){
|
||||
plot_data %>%
|
||||
ggplot(aes(x = Date, y = n)) +
|
||||
geom_line(size = 1) +
|
||||
labs(title = "Population over time",
|
||||
subtitle = "Unique licensed population by week",
|
||||
caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End)) +
|
||||
ylab("Employee count") +
|
||||
xlab("Date") +
|
||||
scale_y_continuous(labels = round, limits = c(0,NA)) +
|
||||
theme_wpa_basic()
|
||||
|
||||
} else if(return == "table"){
|
||||
plot_data
|
||||
} else {
|
||||
stop("Please enter a valid input for `return`.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,102 @@
|
|||
#' @title Create a count of distinct people in a specified HR variable
|
||||
#'
|
||||
#' @description
|
||||
#' This function enables you to create a count of the distinct people
|
||||
#' by the specified HR attribute.The default behaviour is to return a
|
||||
#' bar chart as typically seen in 'Analysis Scope'.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation".
|
||||
#' If a vector with more than one value is provided, the HR attributes are automatically
|
||||
#' concatenated.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#' @importFrom data.table ":=" "%like%" "%between%"
|
||||
#'
|
||||
#' @family General
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @examples
|
||||
#' # Return a bar plot
|
||||
#' hrvar_count(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#' # Return a summary table
|
||||
#' hrvar_count(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#'@export
|
||||
|
||||
hrvar_count <- function(data,
|
||||
hrvar = "Organization",
|
||||
return = "plot"){
|
||||
|
||||
## Allow multiple HRvar inputs
|
||||
if(length(hrvar) > 1){
|
||||
hrvar_flat <- paste(hrvar, collapse = ", ")
|
||||
|
||||
summary_table <-
|
||||
data %>%
|
||||
select(PersonId, all_of(hrvar)) %>%
|
||||
mutate(!!sym(hrvar_flat) := select(., hrvar) %>%
|
||||
apply(1, paste, collapse = ", ")) %>%
|
||||
group_by(!!sym(hrvar_flat)) %>%
|
||||
summarise(n = n_distinct(PersonId)) %>%
|
||||
arrange(desc(n))
|
||||
|
||||
# Single reference for single and multiple org attributes
|
||||
hrvar_label <- hrvar_flat
|
||||
|
||||
} else {
|
||||
|
||||
summary_table <-
|
||||
data %>%
|
||||
select(PersonId, all_of(hrvar)) %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise(n = n_distinct(PersonId)) %>%
|
||||
arrange(desc(n))
|
||||
|
||||
# Single reference for single and multiple org attributes
|
||||
hrvar_label <- hrvar
|
||||
}
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
data %>%
|
||||
data.table::as.data.table() %>%
|
||||
.[, .(n = n_distinct(PersonId)), by = hrvar] %>%
|
||||
as_tibble() %>%
|
||||
arrange(desc(n))
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
## This is re-run to enable multi-attribute grouping without concatenation
|
||||
summary_table %>%
|
||||
ggplot(aes(x = stats::reorder(!!sym(hrvar_label), -n),
|
||||
y = n)) +
|
||||
geom_col(fill = rgb2hex(0, 120, 212)) +
|
||||
geom_text(aes(label = n),
|
||||
vjust = -1,
|
||||
fontface = "bold",
|
||||
size = 4)+
|
||||
theme_wpa_basic() +
|
||||
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
|
||||
labs(title = paste("People by", camel_clean(hrvar_label))) +
|
||||
scale_y_continuous(limits = c(0, max(summary_table$n) * 1.1)) +
|
||||
xlab(camel_clean(hrvar_label)) +
|
||||
ylab("Number of employees")
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
#' @rdname hrvar_count
|
||||
#' @export
|
||||
analysis_scope <- hrvar_count
|
||||
|
|
@ -0,0 +1,105 @@
|
|||
#' @title Create count of distinct fields and percentage of employees with NAs for all HR variables
|
||||
#'
|
||||
#' @description
|
||||
#' This function enables you to create a summary table to validate organizational data. This table will provide a summary of the data found
|
||||
#' in the WpA Sources page.
|
||||
#' This function will return a summary table with the count of distinct fields per HR attribute and the percentage of
|
||||
#' employees with NAs for that attribute.
|
||||
#' See hrvar_count function for more detail on the specific HR attribute of interest.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param n_var number of HR variables to include in report as rows. Default is set to 10 HR variables.
|
||||
#' @param return String to specify what to return
|
||||
#' @param threshold The max number of unique values allowed for any attribute. Default is 100.
|
||||
#' @param maxna The max percentage of NAs allowable for any column. Default is 20.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @return
|
||||
#' Returns an error message by default, where 'text' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table listing the number of distinct fields and percentage of NAs for the specified number of HR attributes will be returned.
|
||||
#' when 'message' is passed, outputs a message indicating which values are beyond the specified thresholds.
|
||||
#'
|
||||
#' @export
|
||||
hrvar_count_all <- function(data,
|
||||
n_var = 50,
|
||||
return = "message",
|
||||
threshold = 100,
|
||||
maxna = 20
|
||||
){
|
||||
|
||||
## Character vector of HR attributes
|
||||
extracted_chr <- extract_hr(data, return = "names")
|
||||
|
||||
summary_table_n <-
|
||||
data %>%
|
||||
select(PersonId, extracted_chr) %>%
|
||||
summarise_at(vars(extracted_chr), ~n_distinct(.,na.rm = TRUE)) # Excludes NAs from unique count
|
||||
|
||||
## Note: WPA here is used for a matching separator
|
||||
results <-
|
||||
data %>%
|
||||
select(PersonId, extracted_chr) %>%
|
||||
summarise_at(vars(extracted_chr),
|
||||
list(`WPAn_unique` = ~n_distinct(., na.rm = TRUE), # Excludes NAs from unique count
|
||||
`WPAper_na` = ~(sum(is.na(.))/ nrow(data) * 100),
|
||||
`WPAsum_na` = ~sum(is.na(.)) # Number of missing values
|
||||
)) %>% # % of missing values
|
||||
tidyr::gather(attribute, values) %>%
|
||||
tidyr::separate(col = attribute, into = c("attribute", "calculation"), sep = "_WPA") %>%
|
||||
tidyr::spread(calculation, values)
|
||||
|
||||
## Single print message
|
||||
if(sum(results$n_unique >= threshold)==0){
|
||||
|
||||
printMessage <- paste("No attributes have greater than", threshold, "unique values.")
|
||||
}
|
||||
|
||||
if(sum(results$per_na >= maxna)==0){
|
||||
newMessage <- paste("No attributes have more than", maxna, "percent NA values.")
|
||||
printMessage <- paste(printMessage, newMessage, collapse = "\n")
|
||||
|
||||
}
|
||||
|
||||
for (i in 1:nrow(results)) {
|
||||
if(results$n_unique[i] >= threshold){
|
||||
|
||||
newMessage <- paste0("The attribute '",results$attribute[i],"' has a large amount of unique values. Please check.")
|
||||
printMessage <- paste(printMessage, newMessage, collapse = "\n")
|
||||
}
|
||||
|
||||
if(results$per_na[i]>=maxna){
|
||||
|
||||
newMessage <- paste0("The attribute '",results$attribute[i],"' has a large amount of NA values. Please check.")
|
||||
printMessage <- paste(printMessage, newMessage, collapse = "\n")
|
||||
}
|
||||
}
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
results <-
|
||||
results %>%
|
||||
select(Attributes = "attribute",
|
||||
`Unique values` = "n_unique",
|
||||
`Total missing values` = "sum_na",
|
||||
`% missing values` = "per_na")
|
||||
|
||||
return(utils::head(results, n_var))
|
||||
|
||||
} else if(return == "text"){
|
||||
|
||||
printMessage
|
||||
|
||||
} else if(return == "message"){
|
||||
|
||||
message(printMessage)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Error: please check inputs for `return`")
|
||||
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,113 @@
|
|||
#' @title Identify employees who have churned from the dataset
|
||||
#'
|
||||
#' @description
|
||||
#' This function identifies and counts the number of employees who have churned from
|
||||
#' the dataset by measuring whether an employee who is present in the first `n` (n1) weeks
|
||||
#' of the data is present in the last `n` (n2) weeks of the data.
|
||||
#'
|
||||
#' @details
|
||||
#' An additional use case of this function is the ability to identify "new-joiners" by using
|
||||
#' the argument `flip`.
|
||||
#'
|
||||
#' @param data A Person Query as a data frame. Must contain a `PersonId`.
|
||||
#' @param n1 A numeric value specifying the number of weeks at the beginning of the period
|
||||
#' that defines the measured employee set. Defaults to 6.
|
||||
#' @param n2 A numeric value specifying the number of weeks at the end of the period
|
||||
#' to calculate whether employees have churned from the data. Defaults to 6.
|
||||
#' @param return String specifying what to return. Defaults to "message", with options to
|
||||
#' return a character string ("text") or the `PersonId` of employees who have been identified
|
||||
#' as churned ("data").
|
||||
#' @param flip Logical, defaults to FALSE. This determines whether to reverse the logic of identifying the
|
||||
#' non-overlapping set. If set to `TRUE`, this effectively identifies new-joiners, or those
|
||||
#' who were not present in the first n weeks of the data but were present in the final n weeks.
|
||||
#'
|
||||
#' @details
|
||||
#' If an employee is present in the first `n` weeks of the data but not present in the last
|
||||
#' `n` weeks of the data, the function considers the employee as churned. As the measurement period
|
||||
#' is defined by the number of weeks from the start and the end of the passed data frame, you
|
||||
#' may consider filtering the dates accordingly before running this function.
|
||||
#'
|
||||
#' Another assumption that is in place is that any employee whose `PersonId` is not available in the
|
||||
#' data has churned. Note that there may be other reasons why an employee's `PersonId` may not
|
||||
#' be present, e.g. maternity/paternity leave, Workplace Analytics license has been removed,
|
||||
#' shift to a low-collaboration role (to the extent that he/she becomes inactive).
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' sq_data %>% identify_churn(n1 = 3, n2 = 3, return = "message")
|
||||
#' }
|
||||
#' @export
|
||||
identify_churn <- function(data,
|
||||
n1 = 6,
|
||||
n2 = 6,
|
||||
return = "message",
|
||||
flip = FALSE){
|
||||
|
||||
data$Date <- as.Date(data$Date, format = "%m/%d/%Y") # Ensure correct format
|
||||
|
||||
unique_dates <- unique(data$Date) # Vector of unique dates
|
||||
|
||||
nlen <- length(unique_dates) # Total number of unique dates
|
||||
|
||||
# First and last n weeks
|
||||
firstnweeks <- sort(unique_dates)[1:n1]
|
||||
lastnweeks <- sort(unique_dates, decreasing = TRUE)[1:n2]
|
||||
|
||||
## People in the first week
|
||||
first_peeps <-
|
||||
data %>%
|
||||
dplyr::filter(Date %in% firstnweeks) %>%
|
||||
dplyr::pull(PersonId) %>%
|
||||
unique()
|
||||
|
||||
## People in the last week
|
||||
final_peeps <-
|
||||
data %>%
|
||||
dplyr::filter(Date %in% lastnweeks) %>%
|
||||
dplyr::pull(PersonId) %>%
|
||||
unique()
|
||||
|
||||
if(flip == FALSE){
|
||||
|
||||
## In first, not in last
|
||||
churner_id <- setdiff(first_peeps, final_peeps)
|
||||
|
||||
## Message
|
||||
printMessage <-
|
||||
paste0("Churn:\nThere are ", length(churner_id),
|
||||
" employees from ", min(firstnweeks), " to ",
|
||||
max(firstnweeks), " (", n1, " weeks)",
|
||||
" who are no longer present in ",
|
||||
min(lastnweeks), " to ", max(lastnweeks),
|
||||
" (", n2, " weeks).")
|
||||
|
||||
} else if(flip == TRUE){
|
||||
|
||||
## In last, not in first
|
||||
## new joiners
|
||||
churner_id <- dplyr::setdiff(final_peeps, first_peeps)
|
||||
|
||||
## Message
|
||||
printMessage <-
|
||||
paste0("New joiners:\nThere are ", length(churner_id),
|
||||
" employees from ", min(lastnweeks), " to ",
|
||||
max(lastnweeks), " (", n2, " weeks)",
|
||||
" who were not present in ",
|
||||
min(firstnweeks), " to ", max(firstnweeks),
|
||||
" (", n1, " weeks).")
|
||||
|
||||
} else {
|
||||
stop("Invalid argument for `flip`")
|
||||
}
|
||||
|
||||
|
||||
|
||||
if(return == "message"){
|
||||
message(printMessage)
|
||||
} else if(return == "text"){
|
||||
printMessage
|
||||
} else if(return == "data"){
|
||||
churner_id
|
||||
} else {
|
||||
stop("Invalid `return`")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,105 @@
|
|||
#' @title Identify Holiday Weeks
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for weeks where collaboration hours is far outside the mean.
|
||||
#' Returns a list of weeks that appear to be holiday weeks and optionally an edited dataframe with outliers removed.
|
||||
#' By default, missing values are excluded.
|
||||
#'
|
||||
#' As best practice, run this function prior to any analysis to remove atypical collaboration weeks from your dataset.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param sd The standard deviation below the mean for collaboration hours that should define an outlier week. Enter a positive number. Default is 1 standard deviation.
|
||||
#' @param return String to specify what to return
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @importFrom methods is
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a message by default (`return` = "message").
|
||||
#' When 'message' is passed, a message is printed identifying holiday weeks.
|
||||
#' When 'data' is passed, a dataset with outlier weeks flagged in a new column is returned as a dataframe.
|
||||
#' When 'data_cleaned' is passed, a dataset with outlier weeks removed is returned as a dataframe.
|
||||
#' when 'data_dirty' is passed, a dataset with only outlier weeks is returned as a dataframe.
|
||||
#' when 'plot' is passed, a pot with holiday weeks highlighted is returned as a dataframe.
|
||||
#'
|
||||
#' @export
|
||||
identify_holidayweeks <- function(data, sd = 1, return = "message"){
|
||||
|
||||
## Ensure date is formatted
|
||||
if(all(is_date_format(data$Date))){
|
||||
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
|
||||
} else if(is(data$Date, "Date")){
|
||||
# Do nothing
|
||||
} else {
|
||||
stop("`Date` appears not to be properly formatted.\n
|
||||
It needs to be in the format MM/DD/YYYY.\n
|
||||
Also check for missing values or stray values with inconsistent formats.")
|
||||
}
|
||||
|
||||
Calc <-
|
||||
data %>%
|
||||
group_by(Date) %>%
|
||||
summarize(mean_collab = mean(Collaboration_hours, na.rm = TRUE),.groups = 'drop') %>%
|
||||
mutate(z_score = (mean_collab - mean(mean_collab, na.rm = TRUE))/ sd(mean_collab, na.rm = TRUE))
|
||||
|
||||
Outliers = (Calc$Date[Calc$z_score < -sd])
|
||||
|
||||
mean_collab_hrs <- mean(Calc$mean_collab, na.rm = TRUE)
|
||||
|
||||
Message <- paste0("The weeks where collaboration was ",
|
||||
sd,
|
||||
" standard deviations below the mean (",
|
||||
round(mean_collab_hrs, 1),
|
||||
") are: \n",
|
||||
paste(wrap(Outliers, wrapper = "`"),collapse = ", "))
|
||||
|
||||
myTable_plot <-
|
||||
data %>%
|
||||
mutate(holidayweek = (Date %in% Outliers)) %>%
|
||||
select("Date", "holidayweek", "Collaboration_hours") %>%
|
||||
group_by(Date) %>%
|
||||
summarise(Collaboration_hours=mean(Collaboration_hours), holidayweek=first(holidayweek)) %>%
|
||||
mutate(Date=as.Date(Date, format = "%m/%d/%Y"))
|
||||
|
||||
myTable_plot_shade <-
|
||||
myTable_plot %>%
|
||||
filter(holidayweek == TRUE) %>%
|
||||
mutate(min = Date - 3 , max = Date + 3 , ymin = -Inf, ymax = +Inf)
|
||||
|
||||
plot <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = Date, y = Collaboration_hours, group = 1)) +
|
||||
geom_line(colour = "grey40") +
|
||||
theme_wpa_basic() +
|
||||
geom_rect(data = myTable_plot_shade,
|
||||
aes(xmin = min, xmax = max, ymin = ymin, ymax = ymax),
|
||||
color="transparent", fill="steelblue", alpha=0.3) +
|
||||
labs(title = "Holiday Weeks",
|
||||
subtitle = "Showing average collaboration hours over time")+
|
||||
ylab("Collaboration Hours") +
|
||||
xlab("Date")
|
||||
|
||||
if(return == "text"){
|
||||
return(Message)
|
||||
} else if(return == "message"){
|
||||
message(Message)
|
||||
}else if(return %in% c("data_clean", "data_cleaned")){
|
||||
return(data %>% filter(!(Date %in% Outliers)) %>% data.frame())
|
||||
} else if(return == "data_dirty"){
|
||||
return(data %>% filter((Date %in% Outliers)) %>% data.frame())
|
||||
} else if(return == "data"){
|
||||
return(data %>% mutate(holidayweek = (Date %in% Outliers)) %>% data.frame())
|
||||
} else if(return == "plot"){
|
||||
return(plot)
|
||||
} else {
|
||||
stop("Error: please check inputs for `return`")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,55 @@
|
|||
#' @title Identify Inactive Weeks
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for weeks where collaboration hours is far outside the mean for any individual person in the dataset.
|
||||
#' Returns a list of weeks that appear to be inactive weeks and optionally an edited dataframe with outliers removed.
|
||||
#'
|
||||
#' As best practice, run this function prior to any analysis to remove atypical collaboration weeks from your dataset.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param sd The standard deviation below the mean for collaboration hours that should define an outlier week. Enter a positive number. Default is 1 standard deviation.
|
||||
#' @param return String to specify what to return.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @return
|
||||
#' Returns an error message by default, where 'text' is returned.
|
||||
#' When 'data_cleaned' is passed, a dataset with outlier weeks removed is returned as a dataframe.
|
||||
#' When 'data_dirty' is passed, a dataset with outlier weeks is returned as a dataframe.
|
||||
#'
|
||||
#' @export
|
||||
identify_inactiveweeks <- function(data, sd = 2, return = "text"){
|
||||
Calc <-
|
||||
data %>%
|
||||
group_by(PersonId) %>%
|
||||
mutate(z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
|
||||
filter(z_score<= -sd) %>%
|
||||
select(PersonId,Date,z_score) %>%
|
||||
data.frame()
|
||||
|
||||
Message <- paste0("There are ",nrow(Calc), " rows of data with weekly collaboration hours more than ",sd," standard deviations below the mean.")
|
||||
|
||||
if(return == "text"){
|
||||
return(Message)
|
||||
|
||||
} else if(return == "data_dirty"){
|
||||
data %>% group_by(PersonId) %>% mutate(
|
||||
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
|
||||
filter(z_score<= -sd) %>% select(-z_score) %>% data.frame()
|
||||
|
||||
} else if(return == "data_cleaned"){
|
||||
data %>% group_by(PersonId) %>% mutate(
|
||||
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
|
||||
filter(z_score> -sd) %>% select(-z_score) %>% data.frame()
|
||||
|
||||
} else if(return == "data"){
|
||||
data %>% group_by(PersonId) %>% mutate(
|
||||
z_score = (Collaboration_hours - mean(Collaboration_hours))/sd(Collaboration_hours)) %>%
|
||||
mutate(inactiveweek = (z_score<= -sd)) %>% select(-z_score) %>% data.frame()
|
||||
|
||||
} else {
|
||||
stop("Error: please check inputs for `return`")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,96 @@
|
|||
#' @title Identify Non-Knowledge workers (nkw)
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output to identify employees with consistently low collaboration signals.
|
||||
#' Returns the % of non-knowledge workers identified by Organization, and optionally an edited dataframe with non-knowledge workers removed, or the full dataframe with the kw/nkw flag added.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param collab_threshold The collaboration hours threshold that should be exceeded as an average for the entire analysis period for the employee to be categorized as a knowledge worker ("kw").
|
||||
#' Enter a positive number. Default is set to 5 collaboration hours.
|
||||
#' @param return String to specify what to return
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @return
|
||||
#' Returns an error message by default, where 'text' is passed in `return`.
|
||||
#' When 'data_with_flag' is passed, a copy of your original dataframe with an additional column containing the kw/nkw flag will be returned.
|
||||
#' When 'data_clean' is passed, a full dataframe excluding "nkw" employees will be returned.
|
||||
#' when 'data_summmary' is passed, a summary table by organization listing the number and % of non-knowledge workers will be returned.
|
||||
#'
|
||||
#' @export
|
||||
identify_nkw <- function(data, collab_threshold = 5, return = "data_summary"){
|
||||
|
||||
options(dplyr.summarise.inform = FALSE)
|
||||
|
||||
summary_byPersonId <-
|
||||
data %>%
|
||||
group_by(PersonId, Organization) %>%
|
||||
summarize(mean_collab = mean(Collaboration_hours))%>%
|
||||
mutate(flag_nkw = case_when(mean_collab > collab_threshold ~ "kw",
|
||||
TRUE ~ "nkw"))
|
||||
|
||||
data_with_flag <- left_join(data,
|
||||
summary_byPersonId %>%
|
||||
dplyr::select(PersonId,flag_nkw), by = 'PersonId')
|
||||
|
||||
summary_byOrganization <-
|
||||
summary_byPersonId %>%
|
||||
group_by(Organization, flag_nkw)%>%
|
||||
summarise(total = n())%>%
|
||||
group_by(Organization)%>%
|
||||
mutate(perc = total/sum(total))%>% #need to format to %
|
||||
filter(flag_nkw == "nkw")%>%
|
||||
rename(n_nkw = total, perc_nkw = perc)%>%
|
||||
select(-flag_nkw) %>%
|
||||
ungroup()
|
||||
|
||||
## Number of NKW identified
|
||||
n_nkw <- sum(summary_byPersonId$flag_nkw == "nkw")
|
||||
|
||||
|
||||
if(n_nkw == 0){
|
||||
flagMessage <- paste0("[Pass] There are no non-knowledge workers identified",
|
||||
" (average collaboration hours below ",
|
||||
collab_threshold,
|
||||
" hours).")
|
||||
} else {
|
||||
flagMessage <-
|
||||
paste0("[Warning] Out of a population of ", n_distinct(data$PersonId),
|
||||
", there are ", n_nkw,
|
||||
" employees who may be non-knowledge workers (average collaboration hours below ",
|
||||
collab_threshold, " hours).")
|
||||
}
|
||||
|
||||
if(return == "data_with_flag"){
|
||||
return(data_with_flag)
|
||||
|
||||
} else if(return %in% c("data_clean", "data_cleaned")){
|
||||
|
||||
return(data_with_flag %>% filter(flag_nkw == "kw") %>% data.frame())
|
||||
|
||||
} else if(return == "text"){
|
||||
|
||||
flagMessage
|
||||
|
||||
} else if(return =="data_summary"){
|
||||
|
||||
summary_byOrganization %>%
|
||||
mutate(perc_nkw = scales::percent(perc_nkw, accuracy = 1)) %>%
|
||||
rename(`Non-knowledge workers (count)` = "n_nkw",
|
||||
`Non-knowledge workers (%)` = "perc_nkw")
|
||||
|
||||
} else {
|
||||
stop("Error: please check inputs for `return`")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
#' @title Identify outliers across time
|
||||
#'
|
||||
#' This function takes in a selected metric and uses
|
||||
#' z-score (number of standard deviations) to identify outliers
|
||||
#' across time. There are applications in this for identifying
|
||||
#' weeks with abnormally low collaboration activity, e.g. holidays.
|
||||
#' Time as a grouping variable can be overridden with the `group_var`
|
||||
#' argument.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param group_var A string with the name of the grouping variable.
|
||||
#' Defaults to `Date`.
|
||||
#' @param metric Character string containing the name of the metric,
|
||||
#' e.g. "Collaboration_hours"
|
||||
#'
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @examples
|
||||
#' identify_outlier(sq_data, metric = "Collaboration_hours")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a data frame with `Date` (if grouping variable is not set),
|
||||
#' the metric, and the corresponding z-score.
|
||||
#'
|
||||
#' @family General
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @export
|
||||
identify_outlier <- function(data,
|
||||
group_var = "Date",
|
||||
metric = "Collaboration_hours"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c(group_var,
|
||||
"PersonId",
|
||||
metric)
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
main_table <-
|
||||
data %>%
|
||||
group_by(!!sym(group_var)) %>%
|
||||
summarise_at(vars(!!sym(metric)), ~mean(.)) %>%
|
||||
mutate(zscore = (!!sym(metric) - mean(!!sym(metric)))/sd(!!sym(metric)))
|
||||
|
||||
return(main_table)
|
||||
}
|
||||
|
|
@ -0,0 +1,81 @@
|
|||
#' @title Identify groups under privacy threshold
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with of employees under the privacy threshold.
|
||||
#' The method consists in reviewing each individual HR attribute, and count the distinct people within each group.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "table" (default) and "text" (text)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
identify_privacythreshold <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
results <-
|
||||
data %>% hrvar_count(
|
||||
hrvar = hrvar[1],
|
||||
return = "table")
|
||||
|
||||
results$hrvar <- ""
|
||||
|
||||
results <- results[0,]
|
||||
|
||||
for (p in hrvar) {
|
||||
table1 <-
|
||||
data %>%
|
||||
hrvar_count(hrvar = p,
|
||||
return = "table")
|
||||
|
||||
table1$hrvar <- p
|
||||
colnames(table1)[1] <- "group"
|
||||
|
||||
results <- rbind(results,table1)
|
||||
}
|
||||
|
||||
output <- results %>% arrange(n) %>% select(hrvar, everything())
|
||||
|
||||
groups_under <- results %>% filter(n<mingroup) %>% nrow()
|
||||
|
||||
MinGroupFlagMessage_Warning <- paste0("[Warning] There are ", groups_under, " groups under the minimum group size privacy threshold of ", mingroup, ".")
|
||||
MinGroupFlagMessage_Low <- paste0("[Pass] There is only ", groups_under, " group under the minimum group size privacy threshold of ", mingroup, ".")
|
||||
MinGroupFlagMessage_Zero <- paste0("[Pass] There are no groups under the minimum group size privacy threshold of ", mingroup, ".")
|
||||
|
||||
|
||||
if(groups_under > 1){
|
||||
MinGroupFlagMessage <- MinGroupFlagMessage_Warning
|
||||
} else if(groups_under == 1 ){
|
||||
MinGroupFlagMessage <- MinGroupFlagMessage_Low
|
||||
} else if(groups_under ==0){
|
||||
MinGroupFlagMessage <- MinGroupFlagMessage_Zero
|
||||
}
|
||||
|
||||
if(return == "table"){
|
||||
return(output)
|
||||
} else if(return == "message"){
|
||||
message(MinGroupFlagMessage)
|
||||
} else if(return == "text"){
|
||||
MinGroupFlagMessage
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,70 @@
|
|||
#' @title Identify the query type of the passed data frame
|
||||
#'
|
||||
#' @description
|
||||
#' Pass a Workplace Analytics dataset and return the identified
|
||||
#' query type as a string. This function uses variable name string
|
||||
#' matching to 'guess' the query type of the data frame.
|
||||
#'
|
||||
#' @param data A Workplace Analytics dataset in the form of a data frame.
|
||||
#' If the data is not identified as a Workplace Analytics dataset, the function
|
||||
#' will return an error.
|
||||
#' @param threshold Debugging use only. Increase to raise the 'strictness' of the
|
||||
#' guessing algorithm. Defaults to 2.
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' identify_query(sq_data) # Standard query
|
||||
#' identify_query(mt_data) # Meeting query
|
||||
#' identify_query(em_data) # Hourly collaboration query
|
||||
#' identify_query(iris) # Will return an error
|
||||
#' identify_query(mtcars) # Will return an error
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
identify_query <- function(data, threshold = 2){
|
||||
|
||||
## variables to check for in each query type
|
||||
spq_var <- c("PersonId", "Collaboration_hours", "Instant_Message_hours") # Standard Person query
|
||||
caq_var <- c("PersonId", "Collaboration_hrs", "Instant_message_hours") # Collaboration Assessment query
|
||||
smq_var <- c("MeetingId", "Date", "Attendees") # Standard Meeting Query
|
||||
shc_var <- c("PersonId", "Emails_sent_00_01", "IMs_sent_23_24") # Standard Hourly Collaboration
|
||||
|
||||
## see if there are columns which do not exist
|
||||
spq_check <- check_inputs(input = data, requirements = spq_var, return = "names")
|
||||
caq_check <- check_inputs(input = data, requirements = caq_var, return = "names")
|
||||
smq_check <- check_inputs(input = data, requirements = smq_var, return = "names")
|
||||
shc_check <- check_inputs(input = data, requirements = shc_var, return = "names")
|
||||
|
||||
## length of checks
|
||||
spq_check_n <- length(spq_check)
|
||||
caq_check_n <- length(caq_check)
|
||||
smq_check_n <- length(smq_check)
|
||||
shc_check_n <- length(shc_check)
|
||||
total_check_vec <- c(spq_check_n, caq_check_n, smq_check_n, shc_check_n)
|
||||
|
||||
## should never be zero
|
||||
total_check_n <- sum(total_check_vec, na.rm = TRUE)
|
||||
|
||||
## Labels
|
||||
qlabels <- c("Person Query",
|
||||
"Collaboration Assessment Query",
|
||||
"Meeting Query",
|
||||
"Hourly Collaboration Query")
|
||||
|
||||
## Minimum number of non-matches
|
||||
min_nm <- min(total_check_vec)
|
||||
|
||||
## Final guess
|
||||
f_guess <- qlabels[which.min(total_check_vec)]
|
||||
|
||||
if(total_check_n == 0){
|
||||
stop("Error: please check if query data is properly formatted query data.")
|
||||
} else if(min_nm > threshold){
|
||||
stop("Column mismatches: please check if query data is properly formatted query data.")
|
||||
} else {
|
||||
f_guess
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,63 @@
|
|||
#' @title Identify shifts based on outlook time settings for work day start and end time
|
||||
#'
|
||||
#' @description
|
||||
#' This function uses outlook calendar settings for start and end time of work day to identify
|
||||
#' work shifts. The relevant variables are `WorkingStartTimeSetInOutlook` and
|
||||
#' `WorkingEndTimeSetInOutlook`.
|
||||
#'
|
||||
#' @param data A data frame containing data from the Hourly Collaboration query.
|
||||
#'
|
||||
#' @param return Character vector to specify what to return.
|
||||
#' "table" (default) returns a summary table of the counts of shifts.
|
||||
#' "data" returns the original input data frame with the `Shifts` column appended.
|
||||
#'
|
||||
#' @importFrom data.table ":=" "%like%" "%between%"
|
||||
#'
|
||||
#' @examples
|
||||
#' dv_data %>% identify_shifts()
|
||||
#'
|
||||
#' @export
|
||||
identify_shifts <- function(data, return = "plot"){
|
||||
|
||||
clean_times <- function(x){
|
||||
out <- gsub(pattern = ":00", replacement = "", x = x)
|
||||
as.numeric(out)
|
||||
}
|
||||
|
||||
data <- data.table::as.data.table(data)
|
||||
# data <- data.table::copy(data)
|
||||
|
||||
# Make sure data.table knows we know we're using it
|
||||
.datatable.aware = TRUE
|
||||
|
||||
data[, Shifts := paste(WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook, sep = "-")]
|
||||
|
||||
|
||||
# outputTable <- data[, .(count = .N), by = Shifts]
|
||||
outputTable <- data[, list(WeekCount = .N,
|
||||
PersonCount = dplyr::n_distinct(PersonId)), by = Shifts]
|
||||
|
||||
outputTable <- data.table::setorder(outputTable, -PersonCount)
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
dplyr::as_tibble(outputTable)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
outputTable %>%
|
||||
utils::head(10) %>%
|
||||
create_bar_asis(group_var = "Shifts",
|
||||
bar_var = "WeekCount",
|
||||
title = "Most frequent outlook shifts",
|
||||
subtitle = "Showing top 10 only",
|
||||
caption = extract_date_range(data, return = "text"),
|
||||
ylab = "Shifts",
|
||||
xlab = "Frequency")
|
||||
|
||||
} else if(return == "data"){
|
||||
|
||||
output_data <- data
|
||||
dplyr::as_tibble(output_data)
|
||||
}
|
||||
}
|
|
@ -0,0 +1,114 @@
|
|||
#' @title Tenure calculation based on different input dates, returns data summary table or histogram
|
||||
#'
|
||||
#' @description
|
||||
#' This function calculates employee tenure based on different input dates.
|
||||
#' `identify_tenure` uses the latest Date available if user selects "Date",
|
||||
#' but also have flexibility to select a specific date, e.g. "1/1/2020".
|
||||
#'
|
||||
#' @family Data Validation
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param end_date A string specifying the name of the date variable representing the latest date. Defaults to "Date".
|
||||
#' @param beg_date A string specifying the name of the date variable representing the hire date. Defaults to "HireDate".
|
||||
#' @param maxten A numeric value representing the maximum tenure.
|
||||
#' If the tenure exceeds this threshold, it would be accounted for in the flag message.
|
||||
#' @param return String to specify what to return.
|
||||
#' Defaults to "message".
|
||||
#' Other valid values include "text", "plot", "data_cleaned", "data_dirty", and "data".
|
||||
#' For "data", a data frame with the `PersonId` and a calculated variable called `TenureYear` is returned.
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' # Add HireDate to sq_data - method #1
|
||||
#' sq_data2 <- sq_data %>% mutate(HireDate = as.Date("1/1/2015", format = "%m/%d/%Y" ))
|
||||
#' identify_tenure(sq_data2)
|
||||
#'
|
||||
#' # Add HireDate to sq_data - method #2
|
||||
#' sq_data$HireDate <-
|
||||
#' rep(sample(seq(as.Date('1975/01/01'),
|
||||
#' as.Date('2019/11/01'), by="day"), 15119),2)
|
||||
#'
|
||||
#' identify_tenure(sq_data)
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
identify_tenure <- function(data,
|
||||
end_date = "Date",
|
||||
beg_date = "HireDate",
|
||||
maxten = 40,
|
||||
return = "message"){
|
||||
|
||||
required_variables <- c("HireDate")
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
data_prep <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, format= "%m/%d/%Y"), # Re-format `Date`
|
||||
end_date = as.Date(!!sym(end_date), format= "%m/%d/%Y"), # Access a symbol, not a string
|
||||
beg_date = as.Date(!!sym(beg_date), format= "%m/%d/%Y")) %>% # Access a symbol, not a string
|
||||
arrange(end_date) %>%
|
||||
mutate(End = last(end_date))
|
||||
|
||||
last_date <- data_prep$End
|
||||
|
||||
# graphing data
|
||||
tenure_summary <-
|
||||
data_prep %>%
|
||||
filter(Date == last_date) %>%
|
||||
mutate(tenure_years = (Date - beg_date)/365) %>%
|
||||
group_by(tenure_years)%>%
|
||||
summarise(n = n(),.groups = 'drop')
|
||||
|
||||
|
||||
# off person IDs
|
||||
oddpeople <-
|
||||
data_prep %>%
|
||||
filter(Date == last_date) %>%
|
||||
mutate(tenure_years = (Date - beg_date)/365) %>%
|
||||
filter(tenure_years >= maxten) %>%
|
||||
select(PersonId)
|
||||
|
||||
# message
|
||||
Message <- paste0("The mean tenure is ",round(mean(tenure_summary$tenure_years,na.rm = TRUE),1)," years.\nThe max tenure is ",
|
||||
round(max(tenure_summary$tenure_years,na.rm = TRUE),1),".\nThere are ",
|
||||
length(tenure_summary$tenure_years[tenure_summary$tenure_years>=maxten])," employees with a tenure greater than ",maxten," years.")
|
||||
|
||||
if(return == "text"){
|
||||
return(Message)
|
||||
|
||||
} else if(return == "message"){
|
||||
message(Message)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
suppressWarnings(
|
||||
ggplot(data = tenure_summary,aes(x = as.numeric(tenure_years))) +
|
||||
geom_density() +
|
||||
labs(title = "Tenure - Density",
|
||||
subtitle = "Calculated with `HireDate`") +
|
||||
xlab("Tenure in Years") +
|
||||
ylab("Density - number of employees") +
|
||||
theme_wpa_basic()
|
||||
)
|
||||
|
||||
} else if(return == "data_cleaned"){
|
||||
return(data %>% filter(!(PersonId %in% oddpeople$PersonId)) %>% data.frame())
|
||||
|
||||
} else if(return == "data_dirty"){
|
||||
return(data %>% filter((PersonId %in% oddpeople$PersonId)) %>% data.frame())
|
||||
|
||||
} else if(return == "data"){
|
||||
|
||||
data_prep %>%
|
||||
filter(Date == last_date) %>%
|
||||
mutate(TenureYear = as.numeric((Date - beg_date)/365)) %>%
|
||||
select(PersonId, TenureYear)
|
||||
|
||||
} else {
|
||||
stop("Error: please check inputs for `return`")
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,52 @@
|
|||
#' @title Import a Workplace Analytics Query
|
||||
#'
|
||||
#' @description
|
||||
#' Import a Workplace Analytics Query from a local CSV File, with variable classifications optimised
|
||||
#' for other **wpa** functions.
|
||||
#'
|
||||
#' @details
|
||||
#' `import_wpa()` uses `data.table::fread()` to import CSV files for speed,
|
||||
#' and by default `stringsAsFactors` is set to FALSE.
|
||||
#' A data frame is returned by the function (not a `data.table`).
|
||||
#'
|
||||
#' @param x String containing the path to the Workplace Analytics query to be imported.
|
||||
#' The input file must be a CSV file, and the file extension must be explicitly entered,
|
||||
#' e.g. "/files/standard query.csv"
|
||||
#' @param standardise logical. If TRUE, `import_wpa()` runs `standardise_pq()` to make a Collaboration
|
||||
#' Assessment query's columns name standard and consistent with a Standard Person Query. Note that this
|
||||
#' will have no effect if the query being imported is not a Collaboration Assessment query. Defaults
|
||||
#' as FALSE.
|
||||
#'
|
||||
#' @export
|
||||
import_wpa <- function(x, standardise = FALSE){
|
||||
|
||||
return_data <- data.table::fread(x, stringsAsFactors = FALSE) %>% as.data.frame()
|
||||
|
||||
# Columns which are Dates
|
||||
dateCols <- sapply(return_data, function(x) all(is_date_format(x)))
|
||||
dateCols <- dateCols[dateCols == TRUE]
|
||||
|
||||
return_data <-
|
||||
return_data %>%
|
||||
dplyr::mutate_at(dplyr::vars(names(dateCols)), ~as.Date(., format = "%m/%d/%Y"))
|
||||
|
||||
message("Query has been imported successfully!")
|
||||
|
||||
## Query check only available for Person Queries
|
||||
if(identify_query(return_data) == "Person Query"){
|
||||
check_query(return_data)
|
||||
}
|
||||
|
||||
## Standardise query if `standardise == TRUE`
|
||||
if(standardise == TRUE & identify_query(return_data) == "Collaboration Assessment Query"){
|
||||
message("Standardising column names for a Collaboration Assessment query to
|
||||
a Person query...")
|
||||
return_data <- standardise_pq(return_data)
|
||||
}
|
||||
|
||||
dplyr::as_tibble(return_data)
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,7 @@
|
|||
.onAttach <- function(libname, pkgname) {
|
||||
message <- c("\n Thank you for using the {wpa} R package!",
|
||||
"\n \n Our analysts have taken every care to ensure that this package runs smoothly and bug-free.",
|
||||
"\n However, if you do happen to encounter any, please email mac@microsoft.com to report any issues.",
|
||||
"\n \n Happy coding!")
|
||||
packageStartupMessage(message)
|
||||
}
|
|
@ -0,0 +1,111 @@
|
|||
#' @title Plot the internal network metrics for a HR variable
|
||||
#'
|
||||
#' @description
|
||||
#' lot the internal network metrics for a HR variable.
|
||||
#'
|
||||
#' @param data Person Query as a dataframe including date column named "Date"
|
||||
#' This function assumes the data format is MM/DD/YYYY as is standard in a WpA query output.
|
||||
#' @param hrvar WpA variable for an HR variable to group networks by
|
||||
#' For example, "Layer"
|
||||
#' @param mingroup Numeric vector for minimum group size for aggregation
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param bubble_size A numeric vector of length two to specify the size range of the bubbles
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import reshape2
|
||||
#' @import ggplot2
|
||||
#' @import ggrepel
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{internal_network_plot(sq_data)
|
||||
#
|
||||
#' }
|
||||
#'
|
||||
#' @family Connectivity
|
||||
#'
|
||||
#' @export
|
||||
|
||||
internal_network_plot <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
bubble_size = c(1, 5))
|
||||
{
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar))
|
||||
|
||||
plot_data <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Int_network_size = mean(Internal_network_size),
|
||||
Int_network_breadth = mean(Networking_outside_organization),
|
||||
Employee_count = n_distinct(PersonId)
|
||||
) %>%
|
||||
filter(Employee_count >= mingroup)
|
||||
|
||||
# Create summary table
|
||||
summary_table <- plot_data %>% arrange(., desc(Int_network_size))
|
||||
|
||||
# Create plot
|
||||
int_network_plot <-ggplot(plot_data,
|
||||
aes(x=Int_network_size,
|
||||
y=Int_network_breadth)
|
||||
) +
|
||||
geom_point(aes(size=Employee_count),
|
||||
color = rgb2hex(0, 120, 212),
|
||||
alpha = 0.5) +
|
||||
geom_text_repel(label=plot_data$group) +
|
||||
scale_x_continuous(name = "Internal Network Size") +
|
||||
scale_y_continuous(name = "Internal Network Breadth") +
|
||||
scale_size(range = bubble_size) +
|
||||
theme_classic() +
|
||||
theme(
|
||||
axis.text = element_text(size = 10),
|
||||
axis.text.x = element_text(
|
||||
angle = 90,
|
||||
hjust = 1,
|
||||
vjust = 0.5
|
||||
),
|
||||
plot.title = element_text(
|
||||
color = "grey40",
|
||||
face = "bold",
|
||||
size = 18
|
||||
),
|
||||
plot.subtitle = element_text(size = 14)
|
||||
) +
|
||||
labs(
|
||||
title = paste("Internal network metrics by", hrvar),
|
||||
subtitle = paste(
|
||||
"Network size is number of people, breadth is number of organizations"
|
||||
)
|
||||
) +
|
||||
labs(
|
||||
caption = paste(
|
||||
"Total employees =",
|
||||
sum(plot_data$Employee_count),
|
||||
"| Data from",
|
||||
min(as.Date(data$Date, "%m/%d/%Y")),
|
||||
"to",
|
||||
max(as.Date(data$Date, "%m/%d/%Y"))
|
||||
)
|
||||
)
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
summary_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(int_network_plot)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,19 @@
|
|||
#' @title Identify whether string is a date format
|
||||
#'
|
||||
#' @description
|
||||
#' This function uses regular expression to determine whether
|
||||
#' a string is of the format "mdy", separated by "-", "/",
|
||||
#' or ".", returning a logical vector.
|
||||
#'
|
||||
#' @param string Character string to test whether is a date format.
|
||||
#'
|
||||
#' @examples
|
||||
#' \dontrun{
|
||||
#' is_date_format("1/5/2020")
|
||||
#' }
|
||||
#'
|
||||
#' @export
|
||||
is_date_format <- function(string){
|
||||
grepl("^\\d{1,2}[- /.]\\d{1,2}[- /.]\\d{1,4}$",
|
||||
string)
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
# #' @title Create an exploratory overview of Standard Query dataset
|
||||
# #'
|
||||
# #' @description
|
||||
# #' Input a Standard Query dataset as a data frame in the argument, returning an overview of the key metrics.
|
||||
# #'
|
||||
# #' @param data A Standard Query dataset in the form of a data frame.
|
||||
# #' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
# #'
|
||||
# #' @import dplyr
|
||||
# #' @import ggplot2
|
||||
# #' @import reshape2
|
||||
# #'
|
||||
# #' @return
|
||||
# #' Returns a HTML report containing an overview of the key metrics in a Standard Person Query.
|
||||
# #'
|
||||
# #' @export
|
||||
#
|
||||
# keymetrics_report <- function(data, mingroup = 10){
|
||||
# ## Table by Person (Period Averages)
|
||||
# data %>%
|
||||
# group_by(PersonId) %>%
|
||||
# summarise(periods = n(),
|
||||
# Organization = first(Organization),
|
||||
# LevelDesignation = first(LevelDesignation)) -> mydata_person
|
||||
#
|
||||
# ## Plot available data by Organization
|
||||
# mydata_person %>% count(Organization, name = "count") -> mydata_OrgLevel
|
||||
#
|
||||
# mydata_OrgLevel %>%
|
||||
# ggplot(aes(x = Organization, y = 1, label=count)) +
|
||||
# geom_point(aes(size = count), colour = "steelblue", alpha = 0.5) +
|
||||
# scale_size_continuous(range = c(20, 50)) +
|
||||
# geom_label() +
|
||||
# ggtitle("Measured Individuals (by Organization)") +
|
||||
# ylab("") +
|
||||
# xlab("Organization") +
|
||||
# theme(legend.position="none",
|
||||
# axis.text.x = element_text(angle = 90, hjust = 1),
|
||||
# axis.title.y = element_blank(),
|
||||
# axis.text.y = element_blank(),
|
||||
# axis.ticks.y = element_blank()) -> plot_OrgLevel
|
||||
#
|
||||
# ## Plot available data by LevelDesignation
|
||||
# mydata_person %>% count(LevelDesignation, name = "count") -> mydata_LevelDesignation
|
||||
#
|
||||
# mydata_LevelDesignation %>%
|
||||
# ggplot(aes(x=LevelDesignation, y=1, label=count)) +
|
||||
# geom_point(aes(size = count), colour = "steelblue", alpha = 0.5) +
|
||||
# scale_size_continuous(range = c(20, 50)) +
|
||||
# geom_label() +
|
||||
# ggtitle("Measured Individuals (by Level Designation)") +
|
||||
# ylab("") + xlab("Level Designation") +
|
||||
# theme(legend.position="none",
|
||||
# axis.text.x = element_text(angle = 90, hjust = 1),
|
||||
# axis.title.y = element_blank(),
|
||||
# axis.text.y = element_blank(),
|
||||
# axis.ticks.y = element_blank(),
|
||||
# plot.title = element_text(color="grey40", face="bold", size=20)) -> plot_LevelDesignation
|
||||
#
|
||||
# ## Plot available data by Organization and Level
|
||||
# mydata_person %>% count(Organization, LevelDesignation, name = "count") -> mydata_OrgLevelDesignation
|
||||
#
|
||||
# mydata_OrgLevelDesignation %>%
|
||||
# ggplot(aes(x = Organization, y = LevelDesignation)) +
|
||||
# geom_tile(aes(fill = count), colour = "grey50") +
|
||||
# geom_text(aes(label = round(count, 1))) +
|
||||
# scale_fill_gradient(low = "lightblue", high = "blue",
|
||||
# limits=c(mingroup,NA)) +
|
||||
# labs(fill = "Count") +
|
||||
# ggtitle("Group Sizes (by Organization and Level)") +
|
||||
# ylab("Level Designation") +
|
||||
# theme(axis.text.x = element_text(angle = 45, hjust = 1),
|
||||
# plot.title = element_text(color="grey40", face="bold", size=20)) -> plot_OrgLevelDesignation
|
||||
#
|
||||
# ## Pipe in Manager Relationship
|
||||
# data %>% mgrrel_matrix(return = "plot") -> plot_ManagerRelationship
|
||||
#
|
||||
# ## Pipe in WITL
|
||||
# data %>% keymetrics_scan(hrvar = "Organization", return = "plot") -> plot_WITL
|
||||
#
|
||||
# ## Pipe in Meeting Habits
|
||||
# data %>% meeting_summary(hrvar = "Organization", return = "plot") -> plot_MeetingHabits
|
||||
#
|
||||
# ## Generate kables
|
||||
#
|
||||
# mydata_OrgLevel %>%
|
||||
# knitr::kable(caption = "Data by Organization") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_OrgLevel
|
||||
#
|
||||
# mydata_LevelDesignation %>%
|
||||
# knitr::kable(caption = "Data by Level Designation") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_LevelDesignation
|
||||
#
|
||||
# mydata_OrgLevelDesignation %>%
|
||||
# spread(LevelDesignation, count) %>%
|
||||
# knitr::kable(caption = "Data by Level Designation") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_OrgLevelDesignation
|
||||
#
|
||||
# ## Pipe in Manager Relationship
|
||||
# data %>%
|
||||
# mgrrel_matrix(return = "table") %>%
|
||||
# knitr::kable(caption = "Distribution of Manager-Direct Relationship") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_ManagerRelationship
|
||||
#
|
||||
# ## Pipe in WITL
|
||||
# data %>%
|
||||
# keymetrics_scan(hrvar = "Organization", return = "table") %>%
|
||||
# mutate_at(vars(-variable), ~round(., 1)) %>%
|
||||
# knitr::kable(caption = "Week in the Life - by Organization") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_WITL
|
||||
#
|
||||
# ## Pipe in Meeting Habits
|
||||
# data %>%
|
||||
# meeting_summary(hrvar = "Organization", return = "table") %>%
|
||||
# mutate_at(vars(-variable), ~round(., 1)) %>%
|
||||
# knitr::kable(caption = "Meeting Habits - by Organization") %>%
|
||||
# kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE) -> tb_MeetingHabits
|
||||
#
|
||||
# markobj <- c('---',
|
||||
# 'title: "WPA Standard Query Overview"',
|
||||
# 'output: ',
|
||||
# ' html_document:',
|
||||
# ' theme: united',
|
||||
# ' toc: true',
|
||||
# ' toc_float:',
|
||||
# ' collapsed: false',
|
||||
# ' smooth_scroll: true',
|
||||
#
|
||||
# '---',
|
||||
# '',
|
||||
# '',
|
||||
#
|
||||
# '### Week in the Life',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'plot_WITL',
|
||||
# '```',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_WITL',
|
||||
# '```',
|
||||
#
|
||||
# '### Organization Level',
|
||||
#
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_OrgLevel',
|
||||
# '```',
|
||||
#
|
||||
# '### Level Designation',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'plot_LevelDesignation',
|
||||
# '```',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_LevelDesignation',
|
||||
# '```',
|
||||
#
|
||||
# '### Organization and Level Designation',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'plot_OrgLevelDesignation',
|
||||
# '```',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_OrgLevelDesignation',
|
||||
# '```',
|
||||
#
|
||||
# '### Meeting Habits',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'plot_MeetingHabits',
|
||||
# '```',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_MeetingHabits',
|
||||
# '```',
|
||||
#
|
||||
# '### Manager-direct relationship Distribution',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'plot_ManagerRelationship',
|
||||
# '```',
|
||||
# '```{r, echo=FALSE}',
|
||||
# 'tb_ManagerRelationship')
|
||||
#
|
||||
# writeLines(markobj, "SQ-overview.Rmd")
|
||||
# rmarkdown::render("SQ-overview.Rmd")
|
||||
# utils::browseURL("SQ-overview.html")
|
||||
# unlink(c("SQ-overview.Rmd",
|
||||
# "SQ-overview.md"))
|
||||
# }
|
|
@ -0,0 +1,118 @@
|
|||
#' @title Run a summary of Key Metrics from the Standard Query data
|
||||
#'
|
||||
#' @description
|
||||
#' Returns a heatmapped table by default, with options to return a table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, e.g. "Organization"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param metrics A character vector containing the variable names to calculate averages of.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#' @param textsize A numeric value specifying the text size to show in the plot.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' keymetrics_scan(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @export
|
||||
|
||||
keymetrics_scan <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
metrics = c("Workweek_span",
|
||||
"Collaboration_hours",
|
||||
"After_hours_collaboration_hours",
|
||||
"Meetings",
|
||||
"Meeting_hours",
|
||||
"After_hours_meeting_hours",
|
||||
"Low_quality_meeting_hours",
|
||||
"Meeting_hours_with_manager_1_on_1",
|
||||
"Meeting_hours_with_manager",
|
||||
"Emails_sent",
|
||||
"Email_hours",
|
||||
"After_hours_email_hours",
|
||||
"Generated_workload_email_hours",
|
||||
"Total_focus_hours",
|
||||
"Internal_network_size",
|
||||
"Networking_outside_organization",
|
||||
"External_network_size",
|
||||
"Networking_outside_company"),
|
||||
return = "plot",
|
||||
textsize = 2){
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
group_by(group, PersonId) %>%
|
||||
summarise_at(vars(metrics), ~mean(., na.rm = TRUE)) %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(vars(metrics), ~mean(., na.rm = TRUE)) %>%
|
||||
left_join(hrvar_count(data, hrvar = hrvar, return = "table") %>%
|
||||
rename(Employee_Count = "n"),
|
||||
by = c("group" = hrvar)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable %>%
|
||||
reshape2::melt(id.vars = "group") %>%
|
||||
reshape2::dcast(variable ~ group) -> myTable_wide
|
||||
|
||||
myTable_long <- reshape2::melt(myTable, id.vars=c("group")) %>%
|
||||
mutate(variable = factor(variable)) %>%
|
||||
group_by(variable) %>%
|
||||
# Heatmap by row
|
||||
mutate(value_rescaled = value/mean(value)) %>%
|
||||
ungroup()
|
||||
|
||||
# Underscore to space
|
||||
us_to_space <- function(x){
|
||||
gsub(pattern = "_", replacement = " ", x = x)
|
||||
}
|
||||
|
||||
|
||||
plot_object <-
|
||||
myTable_long %>%
|
||||
filter(variable != "Employee_Count") %>%
|
||||
ggplot(aes(x = group,
|
||||
y = stats::reorder(variable, desc(variable)))) +
|
||||
geom_tile(aes(fill = value_rescaled)) +
|
||||
geom_text(aes(label=round(value, 1)), size = textsize) +
|
||||
scale_fill_distiller(palette = "Blues", direction = 1) +
|
||||
scale_x_discrete(position = "top") +
|
||||
scale_y_discrete(labels = us_to_space) +
|
||||
theme_light() +
|
||||
labs(title = "Key Workplace Analytics metrics",
|
||||
subtitle = paste("Weekly average by", camel_clean(hrvar)),
|
||||
y =" ",
|
||||
x =" ",
|
||||
caption = extract_date_range(data, return = "text")) +
|
||||
theme(axis.text.x = element_text(angle = 90, hjust = 0),
|
||||
plot.title = element_text(color="grey40", face="bold", size=20)) +
|
||||
guides(fill=FALSE)
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_wide %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
#' @title Meeting Hours distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze Meeting Hours distribution.
|
||||
#' Returns a stacked bar plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @inheritParams create_dist
|
||||
#'
|
||||
#' @family Meeting Culture
|
||||
#'
|
||||
#' @examples
|
||||
#' ## Return a plot
|
||||
#' meeting_dist(sq_data, hrvar = "Organization")
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' meeting_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#'
|
||||
#' ## Return result with a custom specified breaks
|
||||
#' meeting_dist(sq_data, hrvar = "LevelDesignation", cut = c(4, 7, 9))
|
||||
#'
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meeting_dist <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot",
|
||||
cut = c(5, 10, 15)) {
|
||||
|
||||
create_dist(data = data,
|
||||
metric = "Meeting_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
cut = cut)
|
||||
}
|
||||
|
||||
#' @rdname meeting_dist
|
||||
#' @export
|
||||
meeting_distribution <- meeting_dist
|
|
@ -0,0 +1,27 @@
|
|||
#' @title Distribution of Meeting Hours (Fizzy Drink plot)
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze weekly meeting hours distribution, and returns
|
||||
#' a 'fizzy' scatter plot by default.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @inheritParams create_fizz
|
||||
#'
|
||||
#' @family Meetings
|
||||
#'
|
||||
#' @examples
|
||||
#' meeting_fizz(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
meeting_fizz <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
create_fizz(data = data,
|
||||
metric = "Meeting_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
|
||||
}
|
|
@ -0,0 +1,37 @@
|
|||
#' @title Meeting Time Trend - Line Chart
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of meeting time, visualised as line charts.
|
||||
#' By default returns a line chart for meeting hours,
|
||||
#' with a separate panel per value in the HR attribute.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @inheritParams create_line
|
||||
#'
|
||||
#' @family Meetings
|
||||
#'
|
||||
#' @examples
|
||||
#'
|
||||
#' ## Return a line plot
|
||||
#' meeting_line(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' meeting_line(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
meeting_line <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
## Inherit arguments
|
||||
create_line(data = data,
|
||||
metric = "Meeting_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
}
|
|
@ -0,0 +1,106 @@
|
|||
#' Run a meeting habits / meeting quality analysis
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, e.g. "Organization"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" or "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @family Meeting Culture
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meeting_quality <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
## Prepare Table
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename hrvar to `group`
|
||||
group_by(PersonId, group) %>%
|
||||
summarize(Meeting_count = mean(Meetings),
|
||||
Meeting_hours = mean(Meeting_hours),
|
||||
Low_quality_meeting_hours = mean(Low_quality_meeting_hours),
|
||||
perc_after_hours_m = 100*mean(After_hours_meeting_hours)/mean(Meeting_hours),
|
||||
perc_low_quality = 100*mean(Low_quality_meeting_hours)/mean(Meeting_hours),
|
||||
perc_Multitasking = 100*mean(Multitasking_meeting_hours)/mean(Meeting_hours),
|
||||
perc_Redundant = 100*mean(Redundant_meeting_hours__organizational_)/mean(Meeting_hours),
|
||||
perc_conflicting = 100*mean(Conflicting_meeting_hours)/mean(Meeting_hours)) %>%
|
||||
group_by(group) %>%
|
||||
summarise_at(vars(Meeting_count,
|
||||
Meeting_hours,
|
||||
Low_quality_meeting_hours,
|
||||
perc_after_hours_m,
|
||||
perc_low_quality,
|
||||
perc_Multitasking,
|
||||
perc_Redundant,
|
||||
perc_conflicting),
|
||||
~mean(.)) %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup) -> myTable
|
||||
|
||||
myTable_wide <- myTable %>%
|
||||
reshape2::melt(id.vars = "group") %>%
|
||||
reshape2::dcast(variable ~ group)
|
||||
|
||||
myTable_long <- reshape2::melt(myTable, id.vars=c("group"))
|
||||
myTable_plot <- myTable %>% select(group, perc_low_quality, Meeting_hours)
|
||||
|
||||
## Bar plot
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = perc_low_quality, y = Meeting_hours, size=2)) +
|
||||
geom_point(stat = "identity",
|
||||
fill = "#203864", alpha=0.1) +
|
||||
geom_text(aes(label = group),
|
||||
hjust = 0.2,
|
||||
color = "black",
|
||||
fontface = "bold",
|
||||
size = 4)+
|
||||
ylim(min(myTable_plot$Meeting_hours),max(myTable_plot$Meeting_hours) + 2) +
|
||||
xlim(min(myTable_plot$perc_low_quality),max(myTable_plot$perc_low_quality) + 2) +
|
||||
theme_wpa_basic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "none",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = "Meeting Quality",
|
||||
subtitle = paste("Meeting time and low-quality percentage by", tolower(hrvar))) +
|
||||
ylab("Average weekly meeting hours") +
|
||||
xlab("Average weekly percentage of low-quality meeting hours") +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Meeting Hours Ranking
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with high levels of 'Weekly Email Collaboration'.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "html" (default, returning an interactive DataTable)
|
||||
#' and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Meetings
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meeting_rank <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
output <-
|
||||
data %>% create_rank(metric="Meeting_hours", hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = "table")
|
||||
|
||||
if(return == "html"){
|
||||
return(create_dt(output))
|
||||
} else if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,101 @@
|
|||
#' @title Produce a skim summary of meeting hours
|
||||
#'
|
||||
#' @description
|
||||
#' This function returns a skim summary in the console
|
||||
#' when provided a standard query in the input.
|
||||
#'
|
||||
#' @param data A standard person query data in the form of a data frame.
|
||||
#' @param return A character string to specify whether to return a message
|
||||
#' or a text string. Defaults to "message". Valid options include "text" or
|
||||
#' "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @examples
|
||||
#' meeting_skim(sq_data)
|
||||
#'
|
||||
#' @export
|
||||
meeting_skim <- function(data, return = "message"){
|
||||
|
||||
key_output <-
|
||||
data %>%
|
||||
select(PersonId,
|
||||
Meeting_hours,
|
||||
Conflicting_meeting_hours,
|
||||
Multitasking_meeting_hours,
|
||||
Redundant_meeting_hours__lower_level_,
|
||||
Redundant_meeting_hours__organizational_,
|
||||
Low_quality_meeting_hours) %>%
|
||||
summarise_at(vars(-PersonId), ~sum(.)) %>% # sum total
|
||||
tidyr::gather(HourType, Hours, -Meeting_hours) %>%
|
||||
mutate_at(vars(Hours), ~./Meeting_hours) %>%
|
||||
mutate(RawHours = Hours * Meeting_hours)
|
||||
|
||||
|
||||
mh_total <- round(key_output$Meeting_hours[1])
|
||||
|
||||
mh_lowqualityprop <-
|
||||
key_output %>%
|
||||
filter(HourType == "Low_quality_meeting_hours") %>%
|
||||
pull(Hours) %>%
|
||||
`*`(100) %>%
|
||||
round() %>%
|
||||
paste0("%")
|
||||
|
||||
bracket <- function(text){
|
||||
paste0("(", text, ")")
|
||||
}
|
||||
|
||||
extract_prop <- function(filt_chr){
|
||||
key_output %>%
|
||||
filter(HourType == filt_chr) %>%
|
||||
pull(Hours) %>%
|
||||
`*`(100) %>%
|
||||
round() %>%
|
||||
paste0("%")
|
||||
}
|
||||
|
||||
extract_raw <- function(filt_chr){
|
||||
key_output %>%
|
||||
filter(HourType == filt_chr) %>%
|
||||
pull(RawHours) %>%
|
||||
round()
|
||||
}
|
||||
|
||||
|
||||
print_text <-
|
||||
paste("There are",
|
||||
mh_total,
|
||||
"total meeting hours across the analysis population.\n",
|
||||
|
||||
">>>",
|
||||
extract_raw("Low_quality_meeting_hours"),
|
||||
"are low quality",
|
||||
bracket(extract_prop("Low_quality_meeting_hours")),
|
||||
"\n",
|
||||
|
||||
">>>",
|
||||
extract_raw("Redundant_meeting_hours__organizational_"),
|
||||
"are redundant",
|
||||
bracket(extract_prop("Redundant_meeting_hours__organizational_")),
|
||||
"\n",
|
||||
">>>",
|
||||
extract_raw("Conflicting_meeting_hours"),
|
||||
"are conflicting",
|
||||
bracket(extract_prop("Conflicting_meeting_hours")),
|
||||
"\n",
|
||||
">>>",
|
||||
extract_raw("Multitasking_meeting_hours"),
|
||||
"are multitasking.",
|
||||
bracket(extract_prop("Multitasking_meeting_hours")))
|
||||
|
||||
if(return == "message"){
|
||||
message(print_text)
|
||||
} else if(return == "text"){
|
||||
print_text <- gsub(pattern = ">>>", replacement = " - ", x = print_text)
|
||||
print_text
|
||||
} else if(return == "table"){
|
||||
key_output
|
||||
} else {
|
||||
stop("Please check `return`")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,42 @@
|
|||
#' @title Meeting Summary
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of weekly meeting hours.
|
||||
#' Returns a bar plot showing average weekly meeting hours by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @inheritParams create_bar
|
||||
#'
|
||||
#' @family Meetings
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @examples
|
||||
#' # Return a ggplot bar chart
|
||||
#' meeting_summary(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#' # Return a summary table
|
||||
#' meeting_summary(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#' @export
|
||||
meeting_summary <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
create_bar(data = data,
|
||||
metric = "Meeting_hours",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return,
|
||||
bar_colour = "darkblue")
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,78 @@
|
|||
#' @title Generate a Meeting Text Mining report in HTML
|
||||
#'
|
||||
#' @description
|
||||
#' Create a text mining report in HTML based on Meeting Subject Lines
|
||||
#'
|
||||
#' @param data A Meeting Query dataset in the form of a data frame.
|
||||
#' @param path Pass the file path and the desired file name, _excluding the file extension_.
|
||||
#' For example, "meeting text mining report".
|
||||
#' @param stopwords A single-column data frame labelled 'word' containing custom stopwords to remove.
|
||||
#' @param timestamp Logical vector specifying whether to include a timestamp in the file name.
|
||||
#' Defaults to TRUE.
|
||||
#' @param keep A numeric vector specifying maximum number of words to keep.
|
||||
#' @param seed A numeric vector to set seed for random generation.
|
||||
#'
|
||||
#' @family Text-mining
|
||||
#'
|
||||
#' @export
|
||||
meeting_tm_report <- function(data,
|
||||
path = "meeting text mining report",
|
||||
stopwords = NULL,
|
||||
timestamp = TRUE,
|
||||
keep = 100,
|
||||
seed = 100){
|
||||
|
||||
## Create timestamped path (if applicable)
|
||||
if(timestamp == TRUE){
|
||||
newpath <- paste(path, wpa::tstamp())
|
||||
} else {
|
||||
newpath <- path
|
||||
}
|
||||
|
||||
# Set outputs
|
||||
output_list <-
|
||||
list(md2html(text = read_preamble("blank.md")), # Header
|
||||
data %>% tm_wordcloud(stopwords = stopwords, keep = keep),
|
||||
data %>% tm_freq(token = "words", stopwords = stopwords, keep = keep),
|
||||
data %>% tm_freq(token = "words", stopwords = stopwords, keep = keep, return = "table"),
|
||||
data %>% tm_freq(token = "ngrams", stopwords = stopwords, keep = keep),
|
||||
data %>% tm_freq(token = "ngrams", stopwords = stopwords, keep = keep, return = "table"),
|
||||
data %>% tm_cooc(stopwords = stopwords, seed = seed),
|
||||
data %>% tm_cooc(stopwords = stopwords, seed = seed, return="table")) %>%
|
||||
purrr::map_if(is.data.frame, create_dt)
|
||||
|
||||
# Set header titles
|
||||
title_list <-
|
||||
c("Text Mining Report", # Section header
|
||||
"Word cloud",
|
||||
"Word Frequency",
|
||||
"",
|
||||
"Phrase Frequency",
|
||||
"",
|
||||
"Word Co-occurrence",
|
||||
"")
|
||||
|
||||
# Set header levels
|
||||
n_title <- length(title_list)
|
||||
levels_list <- rep(3, n_title)
|
||||
levels_list[c(1)] <- 2 # Section header
|
||||
|
||||
# Generate report
|
||||
generate_report(title = "Analysis of Meeting Subject Lines",
|
||||
filename = newpath,
|
||||
outputs = output_list,
|
||||
titles = title_list,
|
||||
subheaders = rep("", n_title),
|
||||
echos = rep(FALSE, n_title),
|
||||
levels = levels_list,
|
||||
theme = "cosmo",
|
||||
preamble = read_preamble("meeting_tm_report.md"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,105 @@
|
|||
#' @title Meeting Hours Time Trend
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of meeting time.
|
||||
#' By default returns a week by week heatmap, highlighting the points in time with most activity.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Collaboration
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meeting_trend <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup=5,
|
||||
return = "plot"){
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
"Meeting_hours",
|
||||
"PersonId")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
myPeriod <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
|
||||
arrange(Date) %>%
|
||||
mutate(Start = first(Date), End=last(Date)) %>%
|
||||
filter(row_number() == 1) %>%
|
||||
select(Start, End)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, Meeting_hours) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy threshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count = mean(Employee_Count),
|
||||
Meeting_hours = mean(Meeting_hours))
|
||||
|
||||
myTable_plot <- myTable %>% select(Date, group, Meeting_hours)
|
||||
|
||||
myTable_return <- myTable_plot %>% spread(Date, Meeting_hours)
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x = Date , y = group , fill = Meeting_hours)) +
|
||||
geom_tile(height=.5) +
|
||||
scale_fill_gradient(name="Hours", low = "white", high = "red") +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14),
|
||||
legend.position = "right",
|
||||
legend.justification = "right",
|
||||
legend.title=element_text(size=14),
|
||||
legend.text=element_text(size=14)) +
|
||||
labs(title = "Meeting Hours",
|
||||
subtitle = paste("Total meeting time by", tolower(hrvar))) +
|
||||
xlab("Date") +
|
||||
ylab(hrvar) +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
#' @title Meeting Type Distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Calculate the hour distribution of internal meeting types.
|
||||
#' This is a wrapper around `meetingtype_dist_mt()` and
|
||||
#' `meetingtype_dist_ca()`, depending on whether a Meeting Query or a Collaboration Assessment Query
|
||||
#' is passed as an input.
|
||||
#'
|
||||
#' @param data Data frame. If a meeting query, must contain the variables `Attendee` and `DurationHours`.
|
||||
#' @param hrvar Character string to specify the HR attribute to split the data by.
|
||||
#' Note that this is only applicable if a Collaboration Assessment query is passed to the function. If a Meeting Query
|
||||
#' is passed instead, this argument is ignored.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#' @importFrom stats setNames
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meetingtype_dist <- function(data,
|
||||
hrvar = NULL,
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
if("MeetingId" %in% names(data)){
|
||||
|
||||
message("Calculating results using a Meeting Query...")
|
||||
meetingtype_dist_mt(data, return = return)
|
||||
|
||||
} else if("PersonId" %in% names(data)){
|
||||
|
||||
message("Calculating results using a Collaboration Assessment Query...")
|
||||
meetingtype_dist_ca(data, hrvar = hrvar, mingroup = mingroup, return = return)
|
||||
|
||||
} else {
|
||||
stop("Please check query type. Must be either a Collaboration Assessment Query or a Meeting Query.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,230 @@
|
|||
#' @title Meeting Type Distribution (Collaboration Assessment Query)
|
||||
#'
|
||||
#' @description
|
||||
#' Calculate the hour distribution of internal meeting types,
|
||||
#' using a Collaboration Assessment Query with core WpA variables as an input.
|
||||
#'
|
||||
#' @param data Meeting Query data frame. Must contain the variables `Attendee` and `DurationHours`
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#' @importFrom stats setNames
|
||||
#'
|
||||
#' @export
|
||||
|
||||
meetingtype_dist_ca <- function(data,
|
||||
hrvar = NULL,
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
mt_dist_str <- c("Bloated_meeting_hours",
|
||||
"Lengthy_meeting_hours",
|
||||
"Workshop_meeting_hours",
|
||||
"All_hands_meeting_hours",
|
||||
"Status_update_meeting_hours",
|
||||
"Decision_making_meeting_hours",
|
||||
"One_on_one_meeting_hours")
|
||||
|
||||
mt_dist_str_clean <-
|
||||
mt_dist_str %>%
|
||||
gsub(pattern = "_meeting_hours", replacement = "", x = .) %>%
|
||||
gsub(pattern = "_", replacement = " ", x = .)
|
||||
|
||||
|
||||
## Add dummy "Total" column if hrvar = NULL
|
||||
if(is.null(hrvar)){
|
||||
|
||||
data <- mutate(data, Total = "Total")
|
||||
hrvar <- "Total"
|
||||
|
||||
}
|
||||
|
||||
## No org splits
|
||||
if(hrvar == "Total"){
|
||||
|
||||
myResultsTable <-
|
||||
data %>%
|
||||
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
|
||||
gather(MeetingType, AttendeeMeetingHours) %>%
|
||||
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
|
||||
Percent = paste(round(Prop * 100), "%")) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType))
|
||||
|
||||
## Only for creating the bottom row data
|
||||
myResultsTableTotal <-
|
||||
data %>%
|
||||
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
|
||||
gather(MeetingType, AttendeeMeetingHours) %>%
|
||||
mutate(MeetingType = "Total") %>%
|
||||
group_by(MeetingType) %>%
|
||||
summarise(AttendeeMeetingHours = sum(AttendeeMeetingHours, na.rm = TRUE)) %>%
|
||||
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
|
||||
Percent = paste(round(Prop * 100), "%"))
|
||||
|
||||
outputTable <-
|
||||
rbind(myResultsTable,
|
||||
myResultsTableTotal)
|
||||
|
||||
|
||||
} else {
|
||||
|
||||
## Group by hrvar
|
||||
myResultsTable <-
|
||||
data %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
|
||||
left_join(data %>% hrvar_count(hrvar = hrvar, return = "table"),
|
||||
by = hrvar) %>%
|
||||
filter(n >= mingroup) %>%
|
||||
gather(MeetingType, AttendeeMeetingHours, -!!sym(hrvar), -n) %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
|
||||
Percent = paste(round(Prop * 100), "%")) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType))
|
||||
|
||||
outputTable <-
|
||||
myResultsTable %>%
|
||||
ungroup() %>%
|
||||
select(!!sym(hrvar), MeetingType, Prop) %>%
|
||||
spread(MeetingType, Prop)
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
if(hrvar == "Total"){
|
||||
|
||||
base_df <-
|
||||
data.frame(id = 1:7,
|
||||
value = c("All hands",
|
||||
"Bloated",
|
||||
"Status update",
|
||||
"Decision making",
|
||||
"One on one",
|
||||
"Lengthy",
|
||||
"Workshop"))
|
||||
|
||||
duration <-
|
||||
c(0, 0, 3, 3,
|
||||
0, 0, 2, 2,
|
||||
0, 0, 1, 1,
|
||||
0, 0, 1, 1,
|
||||
0, 0, 1, 1,
|
||||
1, 1, 2, 2,
|
||||
2, 2, 3, 3)
|
||||
|
||||
attendees <-
|
||||
c(4, 5, 5, 4,
|
||||
3, 4, 4, 3,
|
||||
2, 3, 3, 2,
|
||||
1, 2, 2, 1,
|
||||
0, 1, 1, 0,
|
||||
0, 3, 3, 0,
|
||||
0, 4, 4, 0)
|
||||
|
||||
main_plot_df <-
|
||||
rbind(base_df,
|
||||
base_df,
|
||||
base_df,
|
||||
base_df) %>%
|
||||
arrange(id) %>%
|
||||
mutate(Duration = duration,
|
||||
Attendees = attendees)
|
||||
|
||||
|
||||
label_df <-
|
||||
main_plot_df %>%
|
||||
group_by(id, value) %>%
|
||||
summarise(x = mean(Duration),
|
||||
y = mean(Attendees)) %>%
|
||||
mutate(value2 = sub(" ", "\n", value),
|
||||
text = ifelse(id %in% c(1, 2, 6, 7),"#FFFFFF", "black"),
|
||||
fill = ifelse(id %in% c(1, 2, 6, 7),"#31617C", "#9FE5FF")) %>%
|
||||
left_join(select(myResultsTable, MeetingType, Percent),
|
||||
by = c("value" = "MeetingType")) %>%
|
||||
mutate(Percent = ifelse(is.na(Percent), "0 %", Percent)) %>%
|
||||
mutate(value2 = paste(value2, Percent, sep = "\n"))
|
||||
|
||||
colo_v <- setNames(label_df$fill, nm = base_df$value)
|
||||
text_v <- setNames(label_df$text, nm = label_df$value)
|
||||
|
||||
plot_object <-
|
||||
main_plot_df %>%
|
||||
ggplot(aes(x = Duration, y = Attendees)) +
|
||||
geom_polygon(aes(fill = value, group = id),
|
||||
colour = "#FFFFFF") +
|
||||
scale_fill_manual(values = colo_v) +
|
||||
geom_text(data = label_df, aes(x = x, y = y, label = value2, colour = value), size = 3) +
|
||||
scale_colour_manual(values = text_v) +
|
||||
scale_x_continuous(breaks = 0:3,
|
||||
labels = c("0", "1", "2+", "")) +
|
||||
scale_y_continuous(breaks = 0:5,
|
||||
labels = c("2", "3-8", "9-18",
|
||||
"19-50", "51+", "")) +
|
||||
labs(title = "Meeting types by attendees and duration",
|
||||
subtitle = "% of total time spent in meetings",
|
||||
ylab = "Attendees",
|
||||
xlab = "Duration (hours)",
|
||||
caption = extract_date_range(data, return = "text")) +
|
||||
theme_wpa_basic() +
|
||||
theme(legend.position = "none")
|
||||
|
||||
} else {
|
||||
|
||||
plot_object <-
|
||||
myResultsTable %>%
|
||||
mutate(MeetingType = factor(MeetingType, levels = rev(mt_dist_str_clean))) %>%
|
||||
mutate(Fill = case_when(MeetingType == "Bloated" ~ rgb2hex(180,180,180),
|
||||
MeetingType == "Lengthy" ~ rgb2hex(120,120,120),
|
||||
MeetingType == "Workshop" ~ rgb2hex(90,90,90),
|
||||
MeetingType == "All hands" ~ rgb2hex(60,60,60),
|
||||
MeetingType == "Status update" ~ rgb2hex(45,160,160),
|
||||
MeetingType == "Decision making" ~ rgb2hex(25,120,140),
|
||||
MeetingType == "One on one" ~ rgb2hex(5,85,100))) %>%
|
||||
ggplot(aes(x = !!sym(hrvar), y = Prop, group = MeetingType, fill = Fill)) +
|
||||
geom_bar(position = "stack", stat = "identity") +
|
||||
geom_text(aes(label = paste(round(Prop * 100), "%")),
|
||||
position = position_stack(vjust = 0.5),
|
||||
color = "#FFFFFF",
|
||||
fontface = "bold") +
|
||||
scale_fill_identity(name = "Coaching styles",
|
||||
breaks = c(
|
||||
rgb2hex(180,180,180),
|
||||
rgb2hex(120,120,120),
|
||||
rgb2hex(90,90,90),
|
||||
rgb2hex(60,60,60),
|
||||
rgb2hex(45,160,160),
|
||||
rgb2hex(25,120,140),
|
||||
rgb2hex(5,85,100)
|
||||
),
|
||||
labels = mt_dist_str_clean,
|
||||
guide = "legend") +
|
||||
coord_flip() +
|
||||
theme_wpa_basic() +
|
||||
labs(title = "Meeting types by attendees and duration",
|
||||
subtitle = "% of total time spent in meetings",
|
||||
y = "Percentage",
|
||||
caption = extract_date_range(data, return = "text"))
|
||||
|
||||
|
||||
}
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
outputTable
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,143 @@
|
|||
#' @title Meeting Type Distribution (Meeting Query)
|
||||
#'
|
||||
#' @description
|
||||
#' Calculate the hour distribution of internal meeting types,
|
||||
#' using a Meeting Query with core WpA variables as an input.
|
||||
#'
|
||||
#' @param data Meeting Query data frame. Must contain the variables `Attendee` and `DurationHours`
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#' @importFrom stats setNames
|
||||
#'
|
||||
#' @export
|
||||
meetingtype_dist_mt <- function(data, return = "plot"){
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
data_typed <-
|
||||
data %>%
|
||||
mutate(MeetingType = case_when(
|
||||
Attendees == 2 & DurationHours <= 1 ~ "One on one",
|
||||
Attendees >= 3 & Attendees <= 8 & DurationHours <= 1 ~ "Decision making",
|
||||
Attendees >= 9 & Attendees <= 18 & DurationHours <= 1 ~ "Status update",
|
||||
Attendees >= 19 & Attendees <= 50 & DurationHours <= 2 ~ "Bloated",
|
||||
Attendees <= 18 & DurationHours >= 1 & DurationHours <= 2 ~ "Lengthy",
|
||||
Attendees >= 51 ~ "All hands",
|
||||
Attendees <= 50 & DurationHours >= 2 ~ "Workshop",
|
||||
TRUE ~ NA_character_))
|
||||
|
||||
myResultsTable <-
|
||||
data_typed %>%
|
||||
group_by(MeetingType) %>%
|
||||
summarise(AttendeeMeetingHours = sum(Attendee_meeting_hours),
|
||||
TotalMeetings = n()) %>%
|
||||
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours)) %>%
|
||||
mutate(Percent = paste(round(Prop * 100), "%"))
|
||||
|
||||
myResultsTableTotal <-
|
||||
data_typed %>%
|
||||
summarise(AttendeeMeetingHours = sum(Attendee_meeting_hours),
|
||||
TotalMeetings = n()) %>%
|
||||
mutate(Prop = AttendeeMeetingHours / sum(AttendeeMeetingHours),
|
||||
Percent = paste(round(Prop * 100), "%"),
|
||||
MeetingType = "Total") %>%
|
||||
select(MeetingType, tidyselect::everything())
|
||||
|
||||
base_df <-
|
||||
data.frame(id = 1:7,
|
||||
value = c("All hands",
|
||||
"Bloated",
|
||||
"Status update",
|
||||
"Decision making",
|
||||
"One on one",
|
||||
"Lengthy",
|
||||
"Workshop"))
|
||||
|
||||
duration <-
|
||||
c(0, 0, 3, 3,
|
||||
0, 0, 2, 2,
|
||||
0, 0, 1, 1,
|
||||
0, 0, 1, 1,
|
||||
0, 0, 1, 1,
|
||||
1, 1, 2, 2,
|
||||
2, 2, 3, 3)
|
||||
|
||||
attendees <-
|
||||
c(4, 5, 5, 4,
|
||||
3, 4, 4, 3,
|
||||
2, 3, 3, 2,
|
||||
1, 2, 2, 1,
|
||||
0, 1, 1, 0,
|
||||
0, 3, 3, 0,
|
||||
0, 4, 4, 0)
|
||||
|
||||
main_plot_df <-
|
||||
rbind(base_df,
|
||||
base_df,
|
||||
base_df,
|
||||
base_df) %>%
|
||||
arrange(id) %>%
|
||||
mutate(Duration = duration,
|
||||
Attendees = attendees)
|
||||
|
||||
|
||||
label_df <-
|
||||
main_plot_df %>%
|
||||
group_by(id, value) %>%
|
||||
summarise(x = mean(Duration),
|
||||
y = mean(Attendees)) %>%
|
||||
mutate(value2 = sub(" ", "\n", value),
|
||||
text = ifelse(id %in% c(1, 2, 6, 7),"#FFFFFF", "black"),
|
||||
fill = ifelse(id %in% c(1, 2, 6, 7),"#31617C", "#9FE5FF")) %>%
|
||||
left_join(select(myResultsTable, MeetingType, Percent),
|
||||
by = c("value" = "MeetingType")) %>%
|
||||
mutate(Percent = ifelse(is.na(Percent), "0 %", Percent)) %>%
|
||||
mutate(value2 = paste(value2, Percent, sep = "\n"))
|
||||
|
||||
colo_v <- setNames(label_df$fill, nm = base_df$value)
|
||||
text_v <- setNames(label_df$text, nm = label_df$value)
|
||||
|
||||
plot_object <-
|
||||
main_plot_df %>%
|
||||
ggplot(aes(x = Duration, y = Attendees)) +
|
||||
geom_polygon(aes(fill = value, group = id),
|
||||
colour = "#FFFFFF") +
|
||||
scale_fill_manual(values = colo_v) +
|
||||
geom_text(data = label_df, aes(x = x, y = y, label = value2, colour = value), size = 3) +
|
||||
scale_colour_manual(values = text_v) +
|
||||
scale_x_continuous(breaks = 0:3,
|
||||
labels = c("0", "1", "2+", "")) +
|
||||
scale_y_continuous(breaks = 0:5,
|
||||
labels = c("2", "3-8", "9-18",
|
||||
"19-50", "51+", "")) +
|
||||
labs(title = "Meeting types by attendees and duration",
|
||||
subtitle = "% of total time spent in meetings",
|
||||
ylab = "Attendees",
|
||||
xlab = "Duration (hours)",
|
||||
caption = paste("Data from week of",
|
||||
myPeriod$Start,
|
||||
"to week of",
|
||||
myPeriod$End)) +
|
||||
theme_wpa_basic() +
|
||||
theme(legend.position = "none")
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
rbind(myResultsTable,
|
||||
myResultsTableTotal)
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,69 @@
|
|||
#' @title Create a summary bar chart of the proportion of Meeting Hours spent in Long or Large Meetings
|
||||
#'
|
||||
#' @description
|
||||
#' This function creates a bar chart showing the percentage of meeting hours which are spent in
|
||||
#' long or large meetings.
|
||||
#'
|
||||
#' @param data Collaboration Assessment query in the form of a data frame. Requires the following variables:
|
||||
#' - `Bloated_meeting_hours`
|
||||
#' - `Lengthy_meeting_hours`
|
||||
#' - `Workshop_meeting_hours`
|
||||
#' - `All_hands_meeting_hours`
|
||||
#' - `Status_update_meeting_hours`
|
||||
#' - `Decision_making_meeting_hours`
|
||||
#' - `One_on_one_meeting_hours`
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import ggplot2
|
||||
#' @import dplyr
|
||||
#'
|
||||
#' @export
|
||||
meetingtype_summary <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
mt_dist_str <- c("Bloated_meeting_hours",
|
||||
"Lengthy_meeting_hours",
|
||||
"Workshop_meeting_hours",
|
||||
"All_hands_meeting_hours",
|
||||
"Status_update_meeting_hours",
|
||||
"Decision_making_meeting_hours",
|
||||
"One_on_one_meeting_hours")
|
||||
|
||||
returnTable <-
|
||||
data %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
summarise_at(vars(mt_dist_str), ~sum(., na.rm = TRUE)) %>%
|
||||
gather(MeetingType, AttendeeMeetingHours, -!!sym(hrvar)) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_meeting_hours", replacement = "", x = MeetingType)) %>%
|
||||
mutate(MeetingType = gsub(pattern = "_", replacement = " ", x = MeetingType)) %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
mutate(AttendeeMeetingHours = AttendeeMeetingHours / sum(AttendeeMeetingHours)) %>%
|
||||
spread(MeetingType, AttendeeMeetingHours) %>%
|
||||
left_join(hrvar_count(data, hrvar, return = "table"), by = hrvar) %>%
|
||||
filter(n >= mingroup) %>%
|
||||
ungroup() %>%
|
||||
mutate(MeetingHoursInLongOrLargeMeetings = select(., c("All hands", "Bloated", "Lengthy", "Workshop")) %>%
|
||||
apply(1, sum, na.rm = TRUE)) %>%
|
||||
select(!!sym(hrvar), MeetingHoursInLongOrLargeMeetings, n)
|
||||
|
||||
if(return == "plot"){
|
||||
returnTable %>%
|
||||
create_bar_asis(group_var = hrvar,
|
||||
bar_var = "MeetingHoursInLongOrLargeMeetings",
|
||||
title = "% of Meeting Hours\nin Long or Large Meetings",
|
||||
subtitle = paste0("By ", camel_clean(hrvar)),
|
||||
caption = extract_date_range(data, return = "text"),
|
||||
percent = TRUE,
|
||||
bar_colour = "alert")
|
||||
} else if(return == "table"){
|
||||
returnTable
|
||||
} else {
|
||||
stop("Please enter a valid input for `return`.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,120 @@
|
|||
#' @title Manager meeting coattendance distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze degree of attendance between employes and their managers.
|
||||
#' Returns a stacked bar plot of different buckets of coattendance.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#' @family Meeting Culture
|
||||
#'
|
||||
#' @examples
|
||||
#' workloads_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
mgrcoatt_dist <- function(data, hrvar = "Organization", mingroup = 5, return = "plot") {
|
||||
|
||||
myPeriod <-
|
||||
data %>%
|
||||
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
|
||||
arrange(Date) %>%
|
||||
mutate(Start=first(Date), End=last(Date)) %>%
|
||||
filter(row_number()==1) %>%
|
||||
select(Start, End)
|
||||
|
||||
## Basic Data for bar plot
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(PersonId) %>%
|
||||
filter(Meeting_hours>0) %>%
|
||||
mutate(coattendman_rate = Meeting_hours_with_manager / Meeting_hours) %>%
|
||||
summarise(periods = n(),
|
||||
group = first(group), coattendman_rate=mean(coattendman_rate)) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Create buckets of coattendance time
|
||||
plot_data <-
|
||||
plot_data %>%
|
||||
mutate(bucket_coattendman_rate = case_when(coattendman_rate>=0 & coattendman_rate<.20 ~ "0 - 20%",
|
||||
coattendman_rate>=.20 & coattendman_rate<.4 ~ "20 - 40%",
|
||||
coattendman_rate>=.40 & coattendman_rate<.6 ~ "40 - 60%",
|
||||
coattendman_rate>=.6 ~ "60% +"))
|
||||
|
||||
|
||||
## Employee count / base size table
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count=first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
## Data for bar plot
|
||||
plot_table <-
|
||||
plot_data %>%
|
||||
group_by(group, bucket_coattendman_rate) %>%
|
||||
summarize(Employees=n(),
|
||||
Employee_Count=first(Employee_Count), percent= Employees / Employee_Count) %>%
|
||||
arrange(group, bucket_coattendman_rate)
|
||||
|
||||
## Table for annotation
|
||||
annot_table <-
|
||||
plot_legend %>%
|
||||
dplyr::left_join(plot_table, by = "group")
|
||||
|
||||
## Bar plot
|
||||
plot_object <-
|
||||
plot_table %>%
|
||||
ggplot(aes(x = group, y=Employees, fill = bucket_coattendman_rate)) +
|
||||
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
|
||||
coord_flip() +
|
||||
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
|
||||
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
|
||||
scale_fill_manual(name="", values = c("#bed6f2", "#e9f1fb","#ffdfd3","#FE7F4F")) +
|
||||
theme_classic() +
|
||||
theme(axis.text=element_text(size=12),
|
||||
plot.title = element_text(color="grey40", face="bold", size=18),
|
||||
plot.subtitle = element_text(size=14), legend.position = "top", legend.justification = "right",
|
||||
legend.title=element_text(size=14), legend.text=element_text(size=14)) +
|
||||
labs(title = "Time with Manager", subtitle = paste("Meeting Co-attendance Rate by", hrvar)) +
|
||||
xlab(hrvar) +
|
||||
ylab("Fraction of Employees") +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
## Table to return
|
||||
return_table <-
|
||||
plot_table %>%
|
||||
select(group, bucket_coattendman_rate, percent) %>%
|
||||
spread(bucket_coattendman_rate, percent)
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
return_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,223 @@
|
|||
#' @title Manager Relationship 2x2 Matrix
|
||||
#'
|
||||
#' @description
|
||||
#' Generate the Manager-Relationship 2x2 matrix, returning a ggplot object by default.
|
||||
#' Additional options available to return a "wide" or "long" summary table.
|
||||
#'
|
||||
#' @param data Standard Query data to pass through. Accepts a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector,
|
||||
#' e.g. "Organization". Defaults to NULL.
|
||||
#' @param return A character vector specifying whether to return a matrix plot or a table.
|
||||
#' Defaults to the 2 by 2 matrix. Valid values are "plot", "table", and "chartdata".
|
||||
#' @param plot_colors
|
||||
#' Pass a character vector of length 4 containing HEX codes to specify colors to use in plotting.
|
||||
#' @param threshold
|
||||
#' Specify a numeric value to determine threshold (in minutes) for 1:1 manager hours.
|
||||
#' Defaults to 15.
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import reshape2
|
||||
#' @import ggplot2
|
||||
#' @importFrom scales percent
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#' @family Meeting Culture
|
||||
#'
|
||||
#' @examples
|
||||
#' mgrrel_matrix(sq_data)
|
||||
#'
|
||||
#' @export
|
||||
mgrrel_matrix <- function(data,
|
||||
hrvar = NULL,
|
||||
return = "plot",
|
||||
plot_colors = c("#FE7F4F", "#C3E5FF", "#FFD4C4", "#FEAA8A"),
|
||||
threshold = 15){
|
||||
|
||||
## Add dummy "Total" column if hrvar = NULL
|
||||
if(is.null(hrvar)){
|
||||
|
||||
data <- mutate(data, Total = "Total")
|
||||
hrvar <- "Total"
|
||||
|
||||
}
|
||||
|
||||
|
||||
## Check inputs
|
||||
required_variables <- c("Date",
|
||||
hrvar,
|
||||
"PersonId",
|
||||
"Meeting_hours_with_manager",
|
||||
"Meeting_hours",
|
||||
"Meeting_hours_with_manager_1_on_1")
|
||||
|
||||
## Error message if variables are not present
|
||||
## Nothing happens if all present
|
||||
data %>%
|
||||
check_inputs(requirements = required_variables)
|
||||
|
||||
## Create a Person Weekly Average
|
||||
data1 <-
|
||||
data %>%
|
||||
mutate(coattendman_rate = Meeting_hours_with_manager / Meeting_hours) %>% # Coattendance Rate with Manager
|
||||
filter(!is.na(coattendman_rate)) %>%
|
||||
group_by(PersonId, !!sym(hrvar)) %>%
|
||||
summarise_at(vars(Meeting_hours_with_manager,
|
||||
Meeting_hours,
|
||||
Meeting_hours_with_manager_1_on_1,
|
||||
coattendman_rate),
|
||||
~mean(.)) %>%
|
||||
ungroup()
|
||||
|
||||
|
||||
## Threshold
|
||||
thres_low_chr <- paste("<", threshold, "min")
|
||||
thres_top_chr <- paste(">", threshold, "min")
|
||||
|
||||
|
||||
## Create key variables
|
||||
data2 <-
|
||||
data1 %>%
|
||||
mutate(coattendande = ifelse(coattendman_rate < 0.5, "<50%", ">50%"),
|
||||
mgr1on1 = ifelse(Meeting_hours_with_manager_1_on_1 * 60 < threshold,
|
||||
thres_low_chr,
|
||||
thres_top_chr))
|
||||
|
||||
|
||||
## Grouping variable split
|
||||
if(hrvar == "Total"){
|
||||
data2 %>%
|
||||
count(mgr1on1, coattendande) %>%
|
||||
mutate(perc = n / sum(n)) %>% # Calculate percentages
|
||||
mutate(xmin = ifelse(mgr1on1 == thres_low_chr, -sqrt(perc), 0),
|
||||
xmax = ifelse(mgr1on1 == thres_top_chr, sqrt(perc), 0),
|
||||
ymin = ifelse(coattendande == "<50%", -sqrt(perc), 0),
|
||||
ymax = ifelse(coattendande == ">50%", sqrt(perc), 0),
|
||||
mgrRel = case_when(mgr1on1 == thres_low_chr & coattendande == "<50%" ~ "Under-coached",
|
||||
mgr1on1 == thres_low_chr & coattendande == ">50%" ~ "Co-attending",
|
||||
mgr1on1 == thres_top_chr & coattendande == ">50%" ~ "Highly managed",
|
||||
TRUE ~ "Coaching")) %>%
|
||||
mutate_at("mgrRel", ~as.factor(.)) -> chart
|
||||
|
||||
chart %>%
|
||||
select(mgrRel, n, perc) %>%
|
||||
group_by(mgrRel) %>%
|
||||
summarise_all(~sum(., na.rm = TRUE)) -> clean_tb
|
||||
|
||||
} else if(hrvar != "Total"){
|
||||
|
||||
data2 %>%
|
||||
count(!!sym(hrvar), mgr1on1, coattendande) %>%
|
||||
group_by(!!sym(hrvar)) %>%
|
||||
mutate(perc = n / sum(n)) %>% # Calculate percentages
|
||||
mutate(xmin = ifelse(mgr1on1 == thres_low_chr, -sqrt(perc), 0),
|
||||
xmax = ifelse(mgr1on1 == thres_top_chr, sqrt(perc), 0),
|
||||
ymin = ifelse(coattendande == "<50%", -sqrt(perc), 0),
|
||||
ymax = ifelse(coattendande == ">50%", sqrt(perc), 0),
|
||||
mgrRel = case_when(mgr1on1 == thres_low_chr & coattendande == "<50%" ~ "Under-coached",
|
||||
mgr1on1 == thres_low_chr & coattendande == ">50%" ~ "Co-attending",
|
||||
mgr1on1 == thres_top_chr & coattendande == ">50%" ~ "Highly managed",
|
||||
TRUE ~ "Coaching")) %>%
|
||||
ungroup() %>%
|
||||
mutate_at("mgrRel", ~as.factor(.)) -> chart
|
||||
|
||||
chart %>%
|
||||
select(mgrRel, !!sym(hrvar), n, perc) %>%
|
||||
group_by(mgrRel, !!sym(hrvar)) %>%
|
||||
summarise_all(~sum(., na.rm = TRUE)) -> clean_tb
|
||||
|
||||
}
|
||||
|
||||
## Sort colours out
|
||||
# Legacy variable names
|
||||
myColors <- plot_colors
|
||||
names(myColors) <- levels(chart$mgrRel)
|
||||
|
||||
## Show stacked bar chart if multiple groups
|
||||
|
||||
if(hrvar == "Total"){
|
||||
|
||||
plot <-
|
||||
chart %>%
|
||||
ggplot() +
|
||||
geom_rect(aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, fill = mgrRel), color = "white") +
|
||||
scale_fill_manual(name = "mgrRel", values = myColors) +
|
||||
geom_text(aes(x = xmin + 0.5*sqrt(perc),
|
||||
y = ymin + 0.5*sqrt(perc),
|
||||
label = scales::percent(perc, accuracy = 1))) +
|
||||
coord_equal() +
|
||||
xlab("Weekly 1:1 time with manager") +
|
||||
scale_x_continuous(breaks = c(-max(abs(chart$xmin),abs(chart$xmax))/2,max(abs(chart$xmin),abs(chart$xmax))/2),
|
||||
labels = c(thres_low_chr, thres_top_chr),
|
||||
limits = c(-max(abs(chart$xmin), abs(chart$xmax)), max(abs(chart$xmin), abs(chart$xmax)))) +
|
||||
ylab("Employee and manager coattend") +
|
||||
scale_y_continuous(breaks = c(-max(abs(chart$ymin), abs(chart$ymax))/2, max(abs(chart$ymin), abs(chart$ymax))/2),
|
||||
labels = c("<50%", ">50%"),
|
||||
limits = c(-max(abs(chart$ymin), abs(chart$ymax)), max(abs(chart$ymin), abs(chart$ymax)))) +
|
||||
theme_minimal() +
|
||||
theme(panel.grid = element_blank(),
|
||||
plot.title = element_text(color="grey40", face="bold", size=20),
|
||||
axis.line = element_line(),
|
||||
legend.position = "bottom",
|
||||
legend.title = element_blank(),
|
||||
axis.title = element_text(size = 12),
|
||||
axis.text = element_text(size = 12),
|
||||
legend.text = element_text(size = 12),
|
||||
strip.text.x = element_text(color = "grey40", face = "bold", size = 14)) +
|
||||
labs(caption = extract_date_range(data, return = "text"))
|
||||
|
||||
} else if(hrvar != "Total"){
|
||||
plot <-
|
||||
chart %>%
|
||||
mutate(Fill = case_when(mgrRel == "Co-attending" ~ rgb2hex(68,151,169),
|
||||
mgrRel == "Coaching" ~ rgb2hex(95,190,212),
|
||||
mgrRel == "Highly managed" ~ rgb2hex(49,97,124),
|
||||
mgrRel == "Under-coached" ~ rgb2hex(89,89,89))) %>%
|
||||
ggplot(aes(x = !!sym(hrvar), y = perc, group = mgrRel, fill = Fill)) +
|
||||
geom_bar(position = "stack", stat = "identity") +
|
||||
geom_text(aes(label = paste(round(perc * 100), "%")),
|
||||
position = position_stack(vjust = 0.5),
|
||||
color = "#FFFFFF",
|
||||
fontface = "bold") +
|
||||
scale_fill_identity(name = "Coaching styles",
|
||||
breaks = c(rgb2hex(68,151,169),
|
||||
rgb2hex(95,190,212),
|
||||
rgb2hex(49,97,124),
|
||||
rgb2hex(89,89,89)),
|
||||
labels = c("Co-attending",
|
||||
"Coaching",
|
||||
"Highly managed",
|
||||
"Under-coached"),
|
||||
guide = "legend") +
|
||||
coord_flip() +
|
||||
theme_wpa_basic() +
|
||||
labs(title = "Distribution of types of \nmanager-direct relationship across organizations",
|
||||
subtitle = "Based on manager 1:1 time and percentage of overall time spent with managers")
|
||||
}
|
||||
|
||||
|
||||
if(return == "plot"){
|
||||
|
||||
plot +
|
||||
labs(title = "Distribution of types of \nmanager-direct relationship",
|
||||
subtitle = "Based on manager 1:1 time and percentage of\noverall time spent with managers")
|
||||
|
||||
} else if(return == "table"){
|
||||
|
||||
clean_tb %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "chartdata"){
|
||||
|
||||
chart %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,42 @@
|
|||
#' @title Sample Meeting Query dataset
|
||||
#'
|
||||
#' @description
|
||||
#' A dataset generated from a Meeting Query from WpA.
|
||||
#'
|
||||
#' @format A data frame with 2000 rows and 30 variables:
|
||||
#' \describe{
|
||||
#' \item{MeetingId}{ }
|
||||
#' \item{StartDate}{ }
|
||||
#' \item{StartTimeUTC}{ }
|
||||
#' \item{EndDate}{ }
|
||||
#' \item{EndTimeUTC}{ }
|
||||
#' \item{Attendee_meeting_hours}{ }
|
||||
#' \item{Attendees}{ }
|
||||
#' \item{Organizer_Domain}{ }
|
||||
#' \item{Organizer_FunctionType}{ }
|
||||
#' \item{Organizer_LevelDesignation}{ }
|
||||
#' \item{Organizer_Layer}{ }
|
||||
#' \item{Organizer_Region}{ }
|
||||
#' \item{Organizer_Organization}{ }
|
||||
#' \item{Organizer_zId}{ }
|
||||
#' \item{Organizer_attainment}{ }
|
||||
#' \item{Organizer_TimeZone}{ }
|
||||
#' \item{Organizer_HourlyRate}{ }
|
||||
#' \item{Organizer_IsInternal}{ }
|
||||
#' \item{Organizer_PersonId}{ }
|
||||
#' \item{IsCancelled}{ }
|
||||
#' \item{DurationHours}{ }
|
||||
#' \item{IsRecurring}{ }
|
||||
#' \item{Subject}{ }
|
||||
#' \item{TotalAccept}{ }
|
||||
#' \item{TotalNoResponse}{ }
|
||||
#' \item{TotalDecline}{ }
|
||||
#' \item{TotalNoEmailsDuringMeeting}{ }
|
||||
#' \item{TotalNoDoubleBooked}{ }
|
||||
#' \item{TotalNoAttendees}{ }
|
||||
#' \item{MeetingResources}{ }
|
||||
#'
|
||||
#' ...
|
||||
#' }
|
||||
#' @source \url{https://workplaceanalytics-demo.office.com/en-us/Home}
|
||||
"mt_data"
|
|
@ -0,0 +1,110 @@
|
|||
#' @title Manager 1:1 Time Trend distribution
|
||||
#'
|
||||
#' @description
|
||||
#' Analyze Manager 1:1 Time distribution.
|
||||
#' Returns a stacked bar plot of different buckets of 1:1 time.
|
||||
#' Additional options available to return a table with distribution elements.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics. Accepts a character vector, defaults to "Organization" but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size, defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats median
|
||||
#' @importFrom stats sd
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#'
|
||||
#' @examples
|
||||
#' workloads_dist(sq_data, hrvar = "Organization", return = "table")
|
||||
#' @export
|
||||
|
||||
one2one_dist <- function(data, hrvar = "Organization", mingroup = 5, return = "plot") {
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
## Basic Data for bar plot
|
||||
plot_data <-
|
||||
data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(PersonId, group) %>%
|
||||
summarise(Minutes_with_manager_1_on_1 = mean(Meeting_hours_with_manager_1_on_1 * 60)) %>%
|
||||
ungroup() %>%
|
||||
left_join(data %>%
|
||||
rename(group = !!sym(hrvar)) %>%
|
||||
group_by(group) %>%
|
||||
summarise(Employee_Count = n_distinct(PersonId)),
|
||||
by = "group") %>%
|
||||
filter(Employee_Count >= mingroup)
|
||||
|
||||
## Create buckets of 1:1 time
|
||||
plot_data <-
|
||||
plot_data %>%
|
||||
mutate(bucket_manager_1_on_1 = case_when(Minutes_with_manager_1_on_1 == 0 ~ "0 minutes",
|
||||
Minutes_with_manager_1_on_1 > 0 & Minutes_with_manager_1_on_1 < 15 ~ "1 - 15 minutes",
|
||||
Minutes_with_manager_1_on_1 >= 15 & Minutes_with_manager_1_on_1 < 30 ~ "15 - 30 minutes",
|
||||
Minutes_with_manager_1_on_1 >= 30 ~ "30 min +"))
|
||||
|
||||
|
||||
## Employee count / base size table
|
||||
plot_legend <-
|
||||
plot_data %>%
|
||||
group_by(group) %>%
|
||||
summarize(Employee_Count=first(Employee_Count)) %>%
|
||||
mutate(Employee_Count = paste("n=",Employee_Count))
|
||||
|
||||
## Data for bar plot
|
||||
plot_table <-
|
||||
plot_data %>%
|
||||
group_by(group, bucket_manager_1_on_1) %>%
|
||||
summarize(Employees = n(),
|
||||
Employee_Count = first(Employee_Count),
|
||||
percent = Employees / Employee_Count) %>%
|
||||
arrange(group, desc(bucket_manager_1_on_1))
|
||||
|
||||
## Table for annotation
|
||||
annot_table <-
|
||||
plot_legend %>%
|
||||
dplyr::left_join(plot_table, by = "group")
|
||||
|
||||
## Bar plot
|
||||
plot_object <-
|
||||
plot_table %>%
|
||||
ggplot(aes(x = group, y=Employees, fill = bucket_manager_1_on_1)) +
|
||||
geom_bar(stat = "identity", position = position_fill(reverse = TRUE)) +
|
||||
scale_y_continuous(labels = function(x) paste0(x*100, "%")) +
|
||||
coord_flip() +
|
||||
annotate("text", x = plot_legend$group, y = -.05, label = plot_legend$Employee_Count ) +
|
||||
scale_fill_manual(name="", values = c("#FE7F4F", "#ffdfd3", "#bed6f2", "#e9f1fb")) +
|
||||
theme_wpa_basic() +
|
||||
labs(title = "Time with Manager", subtitle = paste("Scheduled 1:1 weekly meeting minutes by", tolower(hrvar))) +
|
||||
xlab(hrvar) +
|
||||
ylab("Fraction of employees") +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
## Table to return
|
||||
return_table <- plot_table %>% select(group, bucket_manager_1_on_1, percent) %>% spread(bucket_manager_1_on_1, percent)
|
||||
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
return_table %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Manager 1:1 Time Trend - Line Chart
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of 1:1 time with managers, visualised as line charts.
|
||||
#' By default returns a line chart for 1:1 meeting hours,
|
||||
#' with a separate panel per value in the HR attribute.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization"
|
||||
#' but accepts any character vector, e.g. "LevelDesignation"
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#'
|
||||
#' @examples
|
||||
#'
|
||||
#' ## Return a line plot
|
||||
#' one2one_line(sq_data, hrvar = "LevelDesignation")
|
||||
#'
|
||||
#'
|
||||
#' ## Return a table
|
||||
#' one2one_line(sq_data, hrvar = "LevelDesignation", return = "table")
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
one2one_line <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
## Inherit arguments
|
||||
create_line(data = data,
|
||||
metric = "Meeting_hours_with_manager_1_on_1",
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = return)
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
#' @title Manager 1:1 Time Ranking
|
||||
#'
|
||||
#' @description
|
||||
#' This function scans a standard query output for groups with high levels of 'Manager 1:1 Time'.
|
||||
#' Returns a table with a all of groups (across multiple HR attributes) ranked by hours of digital collaboration.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar A list of HR Variables to consider in the scan.
|
||||
#' Defaults to all HR attributes identified.
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size.
|
||||
#' Defaults to 5.
|
||||
#' @param return A character vector specifying what to return.
|
||||
#' Valid values include "html" (default, returning an interactive DataTable)
|
||||
#' and "df" (data frame)
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#' @importFrom stats reorder
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
one2one_rank <- function(data,
|
||||
hrvar = extract_hr(data),
|
||||
mingroup = 5,
|
||||
return = "table"){
|
||||
|
||||
output <-
|
||||
data %>% create_rank(metric="Meeting_hours_with_manager_1_on_1", hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
return = "table")
|
||||
|
||||
if(return == "html"){
|
||||
return(create_dt(output))
|
||||
} else if(return == "table"){
|
||||
return(output)
|
||||
} else {
|
||||
stop("Invalid `return` argument.")
|
||||
}
|
||||
}
|
|
@ -0,0 +1,45 @@
|
|||
#' @title Manager 1:1 Time Summary
|
||||
#'
|
||||
#' @description
|
||||
#' Provides an overview analysis of Manager 1:1 Time.
|
||||
#' Returns a bar plot showing average weekly minutes of Manager 1:1 Time by default.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @inheritParams create_bar
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
one2one_sum <- function(data,
|
||||
hrvar = "Organization",
|
||||
mingroup = 5,
|
||||
return = "plot"){
|
||||
|
||||
cleaned_data <-
|
||||
data %>%
|
||||
mutate(`Scheduled 1:1 meeting minutes with manager` = Meeting_hours_with_manager_1_on_1 * 60)
|
||||
|
||||
|
||||
create_bar(data = cleaned_data,
|
||||
hrvar = hrvar,
|
||||
mingroup = mingroup,
|
||||
metric = "Scheduled 1:1 meeting minutes with manager",
|
||||
return = return,
|
||||
bar_colour = "darkblue")
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,86 @@
|
|||
#' @title Manager 1:1 Time Trend
|
||||
#'
|
||||
#' @description
|
||||
#' Provides a week by week view of scheduled manager 1:1 Time.
|
||||
#' By defualt returns a week by week heatmap, highlighting the points in time with most activity.
|
||||
#' Additional options available to return a summary table.
|
||||
#'
|
||||
#' @param data A Standard Query dataset in the form of a data frame.
|
||||
#' @param hrvar HR Variable by which to split metrics, defaults to "Organization" but accepts any character vector (e.g. "LevelDesignation")
|
||||
#' @param mingroup Numeric value setting the privacy threshold / minimum group size. Defaults to 5.
|
||||
#' @param return Character vector specifying what to return, defaults to "plot".
|
||||
#' Valid inputs are "plot" and "table".
|
||||
#'
|
||||
#' @import dplyr
|
||||
#' @import ggplot2
|
||||
#' @import reshape2
|
||||
#' @import scales
|
||||
#'
|
||||
#' @family Managerial Relations
|
||||
#'
|
||||
#' @return
|
||||
#' Returns a ggplot object by default, where 'plot' is passed in `return`.
|
||||
#' When 'table' is passed, a summary table is returned as a data frame.
|
||||
#'
|
||||
#' @export
|
||||
|
||||
one2one_trend <- function(data, hrvar = "Organization", mingroup=5, return = "plot"){
|
||||
|
||||
## Date range data frame
|
||||
myPeriod <- extract_date_range(data)
|
||||
|
||||
myTable <-
|
||||
data %>%
|
||||
mutate(Date=as.Date(Date, "%m/%d/%Y")) %>%
|
||||
rename(group = !!sym(hrvar)) %>% # Rename HRvar to `group`
|
||||
select(PersonId, Date, group, Meeting_hours_with_manager_1_on_1) %>%
|
||||
group_by(group) %>%
|
||||
mutate(Employee_Count = n_distinct(PersonId)) %>%
|
||||
filter(Employee_Count >= mingroup) # Keep only groups above privacy treshold
|
||||
|
||||
myTable <-
|
||||
myTable %>%
|
||||
group_by(Date, group) %>%
|
||||
summarize(Employee_Count=mean(Employee_Count),
|
||||
Minutes_with_manager_1_on_1 = mean(Meeting_hours_with_manager_1_on_1 * 60))
|
||||
|
||||
myTable_plot <-
|
||||
myTable %>%
|
||||
select(Date, group, Minutes_with_manager_1_on_1)
|
||||
|
||||
myTable_return <- myTable_plot %>% spread(Date, Minutes_with_manager_1_on_1)
|
||||
|
||||
plot_object <-
|
||||
myTable_plot %>%
|
||||
ggplot(aes(x =Date , y = group , fill = Minutes_with_manager_1_on_1)) +
|
||||
geom_tile(height=.5) +
|
||||
scale_fill_gradient(name="Minutes", low = "white", high = "red") +
|
||||
theme_wpa_basic() +
|
||||
labs(title = "Time with Manager",
|
||||
subtitle = paste("Average scheduled 1:1 meeting minutes by", tolower(hrvar))) +
|
||||
xlab("Date") +
|
||||
ylab(hrvar) +
|
||||
labs(caption = paste("Data from week of", myPeriod$Start, "to week of", myPeriod$End))
|
||||
|
||||
if(return == "table"){
|
||||
|
||||
myTable_return %>%
|
||||
as_tibble() %>%
|
||||
return()
|
||||
|
||||
} else if(return == "plot"){
|
||||
|
||||
return(plot_object)
|
||||
|
||||
} else {
|
||||
|
||||
stop("Please enter a valid input for `return`.")
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше
Загрузка…
Ссылка в новой задаче