beaf4795ea
fix typo in README.md |
||
---|---|---|
R | ||
assets | ||
data | ||
.gitignore | ||
LICENSE | ||
README.md | ||
ThirdPartyNotice | ||
azuredeploy.json | ||
global.R | ||
prep_labels.py | ||
server.R | ||
taganomaly.Rproj | ||
ui.R |
README.md
taganomaly
Anomaly detection labeling tool, specifically for multiple time series (one time series per category).
Taganomaly is a tool for creating labeled data for anomaly detection models. It allows the labeler to select points on a time series, further inspect them by looking at the behavior of other times series at the same time range, or by looking at the raw data that created this time series (assuming that the time series is an aggregated metric, counting events per time range)
Click here to deploy on Azure using Azure Container Instances:
The app has four main windows:
1. The labeling window
Time series labeling
Selected points table view
View raw data for window (if exists)
2. Compare this category with others over time
3. Find proposed anomalies using the Twitter AnomalyDetection package
4. Observe the changes in distribution between categories
This could be useful to understand whether an anomaly was univariate or multivariate
How to run locally:
This tool uses the shiny framework for visualizing events.
In order to run it, you need to have R and preferably Rstudio.
Once you have everything installed, open the project on R studio and click Run App
, or call runApp() from the console. You might need to manually install the required packages
Requirements
- R (3.4.0 or above)
Used packages:
- shiny
- dplyr
- gridExtra
- shinydashboard
- DT
- ggplot2
- shinythemes
How to deploy using docker:
Option 1: Deploy to Azure Web App for Containers or Azure Container Instances. More details here (webapp) and here (container instances)
Option 2: Deploy this image to your own environment.
Dockerize the shiny app:
Follow the steps on rize on how to deploy on shiny-server. Default port is 3838
, so make sure you have it open or change the default port to something else.
Instructions of use
- Import time series CSV file. Assumed structure:
- date (
"%Y-%m-%d %H:%M:%S"
). TagAnomaly will attempt to infer the date from other patterns as well, using the parsedate package - category (optional)
- value
- (Optional) Import raw data time series CSV file.
If the original time series is an aggreation over time windows, this time series is the raw values themselves. This way we could dive deeper into an anomalous value and see what it is comprised of. Assumed structure:
- date (
"%Y-%m-%d %H:%M:%S"
). TagAnomaly will attempt to infer the date from other patterns as well, using the parsedate package - category (optional)
- content
-
Select category (optional, if exists)
-
Select time range on slider
-
Select points on plot that look anomalous. Optional (1): click on one time range on the table below the plot to see raw data on this time range Optional (2): Open the
All Categories
tab to see how other time series behave on the same time range. -
Once you decide that these are actual anomalies, save the resulting table to csv by clicking on
Download labels set
and continue to the next category.
Current limitations/issues
It is currently impossible to have multiple selections on one plot. A workaround is to select one area, download the csv and select the next area. Each downloaded CSV has a random string so files won't override each other. Once labeling is finished, one option is to run the provided prep_labels.py file in order to concatenate all of TagAnomaly's output file to one CSV.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.