2018-08-08 03:11:03 +03:00
|
|
|
[![Build Status](https://circleci.com/gh/mozilla/telemetry-streaming/tree/master.svg?style=svg)](https://circleci.com/gh/mozilla/telemetry-streaming/tree/master)
|
2017-02-13 15:19:07 +03:00
|
|
|
[![codecov.io](https://codecov.io/github/mozilla/telemetry-streaming/coverage.svg?branch=master)](https://codecov.io/github/mozilla/telemetry-streaming?branch=master)
|
2017-02-10 15:46:04 +03:00
|
|
|
|
2019-12-05 22:09:03 +03:00
|
|
|
*This repository is no longer in use at Mozilla! It was designed to be run on our AWS-based telemetry infrastructure*
|
2019-12-05 22:08:40 +03:00
|
|
|
|
2017-02-10 14:47:10 +03:00
|
|
|
# telemetry-streaming
|
|
|
|
Spark Streaming ETL jobs for Mozilla Telemetry
|
2017-11-01 22:57:09 +03:00
|
|
|
|
2018-01-30 19:42:29 +03:00
|
|
|
This service currently contains jobs that aggregate error data
|
2018-01-30 18:44:05 +03:00
|
|
|
on 5 minute intervals. It is responsible for generating the (internal only)
|
|
|
|
`error_aggregates` and `experiment_error_aggregates` parquet tables at
|
|
|
|
Mozilla.
|
|
|
|
|
|
|
|
## Issue Tracking
|
|
|
|
|
2018-12-15 00:14:06 +03:00
|
|
|
Please file bugs related to the error aggregates streaming job in the
|
|
|
|
[Datasets: Error Aggregates](https://bugzilla.mozilla.org/enter_bug.cgi?product=Data%20Platform%20and%20Tools&component=Datasets%3A%20Error%20Aggregates) component.
|
|
|
|
|
|
|
|
## Deployment
|
|
|
|
|
|
|
|
The jobs defined in this repository are generally deployed as streaming jobs within
|
|
|
|
[our hosted Databricks account](https://docs.telemetry.mozilla.org/concepts/pipeline/data_pipeline_detail.html?highlight=databricks#databricks-managed-spark-analysis),
|
|
|
|
but some are deployed as periodic batch jobs via Airflow
|
|
|
|
using wrappers codified in
|
|
|
|
[telemetry-airflow](https://github.com/mozilla/telemetry-airflow)
|
|
|
|
that spin up EMR clusters whose configuration is governed by
|
|
|
|
[emr-bootstrap-spark](https://github.com/mozilla/emr-bootstrap-spark/).
|
|
|
|
Changes in production behavior that don't seem to correspond to changes
|
|
|
|
in this repository's code could be related to changes in those other projects.
|
2018-01-30 18:44:05 +03:00
|
|
|
|
2018-07-31 21:08:38 +03:00
|
|
|
## Amplitude Event Configuration
|
|
|
|
|
|
|
|
Some of the jobs defined in `telemetry-streaming` exist to transform telemetry events
|
|
|
|
and republish to [Amplitude](https://amplitude.com/) for further analysis.
|
|
|
|
Filtering and transforming events is accomplished via JSON configurations.
|
|
|
|
If you're creating or updating such a schema, see:
|
|
|
|
|
|
|
|
- [Amplitude event configuration docs](docs/amplitude)
|
|
|
|
|
2017-11-01 22:57:09 +03:00
|
|
|
## Development
|
|
|
|
|
|
|
|
The recommended workflow for running tests is to use your favorite editor for editing
|
|
|
|
the source code and running the tests via sbt. Some common invocations for sbt:
|
|
|
|
|
|
|
|
* `sbt test # run the basic set of tests (good enough for most purposes)`
|
|
|
|
* `sbt "testOnly *ErrorAgg*" # run the tests only for packages matching ErrorAgg`
|
|
|
|
* `sbt "testOnly *ErrorAgg* -- -z version" # run the tests only for packages matching ErrorAgg, limited to test cases with "version" in them`
|
|
|
|
* `sbt dockerComposeTest # run the docker compose tests (slow)`
|
|
|
|
* `sbt "dockerComposeTest -tags:DockerComposeTag" # run only tests with DockerComposeTag (while using docker)`
|
2018-08-03 18:50:17 +03:00
|
|
|
* `sbt scalastyle test:scalastyle # run linter`
|
|
|
|
* `sbt ci # run the full set of continuous integration tests`
|
2018-03-15 18:22:03 +03:00
|
|
|
|
|
|
|
Some tests need Kafka to run. If one prefers to run them via IDE, it's required to run the test cluster:
|
|
|
|
```bash
|
|
|
|
sbt dockerComposeUp
|
|
|
|
```
|
|
|
|
or via plain docker-compose:
|
|
|
|
```bash
|
|
|
|
export DOCKER_KAFKA_HOST=$(./docker_setup.sh)
|
|
|
|
docker-compose -f docker/docker-compose.yml up
|
|
|
|
```
|
|
|
|
It's also good to shut down the cluster afterwards:
|
|
|
|
```bash
|
|
|
|
sbt dockerComposeStop
|
|
|
|
```
|