Spark Streaming ETL jobs for Mozilla Telemetry
Перейти к файлу
Arkadiusz Komarzewski 6a99a00f44 Bug 1462107 - Add frecency aggregator 2018-07-31 11:56:09 +02:00
configs value -> object for pwmgr entries 2018-07-30 10:51:43 -04:00
docker Add Integration Tests with Docker 2017-09-18 14:36:48 -05:00
project Extract common parts of streaming jobs to single base class 2018-06-05 17:07:12 +02:00
src Bug 1462107 - Add frecency aggregator 2018-07-31 11:56:09 +02:00
.gitignore Add .gitignore 2017-01-24 17:34:02 +00:00
.travis.yml Pin an explicit version of the deploy.sh script to fetch 2018-06-19 15:50:21 -04:00
README.md Bug 1423340 - Make tests easily runnable from IDE 2018-03-20 23:53:09 +01:00
build.sbt Bug 1462107 - Add frecency aggregator 2018-07-31 11:56:09 +02:00
docker_setup.sh Fix internal IP resolve on Ubuntu 2018-06-05 17:07:12 +02:00

README.md

Build Status codecov.io

telemetry-streaming

Spark Streaming ETL jobs for Mozilla Telemetry

This service currently contains jobs that aggregate error data on 5 minute intervals. It is responsible for generating the (internal only) error_aggregates and experiment_error_aggregates parquet tables at Mozilla.

Issue Tracking

Please file bugs in the Datasets: Error Aggregates component.

Development

The recommended workflow for running tests is to use your favorite editor for editing the source code and running the tests via sbt. Some common invocations for sbt:

  • sbt test # run the basic set of tests (good enough for most purposes)
  • sbt "testOnly *ErrorAgg*" # run the tests only for packages matching ErrorAgg
  • sbt "testOnly *ErrorAgg* -- -z version" # run the tests only for packages matching ErrorAgg, limited to test cases with "version" in them
  • sbt dockerComposeTest # run the docker compose tests (slow)
  • sbt "dockerComposeTest -tags:DockerComposeTag" # run only tests with DockerComposeTag (while using docker)
  • sbt ci # run all tests

Some tests need Kafka to run. If one prefers to run them via IDE, it's required to run the test cluster:

sbt dockerComposeUp

or via plain docker-compose:

export DOCKER_KAFKA_HOST=$(./docker_setup.sh)
docker-compose -f docker/docker-compose.yml up

It's also good to shut down the cluster afterwards:

sbt dockerComposeStop