Aggregator job for Telemetry.
Перейти к файлу
Rob Hudson 5c7095f2ea Update docker config to help out devs 2018-03-08 23:36:26 -06:00
ansible Run tests against Spark 2.0.2 2017-03-07 11:16:53 -05:00
mozaggregator Make missing scalars 404 2018-01-23 12:55:41 -06:00
tests Make missing scalars 404 2018-01-23 12:55:41 -06:00
.coveragerc Update docker config to help out devs 2018-03-08 23:36:26 -06:00
.gitignore Improve deployment/testing process for mozaggregator: 2016-04-12 11:27:02 -04:00
.travis.yml Improve deployment/testing process for mozaggregator: 2016-04-12 11:27:02 -04:00
Dockerfile Update docker config to help out devs 2018-03-08 23:36:26 -06:00
Dockerfile.dev Update docker config to help out devs 2018-03-08 23:36:26 -06:00
Makefile Update docker config to help out devs 2018-03-08 23:36:26 -06:00
README.md fixing readme 2018-02-16 09:28:55 -06:00
Vagrantfile Fix build 2017-12-18 14:10:38 -06:00
circle.yml [Bug 1342947] Move mozaggregator service to dockerflow 2018-02-15 12:30:29 -06:00
docker-compose.yml Update docker config to help out devs 2018-03-08 23:36:26 -06:00
newrelic.ini [Bug 1342947] Move mozaggregator service to dockerflow 2018-02-15 12:30:29 -06:00
requirements.txt [Bug 1342947] Move mozaggregator service to dockerflow 2018-02-15 12:30:29 -06:00
run-tests.sh Update docker config to help out devs 2018-03-08 23:36:26 -06:00
setup.py Version bump 2018-01-23 12:55:41 -06:00

README.md

python_mozaggregator

Aggregator job for Telemetry. See this blog post for details.

CircleCI

Development and deployment

To clean, build, and run all containers:

make up

To build containers and ssh into web container:

make shell

To manually ssh into running web container (e.g. after a 'make up'):

docker ps
docker exec -it <CONTAINER_ID of web container> /bin/bash

To build and run tests inside dev container:

make test

To manually run tests on running web container:

ssh into container and ./run-tests.sh

TODO(hwoo) Migrate deployment to dockerflow and remove ansible/vagrant stuff, update readme

To deploy a new version of the aggregation service to the cloud:

ansible-playbook ansible/deploy.yml -e '@ansible/envs/dev.yml' -i ansible/inventory

To connect to the database on the host, when running it inside the Vagrant VM, using something like pgAdmin, use localhost as the host, 5432 as the port, vagrant as the username, and a blank password.

API

Aggregates are made available through a HTTP API. There are two kinds of aggregates: per submission date (date a ping is received by the server) and per build-id (date the submitting product was built).

To access the aggregates use the aggregates_by/build_id/ and aggregates_by/submission_date/ prefix respectively.

In the URLs below, replace SERVICE with the origin of this service's instance. The official service is http://aggregates.telemetry.mozilla.org.

The following examples are based on build-id aggregates. Replace build_id with submission_date to use aggregates per submission date instead.

Get available channels:
curl -X GET http://SERVICE/aggregates_by/build_id/channels/
["nightly","beta","aurora"]
Get a list of options for the available dimensions on a given channel and version:
curl -X GET "http://SERVICE/filters/?channel=nightly&version=42"
{"metric":["A11Y_CONSUMERS","A11Y_IATABLE_USAGE_FLAG",...], 
 "application":["Fennec","Firefox"],
 ...}
Get a list of available build-ids for a given channel:
curl -X GET "http://SERVICE/aggregates_by/build_id/channels/nightly/dates/"
[{"date":"20150630","version":"42"}, {"date":"20150629","version":"42"}]
Given a set of build-ids, retrieve for each of build-id the aggregated histogram that complies with the requested filters:
curl -X GET "http://SERVICE/aggregates_by/build_id/channels/nightly/?version=41&dates=20150615,20150616&metric=GC_MS&os=Windows_NT"
{"buckets":[0, ..., 10000],
 "data":[{"date":"20150615",
          "count":239459,
          "sum": 412346123,
          "histogram":[309, ..., 5047],
          "label":""},
         {"date":"20150616",
          "count":233688,
          "sum": 402241121,
          "histogram":[306, ..., 7875],
          "label":""}],
 "kind":"exponential",
 "description":"Time spent running JS GC (ms)"}

The available filters are:

  • metric, e.g. JS_TELEMETRY_ADDON_EXCEPTIONS
  • application, e.g. Firefox
  • architecture, e.g. x86
  • os, e.g. Windows_NT
  • osVersion, e.g. 6.1
  • e10sEnabled, e.g. true
  • label, e.g Adblock-Plus
  • child, e.g. true, meaningful only if e10s is enabled

A reply has the following attributes:

  • buckets, which represents the bucket labels of the histogram
  • kind, the kind of histogram (e.g. exponential)
  • data, which is an array of metric objects with the following attributes:
    • date: a build-id
    • count: number of metrics aggregated
    • sum: sum of accumulated values
    • histogram: bucket values
    • description: histogram description
    • label: for keyed histograms, the key the entry belongs to, or otherwise a blank string

Keyed histograms have the same format as unkeyed histograms, but there can possibly be multiple metric objects with the same date, each with a different key (label).