Aggregator job for Telemetry.
Перейти к файлу
Wesley Dawson adb95aa886
Minimum change to use PBD main v5
2023-10-27 15:47:03 -07:00
.circleci Updated config.yml 2023-01-13 18:07:33 +01:00
bin Minimum change to use PBD main v5 2023-10-27 15:47:03 -07:00
mozaggregator Minimum change to use PBD main v5 2023-10-27 15:47:03 -07:00
queries Add drop tables query 2019-05-23 10:06:15 -05:00
requirements updated http to https for link to moztelemetry git repository 2023-01-13 16:57:39 +01:00
script/validation Add script for validating a copy of the database 2020-03-05 11:11:42 -08:00
tests [Bug 1656910] Remove saved-session ping processing 2020-09-16 15:29:27 -04:00
.coveragerc Update docker config to help out devs 2018-03-08 23:36:26 -06:00
.flake8 Add flake8 config 2018-03-08 23:36:26 -06:00
.gitignore Hack together a patched image 2023-09-20 15:35:30 -07:00
.pyup.yml Add pyup.io config 2018-03-08 23:36:26 -06:00
CODE_OF_CONDUCT.md See PR for details 2019-05-30 11:55:14 -05:00
Dockerfile Hack together a patched image 2023-09-20 15:35:30 -07:00
Dockerfile.dev Add Google Cloud SDK to the docker image 2020-01-02 12:23:59 -08:00
Makefile Update Docker config after feedback 2018-03-14 16:02:44 -07:00
README.md Rename branch in badge 2021-02-03 12:47:18 -08:00
build.sh Hack together a patched image 2023-09-20 15:35:30 -07:00
docker-compose.yml Expose postgres ports 2020-04-22 13:39:53 -04:00
setup.py updated http to https for link to moztelemetry git repository 2023-01-13 16:57:39 +01:00

README.md

python_mozaggregator

Aggregator job for Telemetry. See this blog post for details.

CircleCI

Development and deployment

To clean, build, and run all containers:

make up

To build containers and ssh into web container:

make shell

To manually ssh into running web container (e.g. after a 'make up'):

docker ps
docker exec -it <CONTAINER_ID of web container> /bin/bash

To build and run tests inside dev container:

make test

To manually run tests on running web container:

make shell
./bin/run test

Deployment

The following Env vars need to be set up in hiera-sops POSTGRES_HOST, POSTGRES_RO_HOST, POSTGRES_PASS There are jenkins pipeline jobs to deploy this. See cloudops-deployment/projects/mozaggregator for details

Enabling and Disabling Metrics

To completely disable viewing of a metric, add it to the METRICS_BLACKLIST. No matter how this is deployed, users will never be able to see that metric. Any regex can be matched against with the blacklist, for example to disable all metrics prefixed with "USER_DATA", put in r"USER_DATA.*".

Release Metrics

By default, release metrics are not allowed by the service. To enable a specific release metric, add it to PUBLIC_RELEASE_METRICS. It will then be viewable publicly.

To enable all release metrics (except those in METRICS_BLACKLIST), set the envvar ALLOW_ALL_RELEASE_METRICS to "True".

API

Aggregates are made available through a HTTP API. There are two kinds of aggregates: per submission date (date a ping is received by the server) and per build-id (date the submitting product was built).

To access the aggregates use the aggregates_by/build_id/ and aggregates_by/submission_date/ prefix respectively.

In the URLs below, replace SERVICE with the origin of this service's instance. The official service is https://aggregates.telemetry.mozilla.org.

The following examples are based on build-id aggregates. Replace build_id with submission_date to use aggregates per submission date instead.

Get available channels:
curl -X GET https://SERVICE/aggregates_by/build_id/channels/
["nightly","beta","release"]
Get a list of options for the available dimensions on a given channel and version:
curl -X GET "https://SERVICE/filters/?channel=nightly&version=42"
{"metric":["A11Y_CONSUMERS","A11Y_IATABLE_USAGE_FLAG",...], 
 "application":["Fennec","Firefox"],
 ...}
Get a list of available build-ids for a given channel:
curl -X GET "https://SERVICE/aggregates_by/build_id/channels/nightly/dates/"
[{"date":"20150630","version":"42"}, {"date":"20150629","version":"42"}]
Given a set of build-ids, retrieve for each of build-id the aggregated histogram that complies with the requested filters:
curl -X GET "https://SERVICE/aggregates_by/build_id/channels/nightly/?version=41&dates=20150615,20150616&metric=GC_MS&os=Windows_NT"
{"buckets":[0, ..., 10000],
 "data":[{"date":"20150615",
          "count":239459,
          "sum": 412346123,
          "histogram":[309, ..., 5047],
          "label":""},
         {"date":"20150616",
          "count":233688,
          "sum": 402241121,
          "histogram":[306, ..., 7875],
          "label":""}],
 "kind":"exponential",
 "description":"Time spent running JS GC (ms)"}

The available filters are:

  • metric, e.g. JS_TELEMETRY_ADDON_EXCEPTIONS
  • application, e.g. Firefox
  • architecture, e.g. x86
  • os, e.g. Windows_NT
  • osVersion, e.g. 6.1
  • label, e.g Adblock-Plus
  • child, e.g. true, meaningful only if e10s is enabled

A reply has the following attributes:

  • buckets, which represents the bucket labels of the histogram
  • kind, the kind of histogram (e.g. exponential)
  • data, which is an array of metric objects with the following attributes:
    • date: a build-id
    • count: number of metrics aggregated
    • sum: sum of accumulated values
    • histogram: bucket values
    • description: histogram description
    • label: for keyed histograms, the key the entry belongs to, or otherwise a blank string

Keyed histograms have the same format as unkeyed histograms, but there can possibly be multiple metric objects with the same date, each with a different key (label).