telemetry-batch-view/GRAVEYARD.md

2.8 KiB

Telemetry-batch-view Code Graveyard

This document records interesting code that we've deleted for the sake of discoverability for the future.

Heavy Users

Interesting bits: the Heavy Users job used a custom Paritioner called ConsistentPartitioner that optimized for copartitioning the same client_ids together even as the client_ids grow and wane.

Pioneer Online News Dwell Time v2

This dataset was created as a one-off for the purposes of the Online News pioneer study. It created sessions that measured dwell time on a tld based on logs sent from users. It used a state machine to create the sessions, which is mildly interesting.

Crash Aggregates

This dataset was created to count crashes on a daily basis, before we introduced error aggregates.

Quantum RC

This dataset was created to monitor that various metrics conformed to the Quantum release criteria expectations.

Crash Summary

This dataset was used to access crash pings, before we introduced the crash ping table in BigQuery.

Generic Count and DAU

These datasets were created to count clients on a daily and monthly basis, before we introduced clients last seen.

Longitudinal

This dataset was used to access all histograms for a 1% sample of clients, before we introduced the main ping table in BigQuery.

Experiments Aggregates

This dataset was used for experiment analysis, before it was deprecated in Bug 1515134.

Main Summary, Clients Daily, and Addons

These jobs were reimplemented as BigQuery SQL in bigquery-etl/sql/telemetry_derived/.

Experiments Summary

This job was reimplemented as BigQuery SQL in bigquery-etl/sql/telemetry_derived/experiments_v1/query.sql.