Граф коммитов

8 Коммитов

Автор SHA1 Сообщение Дата
Anthony Miyaguchi ca2204625d
Add views for logical Fenix app ids in GLAM ETL (#1221)
* Add views for logical app ids

* Add new generated sql

* Update generate_glean_sql script to handle logical apps

* Update logical app view for partitiontime

* Make sure to generate view for all of the app ids

* Update last versions to be logical app id agnostic

* Add formatting for black

* Fix linting error

* Update bigquery_etl/glam/generate.py

Co-authored-by: Ben Wu <benjaminwu124@gmail.com>

* Add "all" option to STAGE

* Add new metrics added since last PR

Co-authored-by: Ben Wu <benjaminwu124@gmail.com>
2020-08-17 15:05:15 -07:00
Ben Wu a58821eaae
Add build date udf mapping for fenix_nightly (#1218) 2020-08-06 15:32:45 -04:00
Ben Wu ab50e40fc6
Generalize fenix glam generate and run code (#1183) 2020-07-20 11:25:27 -04:00
Anthony Miyaguchi 684380d37e
Add build date to glam_etl templates (#1109)
* Add updates to the glam fenix etl

* Update extract functions to include build date

* Add build_date to exports
2020-07-07 14:14:14 -07:00
Anthony Miyaguchi 2e68006911
[glam-etl] Use static combinations and avoid double counting (#1002)
* Use compact notation for static combinations

* Avoid double counting by moving cubing into bucket counts

* Modify udf_merged_user_data for reuse outside of clients aggregates

* Duplicate rows from all combos by client id
2020-05-28 11:53:03 -07:00
Anthony Miyaguchi a46197d268
Change naming convention of sql/glam_etl (#912)
* Rename all queries to use org_mozilla_fenix__

* Change all references within jobs to use new prefix

* Update run_glam_sql for new naming convention

* Rename probe counts and histogram counts

* Rename generate_fenix_sql to generate_sql

* Fix lint error and add new views to publish_view ignore

* Add updated clients daily queries

* Add consistent usage of header for better BigQuery job logs

* Add parameters for product to generate SQL for.

* Generate schemas in parallel

* Rename generate_sql to generate_glean_sql
2020-05-21 11:54:36 -07:00
Anthony Miyaguchi 82a6a5f687
Fenix exports for GLAM (#870)
* Add views for extracting to glam

* wip: Add export script

* Rename extract queries and don't run them

* Add user counts

* Add generated sql

* Update extract queries to rely on current project

* Fix optional day partitioning

* Fix extraction to glam-fenix-dev project

* Add globs for ignoring dryrun and format

* Reorder columns in probe count extract

* Filter on null channel and version

* Do not print header

* Refactor datacube groupings and fix scalar_percentiles

* Rename extract tables

* Convert user counts into a view to avoid needless materialization

* Rename client probe counts to probe counts

* Update publish_views so job does not fail
2020-04-14 11:45:59 -07:00
Anthony Miyaguchi 6e3351272e
Cleanup GLAM etl queries for Fenix (#851)
* Rename scalar_aggregates_incremental and remove telemetry

* Remove ping-type and telemetry variables

* Add initial files for final views

* Add glam templates to format ignore

* Add blacked scalar_percentiles

* Generate view for scalar aggregates

* Add generated view for daily histogram aggregate view

* Add probe counts to generated views

* Generalize writing out queries

* Add a latest_versions to generate

* ADd histogram_percentiles to generate

* Add scalar bucket counts to generate

* Add histogram bucket counts to generate

* Add clients scalar aggregates to generate

* Move probe counts to generate

* Add scalar percentiles to generate

* Add clients histogram aggregates to generate

* Use generate within generate_fenix_sql script

* Fix probe_counts view and bucket counts

* Rename template for scalar bucket counts

* Fix client probe counts view

* Remove irrelevant telemetry where clause

* Add docstrings and shorten lines

* Add probe counts view to dryrun and publish_views ignores

* Rename udf for merged user data

* Use python3
2020-04-03 14:17:32 -07:00