59e49ea36c
* Remove hardcoded dataset in baseline clients last seen check * Remove extra . Co-authored-by: Anna Scholtz <anna@scholtzan.net> --------- Co-authored-by: Anna Scholtz <anna@scholtzan.net> |
||
---|---|---|
.. | ||
templates | ||
README.md | ||
__init__.py | ||
baseline_clients_daily.py | ||
baseline_clients_first_seen.py | ||
baseline_clients_last_seen.py | ||
clients_last_seen_joined.py | ||
common.py | ||
event_error_monitoring.py | ||
event_flow_monitoring.py | ||
event_monitoring_live.py | ||
events_stream.py | ||
events_unnested.py | ||
glean_app_ping_views.py | ||
metrics_clients_daily.py | ||
metrics_clients_last_seen.py |
README.md
Glean Usage
This generator generates the following queries for Glean applications:
baseline_clients_daily
: A daily aggregate of baseline pings perclient_id
baseline_clients_first_seen
: Captures the earliest server date that we observe a particular client in the baseline table.baseline_clients_last_seen
: Captures activity history of each client in 28-day windows for each submission date based on baseline pings.clients_last_seen_joined
: Joins baseline and metrics viewsevents_unnested
: A view of unnested eventsmetrics_clients_daily
: Daily per-client aggregates on top of metrics pingsmetrics_clients_last_seen
: Window over the previous 28 days of the clients metrics daily table- App-specific views for Glean pings: a pointer to the main view to the stable ping table for the release channel of each Glean application
Depending on the specific query, queries get generated for per-app_id
datasets and/or per app.
For example, for datasets related to Fenix this means that for each app_id
(=org_mozilla_firefox
, org_mozilla_fenix_nightly
, org_mozilla_fennec_aurora
, org_mozilla_firefox_beta
, org_mozilla_fenix
) queries are generated writing their results to tables in the associated dataset. Additionally queries will write results to the app dataset fenix
which will essentially UNION
the results of the per-app_id
datasets.
Tables for new Glean apps are generated automatically during nightly table deployment runs.
Adding Queries
Each query is generated by adding a corresponding class that is derived from GleanTable
. For each of these classes a separate Python file is created inside this directory. The Python file and class are named after the query that they generate.
The GleanTable
class has a few parameters and methods that can be overridden inside the derived classes to customize the generation.
The parameters that are available can and in some cases need to be set in the __init__
method of the new class definitions:
- [required]
target_table_id
: name of the target table results are written to by the query - [required]
first_seen
: the general prefix of the query to get related derived tables and views per_app_id_enabled
: default =True
; If set toTrue
the query will be generated for eachapp_id
-dataset.per_app_enabled
: default =True
; If set toTrue
the query will be generated forapp
-datasetscross_channel_template
: default ="cross_channel.view.sql"
; File name of the template used to join data from different channels of the same app. Used when generated per-app queries.
Each query depends on a couple of templates that need to be added to the templates/
directory:
<target_table_id>.query.sql
: Template of the generated query<target_table_id>.metadata.yaml
: Template for the metadata that gets added alongside the generated query<target_table_id>.view.sql
: Template for the user-facing view to expose the data written by the generated querydataset_metadata.yaml
andderived_dataset_metadata.yaml
: Template for thedataset_metadata.yaml
, reused across all queries
Additional, query-specific templates or config files used during the generation process can also be added to the templates/
directory.
The GleanTable
class calls two methods that can be overridden by the query-specific classes:
generate_per_app_id(self, project_id, baseline_table, output_dir=None)
: This method is for generating the per-app_id
queriesgenerate_per_app(self, project_id, app_info, output_dir=None)
: This method is for generating the per-app queries
Customizing the Generated Queries
In some cases it is necessary to use a custom, manually written query instead of a generated one. For example, if the query logic is different for a single app_id
.
For these cases, a query can be added in the sql/
directory. Queries that have been added there and are named like the generated queries will not be overwritten.