Ran YAMLlint on all yaml files and resolved linting issues (fixes #1297) (#1481)

* "Ran YAMLlint on all yaml files"

* "Moved product info metadata table to README file"

* "Reformatted yaml lists"

* "Updated line breaks so script runs"

* "Updated line breaks so script runs"

* "Undid line breaks"

* "Created custom config file"

* "Removed base document id"

* "Undid line breaks"

* "Reformatted code"

* "Trimmed whitespace"

* "Undid line break"

* "Introduced newline"

* "Trimmed whitespace"

* "Added yamillint to config file"

* "Added yamllint to config file"

* "Moved up yamllint test"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Removing hyphen to fix CI error"

* "Indentation to remove CI error"

* "Included yamllint install in build run"

* "Added yamllint in requirements.txt and .in file"

* "Moved install yamllint step to its own stage"

* "Updated yamllint test"

* "Updated circleci step"

* "Reformatted code"

* "Added yamllint to circleci steps"

* "Added checkout block to yamllint step"

* "Trimmed whitespace"

* "Undid yamllint step"

* "Specified directory name for yamllint test"

* "Fixed yamlint errors"

* "Fixed yamllint errors"

* "Fixed yamllint errors"

* "Fixed yamllint errors"

* "Ignore pathway in linting"

* "Added ignore venv pathway during linting"

* "Updated ignore block"

* "Updated ignore block"

* "Removed ignore block"

* "Updated ignore block"

* "Indented base as a list"

* "Indented base item"

* Update tests/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_last_seen_v1/test_day_bit_shifting/expect.yaml

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>

* "Resolved linting errors"

* "Referenced tables put back on same line"

* "Fixed linting error"

* Update sql/moz-fx-data-shared-prod/account_ecosystem_derived/fxa_logging_users_daily_v1/metadata.yaml

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>

* "Fixed linting error"

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>
This commit is contained in:
Rhys 2020-10-29 20:24:55 -04:00 коммит произвёл GitHub
Родитель 415ee2fb62
Коммит 1ace0fe2b7
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
389 изменённых файлов: 1877 добавлений и 1328 удалений

Просмотреть файл

@ -1,3 +1,4 @@
---
version: 2
jobs:
build:
@ -7,8 +8,10 @@ jobs:
- checkout
- restore_cache:
keys:
# when lock files change, use increasingly general patterns to restore cache
# when lock files change, use increasingly general
# patterns to restore cache
- &cache_key
# yamllint disable-line rule:line-length
python-packages-v1-{{ .Branch }}-{{ checksum "requirements.in" }}-{{ checksum "requirements.txt" }}
- python-packages-v1-{{ .Branch }}-{{ checksum "requirements.in" }}-
- python-packages-v1-{{ .Branch }}-
@ -20,6 +23,9 @@ jobs:
python3.8 -m venv venv/
venv/bin/pip install pip-tools
venv/bin/pip-sync
- run:
name: Yamllint Test
command: PATH="venv/bin:$PATH" yamllint -c .yamllint.yaml .
- run:
name: PyTest with linters
command: PATH="venv/bin:$PATH" script/entrypoint
@ -39,7 +45,8 @@ jobs:
steps:
- checkout
- run:
name: Verify that requirements.txt contains the right dependencies for this python version
name: Verify that requirements.txt contains the right dependencies
for this python version
command: |
pip install pip-tools
pip-compile --quiet --generate-hashes requirements.in
@ -68,14 +75,16 @@ jobs:
name: Early return if this build is from a forked PR
command: |
if [ -n "$CIRCLE_PR_NUMBER" ]; then
echo "Cannot pass creds to forked PRs, so marking this step successful"
echo "Cannot pass creds to forked PRs,
so marking this step successful"
circleci step halt
fi
- *build
- &pytest_integration_test
run:
name: PyTest Integration Test
# Google's client libraries will check for GOOGLE_APPLICATION_CREDENTIALS
# Google's client libraries will check for
# GOOGLE_APPLICATION_CREDENTIALS
# and use a file in that location for credentials if present;
# See https://cloud.google.com/docs/authentication/production
command: |
@ -83,7 +92,8 @@ jobs:
echo "$GCLOUD_SERVICE_KEY" > "$GOOGLE_APPLICATION_CREDENTIALS"
PATH="venv/bin:$PATH" script/entrypoint -m integration
validate-dags:
# based on https://github.com/mozilla/telemetry-airflow/blob/master/.circleci/config.yml
# based on
# https://github.com/mozilla/telemetry-airflow/blob/master/.circleci/config.yml
machine:
image: ubuntu-1604:201903-01
docker_layer_caching: true
@ -134,7 +144,8 @@ jobs:
name: Early return if this build is from a forked PR
command: |
if [ -n "$CIRCLE_PR_NUMBER" ]; then
echo "Cannot pass creds to forked PRs, so marking this step successful"
echo "Cannot pass creds to forked PRs,
so marking this step successful"
circleci step halt
fi
- *build
@ -142,14 +153,15 @@ jobs:
name: Install dependencies
command: pip install mkdocs markdown-include
- add_ssh_keys:
fingerprints:
"ab:b5:f7:55:92:0a:72:c4:63:0e:57:be:cd:66:32:53"
fingerprints: "ab:b5:f7:55:92:0a:72:c4:63:0e:57:be:cd:66:32:53"
- run:
name: Build and deploy docs
command: |
PATH="venv/bin:$PATH" script/generate_docs --output_dir=generated_docs/
PATH="venv/bin:$PATH" script/generate_docs \
--output_dir=generated_docs/
cd generated_docs/
mkdocs gh-deploy -m "[ci skip] Deployed {sha} with MkDocs version: {version}"
mkdocs gh-deploy -m "[ci skip] Deployed
{sha} with MkDocs version: {version}"
deploy:
parameters:
image:
@ -163,14 +175,18 @@ jobs:
docker_layer_caching: true
- run:
name: Determine docker image name
command: echo 'IMAGE="${CIRCLE_PROJECT_USERNAME+$CIRCLE_PROJECT_USERNAME/}${CIRCLE_PROJECT_REPONAME:-bigquery-etl}:${CIRCLE_TAG:-latest}"' > $BASH_ENV
command:
# yamllint disable-line rule:line-length
echo 'IMAGE="${CIRCLE_PROJECT_USERNAME+$CIRCLE_PROJECT_USERNAME/} ${CIRCLE_PROJECT_REPONAME:-bigquery-etl}:${CIRCLE_TAG:-latest}"' > $BASH_ENV
- run:
name: Build docker image
command: docker build . --pull --tag "$IMAGE"
- run:
name: Deploy to Dockerhub
command: |
echo "${DOCKER_PASS:?}" | docker login -u "${DOCKER_USER:?}" --password-stdin
echo "${DOCKER_PASS:?}" | \
docker login -u "${DOCKER_USER:?}" --password-stdin
docker push "$IMAGE"
workflows:

Просмотреть файл

@ -1,3 +1,4 @@
---
env:
browser: true
commonjs: true
@ -9,4 +10,4 @@ globals:
parserOptions:
ecmaVersion: 2018
rules:
no-console: off
no-console: 0

1
.github/dependabot.yml поставляемый
Просмотреть файл

@ -1,3 +1,4 @@
---
version: 2
updates:
- package-ecosystem: pip

8
.yamllint.yaml Normal file
Просмотреть файл

@ -0,0 +1,8 @@
---
rules:
line-length:
allow-non-breakable-words: true
allow-non-breakable-inline-mappings: true
ignore: |
venv/

122
dags.yaml
Просмотреть файл

@ -1,9 +1,15 @@
---
bqetl_error_aggregates:
schedule_interval: 3h
default_args:
owner: bewu@mozilla.com
email: ['telemetry-alerts@mozilla.com', 'bewu@mozilla.com', 'wlachance@mozilla.com']
start_date: '2019-11-01'
email:
[
"telemetry-alerts@mozilla.com",
"bewu@mozilla.com",
"wlachance@mozilla.com",
]
start_date: "2019-11-01"
retries: 1
retry_delay: 20m
depends_on_past: false
@ -12,8 +18,8 @@ bqetl_ssl_ratios:
schedule_interval: 0 2 * * *
default_args:
owner: chutten@mozilla.com
start_date: '2019-07-20'
email: ['telemetry-alerts@mozilla.com', 'chutten@mozilla.com']
start_date: "2019-07-20"
email: ["telemetry-alerts@mozilla.com", "chutten@mozilla.com"]
retries: 2
retry_delay: 30m
@ -21,8 +27,8 @@ bqetl_amo_stats:
schedule_interval: 0 3 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2020-06-01'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2020-06-01"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 2
retry_delay: 30m
@ -30,8 +36,13 @@ bqetl_vrbrowser:
schedule_interval: 0 2 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com', 'ascholtz@mozilla.com']
start_date: "2019-07-25"
email:
[
"telemetry-alerts@mozilla.com",
"jklukas@mozilla.com",
"ascholtz@mozilla.com",
]
retries: 1
retry_delay: 5m
@ -39,8 +50,8 @@ bqetl_core:
schedule_interval: 0 2 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 1
retry_delay: 5m
@ -48,8 +59,8 @@ bqetl_nondesktop:
schedule_interval: 0 3 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 1
retry_delay: 5m
@ -57,8 +68,8 @@ bqetl_mobile_search:
schedule_interval: 0 2 * * *
default_args:
owner: bewu@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'bewu@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "bewu@mozilla.com"]
retries: 1
retry_delay: 5m
@ -66,8 +77,8 @@ bqetl_fxa_events:
schedule_interval: 30 1 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-03-01'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2019-03-01"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 1
retry_delay: 10m
@ -75,8 +86,8 @@ bqetl_gud:
schedule_interval: 0 3 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 1
retry_delay: 5m
@ -84,8 +95,8 @@ bqetl_messaging_system:
schedule_interval: 0 2 * * *
default_args:
owner: najiang@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'najiang@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "najiang@mozilla.com"]
retries: 1
retry_delay: 5m
@ -93,8 +104,8 @@ bqetl_activity_stream:
schedule_interval: 0 2 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2019-07-25'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2019-07-25"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 1
retry_delay: 5m
@ -102,8 +113,9 @@ bqetl_search:
schedule_interval: 0 3 * * *
default_args:
owner: bewu@mozilla.com
start_date: '2018-11-27'
email: ['telemetry-alerts@mozilla.com', 'bewu@mozilla.com', 'frank@mozilla.com']
start_date: "2018-11-27"
email:
["telemetry-alerts@mozilla.com", "bewu@mozilla.com", "frank@mozilla.com"]
retries: 2
retry_delay: 30m
@ -111,8 +123,8 @@ bqetl_addons:
schedule_interval: 0 3 * * *
default_args:
owner: bmiroglio@mozilla.com
start_date: '2018-11-27'
email: ['telemetry-alerts@mozilla.com', 'bmiroglio@mozilla.com']
start_date: "2018-11-27"
email: ["telemetry-alerts@mozilla.com", "bmiroglio@mozilla.com"]
retries: 2
retry_delay: 30m
@ -120,8 +132,8 @@ bqetl_devtools:
schedule_interval: 0 3 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2018-11-27'
email: ['telemetry-alerts@mozilla.com', 'jklukas@mozilla.com']
start_date: "2018-11-27"
email: ["telemetry-alerts@mozilla.com", "jklukas@mozilla.com"]
retries: 2
retry_delay: 30m
@ -129,8 +141,14 @@ bqetl_main_summary:
schedule_interval: 0 2 * * *
default_args:
owner: dthorn@mozilla.com
start_date: '2018-11-27'
email: ['telemetry-alerts@mozilla.com', 'dthorn@mozilla.com', 'jklukas@mozilla.com', 'frank@mozilla.com']
start_date: "2018-11-27"
email:
[
"telemetry-alerts@mozilla.com",
"dthorn@mozilla.com",
"jklukas@mozilla.com",
"frank@mozilla.com",
]
retries: 2
retry_delay: 30m
@ -138,8 +156,8 @@ bqetl_experiments_daily:
schedule_interval: 0 3 * * *
default_args:
owner: ssuh@mozilla.com
start_date: '2018-11-27'
email: ['telemetry-alerts@mozilla.com', 'ssuh@mozilla.com']
start_date: "2018-11-27"
email: ["telemetry-alerts@mozilla.com", "ssuh@mozilla.com"]
retries: 2
retry_delay: 30m
@ -147,8 +165,8 @@ bqetl_document_sample:
schedule_interval: daily
default_args:
owner: amiyaguchi@mozilla.com
start_date: '2020-02-17'
email: ['telemetry-alerts@mozilla.com', 'amiyaguchi@mozilla.com']
start_date: "2020-02-17"
email: ["telemetry-alerts@mozilla.com", "amiyaguchi@mozilla.com"]
retries: 2
retry_delay: 30m
@ -156,19 +174,19 @@ bqetl_asn_aggregates:
schedule_interval: 0 2 * * *
default_args:
owner: ascholtz@mozilla.com
start_date: '2020-04-05'
email: ['ascholtz@mozilla.com', 'tdsmith@mozilla.com']
start_date: "2020-04-05"
email: ["ascholtz@mozilla.com", "tdsmith@mozilla.com"]
retries: 2
retry_delay: 30m
# DAG for exporting query data marked as public to GCS
# queries should not be explicitly assigned to this DAG (it's done automatically)
# queries should not be explicitly assigned to this DAG (done automatically)
bqetl_public_data_json:
schedule_interval: 0 4 * * *
default_args:
owner: ascholtz@mozilla.com
start_date: '2020-04-14'
email: ['telemetry-alerts@mozilla.com', 'ascholtz@mozilla.com']
start_date: "2020-04-14"
email: ["telemetry-alerts@mozilla.com", "ascholtz@mozilla.com"]
retries: 2
retry_delay: 30m
@ -177,8 +195,8 @@ bqetl_internet_outages:
schedule_interval: 0 3 * * *
default_args:
owner: aplacitelli@mozilla.com
start_date: '2020-01-01'
email: ['aplacitelli@mozilla.com', 'sguha@mozilla.com']
start_date: "2020-01-01"
email: ["aplacitelli@mozilla.com", "sguha@mozilla.com"]
retries: 2
retry_delay: 30m
@ -186,8 +204,8 @@ bqetl_deletion_request_volume:
schedule_interval: 0 1 * * *
default_args:
owner: dthorn@mozilla.com
start_date: '2020-06-29'
email: ['telemetry-alerts@mozilla.com', 'dthorn@mozilla.com']
start_date: "2020-06-29"
email: ["telemetry-alerts@mozilla.com", "dthorn@mozilla.com"]
retries: 2
retry_delay: 30m
@ -195,8 +213,8 @@ bqetl_fenix_event_rollup:
schedule_interval: 0 2 * * *
default_args:
owner: frank@mozilla.com
start_date: '2020-09-09'
email: ['frank@mozilla.com']
start_date: "2020-09-09"
email: ["frank@mozilla.com"]
retries: 2
retry_delay: 30m
@ -204,8 +222,8 @@ bqetl_account_ecosystem:
schedule_interval: 0 2 * * *
default_args:
owner: jklukas@mozilla.com
start_date: '2020-09-17'
email: ['jklukas@mozilla.com']
start_date: "2020-09-17"
email: ["jklukas@mozilla.com"]
retries: 2
retry_delay: 30m
@ -213,8 +231,8 @@ bqetl_stripe:
schedule_interval: daily
default_args:
owner: dthorn@mozilla.com
start_date: '2020-10-05'
email: ['telemetry-alerts@mozilla.com', 'dthorn@mozilla.com']
start_date: "2020-10-05"
email: ["telemetry-alerts@mozilla.com", "dthorn@mozilla.com"]
retries: 2
retry_delay: 5m
@ -222,8 +240,8 @@ bqetl_mozilla_vpn:
schedule_interval: daily
default_args:
owner: dthorn@mozilla.com
start_date: '2020-10-08'
email: ['telemetry-alerts@mozilla.com', 'dthorn@mozilla.com']
start_date: "2020-10-08"
email: ["telemetry-alerts@mozilla.com", "dthorn@mozilla.com"]
retries: 2
retry_delay: 30m
@ -238,7 +256,7 @@ bqetl_org_mozilla_fenix_derived:
owner: amiyaguchi@mozilla.com
retries: 2
retry_delay: 30m
start_date: '2020-10-18'
start_date: "2020-10-18"
schedule_interval: daily
bqetl_google_analytics_derived:

Просмотреть файл

@ -1,3 +1,4 @@
---
site_name: BigQuery ETL
site_description: Mozilla BigQuery ETL
site_author: Mozilla Data Platform Team

Просмотреть файл

@ -21,3 +21,4 @@ stripe==2.55.0
ujson==4.0.1
pandas==1.1.3
jsonschema==3.2.0
yamllint==1.25.0

Просмотреть файл

@ -366,7 +366,7 @@ pandas==1.1.3 \
pathspec==0.8.0 \
--hash=sha256:7d91249d21749788d07a2d0f94147accd8f845507400749ea19c1ec9054a12b0 \
--hash=sha256:da45173eb3a6f2a5a487efba21f050af2b41948be6ab52b6a1e3ff22bb8b7061 \
# via black
# via black, yamllint
pluggy==0.13.1 \
--hash=sha256:15b2acde666561e1298d71b523007ed7364de07029219b604cf808bfa1c765b0 \
--hash=sha256:966c145cd83c96502c3c3868f50408687b38434af77734af1e9ca461a4081d2d \
@ -475,7 +475,7 @@ pyyaml==5.3.1 \
--hash=sha256:b8eac752c5e14d3eca0e6dd9199cd627518cb5ec06add0de9d32baeee6fe645d \
--hash=sha256:cc8955cfbfc7a115fa81d85284ee61147059a753344bc51098f3ccd69b0d7e0c \
--hash=sha256:d13155f591e6fcc1ec3b30685d50bf0711574e2c0dfffd7644babf8b5102ca1a \
# via -r requirements.in, mozilla-schema-generator
# via -r requirements.in, mozilla-schema-generator, yamllint
regex==2020.9.27 \
--hash=sha256:088afc8c63e7bd187a3c70a94b9e50ab3f17e1d3f52a32750b5b77dbe99ef5ef \
--hash=sha256:1fe0a41437bbd06063aa184c34804efa886bcc128222e9916310c92cd54c3b4c \
@ -593,6 +593,10 @@ urllib3==1.25.10 \
--hash=sha256:91056c15fa70756691db97756772bb1eb9678fa585d9184f24534b100dc60f4a \
--hash=sha256:e7983572181f5e1522d9c98453462384ee92a0be7fac5f1413a1e35c56cc0461 \
# via requests
yamllint==1.25.0 \
--hash=sha256:b1549cbe5b47b6ba67bdeea31720f5c51431a4d0c076c1557952d841f7223519 \
--hash=sha256:c7be4d0d2584a1b561498fa9acb77ad22eb434a109725c7781373ae496d823b3 \
# via -r requirements.in
yarl==1.6.0 \
--hash=sha256:04a54f126a0732af75e5edc9addeaa2113e2ca7c6fce8974a63549a70a25e50e \
--hash=sha256:3cc860d72ed989f3b1f3abbd6ecf38e412de722fb38b8f1b1a086315cf0d69c5 \

Просмотреть файл

@ -1,5 +1,7 @@
---
friendly_name: Experimenter experiments
description: Experimenter experiments periodically imported via the Experimenter API.
description: Experimenter experiments periodically imported via the
Experimenter API.
labels:
incremental: false
schedule: ten-minute

Просмотреть файл

@ -1,4 +1,5 @@
description: Daily summary Google analytics data for blog.mozilla.org landing page
description: Daily summary Google analytics data for blog.mozilla.org
landing page
friendly_name: Blogs Landing Page Summary
labels:
incremental: true
@ -8,5 +9,5 @@ owners:
scheduling:
dag_name: bqetl_google_analytics_derived
referenced_tables:
- ['moz-fx-data-marketing-prod', 'ga_derived', 'blogs_goals_v1']
- ['moz-fx-data-marketing-prod', 'ga_derived', 'blogs_sessions_v1']
- ["moz-fx-data-marketing-prod", "ga_derived", "blogs_goals_v1"]
- ["moz-fx-data-marketing-prod", "ga_derived", "blogs_sessions_v1"]

Просмотреть файл

@ -1,4 +1,5 @@
description: Intermediate table containing normalized sessions for blog.mozilla.org
description: Intermediate table containing normalized sessions
for blog.mozilla.org
friendly_name: Blogs Sessions
labels:
incremental: true
@ -8,4 +9,4 @@ owners:
scheduling:
dag_name: bqetl_google_analytics_derived
referenced_tables:
- ['moz-fx-data-marketing-prod', 'ga_derived', 'blogs_empty_check_v1']
- ["moz-fx-data-marketing-prod", "ga_derived", "blogs_empty_check_v1"]

Просмотреть файл

@ -1,13 +1,16 @@
---
friendly_name: AET Clients Daily
description: >
One row per user per service per day, showing metrics across services.
The `user_id` and `client_id` are directly related to the `ecosystem_user_id`
and `ecosystem_client_id` primitive identifiers for Account Ecosystem Telemetry,
but are abstracted to prevent fingerprinting and to provide continuity across
user password reset events. In this view, we are guaranteed that a logical user
is represented by a consistent `user_id` over time.
The `user_id` and `client_id` are directly related to the
`ecosystem_user_id` and `ecosystem_client_id` primitive
identifiers for Account Ecosystem Telemetry, but are abstracted
to prevent fingerprinting and to provide continuity across
user password reset events. In this view, we are guaranteed that
a logical user is represented by a consistent `user_id` over time.
For rows representing client telemetry, this view looks up `user_id` at runtime
based on `client_id` so that we can have `user_id` values present for any client
that has ever logged in to FxA, even for older rows before the first login.
For rows representing client telemetry, this view looks up `user_id`
at runtime based on `client_id` so that we can have `user_id` values
present for any client that has ever logged in to FxA, even for older
rows before the first login.

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: AET Desktop Clients Daily
description: >
One row per desktop client per day aggregating all AET pings received for that client.
One row per desktop client per day aggregating all AET pings received
for that client.
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Ecosystem Client ID Lookup
description: >
Lookup table of ecosystem_client_id_hash to canonical_id.
@ -9,9 +10,11 @@ labels:
incremental: true
scheduling:
dag_name: bqetl_account_ecosystem
depends_on_past: True
depends_on_past: true
# We access a restricted table for getting an HMAC key, so cannot dry run
# and must explicitly list referenced tables.
referenced_tables:
- ['moz-fx-data-shared-prod', 'telemetry_stable', 'account_ecosystem_v4']
- ['moz-fx-data-shared-prod', 'account_ecosystem_derived', 'ecosystem_user_id_lookup_v1']
- ['moz-fx-data-shared-prod', 'telemetry_stable',
'account_ecosystem_v4']
- ['moz-fx-data-shared-prod', 'account_ecosystem_derived',
'ecosystem_user_id_lookup_v1']

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Ecosystem User ID Lookup
description: >
Lookup table of ecosystem_user_id to canonical_id.
@ -14,9 +15,14 @@ labels:
incremental: true
scheduling:
dag_name: bqetl_account_ecosystem
depends_on_past: True
referenced_tables: [['moz-fx-data-shared-prod', 'firefox_accounts_stable', 'account_ecosystem_v1']]
depends_on_past: true
referenced_tables:
- [
"moz-fx-data-shared-prod",
"firefox_accounts_stable",
"account_ecosystem_v1",
]
# This is an unpartitioned table where the script adds rows via INSERT INTO,
# thus the custom settings below.
# thus the custom settings below.$
date_partition_parameter: null
parameters: ["submission_date:DATE:{{ds}}"]

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: AET Logging Users Daily
description: >
One row per canonical_id per oauth service per day aggregating all AET events received for that user.
One row per canonical_id per oauth service per day aggregating
all AET events received for that user.
owners:
- jklukas@mozilla.com
labels:
@ -12,5 +14,13 @@ scheduling:
# We access a restricted table for getting an HMAC key, so cannot dry run
# and must explicitly list referenced tables.
referenced_tables:
- ['moz-fx-data-shared-prod', 'firefox_accounts_stable', 'account_ecosystem_v1']
- ['moz-fx-data-shared-prod', 'account_ecosystem_derived', 'ecosystem_user_id_lookup_v1']
- [
"moz-fx-data-shared-prod",
"firefox_accounts_stable",
"account_ecosystem_v1",
]
- [
"moz-fx-data-shared-prod",
"account_ecosystem_derived",
"ecosystem_user_id_lookup_v1",
]

Просмотреть файл

@ -1,9 +1,11 @@
---
friendly_name: Ecosystem Client ID Deletion
description: >
Provides ecosystem_client_id values that have appeared in a deletion-request ping.
Provides ecosystem_client_id values that have appeared in a
deletion-request ping.
Also includes the hashed version of the ID to enable deletion of rows in derived
tables containing the hash.
Also includes the hashed version of the ID to enable deletion
of rows in derived tables containing the hash.
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,5 +1,8 @@
---
friendly_name: Impression Stats By Experiment
description: Representation of tile impression statistics, clustered by experiment_id to allow efficient analysis of individual experiments
description: Representation of tile impression statistics,
clustered by experiment_id to allow efficient analysis of
individual experiments
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Impression Stats Flat
description: Unnested representation of tile impression statistics
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: AMO Stats DAU dev/stage
description: >-
Reduced stats table for dev and stage versions of the AMO service.

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: AMO Installs dev/stage
description: >
Reduced daily installs table for dev and stage versions of the AMO service.

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: AMO Stats DAU
description: >-
Daily user statistics to power addons.mozilla.org stats pages. See bug 1572873.
Daily user statistics to power addons.mozilla.org stats pages.
See bug 1572873.
Each row in this table represents a particular addon on a particular day
and provides all the information needed to populate the various

Просмотреть файл

@ -1,10 +1,12 @@
---
friendly_name: AMO Stats DAU
description: >
Daily install statistics to power addons.mozilla.org stats pages. See bug 1654330.
Note that this table uses a hashed_addon_id defined as `TO_HEX(SHA256(addon_id))`
because the underlying event pings have limitations on length of properties attached
to events and addon_id values are sometimes too long. The AMO stats application looks
up records in this table based on the hashed_addon_id.
Daily install statistics to power addons.mozilla.org stats pages.
See bug 1654330. Note that this table uses a hashed_addon_id
defined as `TO_HEX(SHA256(addon_id))` because the underlying event
pings have limitations on length of properties attached to events
and addon_id values are sometimes too long. The AMO stats application
looks up records in this table based on the hashed_addon_id.
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Desktop addons by client
description: >-
Clients_daily-like table that records only the dimensions and addon info
@ -12,4 +13,5 @@ scheduling:
dag_name: bqetl_amo_stats
# provide this value so that DAG generation does not have to dry run the
# query to get it, and that would be slow because main_v4 is referenced
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable', 'main_v4']]
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable',
'main_v4']]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Fenix addons by client
description: >-
Clients_daily-like table on top of the various Firefox for Android channels

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Firefox Accounts Exact Mau 28
description: Base table for exact FxA MAU by dimensions
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Amplitude Export
description: >
Derived from FxA logs, this table contains active events and user property

Просмотреть файл

@ -1,8 +1,10 @@
---
friendly_name: FxA Hashed User IDs
description: >
Contains exactly one entry for each FxA user, giving the 64-character hashed
user ID used for all exports to Amplitude. This table is useful as a lookup
for sync pings which contain a truncated 32-character version of the same hash.
for sync pings which contain a truncated 32-character version of the same
hash.
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,5 +1,7 @@
---
friendly_name: FxA Auth Bounce Events
description: Selected Amplitude events extracted from FxA auth_bounce server logs
description: Selected Amplitude events extracted from FxA auth_bounce
server logs
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Auth Events
description: Selected Amplitude events extracted from FxA auth server logs
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Content Events
description: Selected Amplitude events extracted from FxA content server logs
owners:

Просмотреть файл

@ -1,5 +1,8 @@
---
friendly_name: FxA Delete Events
description: Deletion events extracted from FxA auth server logs used as signal for Mozilla to delete analysis data associated with the user
description: Deletion events extracted from FxA auth server logs
used as signal for Mozilla to delete analysis data associated with
the user
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: FxA Log Auth Events
description: >-
A subset of FxA auth logs that is sometimes useful for ad hoc longitudinal analysis.
A subset of FxA auth logs that is sometimes useful
for ad hoc longitudinal analysis.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1628708
owners:

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: FxA Log Content Events
description: >-
A subset of FxA content server logs that is sometimes useful for ad hoc longitudinal analysis.
A subset of FxA content server logs that is sometimes useful
for ad hoc longitudinal analysis.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1628708
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Log Device Command Events
description: >-
A subset of FxA auth server logs related to "send tab" activity.

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Users Daily
description: Usage aggregations per FxA user per day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Users Last Seen
description: Usage aggregations per FxA user per day over a 28-day window
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Users Services Daily
description: Usage aggregations per FxA user per FxA service per day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: FxA Users Services First Seen
description: Usage aggregations describing when each FxA user was first seen
owners:

Просмотреть файл

@ -1,5 +1,7 @@
---
friendly_name: FxA Users Services Last Seen
description: Usage aggregations per FxA user per FxA service per day over a 28-day window
description: Usage aggregations per FxA user per FxA service
per day over a 28-day window
owners:
- jklukas@mozilla.com
labels:

Просмотреть файл

@ -1,30 +1,60 @@
---
friendly_name: Internet Outages
description: |-
This contains a set aggregated metrics that correlate to internet outages for different countries in the world.
This contains a set aggregated metrics that correlate to internet
outages for different countries in the world.
The dataset contains the following fields:
- `country`: the Country code of the client.
- `city`: the City name (only for cities with a population >= 15000, 'unknown' otherwise).
- `datetime`: the date and the time (truncated to hour) the data was submitted by the client.
- `proportion_undefined`: the proportion of users who failed to send telemetry for a reason that was not listed in the other cases.
- `proportion_timeout`: the proportion of users that had their connection timeout while uploading telemetry ([after 90s, in Firefox Desktop](https://searchfox.org/mozilla-central/rev/fa2df28a49883612bd7af4dacd80cdfedcccd2f6/toolkit/components/telemetry/app/TelemetrySend.jsm#81)).
- `proportion_abort`: the proportion of users that had their connection terminated by the client (for example, terminating open connections before shutting down).
- `proportion_unreachable`: the proportion of users that failed to upload telemetry because the server was not reachable (e.g. because the host was not reachable, proxy problems or OS waking up after a suspension).
- `proportion_terminated`: the proportion of users that had their connection terminated internally by the networking code.
- `proportion_channel_open`: the proportion of users for which the upload request was terminated immediately, by the client, because of a Necko internal error.
- `avg_dns_success_time`: the average time it takes for a successful DNS resolution, in milliseconds.
- `missing_dns_success`: counts how many sessions did not report the `DNS_LOOKUP_TIME` histogram.
- `avg_dns_failure_time`: the average time it takes for an unsuccessful DNS resolution, in milliseconds.
- `missing_dns_failure`: counts how many sessions did not report the `DNS_FAILED_LOOKUP_TIME` histogram.
- `count_dns_failure`: the average count of unsuccessful DNS resolutions reported.
- `ssl_error_prop`: the proportion of users that reported an error through the `SSL_CERT_VERIFICATION_ERRORS` histogram.
- `avg_tls_handshake_time`: the average time after the TCP SYN to ready for HTTP, in milliseconds.
- `city`: the City name (only for cities with a population >= 15000,
'unknown' otherwise).
- `datetime`: the date and the time (truncated to hour) the data was
submitted by the client.
- `proportion_undefined`: the proportion of users who failed to send
telemetry for a reason that was not listed in the other cases.
- `proportion_timeout`: the proportion of users that had their connection
timeout while uploading telemetry
([after 90s, in Firefox Desktop](
https://searchfox.org/mozilla-central/rev/fa2df28a49883612bd7af4dacd80cdfedcccd2f6/toolkit/components/telemetry/app/TelemetrySend.jsm#81)).
- `proportion_abort`: the proportion of users that had their connection
terminated by the client (for example, terminating open connections before
shutting down).
- `proportion_unreachable`: the proportion of users that failed to upload
telemetry because the server was not reachable (e.g. because the host was
not reachable, proxy problems or OS waking up after a suspension).
- `proportion_terminated`: the proportion of users that had their connection
terminated internally by the networking code.
- `proportion_channel_open`: the proportion of users for which the upload
request was terminated immediately, by the client, because of a Necko
internal error.
- `avg_dns_success_time`: the average time it takes for a successful DNS
resolution, in milliseconds.
- `missing_dns_success`: counts how many sessions did not report the
`DNS_LOOKUP_TIME` histogram.
- `avg_dns_failure_time`: the average time it takes for an unsuccessful DNS
resolution, in milliseconds.
- `missing_dns_failure`: counts how many sessions did not report the
`DNS_FAILED_LOOKUP_TIME` histogram.
- `count_dns_failure`: the average count of unsuccessful DNS resolutions
reported.
- `ssl_error_prop`: the proportion of users that reported an error through
the `SSL_CERT_VERIFICATION_ERRORS` histogram.
- `avg_tls_handshake_time`: the average time after the TCP SYN to ready
for HTTP, in milliseconds.
Caveats with the data:
As with any observational data, there are many caveats and interpretation must be done carefully. Below is a list of issues we have considered, but it is not exhaustive.
- Firefox users are not representative of the general population in their region.
- Users can experience multiple types of failures and so the proportions are not summable. For example, if 2.4% of clients had a timeout and 2.6% of clients had eUnreachable that doesn't necessarily mean that 5.0% of clients had a timeout or a eUnreachable
- Geographical data is based on IPGeo databases. These databases are imperfect, so some activity may be attributed to the wrong location. Further, proxy and VPN usage can create geo-attribution errors.
As with any observational data, there are many caveats and interpretation must
be done carefully. Below is a list of issues we have considered, but it is not
exhaustive.
- Firefox users are not representative of the general population in their
region.
- Users can experience multiple types of failures and so the proportions
are not summable. For example, if 2.4% of clients had a timeout and 2.6% of
clients had eUnreachable that doesn't necessarily mean that 5.0% of clients
had a timeout or a eUnreachable
- Geographical data is based on IPGeo databases. These databases are
imperfect, so some activity may be attributed to the wrong location.
Further, proxy and VPN usage can create geo-attribution errors.
owners:
- aplacitelli@mozilla.com
@ -41,6 +71,6 @@ scheduling:
# provide this value so that DAG generation does not have to dry run the
# query to get it, and that would be slow because main_v4 is referenced
referenced_tables:
- ['moz-fx-data-shared-prod', 'telemetry_stable', 'main_v4']
- ['moz-fx-data-shared-prod', 'telemetry_derived', 'clients_daily_v6']
- ['moz-fx-data-shared-prod', 'telemetry_stable', 'health_v4']
- ["moz-fx-data-shared-prod", "telemetry_stable", "main_v4"]
- ["moz-fx-data-shared-prod", "telemetry_derived", "clients_daily_v6"]
- ["moz-fx-data-shared-prod", "telemetry_stable", "health_v4"]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System CFR Exact MAU by Dimensions
description: >
Monthly active users using CFR aggregated across unique sets of dimensions.

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System CFR Users Daily
description: Daily users of CFR, partitioned by day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System CFR Users Last Seen
description: >
Captures history of activity of each client using CFR in 28 day

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: Messaging System Onboarding Exact MAU by Dimensions
description: >
Monthly active users using Onboarding aggregated across unique sets of dimensions.
Monthly active users using Onboarding aggregated across unique
sets of dimensions.
owners:
- najiang@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System Onboarding Users Daily
description: Daily users of Onboarding, partitioned by day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System Onboarding Users Last Seen
description: >
Captures history of activity of each client using Onboarding in 28 day

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: Messaging System Snippets Exact MAU by Dimensions
description: >
Monthly active users using Snippets aggregated across unique sets of dimensions.
Monthly active users using Snippets aggregated across unique sets
of dimensions.
owners:
- najiang@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System Snippets Users Daily
description: Daily users of Snippets, partitioned by day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Messaging System Snippets Users Last Seen
description: >
Captures history of activity of each client using Snippets in 28 day

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Bigquery-etl Scheduled Queries Cost
description: Cost of scheduled bigquery-etl queries, partitioned by day.
labels:

Просмотреть файл

@ -1,4 +1,6 @@
description: Number of accesses to destination tables of scheduled bigquery-etl queries, partitioned by day.
---
description: Number of accesses to destination tables of scheduled
bigquery-etl queries, partitioned by day.
friendly_name: Bigquery-etl Scheduled Query Usage
labels:
incremental: true

Просмотреть файл

@ -1,4 +1,5 @@
description: Determines column sizes of specific tables and partitions via dry runs.
description: Determines column sizes of specific tables and partitions
via dry runs.
friendly_name: Column Size
labels:
incremental: true

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Deletion Request Volume
description: >
A daily count of deletion request pings by document namespace

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Document Sample
description: Document samples non-prod.
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stable Table Sizes
description: >
Table sizes of stable tables, partitioned by day.

Просмотреть файл

@ -1,6 +1,7 @@
friendly_name: Mozilla VPN FxA Login Flows
description: >
A list of users that attempted to login to FxA, to sign up for or login to Mozilla VPN.
A list of users that attempted to login to FxA, to sign up for or
login to Mozilla VPN.
owners:
- dthorn@mozilla.com
labels:
@ -8,10 +9,19 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
parameters:
- "date:DATE:{{ds}}"
referenced_tables:
- ["moz-fx-data-shared-prod", "firefox_accounts_derived", "fxa_auth_events_v1"]
- ["moz-fx-data-shared-prod", "firefox_accounts_derived", "fxa_content_events_v1"]
- [
"moz-fx-data-shared-prod",
"firefox_accounts_derived",
"fxa_auth_events_v1",
]
- [
"moz-fx-data-shared-prod",
"firefox_accounts_derived",
"fxa_content_events_v1",
]

Просмотреть файл

@ -8,7 +8,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
parameters:
- "date:DATE:{{ds}}"

Просмотреть файл

@ -1,6 +1,7 @@
friendly_name: Mozilla VPN Users
description: >
A subset of Mozilla VPN users columns that are accessible to a broader audience.
A subset of Mozilla VPN users columns that are accessible to a
broader audience.
owners:
- dthorn@mozilla.com
labels:
@ -8,6 +9,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
referenced_tables: [["moz-fx-data-shared-prod", "mozilla_vpn_external","users_v1"]]
referenced_tables:
[["moz-fx-data-shared-prod", "mozilla_vpn_external", "users_v1"]]

Просмотреть файл

@ -1,6 +1,8 @@
---
friendly_name: Mozilla VPN Waitlist
description: >
A subset of Mozilla VPN waitlist columns and values that are accessible to a broader audience.
A subset of Mozilla VPN waitlist columns and values that are
accessible to a broader audience.
owners:
- dthorn@mozilla.com
labels:
@ -8,6 +10,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
referenced_tables: [['moz-fx-data-shared-prod', 'mozilla_vpn_external','waitlist_v1']]
referenced_tables: [['moz-fx-data-shared-prod', 'mozilla_vpn_external',
'waitlist_v1']]

Просмотреть файл

@ -1,8 +1,11 @@
---
friendly_name: Mozilla VPN Devices
description: >
A mirror of the devices table from the Mozilla VPN (Guardian) CloudSQL database, updated daily to
match the current state of the table. The table history is not needed, because changes made are not
destructive, except in the case of self-serve data deletion.
A mirror of the devices table from the Mozilla VPN (Guardian)
CloudSQL database, updated daily to match the current state of
the table. The table history is not needed, because changes
made are not destructive, except in the case of self-serve
data deletion.
owners:
- dthorn@mozilla.com
labels:
@ -10,12 +13,16 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
depends_on_past: true
parameters:
# The external_database_query argument in EXTERNAL_QUERY must be a literal string or query
# parameter, and cannot be generated at runtime using function calls like CONCAT or FORMAT,
# so the entire value must be provided as a STRING query parameter to handle specific dates:
- "external_database_query:STRING:SELECT * FROM devices WHERE DATE(updated_at) = DATE '{{ds}}'"
# The external_database_query argument in EXTERNAL_QUERY must be
# a literal string or query parameter, and cannot be generated
# at runtime using function calls like CONCAT or FORMAT, so the
# entire value must be provided as a STRING query parameter to
# handle specific dates:
- "external_database_query:STRING:SELECT * FROM devices
WHERE DATE(updated_at) = DATE '{{ds}}'"
referenced_tables: []

Просмотреть файл

@ -1,8 +1,11 @@
---
friendly_name: Mozilla VPN Subscriptions
description: >
A mirror of the subscriptions table from the Mozilla VPN (Guardian) CloudSQL database, updated daily to
match the current state of the table. The table history is not needed, because changes made are not
destructive, except in the case of self-serve data deletion.
A mirror of the subscriptions table from the Mozilla VPN
(Guardian) CloudSQL database, updated daily to
match the current state of the table. The table history
is not needed, because changes made are not destructive,
except in the case of self-serve data deletion.
owners:
- dthorn@mozilla.com
labels:
@ -10,12 +13,16 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
depends_on_past: true
parameters:
# The external_database_query argument in EXTERNAL_QUERY must be a literal string or query
# parameter, and cannot be generated at runtime using function calls like CONCAT or FORMAT,
# so the entire value must be provided as a STRING query parameter to handle specific dates:
- "external_database_query:STRING:SELECT * FROM subscriptions WHERE DATE(updated_at) = DATE '{{ds}}'"
# The external_database_query argument in EXTERNAL_QUERY must
# be a literal string or query parameter, and cannot be generated
# at runtime using function calls like CONCAT or FORMAT, so the
# entire value must be provided as a STRING query parameter to
# handle specific dates:
- "external_database_query:STRING:SELECT * FROM subscriptions
WHERE DATE(updated_at) = DATE '{{ds}}'"
referenced_tables: []

Просмотреть файл

@ -1,8 +1,11 @@
---
friendly_name: Mozilla VPN Users
description: >
A mirror of the users table from the Mozilla VPN (Guardian) CloudSQL database, updated daily to
match the current state of the table. The table history is not needed, because changes made are not
destructive, except in the case of self-serve data deletion.
A mirror of the users table from the Mozilla VPN (Guardian)
CloudSQL database, updated daily to match the current state
of the table. The table history is not needed, because changes
made are not destructive, except in the case of self-serve
data deletion.
owners:
- dthorn@mozilla.com
labels:
@ -10,12 +13,16 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
depends_on_past: true
parameters:
# The external_database_query argument in EXTERNAL_QUERY must be a literal string or query
# parameter, and cannot be generated at runtime using function calls like CONCAT or FORMAT,
# so the entire value must be provided as a STRING query parameter to handle specific dates:
- "external_database_query:STRING:SELECT * FROM users WHERE DATE(updated_at) = DATE '{{ds}}'"
# The external_database_query argument in EXTERNAL_QUERY must
# be a literal string or query parameter, and cannot be generated
# at runtime using function calls like CONCAT or FORMAT, so
# the entire value must be provided as a STRING query parameter
# to handle specific dates:
- "external_database_query:STRING:SELECT * FROM users
WHERE DATE(updated_at) = DATE '{{ds}}'"
referenced_tables: []

Просмотреть файл

@ -1,8 +1,11 @@
---
friendly_name: Mozilla VPN Waitlist
description: >
A mirror of the waitlist table from the Mozilla VPN (Guardian) CloudSQL database, updated daily to
match the current state of the table. The table history is not needed, because changes made are not
destructive, except in the case of self-serve data deletion.
A mirror of the waitlist table from the Mozilla VPN (Guardian)
CloudSQL database, updated daily to match the current state of
the table. The table history is not needed, because changes made
are not destructive, except in the case of self-serve
data deletion.
owners:
- dthorn@mozilla.com
labels:
@ -10,12 +13,16 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_mozilla_vpn
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter: null
depends_on_past: true
parameters:
# The external_database_query argument in EXTERNAL_QUERY must be a literal string or query
# parameter, and cannot be generated at runtime using function calls like CONCAT or FORMAT,
# so the entire value must be provided as a STRING query parameter to handle specific dates:
- "external_database_query:STRING:SELECT * FROM vpn_waitlist WHERE DATE(updated_at) = DATE '{{ds}}'"
# The external_database_query argument in EXTERNAL_QUERY must
# be a literal string or query parameter, and cannot be generated
# at runtime using function calls like CONCAT or FORMAT, so the
# entire value must be provided as a STRING query parameter to
# handle specific dates:
- "external_database_query:STRING:SELECT * FROM vpn_waitlist
WHERE DATE(updated_at) = DATE '{{ds}}'"
referenced_tables: []

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Geckoview versions corresponding to build hours
description: |
Enumerate the most common GeckoView version for each build hour for nightly"

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Firefox Android Release Event Types
description: Recreate a view on the most recent data from event_types_v1
owners:
@ -9,6 +10,8 @@ labels:
scheduling:
dag_name: bqetl_fenix_event_rollup
destination_table: null
sql_file_path: 'sql/moz-fx-data-shared-prod/org_mozilla_firefox/event_types_v1/query.sql'
referenced_tables: [['moz-fx-data-shared-prod', 'org_mozilla_firefox_derived', 'event_types_v1']]
sql_file_path: 'sql/moz-fx-data-shared-prod/org_mozilla_firefox
/event_types_v1/query.sql'
referenced_tables: [['moz-fx-data-shared-prod', 'org_mozilla_firefox_derived',
'event_types_v1']]
parameters: ['submission_date:DATE:{{ds}}']

Просмотреть файл

@ -1,5 +1,7 @@
---
friendly_name: Firefox Android Release Event Types
description: Retrieve the set of [events, event_properties] and record them in a table.
description: Retrieve the set of [events, event_properties]
and record them in a table.
owners:
- frank@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Events Daily
description: Packed event representation with one-row per-client
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Migrated Clients
description: >
Contains the Fennec Client ID and Fenix Client ID for any clients

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: VR Browser Baseline Daily
description: >
A daily aggregate of baseline pings from each Firefox Reality client,

Просмотреть файл

@ -1,7 +1,8 @@
---
friendly_name: VR Browser Clients Daily
description: >
A daily aggregate of baseline and metrics pings from each Firefox Reality client,
partitioned by day
A daily aggregate of baseline and metrics pings from each
Firefox Reality client, partitioned by day
owners:
- jklukas@mozilla.com
- ascholtz@mozilla.com

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: VR Browser Clients Last Seen
description: >
Captures history of activity of each Firefox Reality client in 28 day

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: VR Browser Metrics Daily
description: >
A daily aggregate of metrics pings from each Firefox Reality client,

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Mobile Search Clients Daily
description: >
Daily mobile clients sending pings that have searches,

Просмотреть файл

@ -1,7 +1,8 @@
---
friendly_name: Mobile Search Clients Daily
description: >
A daily aggregate of baseline and metrics pings that have searches from each mobile client,
partitioned by day
A daily aggregate of baseline and metrics pings that have
searches from each mobile client, partitioned by day
owners:
- bewu@mozilla.com
labels:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Mobile Search Clients Last Seen
description: >-
Captures search activity of a single mobile client in the past 365 days

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Search Aggregates
description: >-
Daily search clients, aggregated across unique sets of dimensions

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Search Clients Daily
description: A daily aggregate of search clients, partitioned by day
owners:

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Search Clients Last Seen
description: >-
Captures history of activity of each search client in 28 day

Просмотреть файл

@ -1,7 +1,11 @@
---
friendly_name: Search Metric Contribution
description: >
A table helps us look at question in 28 day
windows for each submission date: "What percent of one search metirc (ad clicks, search with ads, sap, active hours) are contributed by users in the top [10, 20, 30] percent of the (ad clicks, search with ads, sap, active hours) distribution?"
windows for each submission date: "What percent of one search
metirc (ad clicks, search with ads, sap, active hours) are contributed
by users in the top [10, 20, 30] percent of the (ad clicks, search with
ads, sap, active hours) distribution?"
owners:
- bmiroglio@mozilla.com
labels:

Просмотреть файл

@ -7,6 +7,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external', 'customers_v1']]
referenced_tables:
[["moz-fx-data-shared-prod", "stripe_external", "customers_v1"]]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Plans
description: Stripe Plans as of UTC midnight. See bug 1667889.
owners:
@ -7,6 +8,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external', 'plans_v1']]
referenced_tables:
[["moz-fx-data-shared-prod", "stripe_external", "plans_v1"]]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Products
description: Stripe Products as of UTC midnight. See bug 1667889.
owners:
@ -7,6 +8,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external', 'products_v1']]
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external',
'products_v1']]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Subscriptions
description: Stripe Subscriptions as of UTC midnight. See bug 1667889.
owners:
@ -7,6 +8,8 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external', 'subscriptions_v1']]
referenced_tables: [['moz-fx-data-shared-prod', 'stripe_external',
'subscriptions_v1']]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Charges
description: Stripe Charges as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Credit Notes
description: Stripe Credit Notes as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Customers
description: Stripe Customers as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Disputes
description: Stripe Disputes as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Invoices
description: Stripe Invoices as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Payment Intents
description: Stripe Payment Intents as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Payouts
description: Stripe Payouts as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Plans
description: Stripe Plans as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Prices
description: Stripe Prices as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Products
description: Stripe Products as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Setup Intents
description: Stripe Setup Intents as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Stripe Subscriptions
description: Stripe Subscriptions as of UTC midnight
owners:
@ -7,10 +8,11 @@ labels:
schedule: daily
scheduling:
dag_name: bqetl_stripe
# destination is the whole table, not a single partition, so don't use date_partition_parameter
# destination is the whole table, not a single partition,
# so don't use date_partition_parameter
date_partition_parameter:
parameters:
- 'date:DATE:{{ds}}'
- "date:DATE:{{ds}}"
depends_on:
- dag_name: stripe
task_id: stripe_import_events

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Clients first seen
description: >-
Identical in schema to clients_daily except that each client appears only

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Desktop 1-Week Retention
description: >
Among profiles that were active at least once in the week starting on the

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Addon Aggregates
description: >-
Addon usages by clients, aggregated across unique sets of dimensions
@ -12,4 +13,5 @@ scheduling:
dag_name: bqetl_addons
# provide this value so that DAG generation does not have to dry run the
# query to get it, and that would be slow because main_v4 is referenced
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable', 'main_v4']]
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable',
'main_v4']]

Просмотреть файл

@ -1,3 +1,4 @@
---
friendly_name: Addon Names
description: Addon IDs, names and number of occurences.
owners:
@ -15,4 +16,5 @@ scheduling:
parameters: ["submission_date:DATE:{{ds}}"]
# provide this value so that DAG generation does not have to dry run the
# query to get it, and that would be slow because main_v4 is referenced
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable', 'main_v4']]
referenced_tables: [['moz-fx-data-shared-prod', 'telemetry_stable',
'main_v4']]

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше