* DENG-850 Add test setup.

* DS-2947. Create new dataset and tests for Firefox Desktop Clients.

* DS-2947. Update dataset name to clients_first_seen_v2.

* DS-2947. Dataset name to clients_first_seen_v2.

* DS-2947. Updating tests.

* DS-2947. Schema for clients_first_seen_v2.

* DS-2947. Tests update.

* Tests update

* Restore test files

* DS-2947. Get data from main and new profile ping. Get first dltoken and dlsource available. Update tests.

* DS-2947. Use main ping's submission_timestamp_min to find the earliest ping.

* Remove app_display_versin as it is normalised in app_version. Update fields on a 7 day window. Retrieve data from the ping with the earliest NOT NULL value to remove NULLS when main ping is not available.

* Remove app_display_versin as it is normalised in app_version. Update fields on a 7 day window. Retrieve data from the ping with the earliest NOT NULL value to remove NULLS when main ping is not available.

* Update schemas, remove duplicated columns from query and init. Adapt existing unitest and add unitest for 7-day window updates. Include scheduler in DAG bqetl_analytics_tables.

* Update to enable initialize from query.sql. Remove init.sql.

* Update DAGs dependencies.

* DAG bqetl_main_summary updated.

* Query and tests update to join with sample_id.

* Refactor metadata fields in query and tests.

* Schema and descriptions updated. Remove filter to query the existing table. Remove the DATETIME, the first_seen_date is equivalent.

* Column required to be explicit in the query to match the schema.

* Test fix.

* Tests tmp changes.

* Remove 7-day window update and update tests.

* Add second_seen_date to the query

* DS-3037 Add second_seen_date and tests.

* DS-3037 Add is_init to calculate second_seen_date.

* remove files in analysis dataset

* DS-3037 Add is_init to calculate second_seen_date. Formatting.

* DS-2986 Add initialize script. Change submission_timestamp_min to submission_date due to NULL values in that field.

* DENG-1314. Update metadata reported pings and tests in the query.

* DS-3054. Update bqetl initialize command and query to support parallel run.

* DS-3054. Update query to use submission_timestamp_min from main ping where available for precision in source ping for first_seen_date, add source ping of second_seen_date, get only first_seen date from new_profile and shutdown ping due to 16% clients with more than one new_profile ping. Add capability to run in parallel in bqetl. Update tests.

* DS-3054. Remove initialize.py.

* Reset unrelated formatting changes from this branch to match the main branch.

* Correct jira template.

* DS-2947. Update naming for attribution dltoken and dlsource.

* DS-2947. Update column names and tests and clarity for the initialization command.

* DS-2986. Create table with schema and metadata in command initialize.

* Document what is the result expected from each subquery.

* Documentation update.

* DS-3145. Include user agent and the source ping. Add query documentation.

* DS-3145. Update tests.

* DS-3146 Update logic to get attributes only from the ping that reports the first_seen_date, include locale, update the source for app_build_id and collect second_seen_date only from main ping.

* DS-3146 Update logic to get attributes only from the ping that reports the first_seen_date, include locale, update the source for app_build_id and collect second_seen_date only from main ping.

* DS-3054. Updates and save initialization.

* DS-3054. Table name required for DAG generation.

* DS-2947_implement_bigquery_changes_in_another_PR.

* DS-2947 Naming

* Add clients_first_seen_v2 to skip dry-run.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
This commit is contained in:
Lucia 2023-09-25 16:39:04 +02:00 коммит произвёл GitHub
Родитель 7040463033
Коммит 6826cbe395
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
22 изменённых файлов: 1694 добавлений и 0 удалений

Просмотреть файл

@ -208,6 +208,7 @@ dry_run:
- sql/moz-fx-data-shared-prod/telemetry_derived/clients_last_seen_v1/init.sql
# HTTP Error 408: Request Time-out
- sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_v1/query.sql
- sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_v2/query.sql
- sql/moz-fx-data-shared-prod/telemetry_derived/clients_last_seen_v1/query.sql
- sql/moz-fx-data-shared-prod/telemetry_derived/latest_versions/query.sql
- sql/moz-fx-data-shared-prod/telemetry_derived/italy_covid19_outage_v1/query.sql

Просмотреть файл

@ -47,6 +47,22 @@ with DAG(
doc_md=docs,
tags=tags,
) as dag:
clients_first_seen_v2 = bigquery_etl_query(
task_id="clients_first_seen_v2",
destination_table="clients_first_seen_v2",
dataset_id="telemetry_derived",
project_id="moz-fx-data-shared-prod",
owner="lvargas@mozilla.com",
email=[
"gkaberere@mozilla.com",
"lvargas@mozilla.com",
"telemetry-alerts@mozilla.com",
],
date_partition_parameter=None,
depends_on_past=True,
parameters=["submission_date:DATE:{{ds}}"],
)
firefox_android_clients = bigquery_etl_query(
task_id="firefox_android_clients",
destination_table="firefox_android_clients_v1",
@ -75,6 +91,32 @@ with DAG(
firefox_android_clients_external.set_upstream(firefox_android_clients)
wait_for_copy_deduplicate_all = ExternalTaskSensor(
task_id="wait_for_copy_deduplicate_all",
external_dag_id="copy_deduplicate",
external_task_id="copy_deduplicate_all",
execution_delta=datetime.timedelta(seconds=3600),
check_existence=True,
mode="reschedule",
allowed_states=ALLOWED_STATES,
failed_states=FAILED_STATES,
pool="DATA_ENG_EXTERNALTASKSENSOR",
)
clients_first_seen_v2.set_upstream(wait_for_copy_deduplicate_all)
wait_for_telemetry_derived__clients_daily__v6 = ExternalTaskSensor(
task_id="wait_for_telemetry_derived__clients_daily__v6",
external_dag_id="bqetl_main_summary",
external_task_id="telemetry_derived__clients_daily__v6",
check_existence=True,
mode="reschedule",
allowed_states=ALLOWED_STATES,
failed_states=FAILED_STATES,
pool="DATA_ENG_EXTERNALTASKSENSOR",
)
clients_first_seen_v2.set_upstream(wait_for_telemetry_derived__clients_daily__v6)
wait_for_baseline_clients_daily = ExternalTaskSensor(
task_id="wait_for_baseline_clients_daily",
external_dag_id="copy_deduplicate",

Просмотреть файл

@ -147,6 +147,12 @@ with DAG(
with TaskGroup(
"telemetry_derived__clients_daily__v6_external"
) as telemetry_derived__clients_daily__v6_external:
ExternalTaskMarker(
task_id="bqetl_analytics_tables__wait_for_telemetry_derived__clients_daily__v6",
external_dag_id="bqetl_analytics_tables",
external_task_id="wait_for_telemetry_derived__clients_daily__v6",
)
ExternalTaskMarker(
task_id="bqetl_internet_outages__wait_for_telemetry_derived__clients_daily__v6",
external_dag_id="bqetl_internet_outages",

Просмотреть файл

@ -0,0 +1,42 @@
friendly_name: Clients First Seen
description: |-
Picks out just the first observations per client based on
the main, new_profile and first_shutdown pings.
- All attributes are retrieved from the ping that reports
the first_seen_date, using the earliest timestamp and respecting nulls.
- Only the second_seen_date and client's reported pings are updated
daily when NULL.
This table should be accessed through the user-facing view
`telemetry.clients_first_seen`.
The initialization command creates the table and then inserts data
in parallel for each sample_id. Usage:
`bqetl query initialize sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_v2/query.sql`
owners:
- lvargas@mozilla.com
labels:
application: firefox
incremental: true
schedule: daily
depends_on_past: true
owner1: lvargas
scheduling:
dag_name: bqetl_analytics_tables
task_name: clients_first_seen_v2
depends_on_past: true
date_partition_parameter: null
parameters:
- submission_date:DATE:{{ds}}
bigquery:
time_partitioning:
type: day
field: first_seen_date
require_partition_filter: false
expiration_days: null
clustering:
fields:
- normalized_channel
- sample_id
- country
- normalized_os
references: {}

Просмотреть файл

@ -0,0 +1,360 @@
-- Query for telemetry_derived.clients_first_seen_v2
{% if is_init() and parallel_run() %}
INSERT INTO
`moz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v2`
{% endif %}
-- Query for telemetry_derived.clients_first_seen_v2
-- Each ping type subquery retrieves all attributes as reported on the first
-- ping received and respecting NULLS.
-- Once the first_seen_date is identified after comparing all pings, attributes
-- are retrieved for each client_id from the ping type that reported it.
WITH new_profile_ping AS (
SELECT
client_id AS client_id,
sample_id AS sample_id,
MIN(submission_timestamp) AS first_seen_timestamp,
ARRAY_AGG(DATE(submission_timestamp) ORDER BY submission_timestamp ASC) AS all_dates,
ARRAY_AGG(application.architecture RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS architecture,
ARRAY_AGG(environment.build.build_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_build_id,
ARRAY_AGG(normalized_app_name RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_name,
ARRAY_AGG(environment.settings.locale RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS locale,
ARRAY_AGG(application.platform_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS platform_version,
ARRAY_AGG(application.vendor RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS vendor,
ARRAY_AGG(application.version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_version,
ARRAY_AGG(application.xpcom_abi RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS xpcom_abi,
ARRAY_AGG(document_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS document_id,
ARRAY_AGG(environment.partner.distribution_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS distribution_id,
ARRAY_AGG(environment.partner.distribution_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distribution_version,
ARRAY_AGG(environment.partner.distributor RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distributor,
ARRAY_AGG(environment.partner.distributor_channel RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distributor_channel,
ARRAY_AGG(environment.partner.partner_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_id,
ARRAY_AGG(environment.settings.attribution.campaign RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_campaign,
ARRAY_AGG(environment.settings.attribution.content RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_content,
ARRAY_AGG(environment.settings.attribution.experiment RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_experiment,
ARRAY_AGG(environment.settings.attribution.medium RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_medium,
ARRAY_AGG(environment.settings.attribution.source RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_source,
ARRAY_AGG(environment.settings.attribution.ua RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_ua,
ARRAY_AGG(environment.settings.default_search_engine_data.load_path RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_load_path,
ARRAY_AGG(environment.settings.default_search_engine_data.name RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_name,
ARRAY_AGG(environment.settings.default_search_engine_data.origin RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_origin,
ARRAY_AGG(environment.settings.default_search_engine_data.submission_url RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_submission_url,
ARRAY_AGG(environment.system.apple_model_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS apple_model_id,
ARRAY_AGG(metadata.geo.city RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS city,
ARRAY_AGG(metadata.geo.db_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS db_version,
ARRAY_AGG(metadata.geo.subdivision1 RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS subdivision1,
ARRAY_AGG(normalized_channel RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_channel,
ARRAY_AGG(normalized_country_code RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS country,
ARRAY_AGG(normalized_os RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_os,
ARRAY_AGG(normalized_os_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_os_version,
ARRAY_AGG(payload.processes.parent.scalars.startup_profile_selection_reason RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS startup_profile_selection_reason,
ARRAY_AGG(environment.settings.attribution.dltoken RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_dltoken,
ARRAY_AGG(environment.settings.attribution.dlsource RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_dlsource,
FROM
`moz-fx-data-shared-prod.telemetry.new_profile`
WHERE
{% if is_init() %}
DATE(submission_timestamp) >= '2010-01-01'
{% if parallel_run() %}
AND sample_id = {sample_id}
{% endif %}
{% else %}
DATE(submission_timestamp) = @submission_date
{% endif %}
GROUP BY
client_id,
sample_id
),
shutdown_ping AS (
SELECT
client_id AS client_id,
sample_id AS sample_id,
MIN(submission_timestamp) AS first_seen_timestamp,
ARRAY_AGG(DATE(submission_timestamp) ORDER BY submission_timestamp ASC) AS all_dates,
ARRAY_AGG(application.architecture RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS architecture,
ARRAY_AGG(environment.build.build_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_build_id,
ARRAY_AGG(normalized_app_name RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_name,
ARRAY_AGG(environment.settings.locale RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS locale,
ARRAY_AGG(application.platform_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS platform_version,
ARRAY_AGG(application.vendor RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS vendor,
ARRAY_AGG(application.version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS app_version,
ARRAY_AGG(application.xpcom_abi RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS xpcom_abi,
ARRAY_AGG(document_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS document_id,
ARRAY_AGG(environment.partner.distribution_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS distribution_id,
ARRAY_AGG(environment.partner.distribution_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distribution_version,
ARRAY_AGG(environment.partner.distributor RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distributor,
ARRAY_AGG(environment.partner.distributor_channel RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_distributor_channel,
ARRAY_AGG(environment.partner.partner_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS partner_id,
ARRAY_AGG(environment.settings.attribution.campaign RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_campaign,
ARRAY_AGG(environment.settings.attribution.content RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_content,
ARRAY_AGG(environment.settings.attribution.experiment RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_experiment,
ARRAY_AGG(environment.settings.attribution.medium RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_medium,
ARRAY_AGG(environment.settings.attribution.source RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_source,
ARRAY_AGG(environment.settings.attribution.ua RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_ua,
ARRAY_AGG(environment.settings.default_search_engine_data.load_path RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_load_path,
ARRAY_AGG(environment.settings.default_search_engine_data.name RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_name,
ARRAY_AGG(environment.settings.default_search_engine_data.origin RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_origin,
ARRAY_AGG(environment.settings.default_search_engine_data.submission_url RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS engine_data_submission_url,
ARRAY_AGG(environment.system.apple_model_id RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS apple_model_id,
ARRAY_AGG(metadata.geo.city RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS city,
ARRAY_AGG(metadata.geo.db_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS db_version,
ARRAY_AGG(metadata.geo.subdivision1 RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS subdivision1,
ARRAY_AGG(normalized_channel RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_channel,
ARRAY_AGG(normalized_country_code RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS country,
ARRAY_AGG(normalized_os RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_os,
ARRAY_AGG(normalized_os_version RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS normalized_os_version,
ARRAY_AGG(payload.processes.parent.scalars.startup_profile_selection_reason RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS startup_profile_selection_reason,
ARRAY_AGG(environment.settings.attribution.dltoken RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_dltoken,
ARRAY_AGG(environment.settings.attribution.dlsource RESPECT NULLS ORDER BY submission_timestamp)[SAFE_OFFSET(0)] AS attribution_dlsource,
FROM
`moz-fx-data-shared-prod.telemetry.first_shutdown`
WHERE
{% if is_init() %}
DATE(submission_timestamp) >= '2010-01-01'
{% if parallel_run() %}
AND sample_id = {sample_id}
{% endif %}
{% else %}
DATE(submission_timestamp) = @submission_date
{% endif %}
GROUP BY
client_id,
sample_id
),
main_ping AS (
-- The columns set as NULL are not available in clients_daily_v6 and need to be
-- retrieved in the ETL from telemetry_stable.main_v4:<column>.
SELECT
client_id AS client_id,
sample_id AS sample_id,
-- The submission_timestamp_min is used to compare with the TIMESTAMP of
-- the new_profile and first shutdown pings.
-- It was implemented on Dec 16, 2019 and has data from 2018-10-30.
IF(
MIN(submission_date) >= '2018-10-30',
MIN(submission_timestamp_min),
TIMESTAMP(MIN(submission_date))
) AS first_seen_timestamp,
ARRAY_AGG(DATE(submission_date) ORDER BY submission_date ASC) AS all_dates,
CAST(NULL AS STRING) AS architecture, -- main_v4:environment.build.architecture
ARRAY_AGG(env_build_id RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS app_build_id,
ARRAY_AGG(app_name RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS app_name,
ARRAY_AGG(locale RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS locale,
CAST(NULL AS STRING) AS platform_version, -- main_v4:environment.build.platform_version
ARRAY_AGG(vendor RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS vendor,
ARRAY_AGG(app_version RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS app_version,
CAST(NULL AS STRING) AS xpcom_abi, -- main_v4:environment.build.xpcom_abi / application.xpcom_abi
CAST(NULL AS STRING) AS document_id, -- main_v4:document_id
ARRAY_AGG(distribution_id RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS distribution_id,
CAST(NULL AS STRING) AS partner_distribution_version, -- main_v4:environment.partner.distribution_version
CAST(NULL AS STRING) AS partner_distributor, -- main_v4:environment.partner.distributor
CAST(NULL AS STRING) AS partner_distributor_channel, -- main_v4:environment.partner.distributor_channel
CAST(NULL AS STRING) AS partner_id, -- main_v4:environment.partner.distribution_id
ARRAY_AGG(attribution.campaign RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_campaign,
ARRAY_AGG(attribution.content RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_content,
ARRAY_AGG(attribution.experiment RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_experiment,
ARRAY_AGG(attribution.medium RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_medium,
ARRAY_AGG(attribution.source RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_source,
CAST(NULL AS STRING) AS attribution_ua, -- main_v4:environment.settings.attribution.ua
ARRAY_AGG(default_search_engine_data_load_path RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS engine_data_load_path,
ARRAY_AGG(default_search_engine_data_name RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS engine_data_name,
ARRAY_AGG(default_search_engine_data_origin RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS engine_data_origin,
ARRAY_AGG(default_search_engine_data_submission_url RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS engine_data_submission_url,
CAST(NULL AS STRING) AS apple_model_id, -- main_v4:environment.system.apple_model_id
ARRAY_AGG(city RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS city,
CAST(NULL AS STRING) AS db_version, -- main_v4:metadata.geo.db_version
ARRAY_AGG(geo_subdivision1 RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS subdivision1,
ARRAY_AGG(normalized_channel RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS normalized_channel,
ARRAY_AGG(country RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS country,
ARRAY_AGG(os RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS normalized_os,
ARRAY_AGG(normalized_os_version RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS normalized_os_version,
CAST(NULL AS STRING) AS startup_profile_selection_reason, -- main_v4:payload.processes.parent.scalars.startup_profile_selection_reason
ARRAY_AGG(attribution.dltoken RESPECT NULLS ORDER BY submission_date)[SAFE_OFFSET(0)] AS attribution_dltoken,
CAST(NULL AS STRING) AS attribution_dlsource -- main_v4:environment.settings.attribution.dlsource
FROM
`moz-fx-data-shared-prod.telemetry_derived.clients_daily_v6`
WHERE
{% if is_init() %}
submission_date >= '2010-01-01'
{% if parallel_run() %}
AND sample_id = {sample_id}
{% endif %}
{% else %}
submission_date = @submission_date
{% endif %}
GROUP BY
client_id,
sample_id
),
unioned AS (
SELECT
*,
'new_profile' AS source_ping
FROM
new_profile_ping
UNION ALL
SELECT
*,
'shutdown' AS source_ping
FROM
shutdown_ping
UNION ALL
SELECT
*,
'main' AS source_ping
FROM
main_ping
),
-- The next CTE unions all reported dates from all pings.
-- It's required to find the first and second seen dates.
dates_with_reporting_ping AS (
SELECT
client_id,
ARRAY_CONCAT(
ARRAY_AGG(
STRUCT(
all_dates AS value,
unioned.source_ping AS value_source,
all_dates AS value_date
) IGNORE NULLS
)
) AS seen_dates
FROM
unioned
LEFT JOIN
UNNEST(all_dates) AS all_dates
GROUP BY
client_id
),
-- The next CTE returns the first_seen_date and reporting ping.
-- The timestamp is also required to retrieve the first_seen attributes.
first_seen_date AS (
SELECT
client_id,
DATE(MIN(first_seen_timestamp)) AS first_seen_date,
MIN(first_seen_timestamp) AS first_seen_timestamp,
ANY_VALUE(source_ping HAVING MIN first_seen_timestamp) AS first_seen_source_ping,
FROM
unioned
GROUP BY
client_id
),
-- The next CTE returns the second_seen_date. It finds the next date after first_seen_date
-- as reported by the main ping. Further dates reported by other pings are skipped.
-- The source ping for second_seen_date is always the main ping.
second_seen_date AS (
SELECT
client_id,
IF(
ARRAY_LENGTH(ARRAY_AGG(seen_dates)) > 1,
ARRAY_AGG(seen_dates ORDER BY value_date ASC)[SAFE_OFFSET(1)],
NULL
) AS second_seen_date
FROM
dates_with_reporting_ping
LEFT JOIN
UNNEST(seen_dates) AS seen_dates
WHERE
seen_dates.value_source = 'main'
GROUP BY
client_id
),
-- The next CTE returns the pings ever reported by each client
-- Different from other attributes, this data is updated daily when it's NULL,
-- so it's not limited to the first_seen_date.
reported_pings AS (
SELECT
client_id,
'main' IN UNNEST(ARRAY_AGG(source_ping)) AS reported_main_ping,
'new_profile' IN UNNEST(ARRAY_AGG(source_ping)) AS reported_new_profile_ping,
'shutdown' IN UNNEST(ARRAY_AGG(source_ping)) AS reported_shutdown_ping
FROM
unioned
GROUP BY
client_id
),
_current AS (
SELECT
unioned.client_id AS client_id,
unioned.sample_id AS sample_id,
fsd.first_seen_date AS first_seen_date,
ssd.second_seen_date.value AS second_seen_date,
unioned.* EXCEPT (client_id, sample_id, first_seen_timestamp, all_dates, source_ping),
STRUCT(
fsd.first_seen_source_ping AS first_seen_date_source_ping,
pings.reported_main_ping AS reported_main_ping,
pings.reported_new_profile_ping AS reported_new_profile_ping,
pings.reported_shutdown_ping AS reported_shutdown_ping
) AS metadata
FROM
unioned
INNER JOIN
first_seen_date AS fsd
USING
(client_id, first_seen_timestamp)
LEFT JOIN
second_seen_date AS ssd
USING
(client_id)
LEFT JOIN
reported_pings AS pings
USING
(client_id)
),
_previous AS (
SELECT
*
FROM
`moz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v2`
)
SELECT
{% if is_init() %}
IF(_previous.client_id IS NULL, _current, _previous).*
{% else %}
-- For the daily update:
-- The reported ping status in the metadata is updated when it's NULL.
-- The second_seen_date is updated when it's NULL and only if there is a
-- main ping reported on the submission_date.
-- Every other attribute remains as reported on the first_seen_date.
IF(_previous.client_id IS NULL, _current, _previous).* REPLACE (
IF(
_previous.first_seen_date IS NOT NULL
AND _previous.second_seen_date IS NULL
AND _current.client_id IS NOT NULL
AND _current.metadata.reported_main_ping,
@submission_date,
_previous.second_seen_date
) AS second_seen_date,
(
SELECT AS STRUCT
IF(_previous.client_id IS NULL, _current, _previous).metadata.* REPLACE (
IF(
_previous.client_id IS NULL
OR _previous.metadata.reported_main_ping IS FALSE
AND _current.metadata.reported_main_ping IS TRUE,
_current.metadata.reported_main_ping,
_previous.metadata.reported_main_ping
) AS reported_main_ping,
IF(
_previous.client_id IS NULL
OR _previous.metadata.reported_new_profile_ping IS FALSE
AND _current.metadata.reported_new_profile_ping IS TRUE,
_current.metadata.reported_new_profile_ping,
_previous.metadata.reported_new_profile_ping
) AS reported_new_profile_ping,
IF(
_previous.client_id IS NULL
OR _previous.metadata.reported_shutdown_ping IS FALSE
AND _current.metadata.reported_shutdown_ping IS TRUE,
_current.metadata.reported_shutdown_ping,
_previous.metadata.reported_shutdown_ping
) AS reported_shutdown_ping
)
) AS metadata
)
{% endif %}
FROM
_previous
FULL JOIN
_current
USING
(client_id)

Просмотреть файл

@ -0,0 +1,186 @@
fields:
- description: Unique ID for the client installation.
name: client_id
type: STRING
mode: NULLABLE
- description: Sample ID to limit query results during an analysis.
name: sample_id
type: INTEGER
mode: NULLABLE
- description: Date when the server first received either of the 3 pings from this client.
name: first_seen_date
type: DATE
- description: Second date when the server received either of the 3 pings from this client.
name: second_seen_date
type: DATE
mode: NULLABLE
- description: The application architecture reported by the client.
name: architecture
type: STRING
mode: NULLABLE
- description: The application build reported by the client.
name: app_build_id
type: STRING
mode: NULLABLE
- description: The name of the installed app/browser.
name: app_name
type: STRING
mode: NULLABLE
- description: The best locale that the application should be localized to.
name: locale
type: STRING
mode: NULLABLE
- description: The appliction platform version reported by the client.
name: platform_version
type: STRING
mode: NULLABLE
- description: The application vendor.
name: vendor
type: STRING
mode: NULLABLE
- description: The application version.
name: app_version
type: STRING
mode: NULLABLE
- description: 'A string tag identifying the binary ABI of the current processor and compiler vtable, as taken from the TARGET_XPCOM_ABI configure variable.
It may not be available on all platforms, especially unusual processor or compiler combinations. The result takes the form <processor>-<compilerABI>,
eg. x86-msvc, ppc-gcc3. This value should almost always be used in combination with the OS.'
name: xpcom_abi
type: STRING
mode: NULLABLE
- description: The document ID specified in the URI when the client sent this message.
name: document_id
type: STRING
mode: NULLABLE
- description: The value of the `distribution.id` preference that identifies the Firefox distribution.
name: distribution_id
type: STRING
mode: NULLABLE
- description: The value selected for the `distribution.version` preference in the Partner Distribution Configuration File.
name: partner_distribution_version
type: STRING
mode: NULLABLE
- description: The value of the `app.distributor` preference in the Partner Distribution Configuration File.
name: partner_distributor
type: STRING
mode: NULLABLE
- description: The value of the `app.distributor.channel` preference in the Partner Distribution Configuration File.
name: partner_distributor_channel
type: STRING
mode: NULLABLE
- description: The value of the `mozilla.partner.id` preference in the Partner Distribution Configuration File.
name: partner_id
type: STRING
mode: NULLABLE
- description: Identifier of the particular campaign that led to the download of the product.
name: attribution_campaign
type: STRING
mode: NULLABLE
- description: Identifier to indicate the particular link within a campaign.
name: attribution_content
type: STRING
mode: NULLABLE
- description: Funnel experiment parameters.
name: attribution_experiment
type: STRING
mode: NULLABLE
- description: Category of the source, such as 'organic' for a search engine.
name: attribution_medium
type: STRING
mode: NULLABLE
- description: Referring partner domain, when install happens via a known partner.
name: attribution_source
type: STRING
mode: NULLABLE
- description: Client's user agent, which corresponds to the web browser used to download the Firefox installer.
name: attribution_ua
type: STRING
mode: NULLABLE
- description: The anonymized path of the engine xml file. For details on the components refer to the metadata for telemetry.new_profile.
name: engine_data_load_path
type: STRING
mode: NULLABLE
- description: The name of the default search engine.
name: engine_data_name
type: STRING
mode: NULLABLE
- description: 'The origin of the search engine. The value will be default for engines that are built-in or from distribution partners,
verified for user-installed engines with valid verification hashes, unverified for non-default engines without verification hash,
and invalid for engines with broken verification hashes.'
name: engine_data_origin
type: STRING
mode: NULLABLE
- description: The HTTP url we would use to search. For privacy, we dont record this for user-installed engines.
name: engine_data_submission_url
type: STRING
mode: NULLABLE
- description: The model IDs for Apple desktop devices. Applies to Mac only.
name: apple_model_id
type: STRING
mode: NULLABLE
- description: City retrieved as a result of a geographic lookup based on the client's IP address.
name: city
type: STRING
- description: The specific geo database version used.
name: db_version
type: STRING
mode: NULLABLE
- description: First major country subdivision, typically a state, province, or county.
name: subdivision1
type: STRING
mode: NULLABLE
- description: The Firefox channel, set to Other for unrecognized channel names.
name: normalized_channel
type: STRING
mode: NULLABLE
- description: The ISO 3166-1 alpha-2 country code.
name: country
type: STRING
mode: NULLABLE
- description: The OS name, set to Other for unrecognized OS names.
name: normalized_os
type: STRING
mode: NULLABLE
- description: The OS version.
name: normalized_os_version
type: STRING
mode: NULLABLE
- description: 'How the profile was selected during startup. Possible reasons are:
unknown: Generally should not happen, set as a default in case no other reason occurred. profile-manager: The profile was selected by the profile manager.
profile-reset: The profile was selected for reset, normally this would mean a restart. restart: The user restarted the application,
the same profile as previous will be used. argument-profile: The profile was selected by the --profile command line argument.
argument-p: The profile was selected by the -p command line argument. firstrun-claimed-default: A first run of a dedicated profiles build chose the old
default profile to be the default for this install. firstrun-skipped-default: A first run of a dedicated profiles build skipped over the old default profile
and created a new profile. restart-claimed-default: A first run of a dedicated profiles build after a restart chose the old default.'
name: startup_profile_selection_reason
type: STRING
mode: NULLABLE
- description: Unique token created at Firefox download time.
name: attribution_dltoken
type: STRING
mode: NULLABLE
- description: Identifier that indicates where installations of Firefox originated.
name: attribution_dlsource
type: STRING
mode: NULLABLE
- description:
name: metadata
type: RECORD
mode: NULLABLE
fields:
- description: Ping that reported the first seen date (main, first_shutdown or new_profile).
name: first_seen_date_source_ping
type: STRING
mode: NULLABLE
- description: Indicates wether the client ever reported a main ping.
name: reported_main_ping
type: BOOLEAN
mode: NULLABLE
- description: Indicates wether the client ever reported a new profile ping.
name: reported_new_profile_ping
type: BOOLEAN
mode: NULLABLE
- description: Indicates wether the client ever reported a first shutdown ping.
name: reported_shutdown_ping
type: BOOLEAN
mode: NULLABLE

Просмотреть файл

@ -0,0 +1,272 @@
[
{
"type": "STRING",
"name": "client_id"
},
{
"type": "INTEGER",
"name": "sample_id"
},
{
"type": "TIMESTAMP",
"name": "submission_timestamp",
"mode": "REQUIRED"
},
{
"type": "STRING",
"name": "normalized_channel"
},
{
"type": "STRING",
"name": "normalized_os"
},
{
"type": "STRING",
"name": "normalized_country_code"
},
{
"type": "STRING",
"name": "normalized_os_version"
},
{
"fields": [
{
"type": "STRING",
"name": "architecture"
},
{
"type": "STRING",
"name": "build_id"
},
{
"type": "STRING",
"name": "platform_version"
},
{
"type": "STRING",
"name": "vendor"
},
{
"type": "STRING",
"name": "xpcom_abi"
},
{
"type": "STRING",
"name": "version"
}
],
"name": "application",
"type": "RECORD"
},
{
"type": "STRING",
"name": "normalized_app_name"
},
{
"type": "STRING",
"name": "document_id"
},
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "build_id"
}
],
"name": "build",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "key"
},
{
"fields": [
{
"type": "STRING",
"name": "branch"
},
{
"type": "STRING",
"name": "enrollment_id"
},
{
"type": "STRING",
"name": "type"
}
],
"name": "value",
"type": "RECORD"
}
],
"name": "experiments",
"type": "RECORD",
"mode": "REPEATED"
},
{
"fields": [
{
"type": "STRING",
"name": "distribution_id"
},
{
"type": "STRING",
"name": "distribution_version"
},
{
"type": "STRING",
"name": "distributor"
},
{
"type": "STRING",
"name": "distributor_channel"
},
{
"type": "STRING",
"name": "partner_id"
}
],
"name": "partner",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "locale"
},
{
"fields": [
{
"type": "STRING",
"name": "campaign"
},
{
"type": "STRING",
"name": "content"
},
{
"type": "STRING",
"name": "experiment"
},
{
"type": "STRING",
"name": "medium"
},
{
"type": "STRING",
"name": "source"
},
{
"type": "STRING",
"name": "dltoken"
},
{
"type": "STRING",
"name": "dlsource"
},
{
"type": "STRING",
"name": "ua"
}
],
"name": "attribution",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "load_path"
},
{
"type": "STRING",
"name": "name"
},
{
"type": "STRING",
"name": "origin"
},
{
"type": "STRING",
"name": "submission_url"
}
],
"name": "default_search_engine_data",
"type": "RECORD"
}
],
"name": "settings",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "apple_model_id"
}
],
"name": "system",
"type": "RECORD"
}
],
"name": "environment",
"type": "RECORD"
},
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "city"
},
{
"type": "STRING",
"name": "db_version"
},
{
"type": "STRING",
"name": "subdivision1"
}
],
"type": "RECORD",
"name": "geo"
}
],
"type": "RECORD",
"name": "metadata"
},
{
"fields": [
{
"fields": [
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "startup_profile_selection_reason"
}
],
"type": "RECORD",
"name": "scalars"
}
],
"type": "RECORD",
"name": "parent"
}
],
"type": "RECORD",
"name": "processes"
}
],
"type": "RECORD",
"name": "payload"
}
]

Просмотреть файл

@ -0,0 +1,272 @@
[
{
"type": "STRING",
"name": "client_id"
},
{
"type": "INTEGER",
"name": "sample_id"
},
{
"type": "TIMESTAMP",
"name": "submission_timestamp",
"mode": "REQUIRED"
},
{
"type": "STRING",
"name": "normalized_channel"
},
{
"type": "STRING",
"name": "normalized_os"
},
{
"type": "STRING",
"name": "normalized_country_code"
},
{
"type": "STRING",
"name": "normalized_os_version"
},
{
"fields": [
{
"type": "STRING",
"name": "architecture"
},
{
"type": "STRING",
"name": "build_id"
},
{
"type": "STRING",
"name": "platform_version"
},
{
"type": "STRING",
"name": "vendor"
},
{
"type": "STRING",
"name": "xpcom_abi"
},
{
"type": "STRING",
"name": "version"
}
],
"name": "application",
"type": "RECORD"
},
{
"type": "STRING",
"name": "normalized_app_name"
},
{
"type": "STRING",
"name": "document_id"
},
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "build_id"
}
],
"name": "build",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "key"
},
{
"fields": [
{
"type": "STRING",
"name": "branch"
},
{
"type": "STRING",
"name": "enrollment_id"
},
{
"type": "STRING",
"name": "type"
}
],
"name": "value",
"type": "RECORD"
}
],
"name": "experiments",
"type": "RECORD",
"mode": "REPEATED"
},
{
"fields": [
{
"type": "STRING",
"name": "distribution_id"
},
{
"type": "STRING",
"name": "distribution_version"
},
{
"type": "STRING",
"name": "distributor"
},
{
"type": "STRING",
"name": "distributor_channel"
},
{
"type": "STRING",
"name": "partner_id"
}
],
"name": "partner",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "locale"
},
{
"fields": [
{
"type": "STRING",
"name": "campaign"
},
{
"type": "STRING",
"name": "content"
},
{
"type": "STRING",
"name": "experiment"
},
{
"type": "STRING",
"name": "medium"
},
{
"type": "STRING",
"name": "source"
},
{
"type": "STRING",
"name": "dltoken"
},
{
"type": "STRING",
"name": "dlsource"
},
{
"type": "STRING",
"name": "ua"
}
],
"name": "attribution",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "load_path"
},
{
"type": "STRING",
"name": "name"
},
{
"type": "STRING",
"name": "origin"
},
{
"type": "STRING",
"name": "submission_url"
}
],
"name": "default_search_engine_data",
"type": "RECORD"
}
],
"name": "settings",
"type": "RECORD"
},
{
"fields": [
{
"type": "STRING",
"name": "apple_model_id"
}
],
"name": "system",
"type": "RECORD"
}
],
"name": "environment",
"type": "RECORD"
},
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "city"
},
{
"type": "STRING",
"name": "db_version"
},
{
"type": "STRING",
"name": "subdivision1"
}
],
"type": "RECORD",
"name": "geo"
}
],
"type": "RECORD",
"name": "metadata"
},
{
"fields": [
{
"fields": [
{
"fields": [
{
"fields": [
{
"type": "STRING",
"name": "startup_profile_selection_reason"
}
],
"type": "RECORD",
"name": "scalars"
}
],
"type": "RECORD",
"name": "parent"
}
],
"type": "RECORD",
"name": "processes"
}
],
"type": "RECORD",
"name": "payload"
}
]

Просмотреть файл

@ -0,0 +1,116 @@
[
{
"type": "STRING",
"name": "client_id"
},
{
"type": "INTEGER",
"name": "sample_id"
},
{
"type": "DATE",
"name": "submission_date"
},
{
"type": "TIMESTAMP",
"name": "submission_timestamp_min"
},
{
"type": "STRING",
"name": "env_build_id"
},
{
"type": "STRING",
"name": "normalized_channel"
},
{
"type": "STRING",
"name": "app_display_version"
},
{
"type": "STRING",
"name": "app_name"
},
{
"type": "STRING",
"name": "locale"
},
{
"type": "STRING",
"name": "vendor"
},
{
"type": "STRING",
"name": "app_version"
},
{
"type": "STRING",
"name": "distribution_id"
},
{
"fields": [
{
"type": "STRING",
"name": "campaign"
},
{
"type": "STRING",
"name": "content"
},
{
"type": "STRING",
"name": "experiment"
},
{
"type": "STRING",
"name": "medium"
},
{
"type": "STRING",
"name": "source"
},
{
"type": "STRING",
"name": "dltoken"
}
],
"name": "attribution",
"type": "RECORD"
},
{
"type": "STRING",
"name": "default_search_engine_data_load_path"
},
{
"type": "STRING",
"name": "default_search_engine_data_name"
},
{
"type": "STRING",
"name": "default_search_engine_data_origin"
},
{
"type": "STRING",
"name": "default_search_engine_data_submission_url"
},
{
"type": "STRING",
"name": "city"
},
{
"type": "STRING",
"name": "geo_subdivision1"
},
{
"type": "STRING",
"name": "country"
},
{
"type": "STRING",
"name": "os"
},
{
"type": "STRING",
"name": "normalized_os_version"
}
]

Просмотреть файл

@ -0,0 +1,180 @@
[
{
"type": "STRING",
"name": "client_id"
},
{
"type": "INTEGER",
"name": "sample_id"
},
{
"type": "DATE",
"name": "first_seen_date"
},
{
"type": "DATE",
"name": "second_seen_date"
},
{
"type": "STRING",
"name": "architecture"
},
{
"type": "STRING",
"name": "app_build_id"
},
{
"type": "STRING",
"name": "app_name"
},
{
"type": "STRING",
"name": "locale"
},
{
"type": "STRING",
"name": "platform_version"
},
{
"type": "STRING",
"name": "vendor"
},
{
"type": "STRING",
"name": "app_version"
},
{
"type": "STRING",
"name": "xpcom_abi"
},
{
"type": "STRING",
"name": "document_id"
},
{
"type": "STRING",
"name": "distribution_id"
},
{
"type": "STRING",
"name": "partner_distribution_version"
},
{
"type": "STRING",
"name": "partner_distributor"
},
{
"type": "STRING",
"name": "partner_distributor_channel"
},
{
"type": "STRING",
"name": "partner_id"
},
{
"type": "STRING",
"name": "attribution_campaign"
},
{
"type": "STRING",
"name": "attribution_content"
},
{
"type": "STRING",
"name": "attribution_experiment"
},
{
"type": "STRING",
"name": "attribution_medium"
},
{
"type": "STRING",
"name": "attribution_source"
},
{
"type": "STRING",
"name": "attribution_ua"
},
{
"type": "STRING",
"name": "engine_data_load_path"
},
{
"type": "STRING",
"name": "engine_data_name"
},
{
"type": "STRING",
"name": "engine_data_origin"
},
{
"type": "STRING",
"name": "engine_data_submission_url"
},
{
"type": "STRING",
"name": "apple_model_id"
},
{
"type": "STRING",
"name": "city"
},
{
"type": "STRING",
"name": "db_version"
},
{
"type": "STRING",
"name": "subdivision1"
},
{
"type": "STRING",
"name": "normalized_channel"
},
{
"type": "STRING",
"name": "country"
},
{
"type": "STRING",
"name": "normalized_os"
},
{
"type": "STRING",
"name": "normalized_os_version"
},
{
"type": "STRING",
"name": "startup_profile_selection_reason"
},
{
"type": "STRING",
"name": "attribution_dltoken"
},
{
"type": "STRING",
"name": "attribution_dlsource"
},
{
"fields": [
{
"type": "STRING",
"name": "first_seen_date_source_ping"
},
{
"type": "BOOLEAN",
"name": "reported_main_ping"
},
{
"type": "BOOLEAN",
"name": "reported_new_profile_ping"
},
{
"type": "BOOLEAN",
"name": "reported_shutdown_ping"
}
],
"name": "metadata",
"type": "RECORD"
}
]

Просмотреть файл

@ -0,0 +1,56 @@
# client-1 has a first_seen_date from 2022-06-30 reported by the new_profile ping.
# On 2022-07-01 it reports a main ping which results in updating second_seen_date & reported_main_ping.
# Other attributes e.g. locale are not updated.
# client-2 reports two new_profile pings and one main ping on 2022-07-01.
# The first_seen_date and other attributes are retrieved fromfirst new_profile ping,
# therefore attribution campaign is NULL as reported by the first new_profile ping
# and since the campaign reported by the main ping and the second new_profile ping are ignored.
# client-5 reports new_profile and first_shutdown on the 25th.
# On 2022-07-01 there is another new_profile ping but the second_seen_date remains NULL because it should only
# be updated when there is a main ping reported.
---
- client_id: client-1
first_seen_date: 2022-06-30
second_seen_date: 2022-07-01
sample_id: 10
normalized_channel: release
metadata:
first_seen_date_source_ping: new_profile
reported_main_ping: true
reported_new_profile_ping: true
reported_shutdown_ping: false
- client_id: client-2
first_seen_date: 2022-07-01
sample_id: 20
attribution_source: source2
metadata:
first_seen_date_source_ping: new_profile
reported_main_ping: true
reported_new_profile_ping: true
reported_shutdown_ping: false
- client_id: client-3
first_seen_date: 2022-06-30
second_seen_date: 2022-07-01
sample_id: 30
normalized_channel: beta
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_shutdown_ping: true
- client_id: client-4
sample_id: 40
first_seen_date: 2022-06-29
second_seen_date: 2022-06-30
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_main_ping: true
reported_shutdown_ping: true
- client_id: client-5
sample_id: 50
first_seen_date: 2022-06-25
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_main_ping: false
reported_shutdown_ping: true

Просмотреть файл

@ -0,0 +1,7 @@
---
- client_id: client-3
sample_id: 30
submission_timestamp: '2022-07-01 01:00:00'
- client_id: client-4
sample_id: 40
submission_timestamp: '2022-07-01 04:00:00'

Просмотреть файл

@ -0,0 +1,21 @@
---
- client_id: client-2
sample_id: 20
submission_timestamp: '2022-07-01 00:02:00'
environment:
settings:
attribution:
source: source2
- client_id: client-2
sample_id: 20
submission_timestamp: '2022-07-01 01:02:00'
environment:
settings:
attribution:
campaign: campaign2
- client_id: client-5
sample_id: 50
submission_timestamp: '2022-07-01 01:00:00'
environment:
build:
build_id: build5

Просмотреть файл

@ -0,0 +1,17 @@
---
- client_id: client-1
sample_id: 10
normalized_channel: beta
locale: abc1
submission_date: '2022-07-01'
submission_timestamp_min: '2022-07-01 00:01:00'
- client_id: client-2
sample_id: 20
submission_date: '2022-07-01'
submission_timestamp_min: '2022-07-01 00:03:00'
attribution:
campaign: campaign2
- client_id: client-3
sample_id: 30
submission_date: '2022-07-01'
submission_timestamp_min: '2022-07-01 13:00:00'

Просмотреть файл

@ -0,0 +1,35 @@
---
- client_id: client-1
sample_id: 10
first_seen_date: 2022-06-30
normalized_channel: release
metadata:
first_seen_date_source_ping: new_profile
reported_main_ping: false
reported_new_profile_ping: true
reported_shutdown_ping: false
- client_id: client-3
sample_id: 30
first_seen_date: 2022-06-30
normalized_channel: beta
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_shutdown_ping: false
- client_id: client-4
sample_id: 40
first_seen_date: 2022-06-29
second_seen_date: 2022-06-30
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_main_ping: true
reported_shutdown_ping: false
- client_id: client-5
sample_id: 50
first_seen_date: 2022-06-25
metadata:
first_seen_date_source_ping: new_profile
reported_new_profile_ping: true
reported_main_ping: false
reported_shutdown_ping: true

Просмотреть файл

@ -0,0 +1,4 @@
---
- name: submission_date
type: DATE
value: 2022-07-01

Просмотреть файл

@ -0,0 +1,31 @@
---
- client_id: client-1
first_seen_date: 2023-06-01
city: city1
sample_id: 10
metadata:
first_seen_date_source_ping: main
reported_main_ping: true
reported_new_profile_ping: false
reported_shutdown_ping: false
- client_id: client-3
first_seen_date: 2023-06-01
attribution_dltoken: token3
sample_id: 30
metadata:
first_seen_date_source_ping: new_profile
reported_main_ping: false
reported_new_profile_ping: true
reported_shutdown_ping: true
- client_id: client-4
sample_id: 40
first_seen_date: 2023-05-14
second_seen_date: 2023-06-01
normalized_channel: nightly
attribution_campaign: campaign4
attribution_dltoken: token4
metadata:
first_seen_date_source_ping: shutdown
reported_main_ping: true
reported_new_profile_ping: true
reported_shutdown_ping: true

Просмотреть файл

@ -0,0 +1,4 @@
---
- client_id: client-3
sample_id: 30
submission_timestamp: '2023-06-01 11:03:00'

Просмотреть файл

@ -0,0 +1,8 @@
---
- client_id: client-3
sample_id: 30
submission_timestamp: '2023-06-01 10:03:00'
environment:
settings:
attribution:
dltoken: token3

Просмотреть файл

@ -0,0 +1,18 @@
---
- client_id: client-1
sample_id: 10
submission_date: '2023-06-01'
submission_timestamp_min: '2023-06-01 00:03:00'
city: city1
- client_id: client-2
sample_id: 20
submission_date: '2023-05-28'
submission_timestamp_min: '2023-05-28 00:03:00'
attribution:
campaign: campaign2
- client_id: client-4
sample_id: 40
submission_date: '2023-06-01'
submission_timestamp_min: '2023-06-01 00:04:00'
attribution:
campaign: campaign44

Просмотреть файл

@ -0,0 +1,12 @@
---
- client_id: client-4
sample_id: 40
first_seen_date: 2023-05-14
normalized_channel: nightly
attribution_campaign: campaign4
attribution_dltoken: token4
metadata:
first_seen_date_source_ping: shutdown
reported_main_ping: true
reported_new_profile_ping: true
reported_shutdown_ping: true

Просмотреть файл

@ -0,0 +1,4 @@
---
- name: submission_date
type: DATE
value: 2023-06-01