Граф коммитов

2463 Коммитов

Автор SHA1 Сообщение Дата
Anna Scholtz 52b5495592 Monitoring views for iOS experiments 2021-04-21 09:00:40 -07:00
Jeff Klukas b45770d3af
Update owners (#1969)
Removes some folks who no longer are active on the assigned projects.

Also see https://github.com/mozilla/telemetry-airflow/pull/1288
2021-04-21 09:39:39 -04:00
Anna Scholtz 4b8bbc1f35 Internet outages query from telemetry_stable.main_v4 2021-04-21 06:22:59 -07:00
Daniel Thorn 47d6047e75
Fix incorrect artifactId in pom.xml (#1972) 2021-04-20 18:01:47 -07:00
Anna Scholtz 8109d147e0 Remove time_partitioning_type from bqetl backfill 2021-04-20 10:46:30 -07:00
Jeff Klukas 4bde66d936
Fix task name for core_clients_first_seen (#1966)
Bugfix on top of #1962
2021-04-20 09:59:25 -07:00
Anna Scholtz cae0e6d7f7 Remove bqetl_deviations 2021-04-20 08:45:23 -07:00
Anna Scholtz 1ae324d576 Remove allow_field_addition_on_date from clients_daily 2021-04-20 08:34:39 -07:00
Jeff Klukas 6b2dbec0c0
Add first_seen_date to core_clients_daily and last seen (#1962)
* Add first_seen_date to core_clients_daily and last seen

Supports KPI work for iOS and Focus apps.
See https://docs.google.com/document/d/1-sifTuu3lWd5umvaUmncFrdBIK6eKVTPzmGDLv6GDak/edit?ts=6078667e#

* Update tests

* Add new_profiles to mobile_usage

* Make sure is_new_profile reflects only current day

* Remove is_new_profile from core_clients_last_seen

This field could be confusing.

If we do `COUNTIF(is_new_profile)`,
we'll overcount since a client that appears on a single day will continue
to appear in clients_last_seen with is_new_profile=True carried over from the
original day of observation.

* Remove is_new_profile from core_clients_last_seen query

* bugfixes

* DAG change
2021-04-20 09:49:01 -04:00
Anna Scholtz 67ebf2376c Remove allow_field_addition_on_date in clients_first_seen_v1 metadata 2021-04-19 21:25:16 -07:00
Anna Scholtz 54864c33c3 Add is_taskbar_pinned and launch_method to clients_daily 2021-04-19 10:22:24 -07:00
Jason Thomas de157cc03b
Add ACL metadata for taar access to clients_last_seen table (#1961) 2021-04-19 07:30:18 -07:00
Anna Scholtz b766682213 Add metadata for public release dataset views 2021-04-15 09:50:01 -07:00
Anna Scholtz e759016754 Schedule release data import 2021-04-15 09:50:01 -07:00
Anna Scholtz 7da6a63fd3 Dataset for Fenix releases 2021-04-15 09:50:01 -07:00
Anna Scholtz f3d7280d35 Dataset for Firefox releases 2021-04-15 09:50:01 -07:00
Daniel Thorn d416d7be3a
Expose an `interval` field on iap subscriptions (#1956)
also expose subscription id and device id for vpn tables, and product id for stripe subscriptions
2021-04-14 15:16:12 -07:00
Jeff Klukas fa4119fa62
Update contributes_to_2021_kpi values (#1957)
* Update contributes_to_2021_kpi values

We are tracking a more limited set of mobile apps for 2021:

- Firefox for Android
- Firefox for iOS
- Firefox Focus for Android
- Firefox Focus for iOS

* Mark Lockwise for iOS as false
2021-04-14 17:10:44 -04:00
Daniel Thorn 664eaad83b
Fix typo in stripe invoice schema (#1959) 2021-04-14 14:02:01 -07:00
Anna Scholtz 6173b8746d Public data export temporary stage directory 2021-04-13 11:38:12 -07:00
whd 11ecde27b2
Add workgroups metadata for activity_stream_bi (#1628)
* Add workgroups metadata for activity_stream_bi

* Update workgroup_access for activity_stream_bi
2021-04-12 23:30:07 +00:00
Anna Scholtz 36154a9f98 Ignore materialized view dry runs 2021-04-12 14:10:37 -07:00
Anthony Miyaguchi f58f0bfd3b
Revert "Add migration script for joining against first seen table (#1947)" (#1950)
This reverts commit e4dfedd285.
2021-04-12 15:47:16 -04:00
Anthony Miyaguchi e4dfedd285
Add migration script for joining against first seen table (#1947)
* Add migration script for joining against first seen table

* Update logic for is_new_profile

* Update templates to use DDL with partitioning/clustering

* Fix output of migrate tables to backfill-8

* Add instructions for backfilling

* Fix linting errors
2021-04-12 12:41:52 -07:00
Anthony Miyaguchi 871270f2c4
[DS-1424] Join baseline clients daily with first seen table (#1946)
* Add first_seen_date and related test fixtures

* Use is_new_profile instead of baseline_first_seen

* Update view for baseline_clients_first_seen

* Fix yamllint issues

* Set is_new_profile when submission matches first seen

* Include AS in table alias

* Nit: capitalize AS

* Update bigquery_etl/glean_usage/templates/baseline_clients_daily_v1.sql

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>

* Update bigquery_etl/glean_usage/templates/baseline_clients_daily_v1.sql

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>

* Update clustering specification

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2021-04-12 12:29:57 -07:00
whd e84ec0c75b
Add ACL metadata for poucave access to live events view (#1945)
* Add ACL metadata for poucave access to live events view

* Remove table-level mozilla-confidential events live access for now

Semantically this makes sense but will dataset-level access is already provided
for this workgroup and automation is in place this may be come stale.
2021-04-08 21:18:25 +00:00
Anna Scholtz 29aa8d4204 SKIP dryrun for telemetry_derived materialized views 2021-04-08 12:18:18 -07:00
Jeff Klukas d00cdc507d
Use fully-qualified targets in glean_usage view creation (#1943)
Follow-up to https://github.com/mozilla/bigquery-etl/pull/1941
2021-04-08 13:53:52 -04:00
Anthony Miyaguchi 459f64576c
Add baseline clients daily test (#1941)
* Update table_name_from_baseline to strip project

* Remove project ids from query to facilitate testing

* Rewrite require_partition_filter in tests

* Add basic tests for baseline clients daily
2021-04-08 08:39:28 -04:00
Jeff Klukas 39071e8229
Bug 1703362 Purge security.unexpectedload events (#1942)
* Bug 1703362 Purge security.unexpectedload events

See https://bugzilla.mozilla.org/show_bug.cgi?id=1703362

The fix that reduces the quantity of events is scheduled to be included
in Firefox 88, released in late April. From 88 on, the rate of events
is expected to be reasonable, and the filter here will allow them again.

I will run a backfill for this table, starting from 2020-12-15.

* Refactor with CTE
2021-04-07 11:54:15 -04:00
Anna Scholtz 0f898fc501 Fix not enough values to unpack for referenced tables 2021-04-06 15:09:14 -07:00
whd 7c1b03934b
Default branch (#1939)
* Rename default branch

* Rename branch

* Update circleci for default branch name
2021-04-06 21:15:21 +00:00
Anthony Miyaguchi 9bca48821a
Fix clients first seen matching org_mozilla_firefoxreality (#1938) 2021-04-06 15:13:48 -04:00
Anthony Miyaguchi 0aacbe5c22
Pull out main logic and generate example queries for glean usage (#1937)
* Move argument parser into shared function

* Move shared main entrypoint into common

* Update example script to include other usage queries

* Commit generated queries for example usage queries

* Parallelize generation of example queries

* Add docstring

* Remove ios example queries for daily and last seen

* Fix pydocstyle linting

* Add update_example_glean_usage to CI
2021-04-06 11:38:30 -07:00
Anthony Miyaguchi 1503a7fa89
[DS-1424] Implementation of mobile clients first seen (#1934)
* Add initial boilerplate for clients_first_seen

* Remove submission_timestamp as a field

* [wip] Join data against legacy fennec id if applicable

* Remove user facing view

* Revert "Remove user facing view"

This reverts commit a728a7882170eadad5413c7a7046c0f38297bb87.

* Add flag for fennec_id

* Update logic to limit rows in partitions to submission_date

* Add all sql in glean_usage to format ignores

* Separate init and query

* Add default encoders for testing sql

* Add test for initialization of baseline clients first seen in fenix

* Update query to update over previous history

* Add test for aggregation

* Add generated sql and tests for simple baseline clients first seen

* Add dry-run exceptions for clients first seen tables

* Add clients first seen to generated sql

* Update bigquery_etl/glean_usage/templates/baseline_clients_first_seen.metadata.yaml

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>

* Update bigquery_etl/glean_usage/templates/baseline_clients_first_seen.metadata.yaml

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>

* Group by sample id instead of min

* Add submission_date as baseline first seen date

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2021-04-05 11:36:39 -07:00
William Lachance 420a1bfdba
Documentation preview on Netlify (#1933)
This necessitated adding support for python 3.7.
2021-03-31 18:30:15 -04:00
Ben Wu bfc4980f5d
Bug 1698578 - Parse fenix tagged ad click keys (#1906) 2021-03-31 16:05:35 -04:00
dependabot[bot] 12242c1447
Bump zetasql.version from 2021.02.1 to 2021.03.2 (#1931) 2021-03-31 19:00:16 +00:00
Anna Scholtz 576af2e2a3 Add mvn setup for circleci doc generation 2021-03-31 11:33:12 -07:00
Anna Scholtz 0fb24f1c1f Use zetasql for doc generation 2021-03-31 09:04:23 -07:00
Anna Scholtz 9fbbac8ee9 Fix referenced_tables error in generate_docs 2021-03-30 11:02:38 -07:00
William Lachance 9f21097a49
Remove .vscode settings and add to gitignore (#1929)
.vscode's settings.json can get cluttered with installation-specific
settings, which can lead to confusing pull requests. Instead, let's use
the approach outlined here:

https://stackoverflow.com/a/48387809

This provides a set of defaults that people can use, which we can
extend over time.
2021-03-30 12:25:05 -04:00
William Lachance ba797386f9
Exit gracefully after `bqetl bootstrap` succeeds (#1928)
It would previously try to run bqetl with no arguments, which led
to confusing output.
2021-03-30 11:57:59 -04:00
Jeff Klukas ac8f40f8d1
Bug 1701712 Add events_1pct (#1926)
This is a concrete step to allow querying events data over longer
time ranges. See https://bugzilla.mozilla.org/show_bug.cgi?id=1701712

Currently, Redash will reject longitudinal queries over `events` as being
too expensive, but this cannot take into account the benefits of the clustering
applied on the underlying tables. This 1% table sidesteps the issue for cases
where a 1% sample is sufficient.

We can consider expanding the time range beyond six months if we see clear use
cases for it.
2021-03-30 10:01:56 -04:00
Anthony Miyaguchi 0d1b0c0321
Add DAG for iOS Firefox ETL (#1905)
* Add DAG for iOS Firefox ETL

* Format dags.yaml file

* Rename dag to betl_firefox_ios

* Update schedule_interval to align with other queries
2021-03-29 16:23:50 -07:00
Jeff Klukas 0db0caf29a
Revert "Skip dry run for looker forecast cache (#1922)" (#1923)
This reverts commit 9cbe5b90a9.
2021-03-29 15:16:56 -04:00
Jeff Klukas 9cbe5b90a9
Skip dry run for looker forecast cache (#1922)
We're getting Permission Denied, causing CI on the default branch
to show failure.
2021-03-29 11:10:55 -04:00
Jeff Klukas 16281865cb
DAG scheduling and docs cleanup (#1919)
* DAG scheduling and docs cleanup

This adjusts some DAG schedules in an attempt to minimize the number of
BQSensor rescheduling emails we receive under normal circumstances.

Of note, the `copy_deduplicate` DAG is often taking a little longer than an
hour to complete, meaning that DAGs starting at 02:00 are likely to hit
reschedules. We do not address that problem here in hopes that performance
improvements to copy_deduplicate can bring performance back under 1 hour.

We also make some documentation fixups, including consistently using `|`
rather than `>` or `>-` as our multi-line string indicator. This preserves
whitespace that is relevant for markdown processing.

* Update stripe schedule
2021-03-29 09:15:24 -04:00
Daniel Thorn 8eef98802e
Fix execution_delta for stripe_external queries (#1921) 2021-03-29 09:09:36 -04:00
Anna Scholtz bcfadec33b Link bqetl docs in README 2021-03-25 14:16:06 -07:00