Граф коммитов

1530 Коммитов

Автор SHA1 Сообщение Дата
Frank Bertsch 722e6ece6e
Init statement & sample id addition (#1304)
* Include sample_id in events_daily

* Add init query for events_daily

* Add sample_id to init
2020-09-11 11:53:21 -04:00
Jeff Klukas a9f00f3b15 Reformat 2020-09-11 11:31:34 -04:00
Jeff Klukas 30c70346bc Remove unused string_to_arr function 2020-09-11 11:31:34 -04:00
Jeff Klukas 9295a5d30f Add DETERMINISTIC modifier to JS functions
This recently released feature allows query results using JS UDFs to be cached.

See https://issuetracker.google.com/issues/138310623
2020-09-11 11:31:34 -04:00
Frank Bertsch f7090245d2
Drop order by clause in favor of clustering (#1303) 2020-09-11 07:10:56 -04:00
Frank Bertsch 079a409672
Use submission_date for join criteria (#1302) 2020-09-10 20:19:31 -04:00
Frank Bertsch 818f680052
Create DAG for events rollup (#1301)
* Create DAG for events rollup

* Update sql/org_mozilla_firefox_derived/events_daily_v1/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2020-09-10 18:50:45 -04:00
Frank Bertsch 5a17168e63
Create events daily rollup (#1296)
* Create events daily rollup

This takes the most recent days' data, rolls up the events,
and encodes them as unicode.

Add tests for android events daily

* Remove unecessary file

* Use map.mode_last

* Reformat sql

* Fix experiment aggregation

* Address review feedback

* Fix submission_date

* Git proper alias to events table
2020-09-10 17:48:31 -04:00
Frank Bertsch 8ed92c9838
Give events ping a proper alias (#1300) 2020-09-10 17:35:00 -04:00
Frank Bertsch ddbab093ab
Fix event timestamps (#1299)
* Use correct timestamp representation

* Fix event timestamps for query
2020-09-10 17:14:41 -04:00
Frank Bertsch 156fd98d3d
Revenue LTV updates (#1291)
* Revenue LTV updates

* Remove explicit date references
2020-09-10 15:54:30 -04:00
Frank Bertsch aca88a3d45
Query for updates to event_types_v1 (#1295)
* Query for update event_types_v1

This query takes yesterday's events, yesterday's event_types,
and adds the new events, event_properties, and property values.

It writes it out to a new partition. This is not strictly
necessary but will aid debugging and redoes.

* Format SQL
2020-09-10 12:53:04 -04:00
Frank Bertsch e71840cec6
Android event types init (#1289)
* Init SQL for event_types_v1

* Fix comparison of differently-sized lists

* Add support for tests of init stmts

* Include metadata for event_types_v1

* Add tests for event_types init

* Reformat SQL

* Run black

* Skip invalid unicode sections in event_code_points_to_string

* Allow for init-only queries

* Partition events_daily_v1 by submission-date

This is not strictly required, but will aid in
debugging and reruns.

* Add assertion for not null

* Lint

* Alias events ping name

* Ignore time_ms event property
2020-09-10 12:12:56 -04:00
Anna Scholtz 7566b7c3f0 Add CLI UDF tests 2020-09-09 14:10:44 -07:00
Anna Scholtz f04d7bf507 Add UDF CLI command for publishing UDFs 2020-09-09 14:10:44 -07:00
Anna Scholtz e87bb3f4d4 Add CLI command for validating UDFs 2020-09-09 14:10:44 -07:00
Anna Scholtz cb8675fb92 CLI command for listing UDF information 2020-09-09 14:10:44 -07:00
Anna Scholtz 4e9e1619c6 Add CLI command for creating UDFs 2020-09-09 14:10:44 -07:00
Anna Scholtz 278a7b790b UDF CLI setup 2020-09-09 14:10:44 -07:00
Frank Bertsch 51a7631e09
Bug 1649871 - Include rolled up event types (#1294)
* Bug 1649871 - Include rolled up event types

* Reformat sql
2020-09-09 11:43:57 -04:00
Frank Bertsch de97647d11
Don't include most recent month in revenue calc (#1219) 2020-09-09 11:35:29 -04:00
Frank Bertsch 55f3ab5cb5
Use submission_date for partition pruning (#1257) 2020-09-09 10:59:38 -04:00
Ben Miroglio 23491d5054
Add ltv view without revenue fields (#1203)
* Add ltv view without revenue fields

* Change view to query

* Add submission_date filter

* Ignore normalized ltv export in dryrun

Co-authored-by: Frank Bertsch <fbertsch@mozilla.com>
2020-09-09 10:50:47 -04:00
Jeff Klukas 25363c44fc Remove mozfun reference in mozfun UDF definition
Airflow's publishing of mozfun definitions failed last night due to
https://github.com/mozilla/bigquery-etl/pull/1287

Long-term we should enforce that mozfun function definitions don't directly
reference mozfun.
2020-09-09 09:32:51 -04:00
Jeff Klukas fe0afb6ae0
Tolerate 10-digit Fenix app_build values when choosing channel (#1270)
See https://github.com/mozilla/bigquery-etl/pull/1250 for proposed logic of how
to parse a date from the 10-digit format, but this change should be reliable
for the specific case of determining whether we are before or after the
2020-07-03 epoch where naming of apps and channels changed.

This will affect GLAM and GUD.

Reverts https://github.com/mozilla/bigquery-etl/pull/1249

I will plan to run a backfill for GUD on the same day I merge this, and then
communicate out the change to data users, since channel breakdowns in GUD
will change for Fenix and Firefox Preview.
2020-09-09 08:53:06 -04:00
dependabot[bot] 9cb1b70b1f
Bump attrs from 20.1.0 to 20.2.0 (#1284) 2020-09-09 01:25:34 +00:00
dependabot[bot] b86f64fbfb
Bump gitpython from 3.1.7 to 3.1.8 (#1283) 2020-09-09 01:18:42 +00:00
Anna Scholtz ac56b5b1f6 Fix publishing UDFs 2020-09-08 12:28:16 -07:00
dependabot[bot] f25a096536
Bump pytest-black from 0.3.10 to 0.3.11 (#1282) 2020-09-08 16:46:47 +00:00
Felix Lawrence a8fc87b91c
Add `clients_last_seen.days_interacted_bits`. (#1280)
* Add `clients_last_seen.days_any_interaction_bits`.

This column is a 28-day bit array containing whether the client has any active ticks (thus nonzero active hours) associated with each of the last 28 submission dates.

* Change column name.

Follow the grammatical pattern for these column names, as pointed out by @jklukas.

* Update tests

* Reconcile order to satisfy IF

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2020-09-02 16:21:32 -04:00
Daniel Thorn 26c67c7ee8
Upgrade to pytest 6.0.1 (#1281)
Also upgrade and fix pytest plugins
2020-09-02 11:30:14 -07:00
Anna Scholtz 6c7ad57b55 Prefix test data UDFs with test_ 2020-09-02 10:24:38 -07:00
Anna Scholtz 7564f8e46c Fix collecting UDFs for testing 2020-09-02 10:24:38 -07:00
Anna Scholtz 15a02710d2 Remove old parse_udf file 2020-09-02 10:24:38 -07:00
Anna Scholtz 3bdb418588 Add tests for parsing UDFs 2020-09-02 10:24:38 -07:00
Anna Scholtz 40e8a7ad91 Reformat UDFs 2020-09-02 10:24:38 -07:00
Anna Scholtz 56a475c572 Fix validating docs 2020-09-02 10:24:38 -07:00
Anna Scholtz 437cf67aa2 Refactor parse_udf 2020-09-02 10:24:38 -07:00
Anna Scholtz 0080ff8867 Refactor publish_udfs script 2020-09-02 10:24:38 -07:00
Anna Scholtz 2b29d24f59 Migrate UDFs to new format 2020-09-02 10:24:38 -07:00
dependabot[bot] 04014b483e
Bump google-cloud-storage from 1.27.0 to 1.31.0 (#1279) 2020-09-01 18:17:15 +00:00
dependabot[bot] c565d2e96b
Bump smart-open from 1.10.0 to 2.1.1 (#1278) 2020-09-01 17:20:29 +00:00
dependabot[bot] 357431e1b7
Bump google-cloud-bigquery from 1.24.0 to 1.27.2 (#1277) 2020-09-01 16:22:14 +00:00
dependabot[bot] 08369555f5
Bump attrs from 19.3.0 to 20.1.0 (#1275) 2020-09-01 16:12:43 +00:00
dependabot[bot] 288d6ed8db
Bump gitpython from 3.1.2 to 3.1.7 (#1274) 2020-09-01 16:05:43 +00:00
dependabot[bot] bf8983784a
Bump typing from 3.7.4.1 to 3.7.4.3 (#1267) 2020-09-01 15:57:55 +00:00
Anna Scholtz f0c3ad8fda
Fix writing to temporary table when publishing JSON (#1273)
* Fix writing to temporary table when publishing JSON

* Update test_script_incremental_query test

* Ensure temporary artifacts are deleted

* Wrap deletion of artifacts into try finally block
2020-09-01 08:51:15 -07:00
Frank Bertsch 3da15a3001
Prepare search_clients_last_seen for 100% backfill (#1261)
* Move null search engines to Other

* Remove sample_id limit
2020-09-01 09:46:35 -04:00
dependabot[bot] 5ed271b8b4
Bump pytest-black from 0.3.8 to 0.3.10 (#1265) 2020-08-31 21:10:47 +00:00
Anna Scholtz 34e2c21405 Add CLI command to initialize destination tables for queries 2020-08-31 12:47:58 -07:00