Граф коммитов

36 Коммитов

Автор SHA1 Сообщение Дата
Ben Wu 2156023e16
Add new search types, experiments, and uri count to mobile search (#1167) 2020-07-14 19:00:37 -04:00
Anna Scholtz 41b05f58d3 Rescheduling 2020-07-10 13:30:24 -07:00
Anna Scholtz c4054ccdbf Reschedule and set main_summary priority 2020-07-09 13:08:17 -07:00
Anna Scholtz 9928da015c Update execution delta 2020-07-08 13:46:03 -07:00
Anna Scholtz c2d1430ede Reschedule DAGs 2020-07-08 13:46:03 -07:00
Anna Scholtz c5d727c336 DAG for daily experiments clients 2020-07-06 11:27:26 -07:00
Frank Bertsch 21766bc700
Version search contribution and add task (#1091) 2020-06-24 13:00:44 -04:00
Ding Ding 0a2876e133 Add search_metric_contribution 2020-06-23 21:08:30 -04:00
Ding Ding 99919ea706 Remove query to add UDF first 2020-06-23 21:08:30 -04:00
Ding Ding 87351b6572 Update search_metric_contribution and add udf function
- add quantile_search_metric_contribution.sql in udf folder
- some fixes on search_metric_contribution
- update search_metric_contribution metadata
2020-06-23 21:08:30 -04:00
Ding Ding d5b7a71bfd update submission_date time period 2020-06-23 21:08:30 -04:00
Ding Ding af777bd8e8 add search_metric_contribution 2020-06-23 21:08:30 -04:00
Ben Wu fb13b90eee
Move normalized search engine to views instead of tables (#1062) 2020-06-12 15:19:16 -04:00
Anna Scholtz 45fb7d41e5 Add bqetl_search DAG 2020-06-12 09:47:51 -07:00
Anna Scholtz 69d68c55c3 bqetl_mobile_search DAG 2020-06-08 13:33:19 -07:00
Anna Scholtz b5e837cb80 Ignore formatting template files 2020-05-26 15:06:01 -07:00
Anna Scholtz 581d5a32a9 Fix SQL format validation step in CI 2020-05-26 15:06:01 -07:00
Ben Wu 9c0bbefbc9
Add normalized_app_name to mobile search (#981) 2020-05-13 13:54:53 -04:00
Ben Wu 6450c0036f
Bug 1632245 - Add new fenix apps to mobile search (#946) 2020-05-05 12:03:14 -04:00
Anna Scholtz 3f1cb398fa Undo formatting for old SQL files 2020-02-07 09:48:23 -08:00
Anna Scholtz 153d45d62f Reformat SQL 2020-02-07 09:48:23 -08:00
Anna Scholtz 97b5386b41 Change UDFs to persistent UDFs and remove sql generations script 2020-02-07 09:48:23 -08:00
Daniel Thorn 5fa7e4e61e
Correctly format scripting keywords (#693) 2020-01-21 20:05:47 -08:00
Ben Wu a444b4c27b
Bug 1597361 - Add experiments to search_clients_daily (#667) 2020-01-10 20:02:18 -05:00
William Lachance 28af89726e Bug 1606171 - Add is_default_browser to search clients daily and search aggregates (#637) 2020-01-07 16:18:40 -05:00
Daniel Thorn 1210224536
Preserve nulls in telemetry_derived.main_summary_v4 (#623)
and preserve schema by unnesting in the view
2019-12-20 18:35:07 -05:00
Frank Bertsch 65053ad5e1
Fix monthly_searches to account for null (#612)
* Fix monthly_searches to account for null

* Fix missing closing parens

* Add UDF to get NULL key
2019-12-19 15:19:10 -05:00
Ben Wu 8084fa604f
Remove non-strict search engine normalization (#604) 2019-12-17 12:56:06 -05:00
Ben Wu 7d0934b800
Point search clients daily to main summary table (not view) (#606) 2019-12-17 12:52:39 -05:00
Ben Wu d67c760725
Use normalize engine udf in all search queries (#596) 2019-12-16 12:18:01 -05:00
Frank Bertsch 719f607a0a
Update UDF names with prefixed numbers (#593)
* Error on improper UDF names

* Rename udfs with prefixed numbers
2019-12-12 15:09:57 -05:00
Frank Bertsch 6c825425b3
Search clients last seen (#451)
* Improve error message for ndjson parsing

* Make JSON error messages nicer

* Cast BYTES fields to/from string

BYTES types are not JSON-serializable. To deal with that, we do
two things:
1. Assume the input tables are hex strings, and decode them
   to get the BYTES fields values (on input)
2. Encode BYTES fields as hex strings (on output)

This means that any data files use hex strings for BYTES fields.

Note: This only works on top-level fields

* Add better discrepancy reporting for test assertions

When JSON blobs differ, it can be hard to tell what is wrong.
These functions easily show what's different, and automatically
prints them to be available when tests fail.

* Add search_clients_last_seen for 1-year of history

This new dataset, search_clients_last_seen, contains a year
of history for each client. It is split into 3 main parts:

1. Recent info that is contained in search_clients_daily,
   similar to how we store that in clients_last_seen
2. A year of history, represented as a BYTES field,
   indicating which days they were active for different
   types of activity
3. Among the major search providers, arrays of totals of
   different metrics, split into 12 parts, to account for
   each months total

This dataset will power LTV.

* Fix linting issues

* Enforce sampling on search_clients_daily

* Address review feedback

- Change all bits/bytes functions to include no. of bits
- Use fileobj for tests
- Rename some vars
- Use base64 for bytes in/out

* Generate sql

* Add missing comma

* Move search_clients_ls to search_derived

* Generate moar sql

* Use clients_daily_v8

* Fix query

* Move tests to search_derived

* Fix tests for search_clients_daily_v8

* Don't dryrun with search_clients_last_seen

* Update udf/new_monthly_engine_searches_struct.sql

Co-Authored-By: Jeff Klukas <jeff@klukas.net>

* sample_id is now an int

* Add documentation

* Update schemas

* Make tests use int sample-id
2019-12-12 12:43:09 -05:00
Ben Wu 7d9782b1ba
Bug 1543434 - Create search datasets for mobile (#559) 2019-12-06 13:28:53 -05:00
Ben Wu 10e300465b
Filter out overactive clients in search clients daily (#511) 2019-11-18 20:57:31 -05:00
Ben Wu 9a7e7f159f
Re-add aggregated numerical values in search (#503) 2019-11-14 19:26:20 -05:00
Ben Wu 73dc724086
Switch search to read from flattened main summary (#500) 2019-11-13 13:20:14 -08:00