Граф коммитов

1027 Коммитов

Автор SHA1 Сообщение Дата
dependabot-preview[bot] 9f1655972c Bump sqlparse from 0.3.0 to 0.3.1
Bumps [sqlparse](https://github.com/andialbrecht/sqlparse) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/andialbrecht/sqlparse/releases)
- [Changelog](https://github.com/andialbrecht/sqlparse/blob/master/CHANGELOG)
- [Commits](https://github.com/andialbrecht/sqlparse/compare/0.3.0...0.3.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-08 15:51:15 -04:00
Frank Bertsch 8448e2f78b
Bug 1632635 - Make Amplitude properties top-level (#968)
* Add top-level Amplitude user props

* Don't send empty array user props

* Correct schema mismatch on union
2020-05-08 15:25:36 -04:00
Jeff Klukas 19323045c5 Use join to get 64-char fxa_uid hash
Per discussion in [DS-642](https://jira.mozilla.com/browse/DS-642?focusedCommentId=62023&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-62023)
2020-05-08 14:02:05 -04:00
Jeff Klukas 6d9ebad09c Refactor clients_first_seen
@jmccrosky reported that the first_seen_dates were not necessarily
correct. I neglected an important ORDER BY clause in init.sql.

This also simplifies the logic in the incremental query.
2020-05-08 12:59:29 -04:00
Jeff Klukas b68a2a7b52 Add udf.normalize_os 2020-05-08 12:31:57 -04:00
Jeff Klukas 2f35b9d3e3 Update sync send tab schema
Based on feedback in [DS-642](https://jira.mozilla.com/browse/DS-642?focusedCommentId=61469&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-61469)
2020-05-08 10:46:46 -04:00
Jeff Klukas fa60588b7b Bit pattern UDFs (bits28_*) 2020-05-07 16:54:56 -04:00
Frank Bertsch 8526e7a571
Make FxA export use PDT (#962) 2020-05-07 16:43:45 -04:00
Anna Scholtz fd9313620c deviations_anomdtct_v1 2020-05-07 12:57:39 -07:00
Anna Scholtz a42d19a2ab Update deviations query for Airflow anomdtct pipeline usage 2020-05-07 12:57:39 -07:00
Frank Bertsch 28268dcd50
Bug 1636231 - Remove background events from ios Amplitude export (#960)
* Bug 1636231 - Remove ios background events from Amplitude

* Format fennec ios events sql
2020-05-07 14:55:28 -04:00
Frank Bertsch bd7be1606c
Bug 1632635 - FxA Amplitude export for active events (#941)
* WIP: Initial implementation of FxA Amplitude export

* Use submission_date parameter

* Add hmac-sha256 SQL implementation

* Escape language column name

Co-Authored-By: Jeff Klukas <jeff@klukas.net>

* Use hmac_sha256; update for review feedback

* Reformat sql files

* Add docs for HMAC implementation

* Validate hmac_sha256 against NIST test vectors

* Add filepath as from_text arg

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Explicitly use named argument

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Add docs for hmac validation

* WIP: Derive os_used_week/month as incremental query

* Retrieve hmac_key from encrypted keys table

Co-authored-by: Jeff Klukas <jeff@klukas.net>

* Remove fxa_hmac param

* Reformat SQL files

* Use bytes_seen for os_used

* Rename udfs

* Format UDF sql

* Don't include NULL os values

* Don't include NULL user properties

* Update comment for UDF

* Use fully-named datasets, not fxa*

* Cast key_id to bytes

* Fix failing tests

* Fix test failures

* Use new dataset for view query

* Add access denied exception for secret access

* Remove flake8 changes

* Update description of fxa_amplitude_export

Co-authored-by: Jeff Klukas <jeff@klukas.net>

* Remove version suffix from view

Co-authored-by: Jeff Klukas <jeff@klukas.net>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
2020-05-05 22:37:26 -04:00
Jeff Klukas 069c24ac60
Add clients_first_seen (#952) 2020-05-05 16:52:12 -04:00
Marina Samuel 2a965c1540 --no-replace flag not required. 2020-05-05 12:25:22 -04:00
Marina Samuel e47c17afec Update tests for clients_histogram_probe_counts and delete buckets tests. 2020-05-05 12:25:22 -04:00
Marina Samuel 3bcab84a9e Update dryrun and format scripts. 2020-05-05 12:25:22 -04:00
Marina Samuel d71bda7186 Update run_glam_sql script. 2020-05-05 12:25:22 -04:00
Marina Samuel f41ee1a7b1 Simplify histogram_probe_counts code. 2020-05-05 12:25:22 -04:00
Marina Samuel 733f484c09 Use mod to sample evenly. 2020-05-05 12:25:22 -04:00
Marina Samuel 88182f469c Replace histogram_bucket_counts with aggregates_unnested. 2020-05-05 12:25:22 -04:00
Marina Samuel 137efc5edd Update scalar bucket and probe counts tests. 2020-05-05 12:25:22 -04:00
Marina Samuel 794aca767f Fix double counting bug and add sample fudging for scalars. 2020-05-05 12:25:22 -04:00
Marina Samuel 40fcac38e9 Update probe counts test to account for Dirichlet. 2020-05-05 12:25:22 -04:00
Marina Samuel bd191b5244 Update the densities to follow a Dirichlet distribution. 2020-05-05 12:25:22 -04:00
Marina Samuel 3cb170fa6f Update bucket counts to fudge windows+release numbers. 2020-05-05 12:25:22 -04:00
Marina Samuel 4ff327ffb8 Update tests for double counting bug. 2020-05-05 12:25:22 -04:00
Marina Samuel 552d28861c Update histogram bucket and probe counts to fix double counting bug. 2020-05-05 12:25:22 -04:00
Marina Samuel 6f7910aea8 Add tests for clients_scalar_bucket_counts and clients_scalar_probe_counts. 2020-05-05 12:25:22 -04:00
Marina Samuel 991d41841c Add tests for clients_histogram_bucket_counts and clients_histogram_probe_counts. 2020-05-05 12:25:22 -04:00
Marina Samuel 45e882099d Update dryrun script. 2020-05-05 12:25:22 -04:00
Marina Samuel 08e553907e Update script for running desktop glam. 2020-05-05 12:25:22 -04:00
Marina Samuel 2a62e9fc99 Update sampling portion of query to be more concise. 2020-05-05 12:25:22 -04:00
Marina Samuel 57377f6b68 Adding sample_id to clients daily init files. 2020-05-05 12:25:22 -04:00
Marina Samuel 96bae09c92 Histogram test fixes. 2020-05-05 12:25:22 -04:00
Marina Samuel 62c1d06ec3 Scalar test fixes. 2020-05-05 12:25:22 -04:00
Marina Samuel 5bd6b446ed Add sampling to histograms. 2020-05-05 12:25:22 -04:00
Marina Samuel 788e818c1b Add new histogram probes. 2020-05-05 12:25:22 -04:00
Marina Samuel 867ffb220f Add sampling for scalars. 2020-05-05 12:25:22 -04:00
Marina Samuel adaefc47f9 Add formating for glam desktop scalars. 2020-05-05 12:25:22 -04:00
Ben Wu 6450c0036f
Bug 1632245 - Add new fenix apps to mobile search (#946) 2020-05-05 12:03:14 -04:00
Daniel Thorn bba858ad31
Skip NotFound tables in shredder (#951)
Also remove deleted tables from config

Also add --partition-limit option for faster --dry-run
2020-05-04 12:20:44 -07:00
Anna Scholtz 6b3aa3969a Fix gzip content-encoding 2020-05-04 08:57:37 -07:00
Anna Scholtz 2e7b92e797 Update tests and remove gz ending 2020-05-01 10:48:01 -07:00
Anna Scholtz 3171f38143 Update README 2020-05-01 10:48:01 -07:00
Anna Scholtz daebf6d1e7 Remove .gz from exported JSON files 2020-05-01 10:48:01 -07:00
Anna Scholtz c63b185bb9 Set content-type for public data to json and encoding to gzip 2020-05-01 10:48:01 -07:00
Anna Scholtz af964e281d Add doc comments 2020-04-30 14:05:26 -07:00
Anna Scholtz f961ed3137 Create pytest fixtures 2020-04-30 14:05:26 -07:00
Anna Scholtz b3ddfc3cf5 BigQuery client fixture with random dataset 2020-04-30 14:05:26 -07:00
Marina Samuel 0199ca6b5f Use glean percentiles function for desktop. 2020-04-30 15:49:31 -04:00