Граф коммитов

1485 Коммитов

Автор SHA1 Сообщение Дата
Jeff Klukas 73268edb16 Add comment about applicability of this UDF 2019-03-27 08:56:24 -04:00
Jeff Klukas 59855dce0a Add seen_in_tier1_country for fxa tables 2019-03-27 08:56:24 -04:00
Daniel Thorn 97fe0540d1
Rename get_key to udf_get_key (#39) 2019-03-26 10:44:49 -07:00
Jeff Klukas b4d2444f31 Define views using CREATE OR REPLACE VIEW 2019-03-26 11:07:37 -04:00
Jeff Klukas 3d525cb6e7 Add first phase of FxA ETL for calculating MAU
There will be follow-ups in the next few days to include daily imports
from the FxA team's BQ projects.
2019-03-26 11:07:37 -04:00
dependabot[bot] 3d3dd3b4a8 Bump google-api-core from 1.8.1 to 1.8.2 (#35)
Bumps [google-api-core](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases)
- [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/api_core-1.8.1...api_core-1.8.2)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-25 09:09:55 -07:00
dependabot[bot] 5538e832bc Bump googleapis-common-protos from 1.5.8 to 1.5.9 (#34)
Bumps [googleapis-common-protos](https://github.com/googleapis/googleapis) from 1.5.8 to 1.5.9.
- [Release notes](https://github.com/googleapis/googleapis/releases)
- [Commits](https://github.com/googleapis/googleapis/commits)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-25 08:37:02 -07:00
dependabot[bot] e8a1e16801 Bump pytest-xdist from 1.26.1 to 1.27.0 (#31)
Bumps [pytest-xdist](https://github.com/pytest-dev/pytest-xdist) from 1.26.1 to 1.27.0.
- [Release notes](https://github.com/pytest-dev/pytest-xdist/releases)
- [Changelog](https://github.com/pytest-dev/pytest-xdist/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest-xdist/compare/v1.26.1...v1.27.0)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-21 09:35:41 -07:00
Jeff Klukas ba1a53f62a Filter out null client_id 2019-03-20 12:37:30 -04:00
Jeff Klukas 3240a8eec3 Add WAU and id_bucket to desktop mau datasets
Mirrors what we've done for nondesktop.

Actually rolling this out will involve getting the change merged
to `spark-parquet-to-bigquery`, then doing a manual recreation of the table
and downstream. Because there's no windowing or joining involved, though,
those recreations can each be done efficiently in a single query that takes
perhaps a few minutes to run.

This change will allow us to move the Growth Dashboard off of Spark-based
analysis of Parquet datasets and onto BQ (via Redash).
2019-03-20 12:37:30 -04:00
Jeff Klukas 68f87ecbe4 Update live views to use fully qualified table names
Per [BQ docs on creating views](https://cloud.google.com/bigquery/docs/views#creating_a_view):

> For standard SQL views, the query must include the project ID in table and view references in the form `[PROJECT_ID].[DATASET].[TABLE]`.

We also remove the `observation_date` column. After further discussion with
the growth dashboard group, we're comfortable using the established definition
of the MAU window including the date of report; the concern is to make sure
we never include a partial day, which is already guaranteed by batch processing
only after a day is closed.
2019-03-20 12:28:55 -04:00
Jeff Klukas 3a93272d9b Add bucket_id to nondesktop tables
As discussed in the
[Smoot Growth Dashboard Engineering Requirements]
(https://docs.google.com/document/d/1L8tWDUjccutGGAldhpypRtPCaw3kkXboPUTtTZb02OA/edit?usp=sharing):

> In order to support statistical inference (confidence intervals and hypothesis tests), we require the concept of a ID Bucket.  We divide the space of user IDs (probably client_id in most cases) into 20 buckets with equal probability of assignment into each bucket.  It is very important that this assignment is orthogonal to anything else that we care about - for example, it would be a problem if newer profiles were more likely to be assigned to particular buckets, or if profiles in certain experiment arms always ended up in particular buckets.
2019-03-19 21:29:36 -04:00
Jeff Klukas 863b2252ef Standardize use of _n and DATE 2019-03-19 20:42:16 -04:00
Daniel Thorn 0c7e3a9a60 fix readme link 2019-03-19 12:23:31 -04:00
Daniel Thorn a61a3ca236
Fix core_clients_daily to use framing (#25) 2019-03-19 09:06:22 -07:00
Daniel Thorn 124b201018
Reorganise README for slightly better readability (#24) 2019-03-19 08:57:15 -07:00
Jeff Klukas f000d0170e Refactor nondesktop exact_mau28 tables into live views
This is an alternative to #13 that separates ETL to a single
dimensional `_raw_v1` table and then delays details for the presentation
layer to live views where possible.
2019-03-18 17:31:45 -04:00
Jeff Klukas ae0f491033 Add normalized_channel to nondesktop dimensions
For a time, we thought we were going to use release only for KPI
calculations, but this is likely going to revert to the historical
stance of including prerelease in MAU calculations.

This PR removes the filter we had on channel, and retains normalized_channel
as a dimension so we can filter by channel if needed at the analysis level.
2019-03-18 16:47:40 -04:00
dependabot[bot] eec68e629d Bump google-api-core from 1.8.0 to 1.8.1 (#21)
Bumps [google-api-core](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases)
- [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/logging-1.8.0...api_core-1.8.1)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-14 09:52:04 -07:00
dependabot[bot] fc41b909fa [Security] Bump pyyaml from 3.13 to 5.1 (#20)
Bumps [pyyaml](https://github.com/yaml/pyyaml) from 3.13 to 5.1. **This update includes security fixes.**
- [Release notes](https://github.com/yaml/pyyaml/releases)
- [Changelog](https://github.com/yaml/pyyaml/blob/master/CHANGES)
- [Commits](https://github.com/yaml/pyyaml/compare/3.13...5.1)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-13 15:33:35 -07:00
dependabot[bot] 7d99d5bccd Bump pytest from 4.3.0 to 4.3.1 (#18)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 4.3.0 to 4.3.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/4.3.0...4.3.1)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-13 10:12:42 -07:00
Daniel Thorn c8edf48c66
Remove dags until they are GA (#19) 2019-03-13 09:58:49 -07:00
William Lachance 9bef1b7fe3 Update README.md (#17)
* Add CircleCI badge
* Fix link to tests/
2019-03-12 11:24:56 -07:00
dependabot[bot] 61d5b6a528 Bump google-cloud-bigquery from 1.9.0 to 1.10.0 (#16)
Bumps [google-cloud-bigquery](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases)
- [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/logging-1.9.0...logging-1.10.0)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-11 15:24:18 -07:00
dependabot[bot] bb5bdcf517 Bump certifi from 2018.11.29 to 2019.3.9 (#15)
Bumps [certifi](https://github.com/certifi/python-certifi) from 2018.11.29 to 2019.3.9.
- [Release notes](https://github.com/certifi/python-certifi/releases)
- [Commits](https://github.com/certifi/python-certifi/compare/2018.11.29...2019.03.09)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2019-03-11 15:23:06 -07:00
Jeff Klukas e7691d3a69
Bug 1524670 firefox_nondesktop_exact_mau28 (#11)
Bug 1524670 firefox_nondesktop_exact_mau28
2019-03-07 19:13:07 -05:00
Daniel Thorn aef4494831
Fix flake8 and docstyle on master (#12) 2019-03-07 14:27:39 -08:00
Daniel Thorn 79070068ad
Add first test (#9) 2019-03-07 12:43:21 -08:00
Jeff Klukas 55da69cca9
Add core_clients_daily_v1 and core_clients_last_seen_v1 (#8) 2019-03-07 09:31:12 -05:00
Daniel Thorn 55a81dcc17
Add first airflow DAG (#5) 2019-03-05 10:31:31 -08:00
Daniel Thorn 9b62e6ba17
Optimize clients_last_seen_v1.init.sql (#4) 2019-02-28 09:25:35 -08:00
Daniel Thorn 75e75a3122
Fix exact_mau28_by_dimensions and update exact_mau28 to use it (#3) 2019-02-21 15:29:58 -08:00
Daniel Thorn 123552c320
version queries and add clients_last_seen and mau28_by_dimensions (#1) 2019-02-15 13:37:24 -08:00
Daniel Thorn 99546691cb
include dataset id in SQL 2019-02-08 14:17:08 -08:00
Daniel Thorn b91dcdd15e
Initial commit 2019-02-08 11:11:59 -08:00