bigquery-etl

Граф коммитов

Автор	SHA1	Сообщение	Дата
Jeff Klukas	73268edb16	Add comment about applicability of this UDF	2019-03-27 08:56:24 -04:00
Jeff Klukas	59855dce0a	Add seen_in_tier1_country for fxa tables	2019-03-27 08:56:24 -04:00
Daniel Thorn	97fe0540d1	Rename get_key to udf_get_key (#39 )	2019-03-26 10:44:49 -07:00
Jeff Klukas	b4d2444f31	Define views using CREATE OR REPLACE VIEW	2019-03-26 11:07:37 -04:00
Jeff Klukas	3d525cb6e7	Add first phase of FxA ETL for calculating MAU There will be follow-ups in the next few days to include daily imports from the FxA team's BQ projects.	2019-03-26 11:07:37 -04:00
dependabot[bot]	3d3dd3b4a8	Bump google-api-core from 1.8.1 to 1.8.2 (#35 ) Bumps [google-api-core](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.8.1 to 1.8.2. - [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases) - [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md) - [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/api_core-1.8.1...api_core-1.8.2) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-25 09:09:55 -07:00
dependabot[bot]	5538e832bc	Bump googleapis-common-protos from 1.5.8 to 1.5.9 (#34 ) Bumps [googleapis-common-protos](https://github.com/googleapis/googleapis) from 1.5.8 to 1.5.9. - [Release notes](https://github.com/googleapis/googleapis/releases) - [Commits](https://github.com/googleapis/googleapis/commits) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-25 08:37:02 -07:00
dependabot[bot]	e8a1e16801	Bump pytest-xdist from 1.26.1 to 1.27.0 (#31 ) Bumps [pytest-xdist](https://github.com/pytest-dev/pytest-xdist) from 1.26.1 to 1.27.0. - [Release notes](https://github.com/pytest-dev/pytest-xdist/releases) - [Changelog](https://github.com/pytest-dev/pytest-xdist/blob/master/CHANGELOG.rst) - [Commits](https://github.com/pytest-dev/pytest-xdist/compare/v1.26.1...v1.27.0) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-21 09:35:41 -07:00
Jeff Klukas	ba1a53f62a	Filter out null client_id	2019-03-20 12:37:30 -04:00
Jeff Klukas	3240a8eec3	Add WAU and id_bucket to desktop mau datasets Mirrors what we've done for nondesktop. Actually rolling this out will involve getting the change merged to `spark-parquet-to-bigquery`, then doing a manual recreation of the table and downstream. Because there's no windowing or joining involved, though, those recreations can each be done efficiently in a single query that takes perhaps a few minutes to run. This change will allow us to move the Growth Dashboard off of Spark-based analysis of Parquet datasets and onto BQ (via Redash).	2019-03-20 12:37:30 -04:00
Jeff Klukas	68f87ecbe4	Update live views to use fully qualified table names Per [BQ docs on creating views](https://cloud.google.com/bigquery/docs/views#creating_a_view): > For standard SQL views, the query must include the project ID in table and view references in the form `[PROJECT_ID].[DATASET].[TABLE]`. We also remove the `observation_date` column. After further discussion with the growth dashboard group, we're comfortable using the established definition of the MAU window including the date of report; the concern is to make sure we never include a partial day, which is already guaranteed by batch processing only after a day is closed.	2019-03-20 12:28:55 -04:00
Jeff Klukas	3a93272d9b	Add bucket_id to nondesktop tables As discussed in the [Smoot Growth Dashboard Engineering Requirements] (https://docs.google.com/document/d/1L8tWDUjccutGGAldhpypRtPCaw3kkXboPUTtTZb02OA/edit?usp=sharing): > In order to support statistical inference (confidence intervals and hypothesis tests), we require the concept of a ID Bucket. We divide the space of user IDs (probably client_id in most cases) into 20 buckets with equal probability of assignment into each bucket. It is very important that this assignment is orthogonal to anything else that we care about - for example, it would be a problem if newer profiles were more likely to be assigned to particular buckets, or if profiles in certain experiment arms always ended up in particular buckets.	2019-03-19 21:29:36 -04:00
Jeff Klukas	863b2252ef	Standardize use of _n and DATE	2019-03-19 20:42:16 -04:00
Daniel Thorn	0c7e3a9a60	fix readme link	2019-03-19 12:23:31 -04:00
Daniel Thorn	a61a3ca236	Fix core_clients_daily to use framing (#25 )	2019-03-19 09:06:22 -07:00
Daniel Thorn	124b201018	Reorganise README for slightly better readability (#24 )	2019-03-19 08:57:15 -07:00
Jeff Klukas	f000d0170e	Refactor nondesktop exact_mau28 tables into live views This is an alternative to #13 that separates ETL to a single dimensional `_raw_v1` table and then delays details for the presentation layer to live views where possible.	2019-03-18 17:31:45 -04:00
Jeff Klukas	ae0f491033	Add normalized_channel to nondesktop dimensions For a time, we thought we were going to use release only for KPI calculations, but this is likely going to revert to the historical stance of including prerelease in MAU calculations. This PR removes the filter we had on channel, and retains normalized_channel as a dimension so we can filter by channel if needed at the analysis level.	2019-03-18 16:47:40 -04:00
dependabot[bot]	eec68e629d	Bump google-api-core from 1.8.0 to 1.8.1 (#21 ) Bumps [google-api-core](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.8.0 to 1.8.1. - [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases) - [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md) - [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/logging-1.8.0...api_core-1.8.1) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-14 09:52:04 -07:00
dependabot[bot]	fc41b909fa	[Security] Bump pyyaml from 3.13 to 5.1 (#20 ) Bumps [pyyaml](https://github.com/yaml/pyyaml) from 3.13 to 5.1. This update includes security fixes. - [Release notes](https://github.com/yaml/pyyaml/releases) - [Changelog](https://github.com/yaml/pyyaml/blob/master/CHANGES) - [Commits](https://github.com/yaml/pyyaml/compare/3.13...5.1) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-13 15:33:35 -07:00
dependabot[bot]	7d99d5bccd	Bump pytest from 4.3.0 to 4.3.1 (#18 ) Bumps [pytest](https://github.com/pytest-dev/pytest) from 4.3.0 to 4.3.1. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/master/CHANGELOG.rst) - [Commits](https://github.com/pytest-dev/pytest/compare/4.3.0...4.3.1) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-13 10:12:42 -07:00
Daniel Thorn	c8edf48c66	Remove dags until they are GA (#19 )	2019-03-13 09:58:49 -07:00
William Lachance	9bef1b7fe3	Update README.md (#17 ) * Add CircleCI badge * Fix link to tests/	2019-03-12 11:24:56 -07:00
dependabot[bot]	61d5b6a528	Bump google-cloud-bigquery from 1.9.0 to 1.10.0 (#16 ) Bumps [google-cloud-bigquery](https://github.com/GoogleCloudPlatform/google-cloud-python) from 1.9.0 to 1.10.0. - [Release notes](https://github.com/GoogleCloudPlatform/google-cloud-python/releases) - [Changelog](https://github.com/googleapis/google-cloud-python/blob/master/CHANGELOG.md) - [Commits](https://github.com/GoogleCloudPlatform/google-cloud-python/compare/logging-1.9.0...logging-1.10.0) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-11 15:24:18 -07:00
dependabot[bot]	bb5bdcf517	Bump certifi from 2018.11.29 to 2019.3.9 (#15 ) Bumps [certifi](https://github.com/certifi/python-certifi) from 2018.11.29 to 2019.3.9. - [Release notes](https://github.com/certifi/python-certifi/releases) - [Commits](https://github.com/certifi/python-certifi/compare/2018.11.29...2019.03.09) Signed-off-by: dependabot[bot] <support@dependabot.com>	2019-03-11 15:23:06 -07:00
Jeff Klukas	e7691d3a69	Bug 1524670 firefox_nondesktop_exact_mau28 (#11 ) Bug 1524670 firefox_nondesktop_exact_mau28	2019-03-07 19:13:07 -05:00
Daniel Thorn	aef4494831	Fix flake8 and docstyle on master (#12 )	2019-03-07 14:27:39 -08:00
Daniel Thorn	79070068ad	Add first test (#9 )	2019-03-07 12:43:21 -08:00
Jeff Klukas	55da69cca9	Add core_clients_daily_v1 and core_clients_last_seen_v1 (#8 )	2019-03-07 09:31:12 -05:00
Daniel Thorn	55a81dcc17	Add first airflow DAG (#5 )	2019-03-05 10:31:31 -08:00
Daniel Thorn	9b62e6ba17	Optimize clients_last_seen_v1.init.sql (#4 )	2019-02-28 09:25:35 -08:00
Daniel Thorn	75e75a3122	Fix exact_mau28_by_dimensions and update exact_mau28 to use it (#3 )	2019-02-21 15:29:58 -08:00
Daniel Thorn	123552c320	version queries and add clients_last_seen and mau28_by_dimensions (#1 )	2019-02-15 13:37:24 -08:00
Daniel Thorn	99546691cb	include dataset id in SQL	2019-02-08 14:17:08 -08:00
Daniel Thorn	b91dcdd15e	Initial commit	2019-02-08 11:11:59 -08:00

... 26 27 28 29 30

1485 Коммитов Все ветки Поиск

1485 Коммитов

Все ветки