Граф коммитов

221 Коммитов

Автор SHA1 Сообщение Дата
Anna Scholtz 08b45a40fe
Jinja queries support (#3691)
* Support Jinja templating in query files

* Formatting for Jinja

* ./bqetl query render command

* Fix running templates
2023-03-30 11:00:12 -07:00
kik-kik 51def19185
Bug 1825545 - Revert "Support Jinja templating in query files (#3685)" (#3689)
Bug 1825545 - This reverts commit a1c51124ec.
2023-03-30 10:47:17 -04:00
Anna Scholtz a1c51124ec
Support Jinja templating in query files (#3685)
* Support Jinja templating in query files

* Formatting for Jinja

* ./bqetl query render command
2023-03-29 10:38:08 -07:00
kik-kik 9ef6ba1557
issues(3416): added missing metadata.yaml files inside moz-fx-data-shared-prod (#3640)
* added missing metadata.yaml files inside telemetry_derived

* added a comment where we should include a check to see if all queries have a metadata.yaml

* added the remained of missing metadata.yaml files under the moz-fx-data-shared-prod project

* fix metadata files for folders without version

* changes made as requested by ascholtz in PR#3640

---------

Co-authored-by: Leli Schiestl <lschiestl@mozilla.com>
2023-03-13 21:53:05 +00:00
Alexander 60c85e7c54
Revert CI changes for private UDFs and add stub documentation - DENG-735 (#3652)
* Revert "CI fixes for supporting private UDFs in bigquery-etl - DENG-735 (#3631)"

This reverts commit edcfe758f7.

* Added stub UDF for monetized_search

* Add docs for using a private internal UDF
2023-03-10 11:41:46 -05:00
Alexander edcfe758f7
CI fixes for supporting private UDFs in bigquery-etl - DENG-735 (#3631)
* Minimize stub normalize_search_engine UDF and usage in search_clients_last_seen tests
* Move sql tests downstream of private-generate-sql and copy UDFs into sql-dir for tests
2023-03-09 16:54:42 -05:00
Frank Bertsch 12336b8286
Run a nighly CI workflow after probe-scraper (#3641)
Probe-scraper takes at most 4 hours to run.
If those schemas get deployed, then this build will cache
the latest stable schemas and make them available in the
container.
2023-03-07 13:02:50 -05:00
Mikaël Ducharme e890be4c10
chore(ci): use unit tests from telemetry-airflow (#3600)
* use unit tests from telemetry-airflow
2023-02-22 13:24:56 -05:00
Anna Scholtz 3362ea470a
Deploy changes to stage env (#3587)
* Deploy changes to stage env

* Test deploy

* Remove tests queries

* Add CLI option to remove updated artifacts
2023-02-22 09:49:01 -08:00
Anna Scholtz 23b9c31e58
Update SSH key for pushing generated SQL (#3595) 2023-02-14 15:20:05 -08:00
Anna Scholtz a064fe246a
Update SSH doc deploy key (#3594) 2023-02-14 17:52:19 -05:00
Daniel Thorn ac053c326a
Update dependencies missed by dependabot (#3566) 2023-02-06 12:14:32 -08:00
Anna Scholtz 2e994cc19a
Re-enable gh-pages for docs (#3556) 2023-02-01 10:19:33 -08:00
Anna Scholtz c68a89b871
Disable gh-pages (#3531) 2023-01-25 17:14:59 -08:00
Daniel Thorn fffa763f88
Fix compatibility with dependabot (#3512)
by updating pip-tools and indicating no packages are unsafe

and update google-cloud-bigquery to fix dependabot version conflict error
2023-01-17 14:14:54 -08:00
Anna Scholtz 653d5bcb5e
Fix copying DAGs in CI (#3511) 2023-01-17 11:16:02 -08:00
Anna Scholtz 031320dfcf
Move check for dags being up to date after DAGs have been generated (#3508) 2023-01-17 08:49:29 -08:00
akkomar 5e42daae7f
Update bigquery-etl deploy key (#3492) 2023-01-09 22:54:43 +01:00
akkomar ef4bd86d9b
Update private-bigquery-etl deploy key (#3491) 2023-01-09 22:12:38 +01:00
Daniel Thorn a2ec2afeec
enforce isort in CI (#3450) 2022-12-12 15:11:38 -08:00
Daniel Thorn 797ac94268
Update CI to python 3.10 (#3248) 2022-10-05 11:32:53 -07:00
Anna Scholtz 81fbdf6c01 Move doc validation and generation commands to CLI 2022-08-25 08:51:41 -07:00
Anna Scholtz 9a5a98779a Remove scripts that have been replaced by CLI 2022-07-19 08:08:49 -07:00
Jeff Klukas f5b1b0d61c
Make link to manual trigger easier to copy/paste (#3015)
The link is currently followed by a '.' and it is easy to miss the next line of instruction. Indentation should make it easier to see these are grouped together.
2022-06-14 15:31:45 +00:00
Daniel Thorn 5ea2f1dea2
Remove redundant GOOGLE_APPLICATION_CREDENTIALS in integration (#2973) 2022-05-16 19:53:50 +00:00
Sean Rose 3fb6ebc160
Specify pyjnius in `requirements.in` (#2924) 2022-04-29 12:23:47 -07:00
Daniel Thorn 0235d6087c
Reapply "Update to zetasql 2022.02.1 (#2862)" (#2881) 2022-04-07 14:02:20 -07:00
Daniel Thorn 22835e5706
Revert "Update to zetasql 2022.02.1 (#2862)" (#2879)
This reverts commit 9c9761feac.
2022-04-07 12:03:09 -07:00
Daniel Thorn 9c9761feac
Update to zetasql 2022.02.1 (#2862) 2022-04-05 18:55:52 +02:00
Daniel Thorn 4406437be2
Use --no-deps when installing compiled requirements files (#2752) 2022-02-24 21:36:47 +00:00
Anna Scholtz b5f9ea4752 Update generate_sql script 2022-02-16 08:52:51 -08:00
Anna Scholtz 9582f323f5 Revert "Remove verify-dags-up-to-date from CI"
This reverts commit cbce03eb03.
2022-02-15 11:39:27 -08:00
Anna Scholtz 72d482210b Revert "Create directory when generating DAGs"
This reverts commit 60888156b0.
2022-02-15 11:39:27 -08:00
Anna Scholtz 60888156b0 Create directory when generating DAGs 2022-02-14 08:32:04 -08:00
Anna Scholtz cbce03eb03 Remove verify-dags-up-to-date from CI 2022-02-14 08:32:04 -08:00
Anna Scholtz 11139420c4 Copy generated DAGs when pushing generated SQL 2022-02-02 14:43:18 -08:00
Anna Scholtz c50e8615cf Test DAG SQL copy 2022-02-01 11:47:26 -08:00
Anna Scholtz 10ec0a4510 Copy generated DAGs to /tmp/workspace 2022-01-28 08:07:15 -08:00
Anna Scholtz cd5fefde6b Push DAGs to generated-sql branch 2022-01-19 10:11:47 -08:00
kik-kik a2728ee4ed
tweaked comment in validate-dags to mention that tags are also validated (#2655)
Co-authored-by: = <=>
2022-01-19 14:20:22 +00:00
kik-kik 1be3aafb82 Update Dockerfile
as suggested by @jklukas

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2022-01-12 16:35:13 +01:00
Alexander Nicholson 1f9cb278f2 Dockerfille specifies platform for base image (to support apple sillicon), circleci using custom executor with updated docker to support --platform flag 2022-01-12 16:35:13 +01:00
Anna Scholtz 81a6a2cd60 Record dependencies when generating private SQL 2022-01-11 14:12:46 -08:00
Anna Scholtz c227e815eb Fix copying private generated SQL in CI 2022-01-11 08:50:33 -08:00
= 1af6d45726 bumped up version of ubuntu used for validate-dags step 2022-01-10 17:03:04 +01:00
Anna Scholtz f6da873783 Move DAG generation to CI 2022-01-07 14:25:04 -08:00
Anna Scholtz 8fa323719f CI run validation after generating SQL 2022-01-07 14:25:04 -08:00
Anna Scholtz 95332daeff Use generate all command in CI 2022-01-07 14:25:04 -08:00
Anna Scholtz 03f0b5abe0 Ensure diff has maximum length 2022-01-05 12:13:09 -08:00
Anna Scholtz 0ec87b045c Do not generate SQL diff on main 2022-01-04 12:50:28 -08:00
Anna Scholtz c1440c2e82 Show generated SQL diff in PR 2022-01-04 11:16:32 -08:00
Anna Scholtz 22fea1dca3 Add docs for sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz dab97f5305 Move search to sql_generators/ 2022-01-03 14:15:02 -08:00
Daniel Thorn 0d327a5350
Ignore comments when verifying compiled requirements (#2542) 2021-12-01 14:23:43 -08:00
Will Lachance 1dcf62367a
Fix ci in production (#2530)
This step didn't get run in development, so we didn't notice it until now.
2021-11-26 21:13:33 +00:00
Will Lachance b6b146b396
Install mkdocs via requirements files (#2528)
This makes doc generation "work out of the box" and simplifies CI
somewhat.
2021-11-26 18:12:39 +00:00
Anna Scholtz 03c38dc203 Remove -n from entrypoint scripts 2021-11-08 10:29:14 -08:00
Anna Scholtz 6200c5ed36 Run SQL tests separately 2021-11-08 10:29:14 -08:00
Will Lachance 411af312e4
Replace `format_sql` with "bqetl format" (#2348)
* Make bqetl work equivalently to format_sql (which had some extra
options to format standard in)
* Remove `format_sql` and update everything that uses it to use
  `bqetl format` instead.
2021-09-20 17:18:25 +00:00
Will Lachance f7fff6351b
Streamline code to generate events_daily queries and re-run (#2307)
We're still checking the queries into the tree for now, but this moves the code to do so
into bqetl (more discoverable and easier to test) and generally streamlines things so it
will be easier to generate these queries at build-time. Also, check-in updated views from
a run of this code.
2021-08-31 18:56:46 +00:00
Daniel Thorn 5eeb7cddf7
Fix non-private image builds (#2302)
and build docker images on branches, but only deploy on main and tags
2021-08-30 12:08:57 -07:00
Will Lachance 703239b004
Minor improvements to bqetl script (#2267)
* Check for a valid python installation
* Actually exit if `venv` directory already exists
* Don't require virtualenv package (use `venv` builtin)
* Ensure that we don't attempt to recompile numpy or pandas when
  installing on more recent versions of Mac
2021-08-16 21:01:07 +00:00
Ben Wu dbf25769cd
Replace normalize search engine with stub implementation (#2258) 2021-08-12 16:30:13 +00:00
Anna Scholtz 4eb3dd1a60 Use ./script/bqetl in view scripts 2021-08-04 14:49:38 -07:00
Anna Scholtz cb84f91d9f Revert "Revert "Add view tests""
This reverts commit 63764c72cc.
2021-08-04 14:49:38 -07:00
Anna Scholtz 63764c72cc Revert "Add view tests"
This reverts commit d0dfeb9701.
2021-07-29 10:47:24 -07:00
Anna Scholtz d0dfeb9701 Add view tests 2021-07-29 10:07:53 -07:00
Anna Scholtz b60d4f4be2 CircleCI build check for fork 2021-07-12 14:10:20 -07:00
Anna Scholtz aafa54c346 Add separate CI step for SQL and routine tests 2021-07-12 14:10:20 -07:00
akkomar 374325adc7
Fix private SQL generation (#2177) 2021-07-09 14:18:17 +00:00
akkomar 9c313fe1dd
Push private generated SQL to private repository (#2174) 2021-07-09 13:57:36 +02:00
Anna Scholtz fe0ab35568 Use circleci-agent step halt 2021-06-24 13:56:53 -07:00
Anna Scholtz f3203ddd8c Update docs and add pull request template 2021-06-24 13:56:53 -07:00
Anna Scholtz 3d19b616c1 Fix GH workflow 2021-06-24 09:41:44 -07:00
Anna Scholtz e0f0e4a983 Fix GH action 2021-06-23 12:56:05 -07:00
Anna Scholtz dfdf195823 CircleCI manual-trigger-required-for-fork with parameter 2021-06-22 15:41:42 -07:00
Anna Scholtz 3af4b612aa Add CI step for integration trigger instructions 2021-06-22 15:41:42 -07:00
akkomar 11db3bf231
Bug 1709871 - Integrate private bigquery_etl into Docker image (#2110) 2021-06-11 09:47:09 -04:00
Arkadiusz Komarzewski 841a0b2cb7 Bug 1713631 - Publish public Docker image after private one 2021-06-09 21:37:08 +02:00
Arkadiusz Komarzewski 6837632855 Bug 1713631 - Publish Docker image to private GCR repository
This is a first step on a way to transitioning bigquery-etl to private GCR repository.
2021-06-07 13:04:25 +02:00
Arkadiusz Komarzewski aad7fd95e2 Bug 1709871 - Remove nightly scheduled build 2021-05-31 17:13:14 +02:00
Daniel Thorn 3c8894fdf1
Make schema validation part of dryrun (#2069) 2021-05-25 14:53:09 -04:00
Jeff Klukas 342431946d
The case of the missing sql/ directory (#2065)
Whitespace. Pass it on.
2021-05-24 11:47:21 -07:00
Jeff Klukas 05852b7098
Copy generated-sql content into docker container (#2062)
This also adds a scheduled build at midnight UTC to ensure that the generated
content is updated shortly before nightly ETL starts.

This should allow us to relieve some awkwardness like in various contexts
where we need to run `script/generate_sql` before invoking an endpoint in the
container. In particular, it will allow us to revert
https://github.com/mozilla/telemetry-airflow/pull/1312
and may allow us to remove [generation when publishing views in Jenkins](de1f9e61e2/projects/data-shared/Jenkinsfile.bigquery.prod (L139)).
2021-05-24 14:11:02 -04:00
Anna Scholtz 3e2e611ce6 Add tests 2021-05-19 12:51:11 -07:00
Anna Scholtz cee749c4ba Backfill with init option 2021-05-18 11:24:27 -07:00
Anna Scholtz e1bccb5530 Enable validate-schemas in CI 2021-05-07 12:30:04 -07:00
Daniel Thorn ca38326e91
Fix dependabot updates by separating pyjnius requirement (#1990) 2021-05-04 13:02:56 -04:00
whd 7c1b03934b
Default branch (#1939)
* Rename default branch

* Rename branch

* Update circleci for default branch name
2021-04-06 21:15:21 +00:00
Anthony Miyaguchi 0aacbe5c22
Pull out main logic and generate example queries for glean usage (#1937)
* Move argument parser into shared function

* Move shared main entrypoint into common

* Update example script to include other usage queries

* Commit generated queries for example usage queries

* Parallelize generation of example queries

* Add docstring

* Remove ios example queries for daily and last seen

* Fix pydocstyle linting

* Add update_example_glean_usage to CI
2021-04-06 11:38:30 -07:00
Anna Scholtz 576af2e2a3 Add mvn setup for circleci doc generation 2021-03-31 11:33:12 -07:00
Ben Wu 5049ca6551
Add check for ungenerated and untracked dags (#1900) 2021-03-17 16:28:06 -04:00
Jeff Klukas 2cab2338e9
Remove fxa_amplitude_email_clicks (#1892)
* Remove fxa_amplitude_email_clicks

Per @cvalaas, the SFMC email system is now retired, so the underlying table
here is no longer receiving updates.

Depends on https://github.com/mozilla/telemetry-airflow/pull/1268

* Add diff-filter

* yamllint hint

* More yamllint fixing
2021-03-15 14:02:20 -04:00
Daniel Thorn dfeea39ac5
Enforce more yaml lint rules (#1878) 2021-03-09 17:25:01 -05:00
Daniel Thorn 024e993c44
Record table references in metadata.yaml (#1875) 2021-03-09 12:29:05 -05:00
Anna Scholtz d220966394 Ignore materialized views when dry running 2021-03-03 15:46:45 -08:00
Jeff Klukas dd6ddee6b9
Use dataset labels to speed up stable view generation (#1863)
* Use dataset labels to speed up stable view generation

Builds on new dry run affordance from
https://github.com/mozilla/bigquery-etl/pull/1858

We also remove the `--no-dry-run` option now since only the single dry run
is now needed, and stable view generation completes in less than 2 seconds.
2021-03-02 15:05:39 -05:00
Daniel Thorn ba90a8c2b6
Fix query detection in dryrun (#1850) 2021-02-25 17:41:54 -05:00
Daniel Thorn 8b1ef6a913
Fix dryrun "'str' object has no attribute 'glob'" error (#1847) 2021-02-25 16:31:03 -05:00
Daniel Thorn 8cb740ba55
Limit dry run to modified queries (#1842) 2021-02-24 17:08:41 -08:00