Anthony Miyaguchi
|
a1d3623542
|
Merge pull request #2 from mozilla/dependabot/pip/bleach-3.3.0
Bump bleach from 3.2.3 to 3.3.0
|
2021-02-02 16:27:36 -08:00 |
dependabot[bot]
|
a073283be8
|
Bump bleach from 3.2.3 to 3.3.0
Bumps [bleach](https://github.com/mozilla/bleach) from 3.2.3 to 3.3.0.
- [Release notes](https://github.com/mozilla/bleach/releases)
- [Changelog](https://github.com/mozilla/bleach/blob/master/CHANGES)
- [Commits](https://github.com/mozilla/bleach/compare/v3.2.3...v3.3.0)
Signed-off-by: dependabot[bot] <support@github.com>
|
2021-02-02 23:17:31 +00:00 |
Anthony Miyaguchi
|
3e414c6d6f
|
Merge pull request #1 from mozilla/docker
Add dockerfile for project
|
2021-02-01 10:05:54 -08:00 |
Anthony Miyaguchi
|
ef028e7e63
|
Export POSTGRES variables instead of running from within container
|
2021-01-28 17:16:10 -08:00 |
Anthony Miyaguchi
|
487e3c2896
|
Update README
|
2021-01-28 17:10:32 -08:00 |
Anthony Miyaguchi
|
c509e2a5f1
|
Use usr/bin/env and add py extension back for spark-submit
|
2021-01-28 17:10:01 -08:00 |
Anthony Miyaguchi
|
06d721dd44
|
Merge scripts into bin
|
2021-01-28 16:58:54 -08:00 |
Anthony Miyaguchi
|
8c731a6288
|
Upgrade to pyspark 3
|
2021-01-28 16:48:24 -08:00 |
Anthony Miyaguchi
|
4884bcb46f
|
Fix issues with python3 for spark, POSTGRES_HOST location
|
2021-01-28 16:47:48 -08:00 |
Anthony Miyaguchi
|
4dd3769abf
|
Force start_ds and end_ds to be configured by the job
|
2021-01-28 16:09:08 -08:00 |
Anthony Miyaguchi
|
f0b31ffa6c
|
Add initial dockerfile and docker-compose
|
2021-01-28 16:08:50 -08:00 |
Anthony Miyaguchi
|
ae0ca52ed7
|
Add script to export environment variables for aws access
|
2021-01-28 16:08:15 -08:00 |
Anthony Miyaguchi
|
716e3790e4
|
Add .env file with variables
|
2021-01-28 15:13:12 -08:00 |
Anthony Miyaguchi
|
4347ead942
|
Split dev and normal requirements
|
2021-01-28 15:12:51 -08:00 |
Anthony Miyaguchi
|
b03a99815b
|
Add updated dependencies using pip-tools
|
2021-01-28 15:07:38 -08:00 |
Anthony Miyaguchi
|
be82395e18
|
Add separate script for loading aggregates into BigQuery
|
2020-03-16 16:25:19 -07:00 |
Anthony Miyaguchi
|
ef3d082a9b
|
Update README
|
2020-03-13 15:52:27 -07:00 |
Anthony Miyaguchi
|
fda5aea0bf
|
Use rm -rf in pg_dump_by_day
|
2020-03-12 18:27:00 -07:00 |
Anthony Miyaguchi
|
013824acab
|
Skip missing days
|
2020-03-12 17:30:58 -07:00 |
Anthony Miyaguchi
|
73d883c790
|
Wrap rm with if
|
2020-03-12 17:25:41 -07:00 |
Anthony Miyaguchi
|
aec03d9772
|
Remove unused lines
|
2020-03-12 17:07:44 -07:00 |
Anthony Miyaguchi
|
a848b78ed9
|
Make start and end dates configurable
|
2020-03-12 17:04:20 -07:00 |
Anthony Miyaguchi
|
d66bb6886f
|
Skip BQ load by default
|
2020-03-11 12:44:31 -07:00 |
Anthony Miyaguchi
|
e6ccf29c9f
|
Add project to table qualifier
|
2020-03-11 12:43:51 -07:00 |
Anthony Miyaguchi
|
4a0c6e6d47
|
Add backfill process for a single week
|
2020-02-27 17:17:15 -08:00 |
Anthony Miyaguchi
|
f9f653bec4
|
Add ingest_date to output parquet since aggregates change over time
|
2020-02-27 15:59:45 -08:00 |
Anthony Miyaguchi
|
46530e7138
|
Remove files if dump already exists locally
|
2020-02-27 15:29:08 -08:00 |
Anthony Miyaguchi
|
ccb77ecd46
|
Add simple script to show data
|
2020-02-27 15:19:52 -08:00 |
Anthony Miyaguchi
|
1f5ab907c4
|
Fix build_ids
|
2020-02-27 15:19:35 -08:00 |
Anthony Miyaguchi
|
f1de856eb7
|
Update notebook for converting into parquet
|
2020-02-27 15:14:30 -08:00 |
Anthony Miyaguchi
|
81c5892bf1
|
Update parquet processing for string aggregate
|
2020-02-27 14:52:42 -08:00 |
Anthony Miyaguchi
|
7863789eb9
|
Add functional backfill script
|
2020-02-27 13:48:03 -08:00 |
Anthony Miyaguchi
|
9045b6bfb0
|
Create a backfill script
|
2020-02-27 13:12:53 -08:00 |
Anthony Miyaguchi
|
bd5108b737
|
Add section for processing pg_dump into parquet
|
2019-12-17 16:15:58 -08:00 |
Anthony Miyaguchi
|
a3091cd8dd
|
Add a script to process pg_dump into parquet
|
2019-12-17 16:11:57 -08:00 |
Anthony Miyaguchi
|
849a211bd1
|
Update .gitignore
|
2019-12-17 16:11:10 -08:00 |
Anthony Miyaguchi
|
352aae7165
|
Add notebook for exploration into parsing aggregates
|
2019-12-17 14:43:06 -08:00 |
Anthony Miyaguchi
|
cb341be9ce
|
Add notebook for converting a day of dumps into parquet
|
2019-12-16 17:25:42 -08:00 |
Anthony Miyaguchi
|
d6d0b1d055
|
Add initial scripts for dumping mozaggregator
|
2019-12-16 15:52:08 -08:00 |