A docker image that runs a script to export Leanplum data into BigQuery
Перейти к файлу
Ben Wu ea4ed32763
Add destination dataset name to temp table name (fixes #11) (#13)
2019-12-12 12:36:49 -05:00
.circleci Modifying CircleCI to push to GCR rather than dockerhub 2019-10-17 17:27:30 -07:00
bin Add initial skeleton for job 2019-10-08 11:07:55 -04:00
leanplum_data_export Add destination dataset name to temp table name (fixes #11) (#13) 2019-12-12 12:36:49 -05:00
requirements Add flake8 to image 2019-10-14 10:43:59 -04:00
tests Add destination dataset name to temp table name (fixes #11) (#13) 2019-12-12 12:36:49 -05:00
.flake8 Add flake8 config file 2019-10-14 12:36:44 -04:00
Dockerfile Add initial skeleton for job 2019-10-08 11:07:55 -04:00
Makefile Add flake8 config file 2019-10-14 12:36:44 -04:00
README.md Fix ordering of args in examples 2019-10-18 06:35:29 -04:00
docker-compose.yml Make locally running app work 2019-10-11 14:13:36 -04:00
setup.py Add initial skeleton for job 2019-10-08 11:07:55 -04:00

README.md

CircleCI

Leanplum Data Export

This repository is the job to export Mozilla data from Leanplum and into BQ.

It's dockerized to run on GKE. To run locally:

pip install .
leanplum-data-export export-leanplum \
  --app-id $LEANPLUM_APP_ID \
  --client-key $LEANPLUM_CLIENT_KEY \
  --date 20190101 \
  --bucket gcs-leanplum-export \
  --table-prefix leanplum \
  --bq-dataset dev_external \
  --prefix dev

Doing it this way will, by default, use your local GCP credentials. GCP only allows you to do this a few times

Alternatively, run in Docker.

First, create a service account with access to GCS and BigQuery. Download a JSON key file and make it available in your environment as GCLOUD_SERVICE_KEY. Then run:

bq mk leanplum
make run COMMAND="leanplum-data-export export-leanplum \
  --app-id $LEANPLUM_APP_ID \
  --client-key $LEANPLUM_CLIENT_KEY \
  --date 20190101 \
  --bucket gcs-leanplum-export \
  --table-prefix leanplum \
  --bq-dataset leanplum \
  --prefix dev"

That will create the dataset in BQ, download the files, and make them available in BQ in that dataset as external tables.

Development and Testing

While iterating on development, we recommend using virtualenv to run the tests locally.

Run tests locally

Install requirements locally:

python3 -m virtualenv venv
source venv/bin/activate
make install-requirements

Run tests locally:

pytest tests/

Run tests in docker

You can run the tests just as CI does by building the container and running the tests.

make clean && make build
make test

Deployment

This project deploys automatically to Dockerhub. The latest release is used to run the job.