Update ci config, dockerfile, makefile
This commit is contained in:
Родитель
f66f13ca12
Коммит
3747e0b5d9
|
@ -1,5 +1,16 @@
|
|||
version: 2.1
|
||||
|
||||
commands:
|
||||
restore-docker:
|
||||
description: Restore Docker image cache
|
||||
steps:
|
||||
- setup_remote_docker
|
||||
- restore_cache:
|
||||
key: v1-{{.Branch}}
|
||||
- run:
|
||||
name: Restore Docker image cache
|
||||
command: docker load -i /cache/docker.tar
|
||||
|
||||
jobs:
|
||||
build:
|
||||
docker:
|
||||
|
@ -19,18 +30,21 @@ jobs:
|
|||
- /cache/docker.tar
|
||||
|
||||
test:
|
||||
docker:
|
||||
docker: &docker
|
||||
- image: docker:18.06.0-ce
|
||||
steps:
|
||||
- setup_remote_docker
|
||||
- restore_cache:
|
||||
key: v1-{{.Branch}}
|
||||
- run:
|
||||
name: Restore Docker image cache
|
||||
command: docker load -i /cache/docker.tar
|
||||
- restore-docker
|
||||
- run:
|
||||
name: Test Code
|
||||
command: docker run app:build test
|
||||
command: docker run app:build make test
|
||||
|
||||
lint:
|
||||
docker: *docker
|
||||
steps:
|
||||
- restore-docker
|
||||
- run:
|
||||
name: Lint Code
|
||||
command: docker run app:build make lint
|
||||
|
||||
workflows:
|
||||
main:
|
||||
|
@ -40,3 +54,7 @@ workflows:
|
|||
- test:
|
||||
requires:
|
||||
- build
|
||||
|
||||
- lint:
|
||||
requires:
|
||||
- build
|
||||
|
|
|
@ -0,0 +1,3 @@
|
|||
[flake8]
|
||||
max-line-length = 100
|
||||
exclude = venv/*
|
10
Dockerfile
10
Dockerfile
|
@ -8,10 +8,14 @@ ARG HOME="/app"
|
|||
|
||||
ENV HOME=${HOME}
|
||||
RUN groupadd --gid ${USER_ID} ${GROUP_ID} && \
|
||||
useradd --create-home --uid ${USER_ID} --gid ${GROUP_ID} --home-dir /app ${GROUP_ID}
|
||||
useradd --create-home --uid ${USER_ID} --gid ${GROUP_ID} --home-dir ${HOME} ${GROUP_ID}
|
||||
|
||||
RUN pip install --upgrade pip
|
||||
|
||||
COPY requirements.txt ./
|
||||
COPY requirements.dev.txt ./
|
||||
RUN pip install -r requirements.dev.txt
|
||||
|
||||
WORKDIR ${HOME}
|
||||
|
||||
COPY requirements.txt ./
|
||||
|
@ -20,8 +24,8 @@ RUN pip install -r requirements.dev.txt
|
|||
|
||||
COPY . .
|
||||
|
||||
RUN pip install .
|
||||
|
||||
# Drop root and change ownership of the application folder to the user
|
||||
RUN chown -R ${USER_ID}:${GROUP_ID} ${HOME}
|
||||
USER ${USER_ID}
|
||||
|
||||
ENTRYPOINT ["/app/entrypoint"]
|
||||
|
|
|
@ -0,0 +1,11 @@
|
|||
.PHONY: install lint test
|
||||
|
||||
install:
|
||||
pip install -r requirements.dev.txt
|
||||
pip install .
|
||||
|
||||
lint:
|
||||
flake8
|
||||
|
||||
test:
|
||||
pytest tests/
|
55
README.md
55
README.md
|
@ -2,13 +2,58 @@
|
|||
|
||||
This Play Store export is a job to schedule backfills of Play Store data to BigQuery via the BigQuery Data Transfer service.
|
||||
|
||||
The purpose of this job is to continuously backfill past days over time.
|
||||
Past Play Store data has been found to still update over time
|
||||
(e.g. data from a day two weeks ago can still be updated)
|
||||
The purpose of this job is to be scheduled to run regularly in order to continuously backfill past days over time.
|
||||
Past Play Store data has been found to still update over time (e.g. data from a day two weeks ago can still be updated)
|
||||
so regular backfills of at least 30 days are required.
|
||||
The BigQuery Play Store transfer job has a non-configurable refresh
|
||||
window size of 7 days which is insufficient.
|
||||
This is an issue with the retained installers metric in particular.
|
||||
The BigQuery Play Store transfer job has a non-configurable refresh window size of 7 days which is insufficient.
|
||||
|
||||
These scripts require that a Play Store transfer config already exists and the current gcloud user has
|
||||
permission to create jobs in the project.
|
||||
|
||||
See [Google Play transfers documentation](https://cloud.google.com/bigquery-transfer/docs/play-transfer) for more details.
|
||||
|
||||
## Usage
|
||||
|
||||
Start a backfill using the `python3 play_store_export/export.py` script:
|
||||
```sh
|
||||
usage: export.py [-h] --date DATE --project PROJECT --transfer-config
|
||||
TRANSFER_CONFIG [--transfer-location TRANSFER_LOCATION]
|
||||
[--backfill-day-count BACKFILL_DAY_COUNT]
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--date DATE Date at which the backfill will start, going backwards
|
||||
--project PROJECT Either the project that the source GCS project belongs
|
||||
to or the project that contains the transfer config
|
||||
--transfer-config TRANSFER_CONFIG
|
||||
ID of the transfer config. This should be a UUID.
|
||||
--transfer-location TRANSFER_LOCATION
|
||||
Region of the transfer config (defaults to `us`)
|
||||
--backfill-day-count BACKFILL_DAY_COUNT
|
||||
Number of days to backfill
|
||||
```
|
||||
|
||||
## Develop
|
||||
|
||||
This project uses the BigQuery Data Transfer Python library:
|
||||
https://googleapis.dev/python/bigquerydatatransfer/latest/index.html
|
||||
|
||||
Install python dependencies with:
|
||||
```sh
|
||||
make install
|
||||
```
|
||||
|
||||
Run tests with:
|
||||
```sh
|
||||
make test
|
||||
```
|
||||
|
||||
Run linter with:
|
||||
```sh
|
||||
make lint
|
||||
```
|
||||
|
||||
A script to cancel all running transfer jobs exists in `play_store_export/cancel_transfers.py`
|
||||
which may be useful during development and testing.
|
||||
|
||||
|
|
|
@ -1,9 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
if [ "$1" = test ]; then
|
||||
exec pytest "${@:2}"
|
||||
else
|
||||
exec "$@"
|
||||
fi
|
|
@ -103,7 +103,7 @@ def wait_for_transfer(transfer_name: str, timeout: int = 1200, polling_period: i
|
|||
|
||||
|
||||
def start_export(project: str, transfer_config_name: str, transfer_location: str,
|
||||
base_date: datetime.date, backfill_day_count: int = 35):
|
||||
base_date: datetime.date, backfill_day_count: int):
|
||||
"""
|
||||
Start and wait for the completion of a backfill of `backfill_day_count` days, counting
|
||||
backwards from `base_date. The base date is included in the backfill and counts as a
|
||||
|
|
|
@ -1,2 +1,3 @@
|
|||
flake8==3.8.2
|
||||
pytest==5.4.3
|
||||
-r requirements.txt
|
||||
|
|
|
@ -0,0 +1,26 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# This Source Code Form is subject to the terms of the Mozilla Public
|
||||
# License, v. 2.0. If a copy of the MPL was not distributed with this
|
||||
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
||||
|
||||
from setuptools import setup, find_packages
|
||||
|
||||
readme = open("README.md").read()
|
||||
|
||||
setup(
|
||||
name="play_store_export",
|
||||
description="Scripts to export Play Store app data to BigQuery using Transfer Service",
|
||||
author="Ben Wu",
|
||||
author_email="bewu@mozilla.com",
|
||||
url="https://github.com/mozilla/leanplum_data_export",
|
||||
packages=find_packages(include=["play_store_export"]),
|
||||
package_dir={"play-store-export": "play_store_export"},
|
||||
python_requires=">=3.6.0",
|
||||
version="0.1.0",
|
||||
long_description=readme,
|
||||
include_package_data=True,
|
||||
license="Mozilla",
|
||||
)
|
Загрузка…
Ссылка в новой задаче