diff --git a/docker-compose.yml b/docker-compose.yml index 011bdae27..2d5828b26 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -26,6 +26,7 @@ services: - SITE_URL=http://backend:8000/ - TREEHERDER_DEBUG=True - NEW_RELIC_INSIGHTS_API_KEY=${NEW_RELIC_INSIGHTS_API_KEY:-} + - PROJECTS_TO_INGEST=${PROJECTS_TO_INGEST:-autoland,try} entrypoint: './docker/entrypoint.sh' # We *ONLY* initialize the data when we're running the backend command: './initialize_data.sh ./manage.py runserver 0.0.0.0:8000' @@ -102,7 +103,7 @@ services: context: . dockerfile: docker/dev.Dockerfile environment: - - PULSE_URL=${PULSE_URL:-amqp://docker-shared-user:8r5VFxpJHtJahTVV5bYutykgDsXhGF@pulse.mozilla.org:5671/?ssl=1} + - PULSE_URL=${PULSE_URL:-amqp://docker-shared-user:oGv7P5%H94@pulse.mozilla.org:5671/?ssl=1} - LOGGING_LEVEL=INFO - PULSE_AUTO_DELETE_QUEUES=True - DATABASE_URL=mysql://root@mysql:3306/treeherder @@ -123,7 +124,7 @@ services: environment: - CELERY_BROKER_URL=amqp://guest:guest@rabbitmq:5672// - DATABASE_URL=mysql://root@mysql:3306/treeherder - - PROJECTS_TO_INGEST=${PROJECTS_TO_INGEST:-autoland} + - PROJECTS_TO_INGEST=${PROJECTS_TO_INGEST:-autoland,try} entrypoint: './docker/entrypoint.sh' command: celery worker -A treeherder --uid=nobody --gid=nogroup --without-gossip --without-mingle --without-heartbeat -Q store_pulse_pushes,store_pulse_tasks --concurrency=1 --loglevel=WARNING volumes: diff --git a/docs/installation.md b/docs/installation.md index 60bb258cb..66acc27ba 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -67,6 +67,9 @@ To get started: ### Starting a local Treeherder instance +By default, data will be ingested from autoland and try using a shared pulse account. However, if you want to use your own pulse account or change the repositories being ingested, you need to +export those env variables in the shell first or inline with the command below. See [Pulse Ingestion Configuration](pulseload.md#pulse-ingestion-configuration) for more details. + - Open a shell, cd into the root of the Treeherder repository, and type: ```bash @@ -78,10 +81,7 @@ To get started: - Visit in your browser (NB: not port 8000). Both Django's runserver and webpack-dev-server will automatically refresh every time there's a change in the code. - - -!!! note - There will be no data to display until the ingestion tasks are run. +Proceed to [Running the ingestion tasks](#running-the-ingestion-tasks) to get data. ### Using the minified UI @@ -106,6 +106,8 @@ production UI from `.build/` instead. In addition to being minified and using th non-debug versions of React, the assets are served with the same `Content-Security-Policy` header as production. +Proceed to [Running the ingestion tasks](#running-the-ingestion-tasks) to get data. + ### Running full stack with a custom DB setting If you want to develop both the frontend and backend, but have the database pointing to @@ -121,7 +123,7 @@ Alternatively, you can `export` that value in your terminal prior to executing `docker-compose up` or just specify it on the command line as you execute: ```bash -DATABASE_URL=mysql://user:password@hostname/treeherder docker-compose up +DATABASE_URL=mysql://user:password@hostname/treeherder SKIP_INGESTION=True docker-compose up ``` @@ -143,19 +145,28 @@ docker volume rm treeherder_mysql_data ### Running the ingestion tasks -Ingestion tasks populate the database with version control push logs, queued/running/completed jobs & output from log parsing, as well as maintain a cache of intermittent failure bugs. To run these: +Celery tasks include storing of pushes, tasks and parsing logs (which provides failure lines and performance data) and generating alerts (for Perfherder). You can either run all the queues or run only specific queues. -- Start up a celery worker to process async tasks: +Open a new shell tab. To run all the queues type: - ```bash - docker-compose run backend celery -A treeherder worker --concurrency 1 - ``` +```bash +docker-compose run -e PROJECTS_TO_INGEST=autoland backend celery -A treeherder worker --concurrency 1 +``` -- Then in a new terminal window, run `docker-compose run backend bash`, and follow the steps from the [loading pulse data](pulseload.md) page. + +!!! note + If you skip the `PROJECTS_TO_INGEST` flag, this command will store pushes and tasks from all repositories and will take a very long time for data to load. However, you can change it to ingest from other repositories if you wish. + +You can find a list of different celery queues in the the CELERY_TASK_QUEUES variable in the [settings] file. For instance, to +only store tasks and pushes and omit log parsing: + +```bash +docker-compose run -e PROJECTS_TO_INGEST=autoland backend celery -A treeherder worker -Q store_pulse_tasks,store_pulse_pushes --concurrency 1 +``` ### Manual ingestion -`NOTE`; You have to include `--root-url https://community-tc.services.mozilla.com` in order to ingest from the [Taskcluster Community instance](https://community-tc.services.mozilla.com), otherwise, it will default to the Firefox CI. +`NOTE`: You have to include `--root-url https://community-tc.services.mozilla.com` in order to ingest from the [Taskcluster Community instance](https://community-tc.services.mozilla.com), otherwise, it will default to the Firefox CI. Open a terminal window and run `docker-compose up`. All following sections assume this step. @@ -216,3 +227,4 @@ Continue to **Working with the Server** section after looking at the [Code Style [yarn]: https://yarnpkg.com/en/docs/install [package.json]: https://github.com/mozilla/treeherder/blob/master/package.json [eslint]: https://eslint.org +[settings]: https://github.com/mozilla/treeherder/blob/master/treeherder/config/settings.py#L318 diff --git a/docs/pulseload.md b/docs/pulseload.md index aae4a4fa6..b08c7af20 100644 --- a/docs/pulseload.md +++ b/docs/pulseload.md @@ -1,7 +1,7 @@ -# Loading Pulse data +# Pulse Ingestion Configuration By default, running the Docker container with `docker-compose up` will ingest data -from the `autoland` repo using a shared [Pulse Guardian] user. You can configure this the following ways: +from the `autoland` and `try` repositories using a shared [Pulse Guardian] user. You can configure this the following ways: 1. Specify a custom set of repositories for which to ingest data 2. Create a custom **Pulse User** on [Pulse Guardian] @@ -42,26 +42,6 @@ See [Starting a local Treeherder instance] for more info. [starting a local treeherder instance]: installation.md#starting-a-local-treeherder-instance -## Advanced Celery Configuration - -If you only want to ingest the Pushes and Tasks, then the default will do that for you. -But if you want to do other processing like parsing logs, etc, then you can specify the other queues -you would like to process. - -Open a new terminal window. To run all the queues do: - -```bash -docker-compose run backend celery -A treeherder worker --concurrency 1 -``` - -You will see a list of activated queues. If you wanted to narrow that down, then note -which queues you'd like to run and add them to a comma-separated list. For instance, to -only do Log Parsing for sheriffed trees (autoland, mozilla-*): - -```bash -docker-compose run backend celery -A treeherder worker -Q log_parser,log_parser_fail_raw_sheriffed,log_parser_fail_json_sheriffed --concurrency 1 -``` - ## Posting Data To post data to your own **Pulse** exchange, you can use the `publish_to_pulse` @@ -82,3 +62,4 @@ ex: [pulse guardian]: https://pulseguardian.mozilla.org/whats_pulse [yml schema]: https://github.com/mozilla/treeherder/blob/master/schemas/pulse-job.yml +[settings]: https://github.com/mozilla/treeherder/blob/master/treeherder/config/settings.py#L318 diff --git a/tests/etl/test_job_loader.py b/tests/etl/test_job_loader.py index 2a429b220..8ba15c471 100644 --- a/tests/etl/test_job_loader.py +++ b/tests/etl/test_job_loader.py @@ -5,10 +5,10 @@ import pytest import responses import slugid -from treeherder.etl.exceptions import MissingPushException from treeherder.etl.job_loader import JobLoader from treeherder.etl.taskcluster_pulse.handler import handleMessage from treeherder.model.models import Job, JobLog, TaskclusterMetadata +from django.core.exceptions import ObjectDoesNotExist @pytest.fixture @@ -233,7 +233,7 @@ def test_ingest_pulse_jobs_with_missing_push(pulse_jobs): status=200, ) - with pytest.raises(MissingPushException): + with pytest.raises(ObjectDoesNotExist): for pulse_job in pulse_jobs: jl.process_job(pulse_job, 'https://firefox-ci-tc.services.mozilla.com')