2015-09-25 23:31:45 +03:00
Loading Pulse data
==================
2018-04-04 03:07:07 +03:00
For ingestion from **Pulse** exchanges, on your local machine, you can choose
2015-09-25 23:31:45 +03:00
to ingest from any exchange you like. Some exchanges will be registered in
2015-10-08 17:18:40 +03:00
``settings.py`` for use by the Treeherder servers. You can use those to get the
2015-09-25 23:31:45 +03:00
same data as Treeherder. Or you can specify your own and experiment with
posting your own data.
2018-07-27 19:41:11 +03:00
2018-04-04 03:07:07 +03:00
The Simple Case
---------------
2015-09-25 23:31:45 +03:00
2018-04-04 03:07:07 +03:00
If you just want to get the same data that Treeherder gets, then you have 3 steps:
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
1. Create a user on [Pulse Guardian] if you don't already have one
2018-08-01 18:33:02 +03:00
2. Create your ``PULSE_URL`` string
2018-04-04 03:07:07 +03:00
3. Open a Vagrant terminal to read Pushes
4. Open a Vagrant terminal to read Jobs
5. Open a Vagrant terminal to run **Celery**
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
### 1. Pulse Guardian
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
Visit [Pulse Guardian], sign in, and create a **Pulse User** . It will ask you to set a
2018-04-04 03:07:07 +03:00
username and password. Remember these as you'll use them in the next step.
Unfortunately, **Pulse** doesn't support creating queues with a guest account, so
this step is necessary.
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
### 2. Environment Variable
2015-09-25 23:31:45 +03:00
2018-04-04 03:07:07 +03:00
If your **Pulse User** was username: ``foo`` and password: ``bar``, your config
2018-07-27 19:41:11 +03:00
string would be:
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
```bash
2018-08-01 18:33:02 +03:00
PULSE_URL="amqp://foo:bar@pulse.mozilla.org:5671/?ssl=1"
2018-07-27 19:41:11 +03:00
```
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
### 3. Read Pushes
```eval_rst
2018-04-04 03:07:07 +03:00
.. note:: Be sure your Vagrant environment is up-to-date. Reload it and run ``vagrant provision`` if you're not sure.
2018-07-27 19:41:11 +03:00
```
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
``ssh`` into Vagrant, then set your config environment variable:
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
```bash
2018-08-01 18:33:02 +03:00
export PULSE_URL="amqp://foo:bar@pulse.mozilla.org:5671/?ssl=1"
2018-07-27 19:41:11 +03:00
```
2018-04-04 03:07:07 +03:00
Next, run the Treeherder management command to read Pushes from the default **Pulse**
2018-07-27 19:41:11 +03:00
exchange:
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
```bash
./manage.py read_pulse_pushes
```
2018-04-04 03:07:07 +03:00
You will see a list of the exchanges it has mounted to and a message for each
push as it is read. This process does not ingest the push into Treeherder. It
adds that Push message to a local **Celery** queue for ingestion. They will be
ingested in step 5.
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
### 4. Read Jobs
2018-04-04 03:07:07 +03:00
2018-08-01 18:33:02 +03:00
As in step 3, open a Vagrant terminal and export your ``PULSE_URL``
2018-07-27 19:41:11 +03:00
variable. Then run the following management command:
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
```bash
./manage.py read_pulse_jobs
```
2015-09-25 23:31:45 +03:00
2018-04-04 03:07:07 +03:00
You will again see the list of exchanges that your queue is now mounted to and
a message for each Job as it is read into your local **Celery** queue.
2018-07-27 19:41:11 +03:00
### 5. Celery
2018-04-04 03:07:07 +03:00
Open your next Vagrant terminal. You don't need to set your environment variable
2018-07-27 19:41:11 +03:00
in this one. Just run **Celery** :
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
```bash
celery -A treeherder worker -B --concurrency 5
```
2018-04-04 03:07:07 +03:00
That's it! With those processes running, you will begin ingesting Treeherder
2018-07-27 19:41:11 +03:00
data. To see the data, you will need to run the Treeherder UI and API.
See [Running the unminified UI with Vagrant] for more info.
[Running the unminified UI with Vagrant]: ui/installation.html#running-the-unminified-ui-with-vagrant
2018-04-04 03:07:07 +03:00
Advanced Configuration
----------------------
2018-08-03 11:29:47 +03:00
### Changing which Data to Ingest
2018-04-04 03:07:07 +03:00
2018-08-03 11:29:47 +03:00
``treeherder.services.pulse.sources`` provides default sources for both Jobs and Pushes.
However you can override these defaults using the standard env methods show below.
2018-04-04 03:07:07 +03:00
2018-08-03 11:29:47 +03:00
#### Pushes
``PULSE_PUSH_SOURCES`` defines a list of dictionaries with exchange and routing key strings.
2018-07-27 19:41:11 +03:00
```bash
2018-08-03 11:29:47 +03:00
export PULSE_PUSH_SOURCES='[{"exchange": "exchange/taskcluster-github/v1/push","routing_keys": ["bugzilla#"]}]'
```
#### Jobs
``PULSE_JOB_EXCHANGES`` defines a list of exchanges to listen to.
```bash
export PULSE_JOB_EXCHANGES="exchange/taskcluster-treeherder/v1/jobs,exchange/fxtesteng/jobs"
2018-07-27 19:41:11 +03:00
```
2018-04-04 03:07:07 +03:00
To change which exchanges you listen to for pushes, you would modify
2018-08-03 11:29:47 +03:00
``PULSE_PUSH_SOURCES``. For instance, to get only Gitbub pushes for Bugzilla,
2018-07-27 19:41:11 +03:00
you would set:
2018-04-04 03:07:07 +03:00
2018-08-03 11:29:47 +03:00
``PULSE_JOB_PROJECTS`` defines a list of projects to listen to.
2018-07-27 19:41:11 +03:00
```bash
2018-08-03 11:29:47 +03:00
export PULSE_JOB_PROJECTS="try,mozilla-central"
2018-07-27 19:41:11 +03:00
```
2018-04-04 03:07:07 +03:00
2018-08-23 20:55:33 +03:00
The source settings are combined such that all `projects` are applied to **each** `exchange` .
2018-08-03 11:29:47 +03:00
The example settings above would produce the following settings:
```python
[{
"exchange": "exchange/taskcluster-treeherder/v1/jobs",
"projects": [
"try",
"mozilla-central",
],
}, {
"exchange": "exchange/fxtesteng/jobs",
"projects": [
"try",
"mozilla-central",
],
}]
```
2018-07-27 19:41:11 +03:00
### Advanced Celery options
2018-04-04 03:07:07 +03:00
If you only want to ingest the Pushes and Jobs, but don't care about log parsing
and all the other processing Treeherder does, then you can minimize the **Celery**
2018-07-27 19:41:11 +03:00
task. You will need:
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
```bash
celery -A treeherder worker -B -Q pushlog,store_pulse_jobs,store_pulse_resultsets --concurrency 5
```
2018-04-04 03:07:07 +03:00
* The ``pushlog`` queue loads up to the last 10 Mercurial pushes that exist.
* The ``store_pulse_resultsets`` queue will ingest all the pushes from the exchanges
specified in ``PULSE_PUSH_SOURCES``. This can be Mercurial and Github
* The ``store_pulse_jobs`` queue will ingest all the jobs from the exchanges
2018-08-03 11:29:47 +03:00
specified in ``PULSE_JOB_EXCHANGES``.
2018-04-04 03:07:07 +03:00
2018-07-27 19:41:11 +03:00
```eval_rst
2018-04-04 03:07:07 +03:00
.. note:: Any job that comes from **Pulse** that does not have an associated push will be skipped.
.. note:: It is slightly confusing to see ``store_pulse_resultsets`` there. It is there for legacy reasons and will change to ``store_pulse_pushes`` at some point.
2018-07-27 19:41:11 +03:00
```
2015-09-25 23:31:45 +03:00
Posting Data
------------
2018-04-04 03:07:07 +03:00
To post data to your own **Pulse** exchange, you can use the ``publish_to_pulse``
2015-09-25 23:31:45 +03:00
management command. This command takes the ``routing_key``, ``connection_url``
and ``payload_file``. The payload file must be a ``JSON`` representation of
2018-07-27 19:41:11 +03:00
a job as specified in the [YML Schema].
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
Here is a set of example parameters that could be used to run it:
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
```bash
./manage.py publish_to_pulse mozilla-inbound.staging amqp://treeherder-test:mypassword@pulse.mozilla.org:5672/ ./scratch/test_job.json
```
2015-09-25 23:31:45 +03:00
2018-07-27 19:41:11 +03:00
You can use the handy [Pulse Inspector] to view messages in your exchange to
2015-09-25 23:31:45 +03:00
test that they are arriving at Pulse the way you expect.
2018-07-27 19:41:11 +03:00
[Pulse Guardian]: https://pulseguardian.mozilla.org/whats_pulse
[Pulse Inspector]: https://tools.taskcluster.net/pulse-inspector/
[YML Schema]: https://github.com/mozilla/treeherder/blob/master/schemas/pulse-job.yml