Since it is footgun-prone, discourages upstreaming of useful development
tricks & is unnecessary in an environment variable centric world.
The one remaining `BZ_API_URL` setting isn't actively used, and if this
changes in the future, it should be set via an environment variable
instead.
Outputting to the console rather than a log file:
* is more user-friendly during development
* is more consistent with Heroku
* means the Vagrant-specific Django LOGGING config is now closer to the
one in settings.py, and so more easily combined with it
Both gunicorn and celery default to outputting to stdout/stderr, so the
`logfile` options can be omitted entirely.
In this commit, Sheriff access is still maintained in the
Treeherder DB, rather than using the scopes derived from
LDAP.
For local usage with Vagrant, this requires accessing
Treeherder with localhost instead of
local.treeherder.mozilla.org
Loggin in to the Django Admin is not enabled in this
branch. Do use the admin, you must first login through
the normal Treeherder front-end. Then the admin will
be accessible if the user has the privileges to do so.
Persona login will still be technically possible through the
login.taskcluster.net site. But that choice will go away
shortly.
As a new contributor to Treeherder, I was confused how to get
Treeherder to ingest several pushes. The celery worker appeared to
only ingest the last 10 pushes.
This commit enhances the "ingest_push" command to allow ingesting
the last N pushes. I've used this to ingest the last 100 pushes
to seed the database with sufficient pushlog data.
Now that Treeherder's data ingestion process doesn't hit it's own API:
* `./manage.py runserver` is less susceptible to memory issues.
* The runserver/gunicorn process doesn't need to be running whilst the
data ingestion takes place.
Since it's only used for local testing, so there's no need to populate
description or set an owner. (They can always be added afterwards via
the Django admin UI).
Prior to this the command would fail if the provided owner email address
did not correspond to a valid user, which meant the added hassle of
creating a user prior, if one did not already exist.
This makes it harder to inadvertently use HTTPS with local Vagrant
hostnames, as well as reduces the number of config variables users of
the client have to keep track of.
The docs have been tweaked to encourage people using production
Treeherder to just omit the `server_url` argument entirely, which
reduces the boilerplate, and also means they'll be less affected by
changes in the future.
Rename ``ingest_from_pulse`` management command to ``read_pulse_jobs`` to
indicate that this step does not actually do any ingesting. It just populates
the celery queue ``store_pulse_jobs`` that DOES do the actual ingesting.
Since several people have not had logging set up when trying to debug
issues using the client/API, so instead of a helpful error message (eg
reminding them to sync their clock for Hawk auth), they get:
`No handlers could be found for logger "thclient.client"`
We're soon going to start blacklisting some default scripting User
Agents, to try and make accidental API abuse easier to trace back to the
source.
The docs are being updated in advance, so that the newsgroup posts have
the real pages available to link to.
For content that's not specifically related to only one of submitting or
retrieving data from the Treeherder API, to avoid duplication.
Ideally the submitting/retrieving sections would be nested under this
new REST API section, however there isn't a way to get Sphinx to do this
that doesn't then mis-display the subheadings:
http://stackoverflow.com/questions/25276415/prevent-sub-section-nesting-in-python-sphinx-when-using-toctree
Since npm 3 (which ships with nodejs 5+) fixes many of the previous
issues, making the third-party npm-shrinkwrap tool redundant (the tool
is also not compatible with npm 3).
New resultsets will still store a value in their ``revision_hash`` field, but it will
just be the same value as their ``long_revision`` field.
This will log an exception in New Relic when a new resultset or job is posted
to the API with only a ``revision_hash``and not a ``revision`` value.
This also switches to using the longer 40 char revisions along side the
12 char revisions. But we leverage the longer ones for most actions. The
short revisions are stored and used so that people and the UI can support
locating a resultset (or setting ranges) with short revisions.
Vagrant now supports a snapshots feature which makes this obsolete:
https://www.vagrantup.com/docs/cli/snapshot.html
By removing the 'scratch' Vm config, we avoid confusion console messages
when performing Vagrant commands, eg:
[~/src/treeherder]$ vagrant provision
==> default: Running provisioner: puppet...
...
==> default: Notice: Finished catalog run in 14.75 seconds
==> scratch: VM not created. Moving on...
* Recommend using the same `client_id` for stage/prod.
* Mention the need to file a bug (and where) for requesting approval.
* Explain prod approval needs submission to be working on stage first.
Avoids this redirect seen in prod gunicorn logs:
[05/Jan/2016:05:55:42 -0800] "GET /docs HTTP/1.1" 301 -
"http://treeherder.readthedocs.org/retrieving_data.html"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:46.0) Gecko/20100101 Firefox/46.0"
Also remove the duplication between the two pages, by having the
submitting data section not mention requesting credentials at all, and
leave that to the common tasks page instead.
If the worker is not running, any `apply_async()` calls are silently
thrown away, due to `ingest_push`'s use of `CELERY_ALWAYS_EAGER` and:
https://github.com/celery/celery/issues/2910
As such, running the worker after ingest_push doesn't help (since the
rabbitmq queues are empty) and so if people are interested in perf/log
data, then they must start the worker first instead.
In update.py, the line outputting revision.txt has to be moved later,
since the `dist/` directory won't exist until grunt build has run. In
addition, since `grunt build` removes the entire `dist/` directory, we
no longer need to manually remove *.gz.
We use the `--production` options for both `npm install` and
`grunt build`, so that the `devDependencies` in package.json are
ignored, and we only install/load the ones listed under `dependencies`
in package.json - since that's all that is required for the build.
We have to use `./node_modules/.bin/grunt` rather than `grunt`, since
grunt-cli is not installed globally on the treeherder admin machine for
greater isolation between stage and production.
Whilst the packages listed in package.json are pinned to exact versions,
they will have their own dependencies, which may be specified via
version ranges. In order to make production/local behaviour more
deterministic, these can be pinned too, using `npm shrinkwrap`.
However the stock shrinkwrap command has a few deficiencies, so we're
using a wrapper around it:
https://github.com/uber/npm-shrinkwrap
Note: Only packages listed under `dependencies` will be shrinkwrapped,
not those under `devDependencies`. This is because using the `--dev`
option (which would include the dev packages in npm-shrinkwrap.json)
means there would then be no way to way to exclude the dev packages when
installing in production.
For more information about shrinkwrap in general, see:
https://docs.npmjs.com/cli/shrinkwraphttp://tilomitra.com/why-you-should-use-npm-shrinkwrap/https://nodejs.org/en/blog/npm/managing-node-js-dependencies-with-shrinkwrap/
And also:
* Explain the process for request/approval on stage/prod
* Remove the unnecessary export_project_credentials step in the
"add a new repository" section, since Treeherder's ETL no longer uses
credentials.json.
Since with the new per-user Hawk credentials, the same auth object can
be used for the whole session, so should just be passed when
instantiating TreeherderClient.
Since they're not specific to the Django app 'webapp'.
Whilst we're there, the local & example settings files have been
renamed. In the future I'd like to combine settings_local.example.py
with puppet/files/treeherder/local.vagrant.py, but I'll do that in
another bug.
Heroku now generates it on deploy, and for stage/prod we generate it
fresh on the stage/prod branch and force push each time. As such, we
have no need for the directory on master, and by removing it we avoid
confusion when new contributors grep the repo.
As an added bonus, the stage/prod deploy script should fail if the dist
directory is missing, so the grunt build cannot be forgotten prior to
deploying. (Currently if it's forgotten, we end up deploying the ancient
dist directory from master that was last updated prior to us switching
to the new deployment strategy.)
This documentation instructs a user on how to setup their local
machine to ingest data from existing exchanges as well as posting
to their own to test their jobs.
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
Since it can be performed whilst `vagrant up` is running, and whilst
modifying the hosts file is not necessary for running the tests, most
people will still want to do it.
Created using |isort -p tests -rc .| and a couple of manual tweaks.
The order is:
* futures
* std library
* third party packages
* local imports
* relative local imports
...with each group ordered with "import x" before "from x import y", and
then alphabetically.
* Reinforce push must be < 4 hours old
* s/mozilla-central/mozilla-inbound/ (better example, since mozilla-inbound is more likely to have pushes less than 4 hours old)
Since bug 1140349, the objectstore endpoint has been deprecated, and
performs the same function as the jobs endpoint. Now that there are no
remaining submitters to it, let's remove it.
After the previous commit, the Objectstore is effectively "dead code".
So this commit removes all the dead code after anything left over in
the Objectstore has been drained and added to the DB.
This adds the ability to specify a custom log name and have the log
viewer use the ``logname`` param of the ``text_log_summary`` to get the
right log.
This also improves the error message returned by the /logslice/ API if a
log name is used that is not found.