Граф коммитов

95 Коммитов

Автор SHA1 Сообщение Дата
Ed Morley b2ecb99390 Bug 1213230 - Remove peep.py
Since it's now unused.
2016-02-15 12:06:17 +00:00
Ed Morley aff281fa78 Bug 1246208 - Heroku: Remove manual collectstatic workaround
The Python buildpack has now rewritten the automatic collectstatic
feature, such that it no longer does an unnecessary (and time-consuming)
dry-run every time. As such, we can switch back to the buildpack's
automatic collectstatic:
https://github.com/heroku/heroku-buildpack-python/blob/master/bin/steps/collectstatic

Prior to this landing, I'll update us to the latest version of the
buildpack (using the `heroku buildpacks:set -i X ...` command) and
remove the `DISABLE_COLLECTSTATIC` environment variable.

[ci skip]
2016-02-08 10:21:12 +00:00
Ed Morley 6080a6cf95 Bug 1242471 - Heroku: Remove now obsolete node PATH workaround
Since the original issue has since been fixed.
2016-01-26 12:45:44 +00:00
Ed Morley 32222f4fe2 Bug 1242471 - Heroku: Remove nodejs files at the end of the compile step
Since once the `grunt build` has run they are no longer required, and
only serve to bloat the slug size, increase attack surface & overwrite
the Python .profile.d script's environment variables.
2016-01-26 12:45:44 +00:00
Cameron Dawson f3c0258a13 Bug 1241583 - celery scripts include autoclassify tasks that shouldn't be currently running
These tasks should not be currently running on stage and prod.  They
will be activated in a later PR.

r=emorley@mozilla.com
2016-01-21 11:02:51 -08:00
Ed Morley 8c5e12eadb Bug 1241144 - Update peep from v2.5.0 to v3.0.0
Is now compatible with pip 8.x.

https://github.com/erikrose/peep/releases/tag/3.0
https://github.com/erikrose/peep/compare/2.5...3.0
2016-01-21 11:12:05 +00:00
Ed Morley e98ff52d90 Bug 1230104 - Update peep from v2.4.1 to v2.5.0
* Is now compatible with pip 7.x
* Adds a `peep port` command to facilitate the transition to pip 8's
hashing feature:
https://pip.pypa.io/en/latest/reference/pip_install/#hash-checking-mode
* Fixes bug in which the right way to call `parse_requirements()` would
not be autodetected.

https://github.com/erikrose/peep/compare/2.4.1...2.5
2015-12-03 10:50:39 +00:00
William Lachance a8e663d61d Bug 1228154 - Generate new performance alerts as data is ingested 2015-12-02 13:20:17 -05:00
Ed Morley 620228cbf3 Bug 1196764 - Rename calculate_eta to calculate_durations
Since we're not calculating ETAs (the UI does that once it knows the
start time and expected duraction), we're calculating recent average
durations instead.
2015-11-30 11:36:18 +00:00
Alice Scarpa 5d9e430cac Bug 1194830: Add a runnable_job API endpoint
This creates a 'runnable_job' table in the database, as well as an API
endpoint at /api/project/{branch}/runnable_jobs listing all existing
buildbot jobs and their symbols. A new daily task 'fetch_allthethings' is
added to update the this table.
2015-11-14 13:56:06 -02:00
Ed Morley 50cec5c1fe Bug 1221806 - Change the stage/prod Gunicorn timeout from 120s to 30s
To help prevent excessively demanding queries from having a detrimental
effect on the API/DBs.
2015-11-05 14:22:30 +00:00
Ed Morley 49f4929236 Bug 1210367 - Heroku: Run collectstatic manually to speed up deploy
The Python buildpack's automatic collectstatic is slower, since it does
a `--dry-run` first. To avoid the time penalty of this, we disable it by
setting `DISABLE_COLLECTSTATIC` in the Heroku environment, and run
collectstatic manually in `bin/post_compile`. See:
https://github.com/heroku/heroku-buildpack-python/issues/252
2015-10-10 22:44:49 +01:00
Ed Morley a5999ac2b9 Bug 1197186 - Move wsgi.py to a generic config/ directory
Since it's not specific to the Django app 'webapp'.
2015-10-08 19:59:44 +01:00
Ed Morley 0c79292d68 Bug 1201523 - Run grunt build during the Heroku deploy
This commit relies on the nodejs buildpack being added to the list of
buildpacks for the app, and prior to the Python buildpack. See:
https://devcenter.heroku.com/articles/using-multiple-buildpacks-for-an-app

The nodejs buildpack will automatically install the packages listed in
`dependencies` in package.json, so that we have the requirements for
the grunt build. We don't actually need node or all of the files in
node_modules after we've run the grunt build, so in the future could try
and remove them to reduce the resultant slug size (though it only
increased from 55MB to 70MB, so it's not urgent).

The dist directory has been added to `.slugignore` to prevent the
in-repo directory from being uploaded, since we'll be generating a new
one as part of the deploy. Once `dist/` is deleted from master, that
entry can be removed.
2015-09-30 18:34:56 +01:00
James Graham 7b6fa25402 Bug 1204942 - First cut at autoclassify / intermittent orange detection.
This adds an autoclassify command and a detect_intermittents command.
The former is designed to take an incoming job with an error summary
and look for existing results marked as intermittent that are a close
match for the new result. At present only one matcher is implemented;
this requires an exact match in terms of test name, result and error
message. Matching is also constrained to be based on single lines; it
is anticipated that future iterations may add support for matching on
groups of lines.

The detect_intermittents command is designed to take a group of jobs
running on the same push and with the same build job (i.e. same
testsuite, same chunk, etc.) and look for new intermittents to add to
the database. This currently only looks for test failures where there
is at least one green job and one non-green job.

There is currently no UI for seeing matches or for adding new
prototypical intermittents as match candidates. There is also no
integration with bugzilla; future development should add association
of frequent intermittents with bugs.
2015-09-21 22:47:19 +01:00
William Lachance 066f437ca5 Bug 1192976 - Refactor performance data + store in master db 2015-09-14 10:16:25 -04:00
Ed Morley 525866a553 Bug 1201517 - Export a revision.txt containing the Git SHA on Heroku too
The Git SHA is available in the SOURCE_VERSION environment variable:
https://devcenter.heroku.com/articles/buildpack-api#bin-compile-summary

Also update the What's Deployed links to include Heroku in the
comparison alongside stage/prod.
2015-09-08 17:22:24 +01:00
Mauro Doglio ee12c00a03 Bug 1182464 - Add celery task to store the error summary 2015-09-03 10:50:12 +02:00
Ed Morley f6a673a493 Bug 1160561 - Define IS_HEROKU in the dyno profile rather than Heroku env
`IS_HEROKU` isn't something that will differ between stage/prod, and is
not going to change any time soon. So let's just get post_compile to
add it to the dyno profile script, so it's one less variable to have to
remember to set via the Heroku CLI/dashboard.

Quoting from:
http://blog.doismellburning.co.uk/2014/10/06/twelve-factor-config-misunderstandings-and-advice/

"12factor says your applications should read their config from the
environment; it has very little to say about how you populate the
environment ? use whatever works for you".
2015-08-28 10:25:12 +01:00
Ed Morley fce5c7e3c0 Bug 1176412 - Override the hostname reported by New Relic on Heroku
On Heroku, the hostname is just a bunch of hex characters, which the New
Relic dashboard displays as "Dynamic Hostname". The New Relic Python
agent (as of v2.52.0.40) supports overriding this, by setting the
environment variable `NEW_RELIC_PROCESS_HOST_DISPLAY_NAME`:
https://docs.newrelic.com/docs/agents/python-agent/installation-configuration/python-agent-configuration#process_host-display_name

We use the value of `DYNO`, since this is set for us in each dyno's
environment, and is of form "web.2", "worker_pushlog.1" etc. Rather than
prefixing every line in the Procfile, we instead append the export to
the profile script created by the Python buildpack, so it's populated at
runtime on every dyno. See:
https://devcenter.heroku.com/articles/profiled

To do this we make use of the Python buildpack's `set-env` function:
ce3bdb37ba/bin/utils (L29-L32)

Similar to what the buildpack does for its own env defines:
ce3bdb37ba/bin/compile (L190-L198)
2015-08-28 10:25:11 +01:00
Ed Morley 5892c72eb2 Bug 1197796 - Make WhiteNoise serve the static assets gzipped
On Heroku, there is no load balancer or Varnish-like cache in front of
gunicorn, so we must handle gzipping responses in the app.

In order for WhiteNoise to serve gzipped static content, assets must be
gzipped on disk in advance (doing so on-demand in Python would not be
as performant). WhiteNoise will then serve the `.gz` version of files in
preference to the original, if the client indicated it supported gzip.

For assets covered by Django's collectstatic, gzipping the assets only
requires using WhiteNoise's GzipManifestStaticFilesStorage backend,
which wraps Django's ManifestStaticFilesStorage to create hashed+gzipped
versions of static assets:
http://whitenoise.evans.io/en/latest/django.html#add-gzip-and-caching-support

The collectstatic generated files will then contain the file hash in
their filename, so WhiteNoise can also serve them with a large max-age
to avoid further requests if the file contents have not changed.

For the UI files under `dist/`, we cannot rely on the Django storage
backend, since the directory isn't covered by STATICFILES_DIRS (it is
instead made known to WhiteNoise via `WHITENOISE_ROOT`). As such, files
under `dist/` are gzipped via an additional step during deployment. See:
http://whitenoise.evans.io/en/latest/base.html#gzip-support

Files whose extension is on the blacklist, or that are not >5% smaller
when compressed, are skipped during compression.
2015-08-26 22:10:05 +01:00
Ed Morley e401d3d26f Bug 1191080 - Use newrelic-admin run-program rather than manual init 2015-08-19 11:35:17 +01:00
Ed Morley 90ba77e596 Bug 1192801 - Remove per-file MPL boilerplate since it's unnecessary
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
2015-08-18 23:32:11 +01:00
Cameron Dawson 00cfe6643d Bug 1140349 - Remove the objectstore code
After the previous commit, the Objectstore is effectively "dead code".
So this commit removes all the dead code after anything left over in
the Objectstore has been drained and added to the DB.
2015-07-21 14:13:21 -07:00
Ed Morley b4720652d6 Bug 1181529 - Update peep to v2.4.1
* Tolerates `pip.__version__` being missing, which can happen in arcane
  situations during error handling, obscuring informative tracebacks.
* flake8 warnings are fixed again, so peep.py can be removed from the
  exclude list.

https://github.com/erikrose/peep/releases/tag/2.4.1
https://github.com/erikrose/peep/compare/2.4...2.4.1
2015-07-08 12:39:30 +01:00
Ed Morley 58813b0c51 Bug 1169916 - Stop using Cython to build the log parser
Since it only speeds up parsing by a few percent of total runtime, and
is therefore not worth the added complexity for deployment and local
hack-test-debug cycles when working on the log parser.

The .gitignore and update.py entries will be removed in a later commit,
once the stage/prod src directories have been cleaned up.
2015-06-30 14:51:57 +01:00
Ed Morley f1b0b6957f Bug 1131244 - Use newrelic-admin run-program when running celery beat
Copies the same pattern used in the other celery wrapper scripts.
2015-06-19 13:14:08 +01:00
Ed Morley 5b74d1d9c1 Bug 1164868 - Split the buildapi tasks into {pending,running,4hr} queues
Previously all three of the buildapi ETL ingestion tasks (pending,
running, 4hr) were run under one queue. In order to be able to tell
issues (such as backlogs or leaks) apart, these have now been split onto
their own queues.

On Heroku, these queues now also have a dyno each - which should mean we
can easily tell which is leaking and possibly also downgrade from the
expensive performance dyno even before the leak is fixed.
2015-05-21 15:44:58 +01:00
Cameron Dawson 358e90f685 Bug 1080760 - Auto-generate bug suggestions asynchronously
This introduces two new ways to generate ``Bug suggestions`` artifacts from
a ``text_log_summary`` artifact
1. POST a ``text_log_summary`` on the ``/artifact`` endpoint
2. POST a ``text_log_summary`` with a job on the ``/jobs`` endpoint.

Both of these cases will schedule an asynchronous task to generate the
``Bug suggestions`` artifact with ``celery``.

Artifact generation scenarios:

JobCollections
^^^^^^^^^^^^^^
Via the ``/jobs`` endpoint:

1. Submit a Log URL with no ``parse_status`` or ``parse_status`` set to "pending"
    * This will generate ``text_log_summary`` and ``Bug suggestions`` artifacts
    * Current *Buildbot* workflow

2. Submit a Log URL with ``parse_status`` set to "parsed" and a ``text_log_summary`` artifact
    * Will generate a ``Bug suggestions`` artifact only
    * Desired future state of *Task Cluster*

3. Submit a Log URL with ``parse_status`` of "parsed", with ``text_log_summary`` and ``Bug suggestions`` artifacts
    * Will generate nothing

ArtifactCollections
^^^^^^^^^^^^^^^^^^^
Via the ``/artifact`` endpoint:

1. Submit a ``text_log_summary`` artifact
    * Will generate a ``Bug suggestions`` artifact if it does not already exist for that job.

2. Submit ``text_log_summary`` and ``Bug suggestions`` artifacts
    * Will generate nothing
    * This is *Treeherder's* current internal log parser workflow
2015-05-20 16:28:32 -07:00
Jonathan French 19b71bc4b4 Bug 1164881 - Add MPL2.0 headers to recent treeherder repo files 2015-05-14 11:45:26 -04:00
Mauro Doglio 597282fe58 Bug 1145606 - Setup treeherder to deploy on heroku
I added a Procfile listing all the different python services treeherder needs.
Heroku provides deployment-specific settings via environment variables, so I had to modify the settings file to listen to them where that wasn't the case. I created an enviroment variable IS_HEROKU which allows to have a heroku-only configuration where needed.
The db service is provided by Amazon RDS, which requires a ssl connection. To enable ssl in the MySQLdb python client I had to modify Datasource (and bump up the version used).
The cache service is provided by the memcachier heroku addon. Heroku recommends to use pylibmc, so I set it up according to the docs here https://devcenter.heroku.com/articles/memcachier#python.
The amqp service is provided by the CloudAMQP addon.
I added a post_compile script that runs every time we deploy. We should run every build step we require in there, like static asset minification, collection, etc.
To share the oauth credentials among the various services I used an environment variable. I also added an option to export_project_credentials so that the credentials can be printed to stdout. This should come handy when we will need to update the environment-stored credentials with the ones in the db.
2015-05-14 13:54:41 +01:00
Ed Morley eaf0c2a792 Bug 1155160 - Remove script for generating the vendor directory
We're no longer using the vendor directory & this script wasn't entirely
reliable anyway, so let's remove it. The virtualenv package can be
removed from dev.txt, since virtualenv is installed globally, and
nothing inside our virtualenv (which is where the packages in dev.txt
end up) needs a local installation of it.
2015-04-22 11:23:32 +01:00
William Lachance 2719eab213 Bug 1156746 - Bump the number of concurrent log processing workers 2015-04-21 08:48:55 -04:00
Ed Morley 58cc851204 Bug 1146184 - Add virtualenv's bin to PATH rather than using activate
In order to rsync the virtualenv as part of deployment, we'll need to
use |virtualenv --relocatable|. However this does not update the
activate scripts (making them unusable), and the changes it makes in the
virtualenv bin directory require that the virtualenv's python is first
in the path:
https://virtualenv.pypa.io/en/latest/userguide.html#making-environments-relocatable

As such, we must make the Treeherder runner scripts add the virtualenv
bin to PATH instead of using the activate pattern.
2015-04-20 23:48:26 +01:00
Ed Morley a2d66a85d1 Bug 1146184 - Clean up the directory variables in runner scripts
The existing name "curr_dir" isn't overly clear - does it mean the
current working directory or the directory in which this script exists?
It should also be uppercase. I think the behaviour is clearer when
reworked like this :-)

There's a lot of other cleanup that could be done (eg reducing
duplication), but saving that for bug 1153971.
2015-04-20 23:48:26 +01:00
Ed Morley 5ee006eb01 Bug 1155293 - Update to peep v2.4
Copied verbatim from:
https://github.com/erikrose/peep/archive/2.4.zip

Changes:
https://github.com/erikrose/peep/compare/2.2...2.4
2015-04-17 17:38:40 +01:00
Ed Morley b1f14bb458 Bug 1153966 - Remove unused Pulse consumer code
Unlike the Pulse publishing, the code for consuming data from Pulse is
unused, being a leftover from initial attempts to ingest buildbot data
via pulse, rather than builds-{4hr,running,pending}.js
2015-04-14 00:21:50 +01:00
Ed Morley 87f6605de1 Bug 1153186 - Split up the high_priority task queue
...into 'classification_mirroring' and 'publish_to_pulse'. This gives
greater visibility into the relative queue sizes of each, allows us to
move one of them to another node, plus means we can pause consumption of
the classification_mirroring task for when the Elasticsearch indicies
are migrated in bug 1142538.
2015-04-13 18:55:34 +01:00
Ed Morley e8ad64842d Bug 1140850 - Remove gevent celery worker script from the repo
Since it's unused & we're moving away from gevent.
2015-04-13 17:12:44 +01:00
Ed Morley 19d0b51d2a Bug 1140882 - Use prefork scheduling for pushlog workers
Use prefork scheduling instead of gevent scheduling, to avoid issues
we've had with gevent - both with zombie tasks & also incompatibilities
with Python 2.7.9.
2015-04-13 17:09:55 +01:00
Ed Morley 8e67030a35 Bug 1143350 - Use peep instead of pip locally, on Travis & in Docker
We want to start using peep in production, to alleviate security
concerns with the idea of auto-updating packages from PyPI on deploy.
As a first step, we switch to using peep in the Vagrant environment,
on Travis and in the Docker build - so we can confirm the hashes are
correct.

Close bug 1143350.
2015-03-19 12:26:07 +00:00
Ed Morley 29ca673205 Bug 1143350 - Check in peep v2.2
We're checking this in so we have a known good starting point in the
chain of trust. It also simplifies our deployment requirements.

peep.py was taken from:
https://github.com/erikrose/peep/archive/2.2.tar.gz

The only alteration made was the addition of the licence block at the
top of the file, taken from LICENCE in the peep repo.
2015-03-19 12:12:01 +00:00
Ed Morley 99c6a1a9d6 Bug 1070470 - Rename pure.txt to checked-in.txt
The packages in this file are those that have been checked in to
vendor/, and the new name makes this more obvious.
2015-03-12 19:04:07 +00:00
Cameron Dawson d4bb357d5e Bug 1113873 - Enable the structured log parser celery queue 2015-03-11 08:49:10 -07:00
Ed Morley e9780bf1a0 Bug 1133482 - Remove download_logs.py since it's always been broken
Traceback (most recent call last):
  File "./bin/download_logs.py", line 97, in <module>
    download_logs(sys.argv[1])
  File "./bin/download_logs.py", line 74, in download_logs
    logrefs = job["job"]["log_references"]
TypeError: string indices must be integers, not str

And I don't think we have a need for it now anyway.
2015-03-03 17:13:35 +00:00
mdoglio fd9eb17603 Bug 1123479 - add startup script for a prefork-based log parser 2015-02-02 12:45:29 +00:00
mdoglio 40b01c751d Bug 1122139 - Split common tasks to separate queues 2015-01-28 14:33:38 +00:00
mdoglio f173254fd7 Bug 1094814 - reduce buildapi worker memory consumption 2014-12-10 11:07:57 +00:00
mdoglio 78ecb0900c Bug 1105800 - increase the web service timeout limit 2014-11-27 17:04:50 +00:00
Jonathan French dbb4d11e09 Bug 1090689 - Add MPL2.0 headers to the repo 2014-11-03 13:06:03 -05:00