Граф коммитов

288 Коммитов

Автор SHA1 Сообщение Дата
pyup-bot 172e220fc8 Pin notebook to latest version 6.0.2 2019-11-04 15:28:23 -08:00
Jannis Leidel f4ee2592f4 Remove unneeded comment. 2019-03-29 09:36:33 +01:00
Mozilla-GitHub-Standards f495076c99 See PR for details
_(Message COC002)_
2019-03-29 09:36:33 +01:00
Jeff Klukas cfc762f9d4 Add pull request template
This is motivated by
[a discussion doc](https://docs.google.com/document/d/1NwY1iReuxcbvxCi9jRUOYiymCUOoSq3cnShCI-OiW3c/edit#heading=h.c6fp0v2ffacr)
that focused on the problem of implicit dependencies between repos.
emr-bootstrap-spark is one of the projects where changes can cause unanticipated
problems for running code that lives elsewhere, and it can be difficult to
track down the source of the problem when an emr-bootstrap-spark deploy goes
wrong.

This will be followed up by documentation changes to other repos that depend
on this one, making the dependency more explicit.
2018-12-18 08:44:58 -05:00
Jeff Klukas a72d3c83cf Bug 1490400 Have telemetry.sh exit on failed commands
The existing comment about `set -e` appears to no longer apply.
I've been deploying clusters while iterating on these changes,
hitting many "Bootstrap failure" states. I have reliably been
able to find stdout.log.gz containing the relevant failure message
for these failed clusters.

There were a few commands failing for every cluster, so I had to
make some small cleanup changes in order for bootstrap to complete
with the new `set -eo pipefail` setting.
2018-12-13 13:50:49 -05:00
pyup-bot 682598c906 Update mozanalysis from 2018.11.1 to 2018.12.0 2018-12-05 10:44:18 -08:00
Jeff Klukas 4a140128ee Update python_moztelemetry and related python packages 2018-11-12 13:57:54 -05:00
pyup-bot 0316767d33 Update mozanalysis from 2018.9.0 to 2018.11.1 2018-11-01 16:33:07 -07:00
Jeff Klukas 942102f25c Make sure /mnt/var/log/spark is writable before touching spark.log 2018-10-10 12:55:07 -04:00
Jeff Klukas 0da923cf0b Make sure metrics.properties exists on all nodes 2018-10-10 09:21:35 -04:00
Jeff Klukas 762cf1a8c6 Ensure write permissions on spark.log 2018-10-10 09:21:35 -04:00
Jeff Klukas e640ff257f Send only Spark driver metrics
Our custom accumulator-based metrics only appear on the driver.
See https://github.com/apache/spark/blob/master/conf/metrics.properties.template
2018-10-10 09:21:35 -04:00
Jeff Klukas b92ab9e6ed Add retry logic around describe-clusters request 2018-10-10 09:21:35 -04:00
Jeff Klukas bb235672f6 Quote and kebab-case Datadog tags
Matches style of tags datadog automatically sends
(security-group, instance-type, etc.) and
allows cleaning of tag values containing characters like ':'
2018-10-10 09:21:35 -04:00
Jeff Klukas ad2dced0be Bug 1465974 Add option for installing datadog-agent 2018-09-24 08:47:50 -04:00
Rob Hudson 18d2b2c09d Add mozanalysis library 2018-09-18 14:06:59 -07:00
Rob Hudson d628b37e28 Sort python requirements 2018-09-18 14:06:59 -07:00
pyup-bot e089c81f2d Update python_moztelemetry from 0.9.1 to 0.10.3 2018-09-13 14:32:28 -07:00
Sunah Suh 9e81e6be8d Bug 1490395: don't upgrade jupyterlab past last known working version 2018-09-11 13:44:33 -05:00
Wesley Dawson 883b234f52 Add sops role assumption permissions 2018-08-30 13:49:39 -07:00
Jeff Klukas 49787f035f Comment on bugfixes in EMR 5.16 and what we can deprecate 2018-08-14 11:41:24 -04:00
Rob Hudson 405298aa89 Configure pyup for security-only updates (bug 1478023) 2018-07-26 09:27:20 -07:00
Jeff Klukas c2199d419f Add link to AWS case 2018-07-24 13:33:24 -04:00
Jeff Klukas a601bf5ee0 Bug 1466936 Install telemetry-spark-packages-assembly.jar 2018-07-24 13:33:24 -04:00
Jeff Klukas 63f3249abc Add a PYTHONSTARTUP script that adds packages to sys.path 2018-07-24 13:33:24 -04:00
Jeff Klukas 5ac2080d21 Bug 1474347 Revert default spark-hyperloglog package
Undoes most of
https://github.com/mozilla/emr-bootstrap-spark/pull/393

which we found can't function on EMR versions before 5.13
due to the lack of support for the spark.jars.repositories parameter.
2018-07-10 21:16:57 +00:00
Daniel Thorn 963edfafd8 Pin python_moztelemetry in bootstrap
So that we can release a new version for python_moztelemetry without breaking ATMO jobs if there's an unexpected bug
2018-07-09 09:27:57 -07:00
Jeff Klukas 0f0796c9db Fix package coordinates 2018-07-09 08:38:28 -04:00
Jeff Klukas a9acd1f542 Update spark-hyperloglog version to match published tag 2018-07-09 08:38:28 -04:00
Jeff Klukas d9df20ff01 Add spark.jars.repositories to point to our S3 mavenrepo 2018-07-09 08:38:28 -04:00
Jeff Klukas 4d5e7acefd Bug 1466936 - Install hyperloglog as a spark package
It looks like we were already pre-loading the vitillo version
of spark-hyperloglog from spark-packages.org for zeppelin,
but we've recently published the mozilla version of the package
on spark-packages.org; this PR makes that package a cluster default.

Packages specified in spark.jars.packages aren't installed at
cluster spinup time, but rather at instantiation of a SparkContext,
so this adds a slight amount of work for each new SparkContext
that's created.
2018-07-09 08:38:28 -04:00
pyup-bot 63451f5b1a Update py4j from 0.8.2.1 to 0.10.7 2018-07-09 08:37:17 -04:00
Daniel Thorn 0bf22d890b Rename python3-requirements.txt to zeppelin-requirements.txt
so we can later add generic python3 support
2018-06-26 12:01:16 -07:00
Blake Imsland 734bd8f481 Use VPC endpoint for accessing unified metastore 2018-06-13 13:59:45 -07:00
Frank Bertsch 821ebe27cb Revert "Update python-mozaggregator from 0.3.0.1 to 0.3.0.2"
This reverts commit f4cac3679d.
2018-05-01 16:07:29 -05:00
pyup-bot f4cac3679d Update python-mozaggregator from 0.3.0.1 to 0.3.0.2 2018-04-30 09:06:17 -05:00
Wesley Dawson ee7739afdb Add telemetry artifacts bucket 2018-04-27 11:46:45 -07:00
Jason Thomas ebb234c136 Add net-mozaws-prod-us-west-2-data-public Bug 1455725 2018-04-20 17:19:11 -04:00
pyup-bot 76c1adf3ac Pin python-mozaggregator to latest version 0.3.0.1 2018-04-09 12:48:42 -05:00
Frank Bertsch 7e32c3bd55 Enable spark logging on executors
Previously, executor logs would fail with:
java.io.FileNotFoundException: /mnt/var/log/spark/spark.log (Permission denied)

Which caused the executor logs to not write.

This change will make them available, and then available in s3.
2018-04-05 14:49:43 -05:00
Wesley Dawson ed047f014e Change Socorro telemetry ingestion bucket 2018-03-29 11:46:05 -07:00
Jason Thomas 68ce625ca0 Add S3 IAM permissions for probe info bucket Bug 1444942 2018-03-27 14:37:37 -04:00
Jason Thomas 3c11f8d3f7 Add DynamoDB IAM related access for TAAR jobs 2018-03-27 14:37:16 -04:00
Jannis Leidel 82fb40feee Install Jupyterlab and a newer Jupyter Notebook. 2018-03-21 11:26:19 -07:00
Allen Short 89dedd707c Enable Spark job cancellation
pursuant to https://github.com/mozilla/telemetry-analysis-service/issues/554
2018-02-28 22:40:49 +01:00
Jannis Leidel a7ccdfeed8 Fix installing arrow and feather. Fix #101. 2017-12-01 00:30:05 +01:00
Jannis Leidel 14fc2e2a0c Bug 1421828 - Pin awscli to old version to force mozaggregator dependency to that version. 2017-12-01 00:28:51 +01:00
Frank Bertsch e7e982b35d Pin mozaggregator version
See bug 1421828
2017-11-30 17:13:57 +01:00
Saptarshi Guha a6cf0cf237 Added feather and arrow libraries
Feather is a interoperable data format and arrow is a time library.
https://github.com/wesm/feather and http://arrow.readthedocs.io/en/latest/
2017-11-30 16:24:08 +01:00
Anthony Miyaguchi 4f31b02794 Make user defined variables in instructions more obvious 2017-11-30 16:21:52 +01:00