Граф коммитов

32 Коммитов

Автор SHA1 Сообщение Дата
Mauro Doglio bf8ee528c7 Decorate MainPings and CrashPings with a parse() method 2017-06-06 15:42:36 +01:00
Mauro Doglio 46548dcc92 Add non-main crashes to stats
This also increases the test coverage adding some tests for the
MainPing methods.
2017-06-06 15:42:36 +01:00
Mauro Doglio fe6b351b68 Use case classes to parse pings out of heka messages 2017-06-06 15:42:36 +01:00
Mauro Doglio 897bbe3593 Add case classes for main and crash pings 2017-06-06 15:42:36 +01:00
Mauro Doglio d7fc30eed4 Upgrade Spark version to 2.1.1 2017-05-22 12:38:03 +01:00
Frank Bertsch 650c6d9c12 Add options to run in batch mode 2017-05-05 11:32:03 +01:00
Mauro Doglio 8d7b4b3a39 Use moztelemetry to autogenerate protobuf classes 2017-04-27 14:47:14 +00:00
Mauro Doglio d523870990 Add SLOW_SCRIPT_PAGE_COUNT to ErrorAggregator stats
This is one of the metrics suggested by releng.
I will use it to test schema updates as per #19
2017-02-28 15:09:43 +00:00
Mauro Doglio 053d5ed00b Set root logger log level to warning 2017-02-21 14:05:26 +00:00
Mauro Doglio ee37f56f49 Add log4j configuration file
Fixes #18
2017-02-20 13:53:00 +00:00
Mauro Doglio a18475aece Add failOnDataLoss option to ErrorAggregator
This is useful when we want to recover from an expected data loss.
Fixes #25
2017-02-20 13:49:53 +00:00
Mauro Doglio 417127b2fc Increase kafka consumers cache size
Fixes #21
2017-02-17 13:57:59 +00:00
Mauro Doglio 4410443787 Don't run tests when running sbt assembly
Fixes #23
2017-02-17 13:48:18 +00:00
Roberto Agostino Vitillo 93dceb687a Add coverage to README. 2017-02-13 12:19:07 +00:00
Roberto Agostino Vitillo 1525297043 Generate coverage report 2017-02-10 13:53:37 +00:00
Roberto Agostino Vitillo 06b875f1df Add build status to README. 2017-02-10 12:49:36 +00:00
Roberto Agostino Vitillo 1a7569d0e6 Create README.md 2017-02-10 12:49:36 +00:00
Mauro Doglio 89c822029a Bug 1338495 - Increase maximum partition fetch size
The kafka consumer has a setting that limits the maximum amount of data
per-partition that can be fetched from the server. This must be higher
than the maximum size of a ping, otherwise the job will raise an error.
The 8MB value was found empirically, we may have to further tune it in
the future.
2017-02-10 11:36:59 +00:00
Mauro Doglio 4fa48fc677 Bug 1337742 - Define merge strategy for duplicate files on assembly
This is to fix an error that arises due to duplicate files in the
library dependencies that get packaged when running sbt-assembly.
You need to tell sbt-assembly how to fix those in order to have a clean
packaged jar.
2017-02-10 11:33:02 +00:00
Mauro Doglio fed064964c
Add com.artima repo resolver in plugins.sbt 2017-02-02 14:38:09 +00:00
Roberto Agostino Vitillo 496ed1f205 Generate aggregates by upstream ingestion time.
Ideally we would like to generate aggregates based on the upstream
ingestion time.  Unfortunately there aren't strong guarantees that
the data we receive is in order and even if it was one Kafka partition
could lag behind others.

Structured Streaming allows us to generate aggregates based on the
upstream ingestion time (aka event time), rather than Spark's processing
time.

Spark Structured streaming
2017-02-02 13:44:28 +00:00
Mauro Doglio 7eb20aaf33 Use sbt test to run tests on TravisCI 2017-01-27 19:08:27 +00:00
Mauro Doglio a30219fde6 Add resolver for supersafe repo 2017-01-27 19:08:27 +00:00
Mauro Doglio 04592301f6 Bug 1332684 - add error counts stats to ErrorAggregator 2017-01-27 19:08:27 +00:00
Mauro Doglio 8ea413a8dd Add raiseOnError CLI option to ErrorAggregator 2017-01-27 19:08:27 +00:00
Mauro Doglio f3d7a16849 Fix style issues 2017-01-27 19:08:27 +00:00
Mauro Doglio 4bb3f1eafe Remove outputBucket and saveLocally options in favor of outputPrefix
Using a generic outputPrefix let us set either a localpath or a s3
bucket.
2017-01-24 17:34:02 +00:00
Mauro Doglio 632de3a8fa Save ErrorStreaming output to S3
The bucket name can be provided as a CLI option.
2017-01-24 17:34:02 +00:00
Mauro Doglio 4f099dc823 Add .gitignore 2017-01-24 17:34:02 +00:00
Mauro Doglio 5a3c788c60 rename Aggregator --> ErrorAggregator 2017-01-24 17:34:02 +00:00
Mauro Doglio 7b6ded8e2e Setup TravisCI builds 2017-01-23 14:59:40 +00:00
Roberto Agostino Vitillo 7777b157a3 First commit. 2017-01-06 16:09:45 +00:00