telemetry-batch-view/docs/SyncSummary.md

670 B

The Sync Summary Dataset

The Sync Summary dataset is generated by src/main/scala/com/mozilla/telemetry/views/SyncView.scala.

Generating the dataset

The dataset can be generated as follows:

  1. SSH into a new spark cluster
  2. cd /mnt && git clone https://github.com/mozilla/telemetry-batch-view.git && cd telemetry-batch-view
  3. sbt assembly
  4. spark-submit --master yarn --deploy-mode client --class com.mozilla.telemetry.views.SyncView target/scala-2.11/telemetry-batch-view-1.1.jar --bucket telemetry-test-bucket --from $DATE --to $DATE

Some adjustment to version numbers in step 4 may be necessary in the future.