mozilla-pipeline-schemas/README.shield.md

1.0 KiB

About Shield Schemas and Parquet files.

All shield related files are autogenerated by gregglind/shield-study-schemas.

DO NOT EDIT THEM BY HAND.

Please file bugs at: gregglind/shield-study-schemas.

Data collection process explained

  • at CLIENT
    • a shield addon generates a PAYLOAD, which validates according to payload validator.

    • TelemetryController wraps this payload into a TELEMETRY PING and sends it to the right S3 bucket for the doc_type.

      let telOptions = {addClientId: true, addEnvironment: true};
      TelemetryController.submitExternalPing(bucket, payload, telOptions);
      
  • at COLLECTOR (simplified view)
    • the collector validates the packet using the jsonschema in this directory. Only parts of the packet that we want to end up in parquet is validated. The validation schema is autogenerated.
    • Valid packets are pushed to parquet according to the parquet schemas. These parquet schemas are autogenerated to match the collector jsonschema.