Граф коммитов

273 Коммитов

Автор SHA1 Сообщение Дата
Frank Bertsch 8bb49e0e08 Make build_id insertions add histograms correctly
Bug 1403994

Previously, old build_id histograms would merge with new ones
by just adding; but instead we want to first add the histograms,
then the sums and counts separately.

Unfortunately this means that [0] did not fix historical build_id
aggregates. Those are broken in the db for good.

[0] https://github.com/mozilla/python_mozaggregator/pull/57
2018-01-12 13:36:25 -06:00
Frank Bertsch 7dd4259ba6 Bump version 2018-01-10 15:39:08 -06:00
Frank Bertsch cba42043d7 Aggregate histograms of different lengths correctly
This separates aggregating the histogram buckets (keys 0:n-2)
from the sum and ping counts (keys n-1 and n). This way we can
add new buckets without the new values polluting those last
two buckets.

See bug 1403994
2018-01-10 15:39:08 -06:00
Frank Bertsch 304df733b8 bump version 2017-12-18 20:16:38 -06:00
Frank Bertsch 1bde537dd0 Ignore NULL main histogram representations
Previously, an edge case for a NULL histogram caused the job to
fail.

See bug 1425464
2017-12-18 14:55:54 -06:00
Frank Bertsch dc09968003 Fix build
* Fix travis issue https://github.com/travis-ci/travis-ci/issues/7940

* Remove expired histogram usage

The new keyed histogram should never expire
2017-12-18 14:10:38 -06:00
Frank Bertsch ab515cb50f Bump version 2017-05-22 11:26:16 -05:00
Frank Bertsch 3c286a59c1 Ignore stream describe throttling 2017-05-22 11:26:16 -05:00
Frank Bertsch 34a1cca582 Bump version 2017-05-16 16:07:18 +01:00
Frank Bertsch 0859301595 Ignore cloudwatch log upload failures
Also fixes a bug where the response from cloudwatch logs is
never set.
2017-05-15 18:35:47 +01:00
Frank Bertsch 4c1594acb8 Increase version number 2017-03-14 07:49:05 +00:00
Frank Bertsch cdf5e74001 Add read replica for service 2017-03-14 07:49:05 +00:00
Frank Bertsch f1ebb95213 Increase version number 2017-03-08 11:20:04 -05:00
Frank Bertsch 817c6b8f1f Use moztelemetry Scalar implementation 2017-03-08 11:20:04 -05:00
Frank Bertsch d75bdeee24 Increase version number 2017-03-07 11:59:06 -05:00
Frank Bertsch 88b781ab5d Add Cache Control and ETag headers
This change implements both cache control and ETag headers.

For Cache-Control:
For all requests but submission-date aggregates, the max-age
is set until the data is kicked from the local cache
(we know the response won't change until then).

For submission-date aggregates, max-age is always set to 24h.

ETags:
Etags are not set on any requests but submission-date aggregates.
The ETags are the same for all values, since submission-date
aggregates will never change, unless we do a backfill.

Thus, the single ETag value can be updated, invalidating all
previous ETags. This should only be done after a backfill.
2017-03-07 11:59:06 -05:00
Roberto Agostino Vitillo d8e663df8e Run tests against Spark 2.0.2
See https://bugzilla.mozilla.org/show_bug.cgi?id=1344020.
2017-03-07 11:16:53 -05:00
Frank Bertsch 889a3a573d Aggregate process and gpu scalars 2017-03-07 05:26:47 -05:00
Frank Bertsch 1db5c88b42 Bump version number 2017-02-23 13:08:07 -06:00
Frank Bertsch 0931f319a6 Enable content/parent process types
In https://github.com/mozilla/python_mozaggregator/pull/29, we
added the new processes to the get_filter_options, but not to
get_metrics (so they weren't correctly transformed for querying
the database). This change transforms content => true and
parent => false for child GET params.
2017-02-23 12:38:32 -06:00
Frank Bertsch 70c7fd59ff Change logging to include Origin
Previously, logging was done for referer URL and referer.
Unfortunately, the data was not properly accessed, so the fields
were never present. This change fixes that by properly retrieving
HTTP Referer, and also using HTTP Origin.
2017-02-23 13:38:02 +00:00
Frank Bertsch ee4ffc30ab Fix retrieval of child/parent histograms
With the change to add gpu histograms, the values for "child"
went from true/false to "true"/"false"/"gpu". This means that
querying for {"child": True} only returns data from before
we made that change [0].

With this change, we now query for both {"child": True}
and for {"child": "true"}. The database function is
also backwards compatible, just in case anyone is running
an older version of the service.

[0] https://github.com/mozilla/python_mozaggregator/pull/29
2017-02-22 09:41:12 -06:00
Frank Bertsch fa50c834e3 Increase version number 2017-02-21 16:40:58 -06:00
Frank Bertsch 580e593c7a Aggregate Fennec saved-session pings 2017-02-21 16:40:58 -06:00
Roberto Agostino Vitillo 381b1ec625 Don't install telemetry-tools
The telemetry-tools package has been deprecated and its functionality
has been embedded in python_moztelemetry.
2017-02-21 21:42:33 +00:00
Frank Bertsch 1073d73494 Replace papertrail logging with CloudWatch Logs 2017-02-17 17:50:57 +00:00
Frank Bertsch 4b88fbbd22 Increase version number
This version includes aggregating and returning gpu processes
2017-02-06 15:51:49 +00:00
George Wright e1d897fe98 bug 1314227 - Support payloads with GPU process Histograms
Original implementation by :gw280, credit to him. (I just shined it up a tad)

GPU processes are coming online for various clients, so we'd like to see their
measurements in places like tmo.

Add child=(gpu, content, parent) values to filter sets, with content and parent
mapping to true and false, respectively, to maintain backward compatibility.

Needed to increase the number of pings per dimension in the tests so we can
test that two new-style pings can aggregate the gpu process information
properly.

Also, properly check that each process_type has the appropriate counts in the
tests. Previously we were getting away with just checking that the entire sum
was correct, when we could have done better.
2017-02-03 07:29:25 -06:00
Frank Bertsch 5f77586190 Increase version number 2017-01-11 08:49:46 -06:00
Frank Bertsch 3a466853f8 Make prefixes and labels constant 2017-01-11 08:49:46 -06:00
Frank Bertsch 02f01af61e Get scalar descriptions from Scalars.yml
The class that is added here will eventually be moved to
python_moztelemetry, and imported.
2017-01-11 08:49:46 -06:00
Frank Bertsch a406c2e876 Ignore "browser.engagement.navigation" scalars
Business development requires that these scalars not be made public
2017-01-11 08:49:46 -06:00
Frank Bertsch a20c7825ab Aggregate keyed scalars 2017-01-11 08:49:46 -06:00
Frank Bertsch 23724cee51 Aggregate numeric scalars
This change does not include keyed numeric scalars.
2017-01-11 08:49:46 -06:00
Roberto Agostino Vitillo 4e8aaabbb5 Merge pull request #26 from fbertsch/add_logging
Bug 1299840 - Log requests against aggregator service
2016-12-30 15:36:33 +01:00
Frank Bertsch ba1aa97621 Increase version number 2016-12-30 06:11:58 -06:00
Frank Bertsch 31aea0445b Log requests against the service
We want to be able to track how the service is being used. This analysis
will aid in future iterations of the aggregator.
2016-12-30 06:11:58 -06:00
Roberto Agostino Vitillo 0cd72c6f94 Merge pull request #25 from fbertsch/use_main_pings
Bug 1286868 - Use main pings instead of saved_session
2016-12-15 19:21:59 +00:00
Frank Bertsch e592e1464f Increase version number 2016-12-15 10:54:08 -06:00
Frank Bertsch e8dad504e9 Use main pings instead of saved_session
Saved_session pings are being deprecated.
However, main pings are both opt-in and opt-out, while saved_session
were just opt-in. For this reason we are filtering to include only
opt-in users, so the results should be similar.

Note that results will not be identical, since saved_session pings
often lag main pings, due to the main ping submission on date split.
2016-12-15 10:22:24 -06:00
Roberto Agostino Vitillo 62e0a03b5c Merge pull request #24 from fbertsch/handle_null
Handle null unicode char in labels
2016-12-08 15:46:23 -10:00
Frank Bertsch eb968c5ca3 Increase version number 2016-12-08 14:54:02 -06:00
Frank Bertsch e821905701 Handle null unicode char in labels
Postgres doesn't handle char \u0000, so we'll have to ignore it.
We can't just use printable chars, since there aren't any requirements
for keyed histograms labels to be limited to those.
2016-12-08 14:54:02 -06:00
Roberto Agostino Vitillo 2ef938d4d1 Revert "Use main pings instead of saved_session"
This reverts commit da1f17a72e. The
change has caused catastrophic failures of the aggregation job.
2016-11-25 07:32:22 +00:00
Roberto Agostino Vitillo 74b55127cf Merge pull request #23 from mozilla/fix
Deal with unexpected types when checking if telemetry is enabled.
2016-11-24 15:34:09 +00:00
Roberto Agostino Vitillo 7db8ca2767 Deal with unexpected types when checking if telemetry is enabled. 2016-11-24 14:13:51 +00:00
Frank Bertsch bea994e030 Bump version number 2016-11-23 15:36:38 +00:00
Frank Bertsch da1f17a72e Use main pings instead of saved_session
Saved_session pings are being deprecated.
However, main pings are both opt-in and opt-out, while saved_session
were just opt-in. For this reason we are filtering to include only
opt-in users, so the results should be similar.

Note that results will not be identical, since saved_session pings
often lag main pings, due to the main ping submission on date split.
2016-11-23 15:36:38 +00:00
Roberto Agostino Vitillo da7fc00833 Merge pull request #20 from fbertsch/use_dataset_api
Bug 1316721 - Use Dataset API
2016-11-21 15:54:48 +00:00
Frank Bertsch 5787dc5e00 Increase version number 2016-11-21 09:35:41 -06:00