This fixes a crashing bug, that probably started being an issue since
March 2020 / Glean v26, when we removed finalizers from the Kotlin
library[1]. Depending on how GC runs it was theoretically possible to
hit that even with finalizers in the SDK.
Contrary to the commit message in [1] we do have one more place where we
dynamically construct metrics: Labeled Metrics!
It used to be the case that whenever you access the submetric of a
labeled metric, that is `Metrics.Category.Name["label"]`, we created a
new Rust object, put that into the internal ConcurrentHandleMap and
returned a handle to it.
A ConcurrentHandleMap does a lot of work to ensure that you can't misuse
handles against the wrong map, when the item was already replaced, etc.
It also sets an upper limit of entries it can handle[2].
That limit is high enough for the static use of Rust (you would have to
use 32767+ metrics of a single type), but quickly[3] breaks down
The solution is to not re-create labeled submetrics on every access.
We cache those ourselves now, identified by the parent's metric handle
and the label.
Now accessing the same label 32768+ times is fine, you get the same
Rust object every time (as long as it exists).
There's one last piece: We do expose a way to destroy metrics, even if
that shouldn't happen for any statically defined ones.
This is not happening on Kotlin, where we removed finalizers some time
ago [1].
We're now removing those finalizers on Swift and Python as well.
We could try to get smart and do some double-checks that the handle is
not stale when the submetric is accessed.
But that breaks down quickly, because the metric could be destroyed
between the access and the recording, still leading to issues.
The only way to handle that would be some reference counting on the
object itself, but the current implementation doens't easily allow that.
So for now we just don't destroy boolean/counter/string metrics, ever.
The included tests fail without the corresponding library changes,
giving us a bit more confidence that the fix works.
Now there's still one way to trigger a crash, in theory:
Keep accessing previously-unused labels 32768+ times.
If you do that you're really on your own.
[1]: 25bf9ea0b2
[2]: 0fdc22a8df/src/handle_map.rs (L160-L166)
[3]: Well, "quickly". Reproducing that on get.webgl.org takes 8+ minutes.
* Support Python 3.10
This also uses the "next gen" Python Docker images from CircleCI.
Also removes some `persist_to_workspace` that is no longer needed.
* Fix YAML syntax
* Don't use sudo
* Use legacy Docker images for Windows tests
* Use next gen images for Windows
* Use wine-stable
* Remove comment
This relies on the background operation capabilities that are built into URLSession, which we were already using wrapped in an `Operation`. This removes the wrapping, using just the URLOperation capabilities.
Update changelog
Notable changes:
* Reserve the default ping name. It can't be used as a ping name, but it can be used in send_in_pings
* New lint: Check for redundant words in ping names
Because of the native code shipped it's essential that all downstream
users rely on the same Glean version.
Gradle can be overeager in chosing a single version when multiple are in
the dependency tree, always picking the latest, even across potentially
breaking-change (that is: major) releases.
Gradle has a way to influence dependency resolution and can even
disallow differing versions using `failOnVersionConflict`.
But that applies to all dependencies and there's no native way to
enforce that for a single dependency.
So the recommended solution[1] is to "manually" check if a dependency was
resolved from multiple versions and raise an issue.
For the ease of consumption by users, such as Fenix, we can apply this
in the Gradle plugin, because that's guaranteed to execute as part of
the build (because if not how are they even using Glean?)
[1]: https://github.com/gradle/gradle/issues/8813
There's a chance that the app is killed at any point.
If that happens while setting core metrics, on next start a ping might
be sent, which is then missing data.
Some of this data is required; and if missing the pipeline rejects the
ping during validation. By moving these metrics first we increase the
chance for them to be set.
This is a very desperate attempt on mitigating missing data.
* bug 1672951 - Add a Metrics Ping Scheduler to glean-core
Mostly a translation from other LBs' MPSes, but with a few changes.
Most notably a split of scheduling from ping submission that makes for
easier-to-proxy operations that are easier to test.
Also notable is the use of a condvar for cancellable tasks.
* add new Configuration parameter, fix RLB ping tests
* add unit to comment
Co-authored-by: Jan-Erik Rediger <badboy@archlinux.us>
Co-authored-by: Jan-Erik Rediger <badboy@archlinux.us>