Allows mach commands to define their own glean metrics with the `metrics_path` @CommandProvider parameter.
When `metrics_path` is defined:
* A `metrics` kwarg is provided to the decorated class. This `metrics` handle is a Glean instance, so Glean documentation should be consulted for usage information.
* When `mach doc telemetry` is run, metrics docs will be generated from all the registered metrics files.
Note: there was some consideration between making `metrics_path` a @CommandProvider or @Command parameter.
In the end, @CommandProvider seemed like a better fit because:
* Metrics seem to be more associated with the entire class than a specific command/method. This is because a class represents a "domain", and that domain may have different commands that have overlapping metrics. Accordingly, all the metrics should be defined once as available to the entire class.
* Currently, @Command methods only take parameters that map one-to-one with CLI arguments. It could seem inconsistent to have one exception: the metrics handle
Differential Revision: https://phabricator.services.mozilla.com/D85953
In the patch for bug 1656993, the case in which
get_command was being set was removed.
Accordingly, its usage in CommandAction will always be evaluated to
`False`, and it can be deleted.
Differential Revision: https://phabricator.services.mozilla.com/D90198
In addition to the existing build telemetry, also gather the stats and
report with Glean. This new telemetry is reported in tandem with the existing
telemetry to allow testing and confidence before a full roll-out.
Additionally, Glean isn't compatible with Python 2, so the new telemetry only runs
on Python 3 mach commands.
Differential Revision: https://phabricator.services.mozilla.com/D83572
This, hopefully, begins to address an ongoing global problem where we have few, if any, insights into the performance of individual build tasks (compilations, calls into Python scripts, etc.) At most we have aggregated statistics about how long tiers last, combined with `sccache` aggregates across the entire build (which don't cover non-compilation tasks). This has a few implications:
1. It's impossible to identify bottlenecks, except by going out of your way to notice and reproduce them. e.g. no one, to my knowledge, was aware that `make_dafsa.py` was a bottleneck until someone happened to notice and report it in bug 1629337. We could have systems that automatically detect this sort of thing, or at least that make it easier to do so than by CTRL-C'ing in the middle of the build several times to try to reproduce the problem.
2. It's impossible to detect regressions, unless the regression is so pronounced and severe that it has an immediate impact on the overall build time and triggers build time alerts.
3. It's impossible to identify that you have *fixed* regressions, except by doing ad-hoc timing measurements by building individual `make` targets. This is error-prone and annoying.
Here we propose a low-friction system wherein individual build tasks log their build own perf info. For now, that's a write to `stdout` consisting of the string `BUILDTASK ` followed by a simple JSON object with a start time, end time, the `argv` of the task, and an additional `"context"` key (I anticipate this could be used to annotate the task with relevant per-task for later aggregation, for example: was this an `sccache` cache hit or not? For now, it's empty everywhere). The build controller then collects this data, validates it, and writes out the entire list of build tasks as a JSON file after the build has completed, similarly to what we already do with `build_resources.json`. We already parse some `make` output to do stuff like tracking when we switch tiers, so this isn't a huge architectural shift or anything.
In my opinion this "should" happen at the build system, or `make`, level, but `make` doesn't expose anything resembling this information to my knowledge, so this has to be implemented outside of `make`. One could implement something like this at the `sccache` level but that doesn't touch anything but C/C++/Rust compilation tasks; an ideal solution would support other generic build tasks. We could also fork `make` to add this feature ourselves, but for several reasons I don't think that's tractable. :)
Of course, this approach has downsides:
1. We depend on parsing the `stdout` of `make`, and processes can unfortunately sometimes trample on each other, leading to data loss for individual build tasks occasionally. This is a necessary limitation of the model to my knowledge, and I don't know that it can be fixed generally. In my testing, not much data tends to be lost usually.
2. Dumping arbitrary data to `stdout` isn't always possible or desirable. If you're not careful about it this can also result in noisier-than-necessary tasks, especially when those tasks are not invoked by a parent process that knows how to handle the special `BUILDTASK` lines.
3. This data is raw enough where aggregation is not completely trivial.
4. This functionality has to be added for any new kind of build task whose performance we'd like to track; it doesn't come "for free" due to not being able to be implemented at the build system level.
5. The data isn't awfully small due to the `argv`'s (at this point, not nearly big enough where we need to be concerned about it IMO, but maybe that will change in the future?)
One can imagine a couple other architectures that could avoid the first two problems, namely: 1) we could use a "real" database that would not dump info to `stdout` and wouldn't lose data, like `sqlite3`; or, 2) we could set up another server, similar to `sccache`, that collects this data from subprocesses and aggregates it, making sure not to lose any along the way. Both of these have enough overhead, in terms of engineering effort or actual impact on latency, where I dont know that they make any sense to even attempt implementing. The remaining continue to be real issues, however.
After this is landed there are a few ways forward. We can start uploading these files as build artifacts in CI to allow us to reason about performance impacts of changes in `central`. We can easily add this functionality to the `sccache` client to start tracking those builds as well. We already have a very simple visualization of build tier timing in `mach resource-usage`; we could join that data against the `BUILDTASK` data to produce a very clear visualization of build bottlenecks, i.e., "why is the `export` tier taking so long", etc.
Differential Revision: https://phabricator.services.mozilla.com/D80284
In two different places we've been encountering issues regarding 1) how we configure the system Python environment and 2) how the system Python environment relates to the `virtualenv`s that we use for building, testing, and other dev tasks. Specifically:
1. With the push to use `glean` for telemetry in `mach`, we are requiring (or rather, strongly encouraging) the `glean_sdk` Python package to be installed with bug 1651424. `mach bootstrap` upgrades the library using your system Python 3 in bug 1654607. We can't vendor it due to the package containing native code. Since we generally vendor all code required for `mach` to function, requiring that the system Python be configured with a certain version of `glean` is an unfortunate change.
2. The build uses the vendored `glean_parser` for a number of build tasks. Since the vendored `glean_parser` conflicts with the globally-installed `glean_sdk` package, we had to add special ad-hoc handling to allow us to circumvent this conflict in bug 1655781.
3. We begin to rely more and more on the `zstandard` package during build tasks, this package again being one that we can't vendor due to containing native code. Bug 1654994 contained more ad-hoc code which subprocesses out from the build system's `virtualenv` to the SYSTEM `python3` binary, assuming that the system `python3` has `zstandard` installed.
As we rely more on `glean_sdk`, `zstandard`, and other packages that are not vendorable, we need to settle on a standard model for how `mach`, the build process, and other `mach` commands that may make their own `virtualenv`s work in the presence of unvendorable packages.
With that in mind, this patch does all the following:
1. Separate out the `mach` `virtualenv_packages` from the in-build `virtualenv_packages`. Refactor the common stuff into `common_virtualenv_packages.txt`. Add functionality to the `virtualenv_packages` manifest parsing to allow the build `virtualenv` to "inherit" from the parent by pointing to the parent's `site-packages`. The `in-virtualenv` feature from bug 1655781 is no longer necessary, so delete it.
2. Add code to `bootstrap`, as well as a new `mach` command `create-mach-environment` to create `virtualenv`s in `~/.mozbuild`.
3. Add code to `mach` to dispatch either to the in-`~/.mozbuild` `virtualenv`s (or to the system Python 3 for commands which cannot run in the `virtualenv`s, namely `bootstrap` and `create-mach-environment`).
4. Remove the "add global argument" feature from `mach`. It isn't used and conflicts with (3).
5. Remove the `--print-command` feature from `mach` which is obsoleted by these changes.
This has the effect of allowing us to install packages that cannot be vendored into a "common" place (namely the global `~/.mozbuild` `virtualenv`s) and use those from the build without requiring us to hit the network. Miscellaneous implementation notes:
1. We allow users to force running `mach` with the system Python if they like. For now it doesn't make any sense to require 100% of people to create these `virtualenv`s when they're allowed to continue on with the old behavior if they like. We also skip this in CI.
2. We needed to duplicate the global-argument logic into the `mach` script to allow for the dispatch behavior. This is something we avoided with the Python 2 -> Python 3 migration with the `--print-command` feature, justifying its use by saying it was only temporarily required until all `mach` commands were running with Python 3. With this change, we'll need to be able to determine the `mach` command from the shell script for the forseeable future, and committing to this forever with the cost that `--print-command` incurs (namely `mach` startup time, an additional .4s on my machine) didn't seem worth it to me. It's not a ton of duplicated code.
Differential Revision: https://phabricator.services.mozilla.com/D85916
Now you can pass the `virtualenv_name` kwarg to the `Command` decorator which will configure the `_virtualenv_manager` accordingly.
Differential Revision: https://phabricator.services.mozilla.com/D86256
Now you can pass the `virtualenv_name` kwarg to the `Command` decorator which will configure the `_virtualenv_manager` accordingly.
Differential Revision: https://phabricator.services.mozilla.com/D86256
Today we don't require that `mach` `CommandProvider`s subclass from any particular parent class and we're very lax about the requirements they must meet. While that's convenient in certain circumstances, it has some unfortunate implications for feature development.
Today the only requirements that we have for `CommandProvider`s are that they have an `__init__()` method that takes either 1 or 2 arguments, the second of which must be called `context` and is populated with the `mach` `CommandContext`. Again, while this flexibility is occasionally convenient, it is limiting. As we add features to `mach`, having a better idea what the shape of our `CommandProvider`s are and how we can instantiate them and use them is increasingly important, and this gives us additional control when having `mach` configure `CommandProvider`s based on data that is only available at the `mach` level. In particular, we plan to leverage this in bugs 985141 and 1654074.
Here we add validation to the `CommandProvider` decorator to ensure all classes inherit from `MachCommandBase`, update all `CommandProvider`s in-tree to inherit from `MachCommandBase`, and update source and test code accordingly.
Follow-up work: we now require (de facto) that the `context` be populated with a `topdir` attribute by the `populate_context_handler` function, since instantiating the `MachCommandBase` requires a `topdir` be provided. This is fine for now in the interest of keeping this patch reasonably sized, but some additional refactoring could make this cleaner.
Differential Revision: https://phabricator.services.mozilla.com/D86255
These tests depend on the `mach uuid` command which was deleted with bug 1639509, and now that `mach uuid` is gone it's broken unconditionally. We could replace the reference to `uuid` with a new no-op `mach` command, but we're in the process of replacing our telemetry code with use of the `glean` API; and the new telemetry code won't have the same semantics (namely, we are unlikely to want to continue to guarantee that sub-`mach` invocations aren't covered by telemetry), so this test might as well just be deleted now.
Differential Revision: https://phabricator.services.mozilla.com/D85911
@CommandProvider does parameter validation and collects information (such
as "pass_context") that will be needed by Registrar.
However, rather than just checking parameter length, we can make it more
specific and assert that the specific expected parameter ("context") exists.
Differential Revision: https://phabricator.services.mozilla.com/D85482
Sentry is initialized globally, but it's not clear to consumers when this actually happens.
For example, an unwary developer may call report_exception() within check_and_get_mach(),
not knowing that Sentry hasn't been initialized yet.
Using a class should make the dependency on register_sentry() more verbose.
Depends on D80913
Differential Revision: https://phabricator.services.mozilla.com/D80918
Sentry needs to be able to send data in a minimal environment, and intelligently determining
the "topobjdir" (without leaning on Mach) is tough.
Instead, since the "topobjdir" usually falls under the "topsrcdir", just leaning on <topsrcdir>
patching should be sufficient.
Differential Revision: https://phabricator.services.mozilla.com/D80122
To avoid sending identifying information, common absolute paths are patched with placeholder values. For example, devs
may place their Firefox repository within their home dir, so absolute paths are doctored to be prefixed with
"<topsrcdir"> instead.
Additionally, any paths including the user's home directory are patched to instead be a relate path from "~".
Differential Revision: https://phabricator.services.mozilla.com/D78962
To avoid sending identifying information, common absolute paths are patched with placeholder values. For example, devs
may place their Firefox repository within their home dir, so absolute paths are doctored to be prefixed with
"<topsrcdir"> instead.
Additionally, any paths including the user's home directory are patched to instead be a relate path from "~".
Differential Revision: https://phabricator.services.mozilla.com/D78962
It is possible to have both a default command (with positional arguments) and
sub-commands (with arguments) in mach. If the subcommand exists, it
is dispatched to; if it doesn't the default command is called the positional
argument filled in.
However, when you run ./mach command --help, it will detect the subcommands
and only print out their help section. If the default command has arguments,
they were not printed out. Now they are.
Small papercuts in this patch are that the Default Command Arguments are
printed after the subcommands and that subcommand help without default
arguments have an extra newline after them. Both of these seem small
enough that the refactoring necessary to abate them is undesirable.
Differential Revision: https://phabricator.services.mozilla.com/D76505
These variables were renamed in bug 1641992, and furthermore they were changed to "templates" (with inline `%s` for string formatting), so fix the tests with that in mind.
Differential Revision: https://phabricator.services.mozilla.com/D77697
I misinterpreted one of the conditionals when working on bug 1641991 (the "not X or Y" formulation confused me :( ). The correct fix here is to remove this conditional which isn't relevant any more after we removed all the old pre-`mochitest` mach commands.
Differential Revision: https://phabricator.services.mozilla.com/D77692
With the addition of this change, a lone `mach busted file` will throw an error (since this should only ever be used by actual humans, and not in automation or anything, the backwards-compatibility breakage isn't a huge deal). Now it's expected to pass in `mach busted file $COMMAND`, where $COMMAND is the mach command that you were running when the error occurred, and we'll figure out which bugzilla component to file the bug against for you instead of directing it to `Firefox Build System :: General` regardless of whether the issue has *anything* to do with the build system. We preserve `mach busted file general` as a backup command that doesn't do any of this heavy logic.
Differential Revision: https://phabricator.services.mozilla.com/D77538
These commands were removed 5 years ago (with the exception of robocop, which was removed about a year ago). We're way past the point where anyone would glean useful info from keeping the stubs here with an error message.
Differential Revision: https://phabricator.services.mozilla.com/D77536
There are `conditions` in tree that are callables but which don't have a `__name__` attribute; for example, `functools.partial` instances don't have a `__name__` since they're effectively anonymous functions. If you get to this branch and one of your `conditions` are that kind of object then you'll get a confusing error message instead of the understandable one we're trying to produce here, so account for that possibility.
Differential Revision: https://phabricator.services.mozilla.com/D72957
The existence of this environment variable breaks the Python shell on Windows, so make sure it's unset (but only in this case to avoid regressing bug 1627873).
Differential Revision: https://phabricator.services.mozilla.com/D70542
This solves an edge case where tab completing a Python 2 command with global
arguments was using the wrong Python version.
Differential Revision: https://phabricator.services.mozilla.com/D56445
--HG--
extra : moz-landing-system : lando
When the build is stalled for some random reason, and mach executed a
python subcommand (this may or may not be limited to python 3, I'm not
sure, and it doesn't really matter since it's a problem on python 3,
which matters most), the subcommand may not have actually sent its last
bits of output before the stall because its output is a pipe and in that
case python uses buffered outputs.
Now, when your build is completely stalled and you ctrl+C, you end up
without these bits of output, and in some cases, those bits of output
can contain actual information, like... tracebacks.
A real life example of this is bug 1624670 when running mach build or
mach configure with the patches from bug 1627163 applied, and configure
stalls without printing out the ValueError message at all.
Differential Revision: https://phabricator.services.mozilla.com/D69925
--HG--
extra : moz-landing-system : lando