зеркало из https://github.com/mozilla/gecko-dev.git
218 строки
8.5 KiB
ReStructuredText
218 строки
8.5 KiB
ReStructuredText
Transforms
|
|
==========
|
|
|
|
Many task kinds generate tasks by a process of transforming job descriptions
|
|
into task definitions. The basic operation is simple, although the sequence of
|
|
transforms applied for a particular kind may not be!
|
|
|
|
Overview
|
|
--------
|
|
|
|
To begin, a kind implementation generates a collection of items; see
|
|
:doc:`loading`. The items are simply Python dictionaries, and describe
|
|
"semantically" what the resulting task or tasks should do.
|
|
|
|
The kind also defines a sequence of transformations. These are applied, in
|
|
order, to each item. Early transforms might apply default values or break
|
|
items up into smaller items (for example, chunking a test suite). Later
|
|
transforms rewrite the items entirely, with the final result being a task
|
|
definition.
|
|
|
|
Transform Functions
|
|
...................
|
|
|
|
Each transformation looks like this:
|
|
|
|
.. code-block:: python
|
|
|
|
@transforms.add
|
|
def transform_an_item(config, items):
|
|
"""This transform ...""" # always a docstring!
|
|
for item in items:
|
|
# ..
|
|
yield item
|
|
|
|
The ``config`` argument is a Python object containing useful configuration for
|
|
the kind, and is a subclass of
|
|
:class:`taskgraph.transforms.base.TransformConfig`, which specifies a few of
|
|
its attributes. Kinds may subclass and add additional attributes if necessary.
|
|
|
|
While most transforms yield one item for each item consumed, this is not always
|
|
true: items that are not yielded are effectively filtered out. Yielding
|
|
multiple items for each consumed item implements item duplication; this is how
|
|
test chunking is accomplished, for example.
|
|
|
|
The ``transforms`` object is an instance of
|
|
:class:`taskgraph.transforms.base.TransformSequence`, which serves as a simple
|
|
mechanism to combine a sequence of transforms into one.
|
|
|
|
Schemas
|
|
.......
|
|
|
|
The items used in transforms are validated against some simple schemas at
|
|
various points in the transformation process. These schemas accomplish two
|
|
things: they provide a place to add comments about the meaning of each field,
|
|
and they enforce that the fields are actually used in the documented fashion.
|
|
|
|
Keyed By
|
|
........
|
|
|
|
Several fields in the input items can be "keyed by" another value in the item.
|
|
For example, a test description's chunks may be keyed by ``test-platform``.
|
|
In the item, this looks like:
|
|
|
|
.. code-block:: yaml
|
|
|
|
chunks:
|
|
by-test-platform:
|
|
linux64/debug: 12
|
|
linux64/opt: 8
|
|
android.*: 14
|
|
default: 10
|
|
|
|
This is a simple but powerful way to encode business rules in the items
|
|
provided as input to the transforms, rather than expressing those rules in the
|
|
transforms themselves. If you are implementing a new business rule, prefer
|
|
this mode where possible. The structure is easily resolved to a single value
|
|
using :func:`taskgraph.transform.base.resolve_keyed_by`.
|
|
|
|
Exact matches are used immediately. If no exact matches are found, each
|
|
alternative is treated as a regular expression, matched against the whole
|
|
value. Thus ``android.*`` would match ``android-arm/debug``. If nothing
|
|
matches as a regular expression, but there is a ``default`` alternative, it is
|
|
used. Otherwise, an exception is raised and graph generation stops.
|
|
|
|
Organization
|
|
-------------
|
|
|
|
Task creation operates broadly in a few phases, with the interfaces of those
|
|
stages defined by schemas. The process begins with the raw data structures
|
|
parsed from the YAML files in the kind configuration. This data can processed
|
|
by kind-specific transforms resulting, for test jobs, in a "test description".
|
|
For non-test jobs, the next step is a "job description". These transformations
|
|
may also "duplicate" tasks, for example to implement chunking or several
|
|
variations of the same task.
|
|
|
|
In any case, shared transforms then convert this into a "task description",
|
|
which the task-generation transforms then convert into a task definition
|
|
suitable for ``queue.createTask``.
|
|
|
|
Test Descriptions
|
|
-----------------
|
|
|
|
Test descriptions specify how to run a unittest or talos run. They aim to
|
|
describe this abstractly, although in many cases the unique nature of
|
|
invocation on different platforms leaves a lot of specific behavior in the test
|
|
description, divided by ``by-test-platform``.
|
|
|
|
Test descriptions are validated to conform to the schema in
|
|
``taskcluster/gecko_taskgraph/transforms/test/__init__.py``. This schema is
|
|
extensively documented and is a the primary reference for anyone modifying
|
|
tests.
|
|
|
|
The output of ``tests.py`` is a task description. Test dependencies are
|
|
produced in the form of a dictionary mapping dependency name to task label.
|
|
|
|
Job Descriptions
|
|
----------------
|
|
|
|
A job description says what to run in the task. It is a combination of a
|
|
``run`` section and all of the fields from a task description. The run section
|
|
has a ``using`` property that defines how this task should be run; for example,
|
|
``mozharness`` to run a mozharness script, or ``mach`` to run a mach command.
|
|
The remainder of the run section is specific to the run-using implementation.
|
|
|
|
The effect of a job description is to say "run this thing on this worker". The
|
|
job description must contain enough information about the worker to identify
|
|
the workerType and the implementation (docker-worker, generic-worker, etc.).
|
|
Alternatively, job descriptions can specify the ``platforms`` field in
|
|
conjunction with the ``by-platform`` key to specify multiple workerTypes and
|
|
implementations. Any other task-description information is passed along
|
|
verbatim, although it is augmented by the run-using implementation.
|
|
|
|
The run-using implementations are all located in
|
|
``taskcluster/gecko_taskgraph/transforms/job``, along with the schemas for their
|
|
implementations. Those well-commented source files are the canonical
|
|
documentation for what constitutes a job description, and should be considered
|
|
part of the documentation.
|
|
|
|
following ``run-using`` are available
|
|
|
|
* ``hazard``
|
|
* ``mach``
|
|
* ``mozharness``
|
|
* ``mozharness-test``
|
|
* ``run-task``
|
|
* ``spidermonkey`` or ``spidermonkey-package``
|
|
* ``debian-package``
|
|
* ``ubuntu-package``
|
|
* ``toolchain-script``
|
|
* ``always-optimized``
|
|
* ``fetch-url``
|
|
* ``python-test``
|
|
|
|
|
|
Task Descriptions
|
|
-----------------
|
|
|
|
Every kind needs to create tasks, and all of those tasks have some things in
|
|
common. They all run on one of a small set of worker implementations, each
|
|
with their own idiosyncrasies. And they all report to TreeHerder in a similar
|
|
way.
|
|
|
|
The transforms in ``taskcluster/gecko_taskgraph/transforms/task.py`` implement
|
|
this common functionality. They expect a "task description", and produce a
|
|
task definition. The schema for a task description is defined at the top of
|
|
``task.py``, with copious comments. Go forth and read it now!
|
|
|
|
In general, the task-description transforms handle functionality that is common
|
|
to all Gecko tasks. While the schema is the definitive reference, the
|
|
functionality includes:
|
|
|
|
* TreeHerder metadata
|
|
|
|
* Build index routes
|
|
|
|
* Information about the projects on which this task should run
|
|
|
|
* Optimizations
|
|
|
|
* Defaults for ``expires-after`` and and ``deadline-after``, based on project
|
|
|
|
* Worker configuration
|
|
|
|
The parts of the task description that are specific to a worker implementation
|
|
are isolated in a ``task_description['worker']`` object which has an
|
|
``implementation`` property naming the worker implementation. Each worker
|
|
implementation has its own section of the schema describing the fields it
|
|
expects. Thus the transforms that produce a task description must be aware of
|
|
the worker implementation to be used, but need not be aware of the details of
|
|
its payload format.
|
|
|
|
The ``task.py`` file also contains a dictionary mapping treeherder groups to
|
|
group names using an internal list of group names. Feel free to add additional
|
|
groups to this list as necessary.
|
|
|
|
Signing Descriptions
|
|
--------------------
|
|
|
|
Signing kinds are passed a single dependent job (from its kind dependency) to act
|
|
on.
|
|
|
|
The transforms in ``taskcluster/gecko_taskgraph/transforms/signing.py`` implement
|
|
this common functionality. They expect a "signing description", and produce a
|
|
task definition. The schema for a signing description is defined at the top of
|
|
``signing.py``, with copious comments.
|
|
|
|
In particular you define a set of upstream artifact urls (that point at the
|
|
dependent task) and can optionally provide a dependent name (defaults to build)
|
|
for use in ``task-reference``/``artifact-reference``. You also need to provide
|
|
the signing formats to use.
|
|
|
|
More Detail
|
|
-----------
|
|
|
|
The source files provide lots of additional detail, both in the code itself and
|
|
in the comments and docstrings. For the next level of detail beyond this file,
|
|
consult the transform source under ``taskcluster/gecko_taskgraph/transforms``.
|