Add an action to rebuild pipeline toolchains and docker images (#798)

* Add an action to rebuild pipeline toolchains

* Rename action to rebuild-docker-images-and-toolchains and include docker-image and fetch tasks

* add documentation on how to manually rebuild cached tasks

---------

Co-authored-by: Ben Hearsum <ben@mozilla.com>
This commit is contained in:
Gabriel Bustamante 2024-09-04 13:35:29 -05:00 коммит произвёл GitHub
Родитель 7b672cf8d8
Коммит 68aa0a7377
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: B5690EEEBB952194
3 изменённых файлов: 43 добавлений и 0 удалений

Просмотреть файл

@ -115,6 +115,12 @@ previous_group_ids: ["SsGpi3TGShaDT-h93fHL-g"]
Note: This feature should _never_ be used for production training, as it completely bypasses all caching mechanisms, and you will most likely end up with invalid or useless models.
## Dealing with expired upstream tasks
All tasks eventually expire, and have their artifacts and metadata deleted from Taskcluster, typically 1 year after creation. This can cause problems if it happens while partway through a training session. This happens most commonly with tasks that are shared across multiple training runs, such as `toolchain` and `docker-image` tasks. When this happens you can use the "Rebuild Docker Images and Toolchains" action to rebuild these, and add the task group they are rebuilt in to the `previous_group_ids` when kicking off a training run.
You may also use this action directly prior to kicking off the start of a new lanugage pair training to ensure that it uses fresh toolchains and docker images, which will typically avoid this problem altogether.
## Interactive Tasks
Taskcluster allows authorized users to run so-called [interactive tasks](https://docs.taskcluster.net/docs/reference/workers/docker-worker/features#feature-interactive). These tasks allow users to gain a shell in the same environment that a pipeline step runs in. This can often be useful for quicker debugging or testing of ideas.

Просмотреть файл

@ -5,6 +5,7 @@ def register(graph_config):
_import_modules(
[
"actions.train",
"actions.rebuild_docker_images_and_toolchains",
"parameters",
"target_tasks",
]

Просмотреть файл

@ -0,0 +1,36 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from taskgraph.actions.registry import register_callback_action
from taskgraph.actions.util import create_tasks, fetch_graph_and_labels
@register_callback_action(
name="rebuild-docker-images-and-toolchains",
title="Rebuild Docker Images and Toolchains",
symbol="images-and-toolchains",
description="Create docker-image and toolchain tasks to rebuild their artifacts.",
order=1000,
context=[],
)
def rebuild_docker_images_and_toolchains_action(
parameters, graph_config, input, task_group_id, task_id
):
decision_task_id, full_task_graph, label_to_task_id = fetch_graph_and_labels(
parameters, graph_config, task_group_id=task_group_id
)
tasks_to_create = [
label
for label, task in full_task_graph.tasks.items()
if task.kind == "docker-image" or task.kind == "fetch" or task.kind == "toolchain"
]
if tasks_to_create:
create_tasks(
graph_config,
tasks_to_create,
full_task_graph,
label_to_task_id,
parameters,
decision_task_id,
)