Граф коммитов

35 Коммитов

Автор SHA1 Сообщение Дата
Valentin Rigal c0a9585a34
Actualize documentation about experiment tracking on Weight & Biases (#861)
* Update existing documentation for the tracking module

* Add doc for Weight & Biases

* Suggestions & Nits

---------

Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-10-01 11:09:26 -07:00
Greg Tatum 9d355d82fe
Rewrite train.sh to train.py (#842)
* Add a run_pipeline utility

* Add more tests for training

* Rewrite train.sh into train.py

* Add the pipeline to the PYTHONPATH

* Ensure that the W&B tracker throws errors in CI

* Add the Taskcluster environment variables so test-fast works on the train test

* Address review comments
2024-09-18 09:04:48 -05:00
Ben Hearsum (he/him) 3e921d2a8d
restrict github-push taskcluster events to `main` (#777)
* restrict github-push taskcluster events to `main`, `release*`, and `dev*`

In https://bugzilla.mozilla.org/show_bug.cgi?id=1907217 we're becoming more explicit about scopes we grant to branches in Github, which means branches that do not show up in the explicit list in fxci-config (https://github.com/mozilla-releng/fxci-config/blob/main/projects.yml) will not be able to start tasks.

* Update documentation and training helper script to take into account supported branches for Taskcluster
2024-09-06 13:53:31 -04:00
Gabriel Bustamante 68aa0a7377
Add an action to rebuild pipeline toolchains and docker images (#798)
* Add an action to rebuild pipeline toolchains

* Rename action to rebuild-docker-images-and-toolchains and include docker-image and fetch tasks

* add documentation on how to manually rebuild cached tasks

---------

Co-authored-by: Ben Hearsum <ben@mozilla.com>
2024-09-04 14:35:29 -04:00
Greg Tatum 43d5680620
Add a train task (#812)
* Add a task for triggering training

* Update the training guide
2024-08-28 14:46:10 -05:00
Valentin Rigal b20c6247c0
Parse stalled validation data (#637)
* Add missing validation metrics

* Allow validation entries missing stalled value

* Update tests

* Support learning rate

* Update test fixtures for W&B

* Suggestion

---------

Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-05-30 10:45:50 -07:00
Evgeny Pavlov 25a33add45
Custom cleaning (#547)
* Update default config

* Pre-download fast text model

* Add custom filters

* Add unit tests for config generation

* Make using custom filtering configs configurable

* Fix substitution
2024-05-16 14:27:27 -07:00
Evgeny Pavlov 63e273c0b3
Add ability to switch to a one-stage teacher training (#596)
* Add ability to switch to a one-stage teacher training

* Fix type name

* Update more configs

* Add extra argument
2024-05-15 12:17:51 -07:00
Greg Tatum 157fc5e8ae
Update the training continuation docs (#540) 2024-04-29 14:36:14 -05:00
Greg Tatum e8c6f2e8d3
Remove the Makefile and replace it with a Taskfile (#510) 2024-04-09 16:11:13 -05:00
Greg Tatum f60c657596
Update the docs for training continuation to use yaml (#516) 2024-04-09 13:54:18 -05:00
Evgeny Pavlov a3bb87c069
Update docs for OpusTrainer and alignments (#504)
* Update OpusTrainer docs with inline noise

* Update and refactor documentation for pipeline steps
2024-03-28 18:18:23 -07:00
Greg Tatum 65ca580a16
Add support for custom corpora through remote URLs (#420) 2024-03-06 13:03:40 -06:00
Valentin Rigal 8012cd30cf
Update parser documentation (#462)
Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-02-29 14:57:22 -08:00
Evgeny Pavlov 58cce071ef
Support typos and noise modifiers (#428)
* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Fix test

* Update opus trainer

* Remove noise parameters from config

* Update opustrainer with fixes

* Run linter

* Fix tests after merge

* Disable noise for student

* Update lockfile

* Fix formatting

* Disable typos for student

* Rename assert functions

* Switch back to faster validation

* Document decision on using augmentations

* Fix typo
2024-02-15 15:33:24 -08:00
Ben Hearsum (he/him) 70fede467f
Add the ability to run starting from a specific task (fixes #227) (#377)
* Add the ability to run starting from a specific task (fixes #227)

A couple of example runs with this:
* https://firefox-ci-tc.services.mozilla.com/tasks/groups/YHAr0HzwSSe4pe5Yh9dIlg uses https://firefox-ci-tc.services.mozilla.com/tasks/groups/JjNp3KcyTUObUtOA9BgK5g as its `previous-group-id` with `start-stage: train-backwards` and `target-stage: train-teacher` - and ends up running `train-backwards, `translate-mono-trg`, `collect-mono-trg`, and `train-teacher`.
* https://firefox-ci-tc.services.mozilla.com/tasks/groups/Sm0YV_8LQP-EOE8Nz6G5Lw uses the above group as its `previous-group-id` with `start-stage: train-teacher` and `target-stage: all`. Note that it ended up depending on tasks from both the above group and the one that it was based on, and ended up scheduling `train-teacher` and everything after it (I didn't bother letting them all run - I think the scheduling is enough to verify this).

Big thanks to @gabrielBusta for suggesting this implementation!

* Update poetry dependencies to pull in newer taskgraph version
2024-02-14 09:07:07 -05:00
Evgeny Pavlov 31311927ef
Move snakemake to a separate folder (#431)
* Move snakemake code to a separate folder

* Small fixes

* Run linter

* Revert formatting

* Fix readme
2024-02-09 09:46:52 -08:00
Evgeny Pavlov afad4f4cad
Tune sentencepiece alphas (#421)
* Increase sp alpha and move to configs

* Add docs

* Update docs/training-guide.md

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

* Update docs/training-guide.md

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

* Update docs/training-guide.md

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

---------

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
2024-02-06 12:23:18 -08:00
Ben Hearsum (he/him) 437ceac078
Add documentation on how to monitor CPU, GPU, etc. on training instances (#398) 2024-01-29 11:35:22 -05:00
Gabriel Bustamante 4218387361
[skip ci] add docs on `pretrained-models` configuration parameter (#349) 2024-01-25 13:52:07 -08:00
Greg Tatum 7f43bd0c7d
Point to the docs for marian args (#381) 2024-01-25 07:31:58 -06:00
Greg Tatum 7ccb5eba7d
Add some docs to what dedupe is (#379) 2024-01-25 07:21:10 -06:00
Evgeny Pavlov 99f2397ebf
Always use Bicleaner AI (#367)
* Use only bicleaner ai

* Remove test command

* Disable hard rules for multilingual model

* Change taskcluster kinds

* Remove bilcleaner

* Fix bicleaner model step

* remove bicleaner

* Fix find upstream

* Add toolchain

* Fix arg type

* Don't delete tmp dir

* Fix artefacts

* Fix artifacts

* Fix linter issue

* Fix path

* Rename pack dir

* Add tests

* Fix typo

* Replace rename to move

* Bump max run time

* Remove expiration

* Fix docs and clarify caching strategy

* Fix doc

* Revert order

* Small fixes

* Fix typo

* Use data dir fixture

* Fix comment

* Remove unused item
2024-01-24 11:46:44 -08:00
Evgeny Pavlov 61f5ec2761
Fix unstable training (#352)
* Adjust opus trainer settings

* Fix optimizer delay

* Use default learning rate

* Enable back translations

* Report learning rate for teacher

* Remove old link

* Match validation and save frequency

* Roll back learning rate

* Disable snakemake dry run

* Add a note about optimizer delay

* Add a link to opus trainer paper
2024-01-17 11:52:25 -08:00
Valentin Rigal d35f28e542
Add publication package (#309)
* Add documentation

* Move publication parser prototype

From https://github.com/mozilla/translations-experiment-tracking/pull/4
Commit a06886e0

* Update parser package for translations main repo

* Remove pre-commit rules

* Apply black

* Update parser code

* Remove package and pin requirements

* Nits/Fixes

* Fix taskcluster naming

* Move parser to 'tracking' root folder

* Switch to pyproject.toml + pinned dependencies

* Add a sample for experiments structure

* Update metrics parser

* Add speed metrics

* Only publish metrics in a bar chart

* Publish fake run at last

* Linting and small fixes

* Merge .gitignore

* Handle pushing metrics when no logs are available

* Add tests

* Fix tests for CI job

* rename Taskcluster sample file

* Suggestions

* Add type hints + parser refactoring

* Improve typing + run static checker (Mypy)

* Suggestions

* Update tests

* Invert metrics data order (bleu_detok, chrf)

* Update CI tests task

* Fix lint

* Update poetry.lock

* Fix tests in CI

* Fix hardcoded path
* Add missing experiments/logs folder (ignored by git)

* Group experiments to analyze by alphabetic order

---------

Co-authored-by: Bastien Abadie <bastien@nextcairn.com>
Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-01-11 13:25:53 -08:00
Ben Hearsum (he/him) 8e99d1c2bb
Update interactive task documentation to work with docker-worker & generic-worker (#329) 2023-12-22 14:05:14 -05:00
Greg Tatum 742fb8f999
Add documentation to various parts of the scripts and pipeline (#298) 2023-12-15 13:34:05 -06:00
Greg Tatum 1273a0ef3f
Fix the docs building by using the remote theme (#310) 2023-12-15 11:29:56 -06:00
Greg Tatum 4a7d718415
Add a serve-docs make command (#302) 2023-12-15 09:15:08 -06:00
Ben Hearsum (he/him) e3740ebef0
Add documentation on how to run an interactive task (#307) 2023-12-14 17:11:59 -05:00
Evgeny Pavlov 0e757b0070
Integrate OpusTrainer (#219)
integrated OpusTrainer in train.sh
    added dataset importer that can augment datasets for evaluation
    removed teacher fine-tuning step. The pre-training and fine-tuning are now done in one step
    removed merge-augmented step
    adjusted pipeline settings to work with a higher amount of data
    modified the Snakemake pipeline accordingly but didn't test
    updated browsermt marian
    added docs
    added unit tests
2023-11-17 16:59:02 -08:00
Evgeny Pavlov 2df0a3a905
Update the training guide (#239)
* Update training guide

* Fix docs

* Add index file

* Remove header

* Fix docs link

* Remove tensorboard section

* Add theme

* Update navigation

* Add logo

* Use absolute links

* Fix code links

* Fix code links

* Fix link

* Clarify what config is

* Fix note for bicleaner

Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>

* Fix typo

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

* Fix link

* Fix mentioning of Marian

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

* Remove "my"

* Make note about snakemake more visible

* Fix phrasing

* Add link to bilceaner paper

* Add clarifications

* Add links to default training configs

* Add reference to bilceaner section

* Small fixes

---------

Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
2023-11-06 10:03:17 -08:00
Evgeny Pavlov d79162cdd2
Add tensorboard util (#233)
* Add tensorboard util

* Use python TC lib for artifact downloading
2023-10-25 12:49:40 -07:00
Evgeny Pavlov 83d43bfcf6
Update docs (#224)
* Update docs

* Fix typos

* Fix TC docs

* Fix relative links
2023-10-16 16:33:29 -07:00
Evgeny Pavlov e9102a37ef
Integrate OpusCleaner (#163)
* Initial integration of opus cleaner

* Support custom filters

* Use opus cleaner in pipeline

* Fix env

* Fix filter generation

* Add more rules

* Fix elrc filter

* Fix env

* Fix frequent patterns filter

* Switch to reading from stdin

* Add a feature flag for opus cleaner

* Fix condition

* Add extra test for non empty files

* Integrate with TC

* Run linter

* Fix step config

* Fix step config

* Fix step config

* Fix step config

* Fix command

* Fix path

* Update OpusCleaner

* Remove warning

* Log filtered length

* Add opuscleaner logs

* Add comments

* Fix using custom filters

* Extract function

* Change the CI target back

* Fix file path

* Replace conda with poetry

* Add doc

* Add more comments

* Rename example filter

* Test corpus

* Fix filter name

* Use opus dataset instead of mtdata

* Make CI faster

* Add sections to makefile

* Fix custom filter search

* Redirect stderr to stdout

* Fix usage of custom config

* Fix config name

* Change back to all
2023-09-26 15:29:07 -07:00