Граф коммитов

31 Коммитов

Автор SHA1 Сообщение Дата
evgeny pavlov 233ea51a55 Relock poetry after conflicts 2024-08-29 13:03:01 -07:00
Greg Tatum 43d5680620
Add a train task (#812)
* Add a task for triggering training

* Update the training guide
2024-08-28 14:46:10 -05:00
Ben Hearsum (he/him) f66a7b67fa
feat: add scaffolding and basic tests for taskgraph generation (#776)
This is prep work for https://github.com/mozilla/firefox-translations-training/issues/628, where I'd like to add some tests to avoid regressing that again in the future.

The fixtures here are based on similar tests from Gecko: https://searchfox.org/mozilla-central/source/taskcluster/test. There's a bit of a terrible hack to make optimized task graphs testable, described more in the comments.
2024-08-07 13:13:07 -04:00
Ben Hearsum (he/him) c01034ee78
chore: bump taskgraph to 9.2.0 (#738)
* chore: bump taskgraph to 10.0.1

This picks up some fixes that are expected to fix #680.

I'm picking up other dependency updates as well, most notably to redo (2.x -> 3.x). That major bump is just because it's dropping Python 2.x support, which doesn't affect us.

* fix: ensure mkdir /builds always succeeds in base docker image

This may be implicitly done because it is referenced in a `VOLUME`. See https://taskcluster-taskgraph.readthedocs.io/en/latest/reference/migrations.html#x-10-x.

* fix: don't try to decompress fetched python wheels or npz files
2024-07-24 19:25:02 -04:00
Valentin Rigal 794bdb2240
Rebase on main @5d35e4a3 (#696) 2024-07-02 12:04:49 -07:00
Evgeny Pavlov 61a2704711
Fix poetry lock (#706)
* Revert "Use pip-compile for tracking dependencies (#695)"

This reverts commit 24748d0608.

* Fix numpy issue
2024-06-27 10:34:19 -07:00
Ben Hearsum (he/him) 2b9e53e0c5
chore: upgrade to Taskgraph 9 (#665)
This is primarily to pick up https://github.com/taskcluster/taskgraph/pull/514, which will be needed for #466.
2024-06-19 10:15:13 -04:00
Ben Hearsum (he/him) a5dd406ff4
Bump taskcluster-taskgraph to 8.2.0 in poetry (#672)
* Bump taskcluster-taskgraph to 8.2.0 in poetry

* fix: run tests when taskcluster configs changes

Some of these files influence test outcomes
2024-06-19 09:11:46 -04:00
Greg Tatum 56040c94b9
Automatically generate training config files with the `task config-generator` (#620)
* Create a util to automatically generate configs

* Add the generated configs

* Update the config generation script

* Update the configs

* Update the configs

* Address review comments for the config generator

* Fix find_corpus test
2024-05-24 16:09:05 -05:00
Greg Tatum da880da8be
Use a virtual environment per requirements.txt file in run_task (#568)
* Add missing mtdata dependency

* Remove pipeline python dependencies from pyproject.toml

* Use the requirements.txt file for run_task

* Add venv support to the CI Dockerfile for the testing image

* Add timing information to the taskgraph generation and a flag to disable the generation
2024-05-13 12:31:51 -05:00
Greg Tatum e8c6f2e8d3
Remove the Makefile and replace it with a Taskfile (#510) 2024-04-09 16:11:13 -05:00
Evgeny Pavlov fab87a7a70
Add support of inline noise data augmentation (#502)
* Add eflomal based aligner

* Use new aligner for shortlist

* Remove old aligner

* Add Taskcluster steps for whitespace tokenized alignments

* Move file to a renamed directory

* Use Tags modifier in training

* Update tests for alignments and shortlist

* Add support of inline noise augmentation in data importer

* Do not use slow inline noise augmentation in devset on CI

* Remove the old alignments task

* Add a test for student alignments

* Fix alignments in training tests

* Return matplotlib module after merge

* Rename functions

* Add more comments in the code

* Remove compression env

* Relock poetry
2024-03-28 18:10:02 -07:00
Evgeny Pavlov 3774779cb7
Add Marian server for model testing (#492)
* Compile marian server

* Add Marian server for testing

* Reformat

* Update utils/marian_client.py

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>

* Make port configurable

* Relock poetry

---------

Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
2024-03-28 15:53:16 -07:00
Greg Tatum 78977402e0
Analysis task that provides the word distribution (#477) 2024-03-26 13:28:43 -05:00
Valentin Rigal 3f135aa115
Taskcluster task group publication (#406)
* Base taskcluster task group publication

* Move tag parser to utils module

* Support metrics

* Support multiple teacher training

* Fix parsing for evaluation folder

* Generic group logs parser

* Parse extra evaluation tasks and publish group_logs fake run

* Publish Marian config on runs

* Publish marian config on runs instead of experiment config

* Rebase vrigal:publish-experiment-config

* Publish experiment config on group_logs
2024-02-16 09:05:01 -08:00
Evgeny Pavlov 58cce071ef
Support typos and noise modifiers (#428)
* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Fix test

* Update opus trainer

* Remove noise parameters from config

* Update opustrainer with fixes

* Run linter

* Fix tests after merge

* Disable noise for student

* Update lockfile

* Fix formatting

* Disable typos for student

* Rename assert functions

* Switch back to faster validation

* Document decision on using augmentations

* Fix typo
2024-02-15 15:33:24 -08:00
Evgeny Pavlov 190358a923
Fix linting for tracking (#441)
* Fix linting pythonpath for tracking

* Add pythonpath to the rest of the commands

* Remove pythonpath

* Update lockfile

* Fix wandb directory in tests
2024-02-15 09:21:33 -08:00
Ben Hearsum (he/him) 70fede467f
Add the ability to run starting from a specific task (fixes #227) (#377)
* Add the ability to run starting from a specific task (fixes #227)

A couple of example runs with this:
* https://firefox-ci-tc.services.mozilla.com/tasks/groups/YHAr0HzwSSe4pe5Yh9dIlg uses https://firefox-ci-tc.services.mozilla.com/tasks/groups/JjNp3KcyTUObUtOA9BgK5g as its `previous-group-id` with `start-stage: train-backwards` and `target-stage: train-teacher` - and ends up running `train-backwards, `translate-mono-trg`, `collect-mono-trg`, and `train-teacher`.
* https://firefox-ci-tc.services.mozilla.com/tasks/groups/Sm0YV_8LQP-EOE8Nz6G5Lw uses the above group as its `previous-group-id` with `start-stage: train-teacher` and `target-stage: all`. Note that it ended up depending on tasks from both the above group and the one that it was based on, and ended up scheduling `train-teacher` and everything after it (I didn't bother letting them all run - I think the scheduling is enough to verify this).

Big thanks to @gabrielBusta for suggesting this implementation!

* Update poetry dependencies to pull in newer taskgraph version
2024-02-14 09:07:07 -05:00
Greg Tatum f4ded7d07f
Add huggingface to the find_corpus (#397) 2024-01-26 15:01:19 -06:00
Greg Tatum 673916fbf5
Make CI happy again (#362)
* Add --no-root to fix linting issue

* Add version number

* Update ruff

* Remove pytest clarity

* Switch tests/test_tracking_cli.py to assert on unordered sets
2024-01-13 12:06:07 -06:00
Greg Tatum f583322300
Add a preflight check utility (#353) 2024-01-12 09:35:18 -06:00
Valentin Rigal d35f28e542
Add publication package (#309)
* Add documentation

* Move publication parser prototype

From https://github.com/mozilla/translations-experiment-tracking/pull/4
Commit a06886e0

* Update parser package for translations main repo

* Remove pre-commit rules

* Apply black

* Update parser code

* Remove package and pin requirements

* Nits/Fixes

* Fix taskcluster naming

* Move parser to 'tracking' root folder

* Switch to pyproject.toml + pinned dependencies

* Add a sample for experiments structure

* Update metrics parser

* Add speed metrics

* Only publish metrics in a bar chart

* Publish fake run at last

* Linting and small fixes

* Merge .gitignore

* Handle pushing metrics when no logs are available

* Add tests

* Fix tests for CI job

* rename Taskcluster sample file

* Suggestions

* Add type hints + parser refactoring

* Improve typing + run static checker (Mypy)

* Suggestions

* Update tests

* Invert metrics data order (bleu_detok, chrf)

* Update CI tests task

* Fix lint

* Update poetry.lock

* Fix tests in CI

* Fix hardcoded path
* Add missing experiments/logs folder (ignored by git)

* Group experiments to analyze by alphabetic order

---------

Co-authored-by: Bastien Abadie <bastien@nextcairn.com>
Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-01-11 13:25:53 -08:00
Evgeny Pavlov b253a1ce6b
Fix install opuscleaner (#350)
* Update and enable opuscleaner

* Remove comment
2024-01-10 12:02:22 -08:00
Greg Tatum e48440fc2c
Fix the vocab training script for Taskcluster (#326) 2023-12-22 09:40:46 -06:00
Evgeny Pavlov 2d4530d0f5
Always split corpus to a fixed number of parts (#308)
* Always split corpus to a fixed number of parts

* Fix splitting

* Rewrite corpus splitting in Python

* Replace in taskcluster

* Add tests

* Unify compression tool with Taskcluster

* Move zstd installation to docker image

* Disable opuscleaner in CI

* Compress chunks

* Fix file names

* Remove zeros from file index

* Start file index with 1

* Fix corpus splitting

* Add a link to an issue

* Generate script description from doc

* Use new test dir

* Use new test dir

* Test command line args

* Clarify expected files

* Add logging
2023-12-19 15:25:33 -08:00
Greg Tatum d1be2bca4a
Update the find corpus tool to provide more information (#280)
* Add pytest-clarity for better text diffs in tests

* Add requests_mock for tests

* Add the test_data artifact to the .gitignore

* Use an underscore with find_corpus.py

* Update the find corpus tool to provide more information

* Add humanize to the dependency list
2023-12-12 15:08:59 -06:00
Evgeny Pavlov 0e757b0070
Integrate OpusTrainer (#219)
integrated OpusTrainer in train.sh
    added dataset importer that can augment datasets for evaluation
    removed teacher fine-tuning step. The pre-training and fine-tuning are now done in one step
    removed merge-augmented step
    adjusted pipeline settings to work with a higher amount of data
    modified the Snakemake pipeline accordingly but didn't test
    updated browsermt marian
    added docs
    added unit tests
2023-11-17 16:59:02 -08:00
Evgeny Pavlov d79162cdd2
Add tensorboard util (#233)
* Add tensorboard util

* Use python TC lib for artifact downloading
2023-10-25 12:49:40 -07:00
Evgeny Pavlov e9102a37ef
Integrate OpusCleaner (#163)
* Initial integration of opus cleaner

* Support custom filters

* Use opus cleaner in pipeline

* Fix env

* Fix filter generation

* Add more rules

* Fix elrc filter

* Fix env

* Fix frequent patterns filter

* Switch to reading from stdin

* Add a feature flag for opus cleaner

* Fix condition

* Add extra test for non empty files

* Integrate with TC

* Run linter

* Fix step config

* Fix step config

* Fix step config

* Fix step config

* Fix command

* Fix path

* Update OpusCleaner

* Remove warning

* Log filtered length

* Add opuscleaner logs

* Add comments

* Fix using custom filters

* Extract function

* Change the CI target back

* Fix file path

* Replace conda with poetry

* Add doc

* Add more comments

* Rename example filter

* Test corpus

* Fix filter name

* Use opus dataset instead of mtdata

* Make CI faster

* Add sections to makefile

* Fix custom filter search

* Redirect stderr to stdout

* Fix usage of custom config

* Fix config name

* Change back to all
2023-09-26 15:29:07 -07:00
Evgeny Pavlov 299d41c34b
Add TC test run to CI (#195)
* Add snakemake test run to CI

* Add toolchain

* Add docker image

* Reduce datasets

* Move ci to a separate config

* Add utils to poetry

* Fix config

* Fix config

* Disable docker

* Use test docker image

* Fix artifacts dir

* Fix tests

* Fix profile setting

* Fix root dir

* Faster translation

* Expose artifacts

* Change default TC config

* Fix default TC config

* Disable snakemake run

* Enable running on PR

* Fix ci config

* Add vocab size argument

* Retrigger CI

* Add a comment on snakemake run

* Use a smaller teacher model for CI

* Try to retrigger downloading

* Use the same year for mono src and trg

* Revert changes [skip ci]

* Revert test config [skip ci]

* Fix comment [skip ci]
2023-09-20 09:40:30 -07:00
Greg Tatum 96b9b3f16b
Add ruff and black linting to the CI (#187)
* Add python's black formatter

* Apply black formatting

* Add the ruff linter

* Run make lint-fix

* Suppress or fix lint issues

* Add a fix-all make command
2023-09-08 09:50:24 -05:00