Граф коммитов

6 Коммитов

Автор SHA1 Сообщение Дата
Greg Tatum 9d355d82fe
Rewrite train.sh to train.py (#842)
* Add a run_pipeline utility

* Add more tests for training

* Rewrite train.sh into train.py

* Add the pipeline to the PYTHONPATH

* Ensure that the W&B tracker throws errors in CI

* Add the Taskcluster environment variables so test-fast works on the train test

* Address review comments
2024-09-18 09:04:48 -05:00
Evgeny Pavlov 63e273c0b3
Add ability to switch to a one-stage teacher training (#596)
* Add ability to switch to a one-stage teacher training

* Fix type name

* Update more configs

* Add extra argument
2024-05-15 12:17:51 -07:00
Evgeny Pavlov a3bb87c069
Update docs for OpusTrainer and alignments (#504)
* Update OpusTrainer docs with inline noise

* Update and refactor documentation for pipeline steps
2024-03-28 18:18:23 -07:00
Evgeny Pavlov 58cce071ef
Support typos and noise modifiers (#428)
* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Update opustrainer

* Adjust configs

* Add evaluation modifiers

* Reduce noise

* Add tests for typos and noise

* Fix typos augmenter

* Fix linting issues

* Update docs

* Fix test

* Update opus trainer

* Remove noise parameters from config

* Update opustrainer with fixes

* Run linter

* Fix tests after merge

* Disable noise for student

* Update lockfile

* Fix formatting

* Disable typos for student

* Rename assert functions

* Switch back to faster validation

* Document decision on using augmentations

* Fix typo
2024-02-15 15:33:24 -08:00
Evgeny Pavlov 61f5ec2761
Fix unstable training (#352)
* Adjust opus trainer settings

* Fix optimizer delay

* Use default learning rate

* Enable back translations

* Report learning rate for teacher

* Remove old link

* Match validation and save frequency

* Roll back learning rate

* Disable snakemake dry run

* Add a note about optimizer delay

* Add a link to opus trainer paper
2024-01-17 11:52:25 -08:00
Evgeny Pavlov 0e757b0070
Integrate OpusTrainer (#219)
integrated OpusTrainer in train.sh
    added dataset importer that can augment datasets for evaluation
    removed teacher fine-tuning step. The pre-training and fine-tuning are now done in one step
    removed merge-augmented step
    adjusted pipeline settings to work with a higher amount of data
    modified the Snakemake pipeline accordingly but didn't test
    updated browsermt marian
    added docs
    added unit tests
2023-11-17 16:59:02 -08:00