firefox-translations-training/3rd_party
Evgeny Pavlov 0e757b0070
Integrate OpusTrainer (#219)
integrated OpusTrainer in train.sh
    added dataset importer that can augment datasets for evaluation
    removed teacher fine-tuning step. The pre-training and fine-tuning are now done in one step
    removed merge-augmented step
    adjusted pipeline settings to work with a higher amount of data
    modified the Snakemake pipeline accordingly but didn't test
    updated browsermt marian
    added docs
    added unit tests
2023-11-17 16:59:02 -08:00
..
browsermt-marian-dev@11c6ae7c46 Integrate OpusTrainer (#219) 2023-11-17 16:59:02 -08:00
extract-lex@42fa605b53 Initial pipeline (#1) 2021-06-17 15:39:15 -07:00
fast_align@cab1e9aac8 Initial pipeline (#1) 2021-06-17 15:39:15 -07:00
kenlm@bbf4fc5112 Bicleaner support + fixes (#13) 2021-07-26 10:00:49 -07:00
marian-dev@e8a1a2530f Bugfix and optimization (#41) 2022-01-05 13:24:05 -08:00
preprocess@64307314b4 Integrate deduplication in the pipeline (#70) 2022-02-11 16:50:41 -08:00