Evgeny Pavlov
1e9799c9bd
Add a link to W&B dashboard ( #759 )
2024-07-19 14:13:28 -07:00
Evgeny Pavlov
31311927ef
Move snakemake to a separate folder ( #431 )
...
* Move snakemake code to a separate folder
* Small fixes
* Run linter
* Revert formatting
* Fix readme
2024-02-09 09:46:52 -08:00
Valentin Rigal
d35f28e542
Add publication package ( #309 )
...
* Add documentation
* Move publication parser prototype
From https://github.com/mozilla/translations-experiment-tracking/pull/4
Commit a06886e0
* Update parser package for translations main repo
* Remove pre-commit rules
* Apply black
* Update parser code
* Remove package and pin requirements
* Nits/Fixes
* Fix taskcluster naming
* Move parser to 'tracking' root folder
* Switch to pyproject.toml + pinned dependencies
* Add a sample for experiments structure
* Update metrics parser
* Add speed metrics
* Only publish metrics in a bar chart
* Publish fake run at last
* Linting and small fixes
* Merge .gitignore
* Handle pushing metrics when no logs are available
* Add tests
* Fix tests for CI job
* rename Taskcluster sample file
* Suggestions
* Add type hints + parser refactoring
* Improve typing + run static checker (Mypy)
* Suggestions
* Update tests
* Invert metrics data order (bleu_detok, chrf)
* Update CI tests task
* Fix lint
* Update poetry.lock
* Fix tests in CI
* Fix hardcoded path
* Add missing experiments/logs folder (ignored by git)
* Group experiments to analyze by alphabetic order
---------
Co-authored-by: Bastien Abadie <bastien@nextcairn.com>
Co-authored-by: Evgeny Pavlov <epavlov@mozilla.com>
2024-01-11 13:25:53 -08:00
Evgeny Pavlov
2df0a3a905
Update the training guide ( #239 )
...
* Update training guide
* Fix docs
* Add index file
* Remove header
* Fix docs link
* Remove tensorboard section
* Add theme
* Update navigation
* Add logo
* Use absolute links
* Fix code links
* Fix code links
* Fix link
* Clarify what config is
* Fix note for bicleaner
Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
* Fix typo
Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
* Fix link
* Fix mentioning of Marian
Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
* Remove "my"
* Make note about snakemake more visible
* Fix phrasing
* Add link to bilceaner paper
* Add clarifications
* Add links to default training configs
* Add reference to bilceaner section
* Small fixes
---------
Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
Co-authored-by: Greg Tatum <gregtatum@users.noreply.github.com>
2023-11-06 10:03:17 -08:00
Evgeny Pavlov
83d43bfcf6
Update docs ( #224 )
...
* Update docs
* Fix typos
* Fix TC docs
* Fix relative links
2023-10-16 16:33:29 -07:00
Evgeny Pavlov
ac9ceec855
Add link to blog post
2022-07-18 16:51:15 -07:00
Evgeny Pavlov
fc2b3b64f3
Add link to training guide
2022-07-18 15:31:46 -07:00
Evgeny Pavlov
7c58f6558b
Move configuraiton to profiles ( #96 )
...
* Move configuration settings to profiles
* Use realtive paths
* Fix output formatting
* Update dag
* Update docs
2022-06-17 10:56:07 -07:00
Evgeny Pavlov
03a2ddaa3f
Update README.md
2022-04-26 17:30:01 -07:00
Evgeny Pavlov
355d9b958e
Add train vocab step
2022-04-22 12:50:19 -07:00
Evgeny Pavlov
551aeb5ea0
Add more references to publications
2022-04-22 12:44:27 -07:00
Amit Moryossef
b97da19bbb
Update README.md ( #86 )
2022-04-21 11:11:24 -07:00
Evgeny Pavlov
9fb8e9e0f8
Update README.md
2022-04-20 11:57:21 -07:00
Evgeny Pavlov
22a3751a09
Add support of Mozilla slurm cluster ( #72 )
2022-02-22 17:48:21 -08:00
Evgeny Pavlov
174cceaa6f
Bugfix and optimization ( #41 )
...
- bugfix
- training and decoding optimization
- evaluation refactoring
- small usability improvements
- moved marian configurations overriding back to configs
2022-01-05 13:24:05 -08:00
Evgeny Pavlov
3b3f33bf25
Quality improvements ( #29 )
2021-12-06 15:03:35 -08:00
Evgeny Pavlov
a09b0ac7ac
Update README.md
2021-10-28 11:07:11 -07:00
Evgeny Pavlov
ef8928b454
Snakemake integration ( #24 )
...
- workflow management using Snakemake
- parallelization to run on a cluster
- Singularity containerization support
- Slurm support
- teacher ensemble support
2021-10-28 10:39:09 -07:00
Evgeny Pavlov
0f6e64cf19
Minor improvements ( #20 )
...
- Flores dataset importer
- custom dataset importer
- ability to use a pre-trained backward model
- save experiment config on start
- stubs for dataset caching ( decided to sync implementation with workflow manager integration )
- use best bleu models instead of best ce-mean-words
- fix linting warnings
2021-08-17 13:20:34 -07:00
Evgeny Pavlov
ec783cfbbb
Bicleaner support + fixes ( #13 )
...
SacreBLEU is a regular importer now and evaluation is not limited to sacrebleu datasets.
fixes
Added bicleaner-ai and bicleaner filtering (one or another based on available pretrained language packs).
fixes
Added script to find all datasets based on language pair and importer type, ready to use in config
fixes
Fixed conda environment activation to be reproducible on GCP
Other minor reproducibility fixes
2021-07-26 10:00:49 -07:00
Evgeny Pavlov
af2abbf525
Add reference to bergamot project
2021-06-21 14:58:02 -07:00
Evgeny Pavlov
4b12dee551
Fix readme after renaming
2021-06-21 14:38:07 -07:00
Evgeny Pavlov
2bcdef2b36
Rename repo
2021-06-21 14:33:31 -07:00
Evgeny Pavlov
3bea08bf4a
Initial pipeline ( #1 )
2021-06-17 15:39:15 -07:00
Evgeny Pavlov
8d11fb1e97
Initial commit
2021-04-30 15:36:49 -07:00