Adding Catalan to English models (#72)

* Adding Catalan to English models

* Update evaluation results [skip ci]

* Update model registry [skip ci]

---------

Co-authored-by: CircleCI evaluation job <ci-models-evaluation@firefox-translations>
This commit is contained in:
Andre Natal 2023-04-20 16:34:52 -07:00 коммит произвёл GitHub
Родитель c652b869d8
Коммит 89ef02f585
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
75 изменённых файлов: 1387 добавлений и 1163 удалений

Просмотреть файл

@ -56,92 +56,59 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
## avg
| Translator/Dataset | en-ru | ru-en | en-nl | fa-en | uk-en | en-fa | is-en | nl-en | en-uk |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 29.44 | 33.69 | 27.30 | 28.70 | 35.93 | 17.30 | 23.40 | 29.65 | 26.30 |
| google | 34.49 (+5.05, +17.15%) | 38.20 (+4.51, +13.38%) | 29.30 (+2.00, +7.33%) | 40.85 (+12.15, +42.33%) | 42.43 (+6.50, +18.09%) | 27.80 (+10.50, +60.69%) | 38.90 (+15.50, +66.24%) | 33.05 (+3.40, +11.47%) | 32.63 (+6.33, +24.08%) |
| microsoft | 33.62 (+4.18, +14.21%) | 38.38 (+4.68, +13.90%) | 28.80 (+1.50, +5.49%) | 36.15 (+7.45, +25.96%) | 42.30 (+6.37, +17.72%) | 20.50 (+3.20, +18.50%) | 38.17 (+14.77, +63.11%) | 32.60 (+2.95, +9.95%) | 32.03 (+5.73, +21.80%) |
| Translator/Dataset | ru-en | en-nl | en-ru | en-fa | nl-en | uk-en | fa-en | ca-en | en-uk | is-en |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 33.69 | 27.30 | 29.44 | 17.30 | 29.65 | 35.93 | 28.70 | 38.00 | 26.30 | 23.40 |
| google | 38.20 (+4.51, +13.38%) | 29.30 (+2.00, +7.33%) | 34.49 (+5.05, +17.15%) | 27.80 (+10.50, +60.69%) | 33.05 (+3.40, +11.47%) | 42.43 (+6.50, +18.09%) | 40.85 (+12.15, +42.33%) | 48.95 (+10.95, +28.82%) | 32.63 (+6.33, +24.08%) | 38.90 (+15.50, +66.24%) |
| microsoft | 38.38 (+4.68, +13.90%) | 28.80 (+1.50, +5.49%) | 33.62 (+4.18, +14.21%) | 20.50 (+3.20, +18.50%) | 32.60 (+2.95, +9.95%) | 42.30 (+6.37, +17.72%) | 36.15 (+7.45, +25.96%) | 46.50 (+8.50, +22.37%) | 32.03 (+5.73, +21.80%) | 38.17 (+14.77, +63.11%) |
![Results](img/avg-bleu.png)
---
## en-ru
| Translator/Dataset | wmt20 | wmt13 | flores-test | flores-dev | wmt21 | wmt19 | wmt17 | wmt16 | wmt15 | wmt14 | wmt22 | wmt18 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 22.00 | 26.20 | 29.20 | 29.90 | 25.50 | 31.40 | 33.60 | 30.90 | 31.40 | 38.20 | 26.50 | 28.50 |
| google | 27.20 (+5.20, +23.64%) | 28.00 (+1.80, +6.87%) | 34.40 (+5.20, +17.81%) | 34.90 (+5.00, +16.72%) | 30.00 (+4.50, +17.65%) | 32.90 (+1.50, +4.78%) | 38.90 (+5.30, +15.77%) | 35.00 (+4.10, +13.27%) | 36.90 (+5.50, +17.52%) | 45.70 (+7.50, +19.63%) | 35.00 (+8.50, +32.08%) | 35.00 (+6.50, +22.81%) |
| microsoft | 26.30 (+4.30, +19.55%) | 27.30 (+1.10, +4.20%) | 33.60 (+4.40, +15.07%) | 33.50 (+3.60, +12.04%) | 29.20 (+3.70, +14.51%) | 33.20 (+1.80, +5.73%) | 38.60 (+5.00, +14.88%) | 34.20 (+3.30, +10.68%) | 36.10 (+4.70, +14.97%) | 44.70 (+6.50, +17.02%) | 33.10 (+6.60, +24.91%) | 33.70 (+5.20, +18.25%) |
![Results](img/en-ru-bleu.png)
---
## ru-en
| Translator/Dataset | flores-dev | mtedx_test | wmt18 | wmt20 | wmt19 | wmt15 | wmt17 | wmt14 | wmt16 | wmt22 | wmt13 | flores-test | wmt21 |
| Translator/Dataset | mtedx_test | wmt19 | wmt17 | flores-dev | wmt22 | flores-test | wmt14 | wmt15 | wmt16 | wmt13 | wmt18 | wmt21 | wmt20 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 31.90 | 24.00 | 31.90 | 35.00 | 39.10 | 33.50 | 37.60 | 37.80 | 33.00 | 38.50 | 29.30 | 31.00 | 35.40 |
| google | 38.40 (+6.50, +20.38%) | 25.10 (+1.10, +4.58%) | 37.30 (+5.40, +16.93%) | 38.40 (+3.40, +9.71%) | 42.80 (+3.70, +9.46%) | 38.60 (+5.10, +15.22%) | 42.70 (+5.10, +13.56%) | 42.70 (+4.90, +12.96%) | 37.60 (+4.60, +13.94%) | 43.70 (+5.20, +13.51%) | 32.20 (+2.90, +9.90%) | 37.30 (+6.30, +20.32%) | 39.80 (+4.40, +12.43%) |
| microsoft | 36.50 (+4.60, +14.42%) | 26.20 (+2.20, +9.17%) | 37.40 (+5.50, +17.24%) | 38.80 (+3.80, +10.86%) | 43.80 (+4.70, +12.02%) | 38.50 (+5.00, +14.93%) | 43.70 (+6.10, +16.22%) | 44.10 (+6.30, +16.67%) | 38.40 (+5.40, +16.36%) | 43.90 (+5.40, +14.03%) | 32.50 (+3.20, +10.92%) | 36.10 (+5.10, +16.45%) | 39.00 (+3.60, +10.17%) |
| bergamot | 24.00 | 39.10 | 37.60 | 31.90 | 38.50 | 31.00 | 37.80 | 33.50 | 33.00 | 29.30 | 31.90 | 35.40 | 35.00 |
| google | 25.10 (+1.10, +4.58%) | 42.80 (+3.70, +9.46%) | 42.70 (+5.10, +13.56%) | 38.40 (+6.50, +20.38%) | 43.70 (+5.20, +13.51%) | 37.30 (+6.30, +20.32%) | 42.70 (+4.90, +12.96%) | 38.60 (+5.10, +15.22%) | 37.60 (+4.60, +13.94%) | 32.20 (+2.90, +9.90%) | 37.30 (+5.40, +16.93%) | 39.80 (+4.40, +12.43%) | 38.40 (+3.40, +9.71%) |
| microsoft | 26.20 (+2.20, +9.17%) | 43.80 (+4.70, +12.02%) | 43.70 (+6.10, +16.22%) | 36.50 (+4.60, +14.42%) | 43.90 (+5.40, +14.03%) | 36.10 (+5.10, +16.45%) | 44.10 (+6.30, +16.67%) | 38.50 (+5.00, +14.93%) | 38.40 (+5.40, +16.36%) | 32.50 (+3.20, +10.92%) | 37.40 (+5.50, +17.24%) | 39.00 (+3.60, +10.17%) | 38.80 (+3.80, +10.86%) |
![Results](img/ru-en-bleu.png)
---
## en-nl
| Translator/Dataset | flores-test | flores-dev |
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 27.00 | 27.60 |
| google | 29.20 (+2.20, +8.15%) | 29.40 (+1.80, +6.52%) |
| microsoft | 28.60 (+1.60, +5.93%) | 29.00 (+1.40, +5.07%) |
| bergamot | 27.60 | 27.00 |
| google | 29.40 (+1.80, +6.52%) | 29.20 (+2.20, +8.15%) |
| microsoft | 29.00 (+1.40, +5.07%) | 28.60 (+1.60, +5.93%) |
![Results](img/en-nl-bleu.png)
---
## fa-en
## en-ru
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 29.10 | 28.30 |
| google | 42.00 (+12.90, +44.33%) | 39.70 (+11.40, +40.28%) |
| microsoft | 36.50 (+7.40, +25.43%) | 35.80 (+7.50, +26.50%) |
| Translator/Dataset | wmt16 | wmt15 | flores-dev | wmt22 | wmt18 | wmt14 | wmt17 | wmt20 | wmt13 | wmt21 | wmt19 | flores-test |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 30.90 | 31.40 | 29.90 | 26.50 | 28.50 | 38.20 | 33.60 | 22.00 | 26.20 | 25.50 | 31.40 | 29.20 |
| google | 35.00 (+4.10, +13.27%) | 36.90 (+5.50, +17.52%) | 34.90 (+5.00, +16.72%) | 35.00 (+8.50, +32.08%) | 35.00 (+6.50, +22.81%) | 45.70 (+7.50, +19.63%) | 38.90 (+5.30, +15.77%) | 27.20 (+5.20, +23.64%) | 28.00 (+1.80, +6.87%) | 30.00 (+4.50, +17.65%) | 32.90 (+1.50, +4.78%) | 34.40 (+5.20, +17.81%) |
| microsoft | 34.20 (+3.30, +10.68%) | 36.10 (+4.70, +14.97%) | 33.50 (+3.60, +12.04%) | 33.10 (+6.60, +24.91%) | 33.70 (+5.20, +18.25%) | 44.70 (+6.50, +17.02%) | 38.60 (+5.00, +14.88%) | 26.30 (+4.30, +19.55%) | 27.30 (+1.10, +4.20%) | 29.20 (+3.70, +14.51%) | 33.20 (+1.80, +5.73%) | 33.60 (+4.40, +15.07%) |
![Results](img/fa-en-bleu.png)
---
## uk-en
| Translator/Dataset | flores-dev | wmt22 | flores-test |
| --- | --- | --- | --- |
| bergamot | 35.60 | 36.60 | 35.60 |
| google | 43.10 (+7.50, +21.07%) | 41.60 (+5.00, +13.66%) | 42.60 (+7.00, +19.66%) |
| microsoft | 41.80 (+6.20, +17.42%) | 44.40 (+7.80, +21.31%) | 40.70 (+5.10, +14.33%) |
![Results](img/uk-en-bleu.png)
![Results](img/en-ru-bleu.png)
---
## en-fa
| Translator/Dataset | flores-dev | flores-test |
| Translator/Dataset | flores-test | flores-dev |
| --- | --- | --- |
| bergamot | 17.20 | 17.40 |
| google | 27.20 (+10.00, +58.14%) | 28.40 (+11.00, +63.22%) |
| microsoft | 19.90 (+2.70, +15.70%) | 21.10 (+3.70, +21.26%) |
| bergamot | 17.40 | 17.20 |
| google | 28.40 (+11.00, +63.22%) | 27.20 (+10.00, +58.14%) |
| microsoft | 21.10 (+3.70, +21.26%) | 19.90 (+2.70, +15.70%) |
![Results](img/en-fa-bleu.png)
---
## is-en
| Translator/Dataset | flores-dev | flores-test | wmt21 |
| --- | --- | --- | --- |
| bergamot | 23.60 | 23.40 | 23.20 |
| google | 39.40 (+15.80, +66.95%) | 38.60 (+15.20, +64.96%) | 38.70 (+15.50, +66.81%) |
| microsoft | 37.30 (+13.70, +58.05%) | 36.70 (+13.30, +56.84%) | 40.50 (+17.30, +74.57%) |
![Results](img/is-en-bleu.png)
---
## nl-en
| Translator/Dataset | flores-dev | flores-test |
@ -153,13 +120,57 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
![Results](img/nl-en-bleu.png)
---
## uk-en
| Translator/Dataset | flores-dev | wmt22 | flores-test |
| --- | --- | --- | --- |
| bergamot | 35.60 | 36.60 | 35.60 |
| google | 43.10 (+7.50, +21.07%) | 41.60 (+5.00, +13.66%) | 42.60 (+7.00, +19.66%) |
| microsoft | 41.80 (+6.20, +17.42%) | 44.40 (+7.80, +21.31%) | 40.70 (+5.10, +14.33%) |
![Results](img/uk-en-bleu.png)
---
## fa-en
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 29.10 | 28.30 |
| google | 42.00 (+12.90, +44.33%) | 39.70 (+11.40, +40.28%) |
| microsoft | 36.50 (+7.40, +25.43%) | 35.80 (+7.50, +26.50%) |
![Results](img/fa-en-bleu.png)
---
## ca-en
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 38.70 | 37.30 |
| google | 49.60 (+10.90, +28.17%) | 48.30 (+11.00, +29.49%) |
| microsoft | 46.80 (+8.10, +20.93%) | 46.20 (+8.90, +23.86%) |
![Results](img/ca-en-bleu.png)
---
## en-uk
| Translator/Dataset | flores-test | wmt22 | flores-dev |
| Translator/Dataset | flores-dev | flores-test | wmt22 |
| --- | --- | --- | --- |
| bergamot | 28.20 | 22.80 | 27.90 |
| google | 33.10 (+4.90, +17.38%) | 32.00 (+9.20, +40.35%) | 32.80 (+4.90, +17.56%) |
| microsoft | 33.50 (+5.30, +18.79%) | 30.40 (+7.60, +33.33%) | 32.20 (+4.30, +15.41%) |
| bergamot | 27.90 | 28.20 | 22.80 |
| google | 32.80 (+4.90, +17.56%) | 33.10 (+4.90, +17.38%) | 32.00 (+9.20, +40.35%) |
| microsoft | 32.20 (+4.30, +15.41%) | 33.50 (+5.30, +18.79%) | 30.40 (+7.60, +33.33%) |
![Results](img/en-uk-bleu.png)
---
## is-en
| Translator/Dataset | flores-dev | flores-test | wmt21 |
| --- | --- | --- | --- |
| bergamot | 23.60 | 23.40 | 23.20 |
| google | 39.40 (+15.80, +66.95%) | 38.60 (+15.20, +64.96%) | 38.70 (+15.50, +66.81%) |
| microsoft | 37.30 (+13.70, +58.05%) | 36.70 (+13.30, +56.84%) | 40.50 (+17.30, +74.57%) |
![Results](img/is-en-bleu.png)
---

Просмотреть файл

@ -0,0 +1 @@
38.7

Просмотреть файл

@ -0,0 +1 @@
0.6699

Просмотреть файл

@ -0,0 +1,61 @@
==========================
x_name: flores-dev.bergamot.en
y_name: flores-dev.microsoft.en
Bootstrap Resampling Results:
x-mean: 0.6700
y-mean: 0.7980
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -18.8769
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
==========================
x_name: flores-dev.bergamot.en
y_name: flores-dev.google.en
Bootstrap Resampling Results:
x-mean: 0.6700
y-mean: 0.8228
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -21.5915
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-dev.google.en outperforms flores-dev.bergamot.en.
==========================
x_name: flores-dev.microsoft.en
y_name: flores-dev.google.en
Bootstrap Resampling Results:
x-mean: 0.7980
y-mean: 0.8228
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -6.7390
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-dev.google.en outperforms flores-dev.microsoft.en.
Summary
If system_x is better than system_y then:
Null hypothesis rejected according to t-test with p_value=0.05.
Scores differ significantly across samples.
system_x \ system_y flores-dev.bergamot.en flores-dev.microsoft.en flores-dev.google.en
----------------------- ------------------------ ------------------------- ----------------------
flores-dev.bergamot.en False False
flores-dev.microsoft.en True False
flores-dev.google.en True True

Просмотреть файл

@ -0,0 +1 @@
49.6

Просмотреть файл

@ -0,0 +1 @@
0.8218

Просмотреть файл

@ -0,0 +1 @@
46.8

Просмотреть файл

@ -0,0 +1 @@
0.7979

Просмотреть файл

@ -0,0 +1 @@
37.3

Просмотреть файл

@ -0,0 +1 @@
0.6381

Просмотреть файл

@ -0,0 +1,61 @@
==========================
x_name: flores-test.bergamot.en
y_name: flores-test.microsoft.en
Bootstrap Resampling Results:
x-mean: 0.6383
y-mean: 0.7878
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -17.4826
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-test.microsoft.en outperforms flores-test.bergamot.en.
==========================
x_name: flores-test.bergamot.en
y_name: flores-test.google.en
Bootstrap Resampling Results:
x-mean: 0.6383
y-mean: 0.8105
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -18.9692
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-test.google.en outperforms flores-test.bergamot.en.
==========================
x_name: flores-test.microsoft.en
y_name: flores-test.google.en
Bootstrap Resampling Results:
x-mean: 0.7878
y-mean: 0.8105
ties (%): 0.0000
x_wins (%): 0.0000
y_wins (%): 1.0000
Paired T-Test Results:
statistic: -6.1132
p_value: 0.0000
Null hypothesis rejected according to t-test.
Scores differ significantly across samples.
flores-test.google.en outperforms flores-test.microsoft.en.
Summary
If system_x is better than system_y then:
Null hypothesis rejected according to t-test with p_value=0.05.
Scores differ significantly across samples.
system_x \ system_y flores-test.bergamot.en flores-test.microsoft.en flores-test.google.en
------------------------ ------------------------- -------------------------- -----------------------
flores-test.bergamot.en False False
flores-test.microsoft.en True False
flores-test.google.en True True

Просмотреть файл

@ -0,0 +1 @@
48.3

Просмотреть файл

@ -0,0 +1 @@
0.8103

Просмотреть файл

@ -0,0 +1 @@
46.2

Просмотреть файл

@ -0,0 +1 @@
0.7877

Просмотреть файл

@ -6,6 +6,18 @@ Three models with different human judgments have been trained to showcase the fr
The models developed by COMET have achieved new state-of-the-art performance on the WMT 2019 Metrics shared task, demonstrating robustness to high-performing systems.
## Interpreting Scores:
When using COMET to evaluate machine translation, it's important to understand how to interpret the scores it produces.
In general, COMET models are trained to predict quality scores for translations. These scores are typically normalized using a z-score transformation to account for individual differences among annotators. While the raw score itself does not have a direct interpretation, it is useful for ranking translations and systems according to their quality.
However, for the latest COMET models like Unbabel/wmt22-comet-da, we have introduced a new training approach that scales the scores between 0 and 1. This makes it easier to interpret the scores: a score close to 1 indicates a high-quality translation, while a score close to 0 indicates a translation that is no better than random chance.
It's worth noting that when using COMET to compare the performance of two different translation systems, it's important to run the comet-compare command to obtain statistical significance measures. This command compares the output of two systems using a statistical hypothesis test, providing an estimate of the probability that the observed difference in scores between the systems is due to chance. This is an important step to ensure that any differences in scores between systems are statistically significant.
Overall, the added interpretability of scores in the latest COMET models, combined with the ability to assess statistical significance between systems using comet-compare, make COMET a valuable tool for evaluating machine translation.
Source: https://aclanthology.org/2020.emnlp-main.213.pdf
Tool: https://github.com/Unbabel/COMET
@ -36,61 +48,130 @@ We also compare the systems using the `comet-compare` tool that calculates the s
## avg
| Translator/Dataset | en-ru | ru-en | en-nl | fa-en | uk-en | en-fa | is-en | nl-en | en-uk |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 0.54 | 0.49 | 0.58 | 0.50 | 0.52 | 0.31 | 0.15 | 0.63 | 0.51 |
| google | 0.76 (+0.21, +39.38%) | 0.59 (+0.10, +20.83%) | 0.67 (+0.08, +14.30%) | 0.74 (+0.24, +48.00%) | 0.67 (+0.15, +28.26%) | 0.70 (+0.39, +126.54%) | 0.70 (+0.55, +370.91%) | 0.70 (+0.07, +10.71%) | 0.79 (+0.27, +53.31%) |
| microsoft | 0.72 (+0.18, +32.36%) | 0.60 (+0.11, +22.13%) | 0.65 (+0.06, +11.05%) | 0.66 (+0.16, +32.78%) | 0.64 (+0.12, +23.16%) | 0.41 (+0.10, +31.65%) | 0.67 (+0.52, +353.71%) | 0.69 (+0.06, +9.12%) | 0.75 (+0.23, +45.60%) |
| Translator/Dataset | ru-en | en-nl | en-ru | en-fa | nl-en | uk-en | fa-en | ca-en | en-uk | is-en |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 0.49 | 0.58 | 0.54 | 0.31 | 0.63 | 0.52 | 0.50 | 0.65 | 0.51 | 0.15 |
| google | 0.59 (+0.10, +20.83%) | 0.67 (+0.08, +14.30%) | 0.76 (+0.21, +39.38%) | 0.70 (+0.39, +126.54%) | 0.70 (+0.07, +10.71%) | 0.67 (+0.15, +28.26%) | 0.74 (+0.24, +48.00%) | 0.82 (+0.16, +24.78%) | 0.79 (+0.27, +53.31%) | 0.70 (+0.55, +370.91%) |
| microsoft | 0.60 (+0.11, +22.13%) | 0.65 (+0.06, +11.05%) | 0.72 (+0.18, +32.36%) | 0.41 (+0.10, +31.65%) | 0.69 (+0.06, +9.12%) | 0.64 (+0.12, +23.16%) | 0.66 (+0.16, +32.78%) | 0.79 (+0.14, +21.22%) | 0.75 (+0.23, +45.60%) | 0.67 (+0.52, +353.71%) |
![Results](img/avg-comet.png)
---
## ru-en
| Translator/Dataset | wmt17 | wmt22 | flores-test | wmt20 | mtedx_test | wmt15 | wmt18 | wmt14 | wmt16 | wmt19 | wmt21 | flores-dev | wmt13 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 0.53 | 0.47 | 0.58 | 0.53 | 0.19 | 0.50 | 0.47 | 0.56 | 0.49 | 0.48 | 0.52 | 0.58 | 0.44 |
| google | 0.64 (+0.11, +20.17%) | 0.61 (+0.14, +29.75%) | 0.67 (+0.09, +16.41%) | 0.61 (+0.08, +14.10%) | 0.30 (+0.10, +51.95%) | 0.61 (+0.11, +21.38%) | 0.60 (+0.12, +26.13%) | 0.67 (+0.11, +19.24%) | 0.59 (+0.11, +22.02%) | 0.56 (+0.08, +16.67%) | 0.61 (+0.09, +18.09%) | 0.67 (+0.10, +17.05%) | 0.53 (+0.09, +19.35%) |
| microsoft | 0.65 (+0.11, +21.50%) | 0.62 (+0.15, +31.74%) | 0.66 (+0.08, +14.34%) | 0.62 (+0.09, +16.50%) | 0.30 (+0.10, +53.96%) | 0.62 (+0.11, +22.51%) | 0.60 (+0.13, +27.21%) | 0.68 (+0.11, +19.91%) | 0.60 (+0.11, +22.55%) | 0.59 (+0.11, +22.70%) | 0.62 (+0.10, +20.00%) | 0.67 (+0.09, +15.54%) | 0.54 (+0.10, +22.66%) |
![Results](img/ru-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [wmt17.ru-en](ru-en/wmt17.ru-en.cometcompare)
- wmt17.microsoft.en outperforms wmt17.bergamot.en.
- wmt17.google.en outperforms wmt17.bergamot.en.
- wmt17.microsoft.en outperforms wmt17.google.en.
#### [wmt22.ru-en](ru-en/wmt22.ru-en.cometcompare)
- wmt22.microsoft.en outperforms wmt22.bergamot.en.
- wmt22.google.en outperforms wmt22.bergamot.en.
#### [flores-test.ru-en](ru-en/flores-test.ru-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
#### [wmt20.ru-en](ru-en/wmt20.ru-en.cometcompare)
- wmt20.microsoft.en outperforms wmt20.bergamot.en.
- wmt20.google.en outperforms wmt20.bergamot.en.
- wmt20.microsoft.en outperforms wmt20.google.en.
#### [mtedx_test.ru-en](ru-en/mtedx_test.ru-en.cometcompare)
- mtedx_test.microsoft.en outperforms mtedx_test.bergamot.en.
- mtedx_test.google.en outperforms mtedx_test.bergamot.en.
#### [wmt15.ru-en](ru-en/wmt15.ru-en.cometcompare)
- wmt15.microsoft.en outperforms wmt15.bergamot.en.
- wmt15.google.en outperforms wmt15.bergamot.en.
#### [wmt18.ru-en](ru-en/wmt18.ru-en.cometcompare)
- wmt18.microsoft.en outperforms wmt18.bergamot.en.
- wmt18.google.en outperforms wmt18.bergamot.en.
#### [wmt14.ru-en](ru-en/wmt14.ru-en.cometcompare)
- wmt14.microsoft.en outperforms wmt14.bergamot.en.
- wmt14.google.en outperforms wmt14.bergamot.en.
#### [wmt16.ru-en](ru-en/wmt16.ru-en.cometcompare)
- wmt16.microsoft.en outperforms wmt16.bergamot.en.
- wmt16.google.en outperforms wmt16.bergamot.en.
#### [wmt19.ru-en](ru-en/wmt19.ru-en.cometcompare)
- wmt19.microsoft.en outperforms wmt19.bergamot.en.
- wmt19.google.en outperforms wmt19.bergamot.en.
- wmt19.microsoft.en outperforms wmt19.google.en.
#### [wmt21.ru-en](ru-en/wmt21.ru-en.cometcompare)
- wmt21.microsoft.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.bergamot.en.
#### [flores-dev.ru-en](ru-en/flores-dev.ru-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [wmt13.ru-en](ru-en/wmt13.ru-en.cometcompare)
- wmt13.microsoft.en outperforms wmt13.bergamot.en.
- wmt13.google.en outperforms wmt13.bergamot.en.
- wmt13.microsoft.en outperforms wmt13.google.en.
---
## en-nl
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 0.59 | 0.58 |
| google | 0.67 (+0.08, +13.04%) | 0.67 (+0.09, +15.59%) |
| microsoft | 0.64 (+0.05, +8.90%) | 0.65 (+0.08, +13.25%) |
![Results](img/en-nl-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-dev.en-nl](en-nl/flores-dev.en-nl.cometcompare)
- flores-dev.microsoft.nl outperforms flores-dev.bergamot.nl.
- flores-dev.google.nl outperforms flores-dev.bergamot.nl.
- flores-dev.google.nl outperforms flores-dev.microsoft.nl.
#### [flores-test.en-nl](en-nl/flores-test.en-nl.cometcompare)
- flores-test.microsoft.nl outperforms flores-test.bergamot.nl.
- flores-test.google.nl outperforms flores-test.bergamot.nl.
- flores-test.google.nl outperforms flores-test.microsoft.nl.
---
## en-ru
| Translator/Dataset | wmt18 | wmt21 | wmt20 | wmt16 | flores-test | wmt22 | wmt14 | wmt15 | wmt13 | flores-dev | wmt19 | wmt17 |
| Translator/Dataset | wmt19 | wmt21 | wmt15 | wmt13 | wmt20 | wmt16 | wmt14 | flores-dev | flores-test | wmt18 | wmt22 | wmt17 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 0.59 | 0.40 | 0.41 | 0.59 | 0.57 | 0.43 | 0.70 | 0.64 | 0.52 | 0.57 | 0.47 | 0.64 |
| google | 0.81 (+0.22, +37.57%) | 0.64 (+0.23, +57.93%) | 0.64 (+0.23, +57.66%) | 0.78 (+0.18, +31.02%) | 0.77 (+0.20, +34.81%) | 0.73 (+0.30, +70.07%) | 0.88 (+0.18, +26.42%) | 0.84 (+0.20, +31.50%) | 0.69 (+0.17, +32.58%) | 0.76 (+0.20, +34.95%) | 0.72 (+0.25, +53.03%) | 0.84 (+0.20, +30.61%) |
| microsoft | 0.77 (+0.18, +30.02%) | 0.59 (+0.19, +46.32%) | 0.59 (+0.18, +44.41%) | 0.74 (+0.15, +25.24%) | 0.73 (+0.16, +27.74%) | 0.67 (+0.25, +57.60%) | 0.86 (+0.16, +22.72%) | 0.81 (+0.17, +26.31%) | 0.67 (+0.14, +27.63%) | 0.72 (+0.15, +26.80%) | 0.70 (+0.22, +47.24%) | 0.81 (+0.17, +26.32%) |
| bergamot | 0.47 | 0.40 | 0.64 | 0.52 | 0.41 | 0.59 | 0.70 | 0.57 | 0.57 | 0.59 | 0.43 | 0.64 |
| google | 0.72 (+0.25, +53.03%) | 0.64 (+0.23, +57.93%) | 0.84 (+0.20, +31.50%) | 0.69 (+0.17, +32.58%) | 0.64 (+0.23, +57.66%) | 0.78 (+0.18, +31.02%) | 0.88 (+0.18, +26.42%) | 0.76 (+0.20, +34.95%) | 0.77 (+0.20, +34.81%) | 0.81 (+0.22, +37.57%) | 0.73 (+0.30, +70.07%) | 0.84 (+0.20, +30.61%) |
| microsoft | 0.70 (+0.22, +47.24%) | 0.59 (+0.19, +46.32%) | 0.81 (+0.17, +26.31%) | 0.67 (+0.14, +27.63%) | 0.59 (+0.18, +44.41%) | 0.74 (+0.15, +25.24%) | 0.86 (+0.16, +22.72%) | 0.72 (+0.15, +26.80%) | 0.73 (+0.16, +27.74%) | 0.77 (+0.18, +30.02%) | 0.67 (+0.25, +57.60%) | 0.81 (+0.17, +26.32%) |
![Results](img/en-ru-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [wmt18.en-ru](en-ru/wmt18.en-ru.cometcompare)
- wmt18.microsoft.ru outperforms wmt18.bergamot.ru.
- wmt18.google.ru outperforms wmt18.bergamot.ru.
- wmt18.google.ru outperforms wmt18.microsoft.ru.
#### [wmt19.en-ru](en-ru/wmt19.en-ru.cometcompare)
- wmt19.microsoft.ru outperforms wmt19.bergamot.ru.
- wmt19.google.ru outperforms wmt19.bergamot.ru.
- wmt19.google.ru outperforms wmt19.microsoft.ru.
#### [wmt21.en-ru](en-ru/wmt21.en-ru.cometcompare)
- wmt21.microsoft.ru outperforms wmt21.bergamot.ru.
- wmt21.google.ru outperforms wmt21.bergamot.ru.
- wmt21.google.ru outperforms wmt21.microsoft.ru.
#### [wmt20.en-ru](en-ru/wmt20.en-ru.cometcompare)
- wmt20.microsoft.ru outperforms wmt20.bergamot.ru.
- wmt20.google.ru outperforms wmt20.bergamot.ru.
- wmt20.google.ru outperforms wmt20.microsoft.ru.
#### [wmt16.en-ru](en-ru/wmt16.en-ru.cometcompare)
- wmt16.microsoft.ru outperforms wmt16.bergamot.ru.
- wmt16.google.ru outperforms wmt16.bergamot.ru.
- wmt16.google.ru outperforms wmt16.microsoft.ru.
#### [flores-test.en-ru](en-ru/flores-test.en-ru.cometcompare)
- flores-test.microsoft.ru outperforms flores-test.bergamot.ru.
- flores-test.google.ru outperforms flores-test.bergamot.ru.
- flores-test.google.ru outperforms flores-test.microsoft.ru.
#### [wmt22.en-ru](en-ru/wmt22.en-ru.cometcompare)
- wmt22.microsoft.ru outperforms wmt22.bergamot.ru.
- wmt22.google.ru outperforms wmt22.bergamot.ru.
- wmt22.google.ru outperforms wmt22.microsoft.ru.
#### [wmt14.en-ru](en-ru/wmt14.en-ru.cometcompare)
- wmt14.microsoft.ru outperforms wmt14.bergamot.ru.
- wmt14.google.ru outperforms wmt14.bergamot.ru.
- wmt14.google.ru outperforms wmt14.microsoft.ru.
#### [wmt15.en-ru](en-ru/wmt15.en-ru.cometcompare)
- wmt15.microsoft.ru outperforms wmt15.bergamot.ru.
- wmt15.google.ru outperforms wmt15.bergamot.ru.
@ -101,15 +182,40 @@ We also compare the systems using the `comet-compare` tool that calculates the s
- wmt13.google.ru outperforms wmt13.bergamot.ru.
- wmt13.google.ru outperforms wmt13.microsoft.ru.
#### [wmt20.en-ru](en-ru/wmt20.en-ru.cometcompare)
- wmt20.microsoft.ru outperforms wmt20.bergamot.ru.
- wmt20.google.ru outperforms wmt20.bergamot.ru.
- wmt20.google.ru outperforms wmt20.microsoft.ru.
#### [wmt16.en-ru](en-ru/wmt16.en-ru.cometcompare)
- wmt16.microsoft.ru outperforms wmt16.bergamot.ru.
- wmt16.google.ru outperforms wmt16.bergamot.ru.
- wmt16.google.ru outperforms wmt16.microsoft.ru.
#### [wmt14.en-ru](en-ru/wmt14.en-ru.cometcompare)
- wmt14.microsoft.ru outperforms wmt14.bergamot.ru.
- wmt14.google.ru outperforms wmt14.bergamot.ru.
- wmt14.google.ru outperforms wmt14.microsoft.ru.
#### [flores-dev.en-ru](en-ru/flores-dev.en-ru.cometcompare)
- flores-dev.microsoft.ru outperforms flores-dev.bergamot.ru.
- flores-dev.google.ru outperforms flores-dev.bergamot.ru.
- flores-dev.google.ru outperforms flores-dev.microsoft.ru.
#### [wmt19.en-ru](en-ru/wmt19.en-ru.cometcompare)
- wmt19.microsoft.ru outperforms wmt19.bergamot.ru.
- wmt19.google.ru outperforms wmt19.bergamot.ru.
- wmt19.google.ru outperforms wmt19.microsoft.ru.
#### [flores-test.en-ru](en-ru/flores-test.en-ru.cometcompare)
- flores-test.microsoft.ru outperforms flores-test.bergamot.ru.
- flores-test.google.ru outperforms flores-test.bergamot.ru.
- flores-test.google.ru outperforms flores-test.microsoft.ru.
#### [wmt18.en-ru](en-ru/wmt18.en-ru.cometcompare)
- wmt18.microsoft.ru outperforms wmt18.bergamot.ru.
- wmt18.google.ru outperforms wmt18.bergamot.ru.
- wmt18.google.ru outperforms wmt18.microsoft.ru.
#### [wmt22.en-ru](en-ru/wmt22.en-ru.cometcompare)
- wmt22.microsoft.ru outperforms wmt22.bergamot.ru.
- wmt22.google.ru outperforms wmt22.bergamot.ru.
- wmt22.google.ru outperforms wmt22.microsoft.ru.
#### [wmt17.en-ru](en-ru/wmt17.en-ru.cometcompare)
- wmt17.microsoft.ru outperforms wmt17.bergamot.ru.
@ -118,139 +224,63 @@ We also compare the systems using the `comet-compare` tool that calculates the s
---
## ru-en
| Translator/Dataset | wmt16 | wmt13 | wmt17 | wmt21 | wmt14 | wmt19 | flores-dev | wmt22 | wmt20 | flores-test | wmt15 | mtedx_test | wmt18 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 0.49 | 0.44 | 0.53 | 0.52 | 0.56 | 0.48 | 0.58 | 0.47 | 0.53 | 0.58 | 0.50 | 0.19 | 0.47 |
| google | 0.59 (+0.11, +22.02%) | 0.53 (+0.09, +19.35%) | 0.64 (+0.11, +20.17%) | 0.61 (+0.09, +18.09%) | 0.67 (+0.11, +19.24%) | 0.56 (+0.08, +16.67%) | 0.67 (+0.10, +17.05%) | 0.61 (+0.14, +29.75%) | 0.61 (+0.08, +14.10%) | 0.67 (+0.09, +16.41%) | 0.61 (+0.11, +21.38%) | 0.30 (+0.10, +51.95%) | 0.60 (+0.12, +26.13%) |
| microsoft | 0.60 (+0.11, +22.55%) | 0.54 (+0.10, +22.66%) | 0.65 (+0.11, +21.50%) | 0.62 (+0.10, +20.00%) | 0.68 (+0.11, +19.91%) | 0.59 (+0.11, +22.70%) | 0.67 (+0.09, +15.54%) | 0.62 (+0.15, +31.74%) | 0.62 (+0.09, +16.50%) | 0.66 (+0.08, +14.34%) | 0.62 (+0.11, +22.51%) | 0.30 (+0.10, +53.96%) | 0.60 (+0.13, +27.21%) |
![Results](img/ru-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [wmt16.ru-en](ru-en/wmt16.ru-en.cometcompare)
- wmt16.microsoft.en outperforms wmt16.bergamot.en.
- wmt16.google.en outperforms wmt16.bergamot.en.
#### [wmt13.ru-en](ru-en/wmt13.ru-en.cometcompare)
- wmt13.microsoft.en outperforms wmt13.bergamot.en.
- wmt13.google.en outperforms wmt13.bergamot.en.
- wmt13.microsoft.en outperforms wmt13.google.en.
#### [wmt17.ru-en](ru-en/wmt17.ru-en.cometcompare)
- wmt17.microsoft.en outperforms wmt17.bergamot.en.
- wmt17.google.en outperforms wmt17.bergamot.en.
- wmt17.microsoft.en outperforms wmt17.google.en.
#### [wmt21.ru-en](ru-en/wmt21.ru-en.cometcompare)
- wmt21.microsoft.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.bergamot.en.
#### [wmt14.ru-en](ru-en/wmt14.ru-en.cometcompare)
- wmt14.microsoft.en outperforms wmt14.bergamot.en.
- wmt14.google.en outperforms wmt14.bergamot.en.
#### [wmt19.ru-en](ru-en/wmt19.ru-en.cometcompare)
- wmt19.microsoft.en outperforms wmt19.bergamot.en.
- wmt19.google.en outperforms wmt19.bergamot.en.
- wmt19.microsoft.en outperforms wmt19.google.en.
#### [flores-dev.ru-en](ru-en/flores-dev.ru-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [wmt22.ru-en](ru-en/wmt22.ru-en.cometcompare)
- wmt22.microsoft.en outperforms wmt22.bergamot.en.
- wmt22.google.en outperforms wmt22.bergamot.en.
#### [wmt20.ru-en](ru-en/wmt20.ru-en.cometcompare)
- wmt20.microsoft.en outperforms wmt20.bergamot.en.
- wmt20.google.en outperforms wmt20.bergamot.en.
- wmt20.microsoft.en outperforms wmt20.google.en.
#### [flores-test.ru-en](ru-en/flores-test.ru-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
#### [wmt15.ru-en](ru-en/wmt15.ru-en.cometcompare)
- wmt15.microsoft.en outperforms wmt15.bergamot.en.
- wmt15.google.en outperforms wmt15.bergamot.en.
#### [mtedx_test.ru-en](ru-en/mtedx_test.ru-en.cometcompare)
- mtedx_test.microsoft.en outperforms mtedx_test.bergamot.en.
- mtedx_test.google.en outperforms mtedx_test.bergamot.en.
#### [wmt18.ru-en](ru-en/wmt18.ru-en.cometcompare)
- wmt18.microsoft.en outperforms wmt18.bergamot.en.
- wmt18.google.en outperforms wmt18.bergamot.en.
---
## en-nl
## en-fa
| Translator/Dataset | flores-test | flores-dev |
| --- | --- | --- |
| bergamot | 0.58 | 0.59 |
| google | 0.67 (+0.09, +15.59%) | 0.67 (+0.08, +13.04%) |
| microsoft | 0.65 (+0.08, +13.25%) | 0.64 (+0.05, +8.90%) |
| bergamot | 0.32 | 0.30 |
| google | 0.71 (+0.39, +119.58%) | 0.70 (+0.40, +134.13%) |
| microsoft | 0.42 (+0.10, +29.43%) | 0.40 (+0.10, +34.06%) |
![Results](img/en-nl-comet.png)
![Results](img/en-fa-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-test.en-nl](en-nl/flores-test.en-nl.cometcompare)
- flores-test.microsoft.nl outperforms flores-test.bergamot.nl.
- flores-test.google.nl outperforms flores-test.bergamot.nl.
- flores-test.google.nl outperforms flores-test.microsoft.nl.
#### [flores-test.en-fa](en-fa/flores-test.en-fa.cometcompare)
- flores-test.microsoft.fa outperforms flores-test.bergamot.fa.
- flores-test.google.fa outperforms flores-test.bergamot.fa.
- flores-test.google.fa outperforms flores-test.microsoft.fa.
#### [flores-dev.en-nl](en-nl/flores-dev.en-nl.cometcompare)
- flores-dev.microsoft.nl outperforms flores-dev.bergamot.nl.
- flores-dev.google.nl outperforms flores-dev.bergamot.nl.
- flores-dev.google.nl outperforms flores-dev.microsoft.nl.
#### [flores-dev.en-fa](en-fa/flores-dev.en-fa.cometcompare)
- flores-dev.microsoft.fa outperforms flores-dev.bergamot.fa.
- flores-dev.google.fa outperforms flores-dev.bergamot.fa.
- flores-dev.google.fa outperforms flores-dev.microsoft.fa.
---
## fa-en
## nl-en
| Translator/Dataset | flores-dev | flores-test |
| Translator/Dataset | flores-test | flores-dev |
| --- | --- | --- |
| bergamot | 0.49 | 0.51 |
| google | 0.74 (+0.25, +50.08%) | 0.74 (+0.23, +45.97%) |
| microsoft | 0.66 (+0.16, +33.31%) | 0.67 (+0.16, +32.25%) |
| bergamot | 0.64 | 0.63 |
| google | 0.70 (+0.07, +10.23%) | 0.70 (+0.07, +11.21%) |
| microsoft | 0.69 (+0.06, +8.75%) | 0.69 (+0.06, +9.49%) |
![Results](img/fa-en-comet.png)
![Results](img/nl-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-dev.fa-en](fa-en/flores-dev.fa-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [flores-test.fa-en](fa-en/flores-test.fa-en.cometcompare)
#### [flores-test.nl-en](nl-en/flores-test.nl-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
#### [flores-dev.nl-en](nl-en/flores-dev.nl-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
---
## uk-en
| Translator/Dataset | flores-dev | wmt22 | flores-test |
| Translator/Dataset | wmt22 | flores-test | flores-dev |
| --- | --- | --- | --- |
| bergamot | 0.59 | 0.38 | 0.60 |
| google | 0.70 (+0.10, +17.39%) | 0.61 (+0.23, +60.58%) | 0.71 (+0.11, +18.39%) |
| microsoft | 0.68 (+0.09, +15.01%) | 0.56 (+0.18, +47.84%) | 0.69 (+0.09, +15.47%) |
| bergamot | 0.38 | 0.60 | 0.59 |
| google | 0.61 (+0.23, +60.58%) | 0.71 (+0.11, +18.39%) | 0.70 (+0.10, +17.39%) |
| microsoft | 0.56 (+0.18, +47.84%) | 0.69 (+0.09, +15.47%) | 0.68 (+0.09, +15.01%) |
![Results](img/uk-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-dev.uk-en](uk-en/flores-dev.uk-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [wmt22.uk-en](uk-en/wmt22.uk-en.cometcompare)
- wmt22.microsoft.en outperforms wmt22.bergamot.en.
- wmt22.google.en outperforms wmt22.bergamot.en.
@ -261,98 +291,70 @@ We also compare the systems using the `comet-compare` tool that calculates the s
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
---
## en-fa
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 0.30 | 0.32 |
| google | 0.70 (+0.40, +134.13%) | 0.71 (+0.39, +119.58%) |
| microsoft | 0.40 (+0.10, +34.06%) | 0.42 (+0.10, +29.43%) |
![Results](img/en-fa-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-dev.en-fa](en-fa/flores-dev.en-fa.cometcompare)
- flores-dev.microsoft.fa outperforms flores-dev.bergamot.fa.
- flores-dev.google.fa outperforms flores-dev.bergamot.fa.
- flores-dev.google.fa outperforms flores-dev.microsoft.fa.
#### [flores-test.en-fa](en-fa/flores-test.en-fa.cometcompare)
- flores-test.microsoft.fa outperforms flores-test.bergamot.fa.
- flores-test.google.fa outperforms flores-test.bergamot.fa.
- flores-test.google.fa outperforms flores-test.microsoft.fa.
---
## is-en
| Translator/Dataset | wmt21 | flores-dev | flores-test |
| --- | --- | --- | --- |
| bergamot | 0.02 | 0.21 | 0.22 |
| google | 0.67 (+0.66, +4185.35%) | 0.71 (+0.50, +236.11%) | 0.71 (+0.49, +226.94%) |
| microsoft | 0.66 (+0.64, +4101.27%) | 0.68 (+0.47, +219.75%) | 0.68 (+0.46, +213.75%) |
![Results](img/is-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [wmt21.is-en](is-en/wmt21.is-en.cometcompare)
- wmt21.microsoft.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.microsoft.en.
#### [flores-dev.is-en](is-en/flores-dev.is-en.cometcompare)
#### [flores-dev.uk-en](uk-en/flores-dev.uk-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [flores-test.is-en](is-en/flores-test.is-en.cometcompare)
---
## fa-en
| Translator/Dataset | flores-test | flores-dev |
| --- | --- | --- |
| bergamot | 0.51 | 0.49 |
| google | 0.74 (+0.23, +45.97%) | 0.74 (+0.25, +50.08%) |
| microsoft | 0.67 (+0.16, +32.25%) | 0.66 (+0.16, +33.31%) |
![Results](img/fa-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-test.fa-en](fa-en/flores-test.fa-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
---
## nl-en
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 0.63 | 0.64 |
| google | 0.70 (+0.07, +11.21%) | 0.70 (+0.07, +10.23%) |
| microsoft | 0.69 (+0.06, +9.49%) | 0.69 (+0.06, +8.75%) |
![Results](img/nl-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-dev.nl-en](nl-en/flores-dev.nl-en.cometcompare)
#### [flores-dev.fa-en](fa-en/flores-dev.fa-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
#### [flores-test.nl-en](nl-en/flores-test.nl-en.cometcompare)
---
## ca-en
| Translator/Dataset | flores-test | flores-dev |
| --- | --- | --- |
| bergamot | 0.64 | 0.67 |
| google | 0.81 (+0.17, +26.99%) | 0.82 (+0.15, +22.68%) |
| microsoft | 0.79 (+0.15, +23.44%) | 0.80 (+0.13, +19.11%) |
![Results](img/ca-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-test.ca-en](ca-en/flores-test.ca-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
#### [flores-dev.ca-en](ca-en/flores-dev.ca-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
---
## en-uk
| Translator/Dataset | wmt22 | flores-test | flores-dev |
| Translator/Dataset | flores-test | flores-dev | wmt22 |
| --- | --- | --- | --- |
| bergamot | 0.36 | 0.60 | 0.58 |
| google | 0.73 (+0.36, +99.31%) | 0.82 (+0.23, +38.00%) | 0.81 (+0.23, +40.14%) |
| microsoft | 0.67 (+0.31, +84.35%) | 0.79 (+0.20, +32.94%) | 0.78 (+0.20, +34.26%) |
| bergamot | 0.60 | 0.58 | 0.36 |
| google | 0.82 (+0.23, +38.00%) | 0.81 (+0.23, +40.14%) | 0.73 (+0.36, +99.31%) |
| microsoft | 0.79 (+0.20, +32.94%) | 0.78 (+0.20, +34.26%) | 0.67 (+0.31, +84.35%) |
![Results](img/en-uk-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [wmt22.en-uk](en-uk/wmt22.en-uk.cometcompare)
- wmt22.microsoft.uk outperforms wmt22.bergamot.uk.
- wmt22.google.uk outperforms wmt22.bergamot.uk.
- wmt22.google.uk outperforms wmt22.microsoft.uk.
#### [flores-test.en-uk](en-uk/flores-test.en-uk.cometcompare)
- flores-test.microsoft.uk outperforms flores-test.bergamot.uk.
- flores-test.google.uk outperforms flores-test.bergamot.uk.
@ -363,4 +365,37 @@ We also compare the systems using the `comet-compare` tool that calculates the s
- flores-dev.google.uk outperforms flores-dev.bergamot.uk.
- flores-dev.google.uk outperforms flores-dev.microsoft.uk.
#### [wmt22.en-uk](en-uk/wmt22.en-uk.cometcompare)
- wmt22.microsoft.uk outperforms wmt22.bergamot.uk.
- wmt22.google.uk outperforms wmt22.bergamot.uk.
- wmt22.google.uk outperforms wmt22.microsoft.uk.
---
## is-en
| Translator/Dataset | flores-test | wmt21 | flores-dev |
| --- | --- | --- | --- |
| bergamot | 0.22 | 0.02 | 0.21 |
| google | 0.71 (+0.49, +226.94%) | 0.67 (+0.66, +4185.35%) | 0.71 (+0.50, +236.11%) |
| microsoft | 0.68 (+0.46, +213.75%) | 0.66 (+0.64, +4101.27%) | 0.68 (+0.47, +219.75%) |
![Results](img/is-en-comet.png)
### Comparisons between systems
*If a comparison is omitted, the systems have equal averages (tie). Click on the dataset for a complete report*
#### [flores-test.is-en](is-en/flores-test.is-en.cometcompare)
- flores-test.microsoft.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.bergamot.en.
- flores-test.google.en outperforms flores-test.microsoft.en.
#### [wmt21.is-en](is-en/wmt21.is-en.cometcompare)
- wmt21.microsoft.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.bergamot.en.
- wmt21.google.en outperforms wmt21.microsoft.en.
#### [flores-dev.is-en](is-en/flores-dev.is-en.cometcompare)
- flores-dev.microsoft.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.bergamot.en.
- flores-dev.google.en outperforms flores-dev.microsoft.en.
---

Двоичные данные
evaluation/dev/img/avg-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 22 KiB

После

Ширина:  |  Высота:  |  Размер: 22 KiB

Двоичные данные
evaluation/dev/img/avg-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 24 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/dev/img/ca-en-bleu.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 19 KiB

Двоичные данные
evaluation/dev/img/ca-en-comet.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/en-fa-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 18 KiB

После

Ширина:  |  Высота:  |  Размер: 18 KiB

Двоичные данные
evaluation/dev/img/en-fa-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/en-nl-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 19 KiB

После

Ширина:  |  Высота:  |  Размер: 19 KiB

Двоичные данные
evaluation/dev/img/en-nl-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/en-ru-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/dev/img/en-ru-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 24 KiB

После

Ширина:  |  Высота:  |  Размер: 24 KiB

Двоичные данные
evaluation/dev/img/en-uk-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/en-uk-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/dev/img/fa-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/is-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 22 KiB

После

Ширина:  |  Высота:  |  Размер: 22 KiB

Двоичные данные
evaluation/dev/img/nl-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/dev/img/ru-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 25 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/dev/img/ru-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 27 KiB

После

Ширина:  |  Высота:  |  Размер: 27 KiB

Двоичные данные
evaluation/dev/img/uk-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 22 KiB

После

Ширина:  |  Высота:  |  Размер: 22 KiB

Просмотреть файл

@ -56,15 +56,59 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
## avg
| Translator/Dataset | en-pt | pt-en | en-bg | nb-en | it-en | en-et | fr-en | en-de | es-en | en-it | en-pl | en-es | pl-en | en-fr | et-en | bg-en | de-en | en-cs | cs-en |
| Translator/Dataset | cs-en | en-et | en-it | fr-en | en-pt | et-en | nb-en | bg-en | en-es | en-bg | en-cs | de-en | it-en | pl-en | en-fr | en-pl | pt-en | es-en | en-de |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 49.85 | 44.87 | 42.10 | 37.60 | 32.67 | 25.50 | 35.43 | 32.01 | 32.38 | 29.77 | 22.27 | 32.41 | 27.87 | 36.01 | 32.37 | 38.50 | 33.16 | 24.76 | 31.07 |
| google | 53.75 (+3.90, +7.82%) | 46.60 (+1.73, +3.86%) | 44.60 (+2.50, +5.94%) | 42.05 (+4.45, +11.84%) | 34.50 (+1.83, +5.59%) | 28.60 (+3.10, +12.16%) | 37.81 (+2.38, +6.70%) | 33.16 (+1.14, +3.58%) | 33.64 (+1.27, +3.91%) | 28.97 (-0.80, -2.69%) | 25.50 (+3.23, +14.52%) | 34.74 (+2.32, +7.17%) | 31.23 (+3.37, +12.08%) | 29.47 (-6.54, -18.15%) | 35.80 (+3.43, +10.61%) | 41.30 (+2.80, +7.27%) | 35.65 (+2.49, +7.52%) | 27.72 (+2.96, +11.95%) | 33.36 (+2.29, +7.36%) |
| microsoft | 50.15 (+0.30, +0.60%) | 46.47 (+1.60, +3.57%) | 38.55 (-3.55, -8.43%) | 42.90 (+5.30, +14.10%) | 34.55 (+1.88, +5.74%) | 28.47 (+2.97, +11.63%) | 39.13 (+3.70, +10.44%) | 33.54 (+1.53, +4.79%) | 32.93 (+0.56, +1.72%) | 32.30 (+2.53, +8.51%) | 24.83 (+2.57, +11.53%) | 33.76 (+1.35, +4.17%) | 31.83 (+3.97, +14.23%) | 36.48 (+0.47, +1.31%) | 36.17 (+3.80, +11.74%) | 41.20 (+2.70, +7.01%) | 37.73 (+4.57, +13.79%) | 28.26 (+3.50, +14.14%) | 34.67 (+3.61, +11.61%) |
| bergamot | 31.07 | 25.50 | 29.77 | 35.43 | 49.85 | 32.37 | 37.60 | 38.50 | 32.41 | 42.10 | 24.76 | 33.16 | 32.67 | 27.87 | 36.01 | 22.27 | 44.87 | 32.38 | 32.01 |
| google | 33.36 (+2.29, +7.36%) | 28.60 (+3.10, +12.16%) | 28.97 (-0.80, -2.69%) | 37.81 (+2.38, +6.70%) | 53.75 (+3.90, +7.82%) | 35.80 (+3.43, +10.61%) | 42.05 (+4.45, +11.84%) | 41.30 (+2.80, +7.27%) | 34.74 (+2.32, +7.17%) | 44.60 (+2.50, +5.94%) | 27.72 (+2.96, +11.95%) | 35.65 (+2.49, +7.52%) | 34.50 (+1.83, +5.59%) | 31.23 (+3.37, +12.08%) | 29.47 (-6.54, -18.15%) | 25.50 (+3.23, +14.52%) | 46.60 (+1.73, +3.86%) | 33.64 (+1.27, +3.91%) | 33.16 (+1.14, +3.58%) |
| microsoft | 34.67 (+3.61, +11.61%) | 28.47 (+2.97, +11.63%) | 32.30 (+2.53, +8.51%) | 39.13 (+3.70, +10.44%) | 50.15 (+0.30, +0.60%) | 36.17 (+3.80, +11.74%) | 42.90 (+5.30, +14.10%) | 41.20 (+2.70, +7.01%) | 33.76 (+1.35, +4.17%) | 38.55 (-3.55, -8.43%) | 28.26 (+3.50, +14.14%) | 37.73 (+4.57, +13.79%) | 34.55 (+1.88, +5.74%) | 31.83 (+3.97, +14.23%) | 36.48 (+0.47, +1.31%) | 24.83 (+2.57, +11.53%) | 46.47 (+1.60, +3.57%) | 32.93 (+0.56, +1.72%) | 33.54 (+1.53, +4.79%) |
![Results](img/avg-bleu.png)
---
## cs-en
| Translator/Dataset | wmt08 | wmt17 | wmt10 | flores-dev | wmt22 | flores-test | wmt12 | wmt11 | wmt14 | wmt15 | wmt16 | wmt13 | wmt18 | wmt09 | wmt21 | wmt20 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 24.50 | 30.20 | 28.20 | 35.30 | 44.50 | 35.30 | 26.50 | 28.10 | 35.00 | 32.00 | 33.40 | 30.30 | 31.30 | 27.60 | 27.90 | 27.00 |
| google | 26.30 (+1.80, +7.35%) | 31.20 (+1.00, +3.31%) | 30.50 (+2.30, +8.16%) | 38.60 (+3.30, +9.35%) | 49.40 (+4.90, +11.01%) | 39.00 (+3.70, +10.48%) | 28.60 (+2.10, +7.92%) | 30.20 (+2.10, +7.47%) | 38.00 (+3.00, +8.57%) | 33.60 (+1.60, +5.00%) | 34.80 (+1.40, +4.19%) | 32.40 (+2.10, +6.93%) | 32.10 (+0.80, +2.56%) | 29.90 (+2.30, +8.33%) | 30.70 (+2.80, +10.04%) | 28.40 (+1.40, +5.19%) |
| microsoft | 26.40 (+1.90, +7.76%) | 33.60 (+3.40, +11.26%) | 30.70 (+2.50, +8.87%) | 40.00 (+4.70, +13.31%) | 54.90 (+10.40, +23.37%) | 40.30 (+5.00, +14.16%) | 29.70 (+3.20, +12.08%) | 30.90 (+2.80, +9.96%) | 39.90 (+4.90, +14.00%) | 34.70 (+2.70, +8.44%) | 38.30 (+4.90, +14.67%) | 33.40 (+3.10, +10.23%) | 34.30 (+3.00, +9.58%) | 29.60 (+2.00, +7.25%) | 30.50 (+2.60, +9.32%) | 27.60 (+0.60, +2.22%) |
![Results](img/cs-en-bleu.png)
---
## en-et
| Translator/Dataset | flores-dev | flores-test | wmt18 |
| --- | --- | --- | --- |
| bergamot | 25.60 | 25.70 | 25.20 |
| google | 30.20 (+4.60, +17.97%) | 29.00 (+3.30, +12.84%) | 26.60 (+1.40, +5.56%) |
| microsoft | 28.60 (+3.00, +11.72%) | 29.20 (+3.50, +13.62%) | 27.60 (+2.40, +9.52%) |
![Results](img/en-et-bleu.png)
---
## en-it
| Translator/Dataset | flores-test | flores-dev | wmt09 |
| --- | --- | --- | --- |
| bergamot | 29.30 | 29.20 | 30.80 |
| google | 29.60 (+0.30, +1.02%) | 28.50 (-0.70, -2.40%) | 28.80 (-2.00, -6.49%) |
| microsoft | 32.10 (+2.80, +9.56%) | 31.10 (+1.90, +6.51%) | 33.70 (+2.90, +9.42%) |
![Results](img/en-it-bleu.png)
---
## fr-en
| Translator/Dataset | wmt08 | mtedx_test | iwslt17 | wmt10 | flores-dev | flores-test | wmt12 | wmt11 | wmt14 | wmt15 | wmt13 | wmt09 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 24.50 | 42.80 | 39.80 | 31.40 | 43.80 | 42.10 | 31.90 | 32.00 | 37.30 | 37.20 | 33.30 | 29.10 |
| google | 26.60 (+2.10, +8.57%) | 42.70 (-0.10, -0.23%) | 40.60 (+0.80, +2.01%) | 34.10 (+2.70, +8.60%) | 48.70 (+4.90, +11.19%) | 46.70 (+4.60, +10.93%) | 33.80 (+1.90, +5.96%) | 34.30 (+2.30, +7.19%) | 40.60 (+3.30, +8.85%) | 39.90 (+2.70, +7.26%) | 34.50 (+1.20, +3.60%) | 31.20 (+2.10, +7.22%) |
| microsoft | 27.40 (+2.90, +11.84%) | 46.40 (+3.60, +8.41%) | 41.80 (+2.00, +5.03%) | 35.00 (+3.60, +11.46%) | 48.90 (+5.10, +11.64%) | 47.00 (+4.90, +11.64%) | 34.60 (+2.70, +8.46%) | 35.20 (+3.20, +10.00%) | 42.30 (+5.00, +13.40%) | 42.70 (+5.50, +14.78%) | 36.10 (+2.80, +8.41%) | 32.20 (+3.10, +10.65%) |
![Results](img/fr-en-bleu.png)
---
## en-pt
| Translator/Dataset | flores-test | flores-dev |
@ -76,26 +120,15 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
![Results](img/en-pt-bleu.png)
---
## pt-en
## et-en
| Translator/Dataset | flores-dev | mtedx_test | flores-test |
| Translator/Dataset | flores-dev | flores-test | wmt18 |
| --- | --- | --- | --- |
| bergamot | 47.80 | 40.20 | 46.60 |
| google | 50.40 (+2.60, +5.44%) | 39.10 (-1.10, -2.74%) | 50.30 (+3.70, +7.94%) |
| microsoft | 49.80 (+2.00, +4.18%) | 41.00 (+0.80, +1.99%) | 48.60 (+2.00, +4.29%) |
| bergamot | 33.50 | 32.70 | 30.90 |
| google | 38.30 (+4.80, +14.33%) | 37.00 (+4.30, +13.15%) | 32.10 (+1.20, +3.88%) |
| microsoft | 37.40 (+3.90, +11.64%) | 37.00 (+4.30, +13.15%) | 34.10 (+3.20, +10.36%) |
![Results](img/pt-en-bleu.png)
---
## en-bg
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 42.00 | 42.20 |
| google | 44.10 (+2.10, +5.00%) | 45.10 (+2.90, +6.87%) |
| microsoft | 38.00 (-4.00, -9.52%) | 39.10 (-3.10, -7.35%) |
![Results](img/en-bg-bleu.png)
![Results](img/et-en-bleu.png)
---
## nb-en
@ -109,127 +142,6 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
![Results](img/nb-en-bleu.png)
---
## it-en
| Translator/Dataset | flores-dev | mtedx_test | wmt09 | flores-test |
| --- | --- | --- | --- | --- |
| bergamot | 31.10 | 35.70 | 33.50 | 30.40 |
| google | 33.40 (+2.30, +7.40%) | 35.90 (+0.20, +0.56%) | 35.40 (+1.90, +5.67%) | 33.30 (+2.90, +9.54%) |
| microsoft | 33.30 (+2.20, +7.07%) | 36.40 (+0.70, +1.96%) | 35.80 (+2.30, +6.87%) | 32.70 (+2.30, +7.57%) |
![Results](img/it-en-bleu.png)
---
## en-et
| Translator/Dataset | flores-test | wmt18 | flores-dev |
| --- | --- | --- | --- |
| bergamot | 25.70 | 25.20 | 25.60 |
| google | 29.00 (+3.30, +12.84%) | 26.60 (+1.40, +5.56%) | 30.20 (+4.60, +17.97%) |
| microsoft | 29.20 (+3.50, +13.62%) | 27.60 (+2.40, +9.52%) | 28.60 (+3.00, +11.72%) |
![Results](img/en-et-bleu.png)
---
## fr-en
| Translator/Dataset | flores-dev | iwslt17 | wmt10 | wmt08 | mtedx_test | wmt12 | wmt15 | wmt09 | wmt14 | wmt11 | wmt13 | flores-test |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 43.80 | 39.80 | 31.40 | 24.50 | 42.80 | 31.90 | 37.20 | 29.10 | 37.30 | 32.00 | 33.30 | 42.10 |
| google | 48.70 (+4.90, +11.19%) | 40.60 (+0.80, +2.01%) | 34.10 (+2.70, +8.60%) | 26.60 (+2.10, +8.57%) | 42.70 (-0.10, -0.23%) | 33.80 (+1.90, +5.96%) | 39.90 (+2.70, +7.26%) | 31.20 (+2.10, +7.22%) | 40.60 (+3.30, +8.85%) | 34.30 (+2.30, +7.19%) | 34.50 (+1.20, +3.60%) | 46.70 (+4.60, +10.93%) |
| microsoft | 48.90 (+5.10, +11.64%) | 41.80 (+2.00, +5.03%) | 35.00 (+3.60, +11.46%) | 27.40 (+2.90, +11.84%) | 46.40 (+3.60, +8.41%) | 34.60 (+2.70, +8.46%) | 42.70 (+5.50, +14.78%) | 32.20 (+3.10, +10.65%) | 42.30 (+5.00, +13.40%) | 35.20 (+3.20, +10.00%) | 36.10 (+2.80, +8.41%) | 47.00 (+4.90, +11.64%) |
![Results](img/fr-en-bleu.png)
---
## en-de
| Translator/Dataset | wmt17 | wmt18 | iwslt17 | wmt13 | flores-test | wmt10 | wmt11 | wmt19 | wmt14 | wmt09 | wmt20 | wmt08 | wmt15 | wmt12 | wmt21 | flores-dev | wmt22 | wmt16 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 32.00 | 47.70 | 26.70 | 28.20 | 38.80 | 26.80 | 23.40 | 44.50 | 29.80 | 23.00 | 35.70 | 23.60 | 33.10 | 24.30 | 27.70 | 38.80 | 32.10 | 40.00 |
| google | 31.50 (-0.50, -1.56%) | 47.80 (+0.10, +0.21%) | 28.90 (+2.20, +8.24%) | 28.80 (+0.60, +2.13%) | 42.30 (+3.50, +9.02%) | 26.50 (-0.30, -1.12%) | 24.10 (+0.70, +2.99%) | 43.50 (-1.00, -2.25%) | 30.90 (+1.10, +3.69%) | 23.60 (+0.60, +2.61%) | 36.50 (+0.80, +2.24%) | 23.70 (+0.10, +0.42%) | 33.70 (+0.60, +1.81%) | 24.70 (+0.40, +1.65%) | 29.70 (+2.00, +7.22%) | 43.70 (+4.90, +12.63%) | 38.30 (+6.20, +19.31%) | 38.60 (-1.40, -3.50%) |
| microsoft | 33.10 (+1.10, +3.44%) | 48.70 (+1.00, +2.10%) | 28.20 (+1.50, +5.62%) | 28.80 (+0.60, +2.13%) | 42.90 (+4.10, +10.57%) | 27.20 (+0.40, +1.49%) | 23.70 (+0.30, +1.28%) | 43.80 (-0.70, -1.57%) | 32.20 (+2.40, +8.05%) | 23.90 (+0.90, +3.91%) | 36.10 (+0.40, +1.12%) | 24.00 (+0.40, +1.69%) | 34.30 (+1.20, +3.63%) | 25.30 (+1.00, +4.12%) | 29.80 (+2.10, +7.58%) | 44.00 (+5.20, +13.40%) | 37.30 (+5.20, +16.20%) | 40.50 (+0.50, +1.25%) |
![Results](img/en-de-bleu.png)
---
## es-en
| Translator/Dataset | flores-dev | wmt10 | wmt08 | mtedx_test | wmt12 | wmt09 | wmt11 | wmt13 | flores-test |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 27.50 | 35.80 | 27.30 | 36.80 | 38.30 | 29.40 | 34.30 | 35.20 | 26.80 |
| google | 30.50 (+3.00, +10.91%) | 37.00 (+1.20, +3.35%) | 28.30 (+1.00, +3.66%) | 35.40 (-1.40, -3.80%) | 38.80 (+0.50, +1.31%) | 31.60 (+2.20, +7.48%) | 35.20 (+0.90, +2.62%) | 35.70 (+0.50, +1.42%) | 30.30 (+3.50, +13.06%) |
| microsoft | 30.30 (+2.80, +10.18%) | 35.40 (-0.40, -1.12%) | 26.80 (-0.50, -1.83%) | 37.60 (+0.80, +2.17%) | 37.80 (-0.50, -1.31%) | 29.60 (+0.20, +0.68%) | 33.70 (-0.60, -1.75%) | 35.30 (+0.10, +0.28%) | 29.90 (+3.10, +11.57%) |
![Results](img/es-en-bleu.png)
---
## en-it
| Translator/Dataset | flores-test | wmt09 | flores-dev |
| --- | --- | --- | --- |
| bergamot | 29.30 | 30.80 | 29.20 |
| google | 29.60 (+0.30, +1.02%) | 28.80 (-2.00, -6.49%) | 28.50 (-0.70, -2.40%) |
| microsoft | 32.10 (+2.80, +9.56%) | 33.70 (+2.90, +9.42%) | 31.10 (+1.90, +6.51%) |
![Results](img/en-it-bleu.png)
---
## en-pl
| Translator/Dataset | flores-test | wmt20 | flores-dev |
| --- | --- | --- | --- |
| bergamot | 21.00 | 25.10 | 20.70 |
| google | 24.40 (+3.40, +16.19%) | 27.90 (+2.80, +11.16%) | 24.20 (+3.50, +16.91%) |
| microsoft | 23.80 (+2.80, +13.33%) | 27.70 (+2.60, +10.36%) | 23.00 (+2.30, +11.11%) |
![Results](img/en-pl-bleu.png)
---
## en-es
| Translator/Dataset | wmt11 | wmt08 | wmt12 | wmt09 | flores-dev | wmt13 | wmt10 | flores-test |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 37.90 | 29.00 | 38.90 | 29.90 | 25.90 | 34.80 | 36.70 | 26.20 |
| google | 39.90 (+2.00, +5.28%) | 30.00 (+1.00, +3.45%) | 40.50 (+1.60, +4.11%) | 30.90 (+1.00, +3.34%) | 30.50 (+4.60, +17.76%) | 36.90 (+2.10, +6.03%) | 38.80 (+2.10, +5.72%) | 30.40 (+4.20, +16.03%) |
| microsoft | 39.10 (+1.20, +3.17%) | 29.90 (+0.90, +3.10%) | 40.00 (+1.10, +2.83%) | 30.70 (+0.80, +2.68%) | 28.40 (+2.50, +9.65%) | 35.70 (+0.90, +2.59%) | 37.80 (+1.10, +3.00%) | 28.50 (+2.30, +8.78%) |
![Results](img/en-es-bleu.png)
---
## pl-en
| Translator/Dataset | flores-dev | wmt20 | flores-test |
| --- | --- | --- | --- |
| bergamot | 26.80 | 31.00 | 25.80 |
| google | 30.00 (+3.20, +11.94%) | 34.10 (+3.10, +10.00%) | 29.60 (+3.80, +14.73%) |
| microsoft | 30.10 (+3.30, +12.31%) | 35.50 (+4.50, +14.52%) | 29.90 (+4.10, +15.89%) |
![Results](img/pl-en-bleu.png)
---
## en-fr
| Translator/Dataset | wmt13 | wmt14 | flores-test | wmt08 | wmt09 | flores-dev | iwslt17 | wmt12 | wmt10 | wmt15 | wmt11 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 33.40 | 39.70 | 48.70 | 25.50 | 28.80 | 48.50 | 38.60 | 31.40 | 31.00 | 36.90 | 33.60 |
| google | 26.50 (-6.90, -20.66%) | 32.60 (-7.10, -17.88%) | 41.80 (-6.90, -14.17%) | 20.70 (-4.80, -18.82%) | 23.50 (-5.30, -18.40%) | 41.30 (-7.20, -14.85%) | 28.00 (-10.60, -27.46%) | 25.10 (-6.30, -20.06%) | 26.60 (-4.40, -14.19%) | 30.60 (-6.30, -17.07%) | 27.50 (-6.10, -18.15%) |
| microsoft | 31.50 (-1.90, -5.69%) | 40.40 (+0.70, +1.76%) | 52.70 (+4.00, +8.21%) | 25.10 (-0.40, -1.57%) | 28.20 (-0.60, -2.08%) | 52.50 (+4.00, +8.25%) | 36.50 (-2.10, -5.44%) | 29.60 (-1.80, -5.73%) | 33.00 (+2.00, +6.45%) | 39.70 (+2.80, +7.59%) | 32.10 (-1.50, -4.46%) |
![Results](img/en-fr-bleu.png)
---
## et-en
| Translator/Dataset | flores-dev | wmt18 | flores-test |
| --- | --- | --- | --- |
| bergamot | 33.50 | 30.90 | 32.70 |
| google | 38.30 (+4.80, +14.33%) | 32.10 (+1.20, +3.88%) | 37.00 (+4.30, +13.15%) |
| microsoft | 37.40 (+3.90, +11.64%) | 34.10 (+3.20, +10.36%) | 37.00 (+4.30, +13.15%) |
![Results](img/et-en-bleu.png)
---
## bg-en
| Translator/Dataset | flores-dev | flores-test |
@ -241,35 +153,123 @@ Both absolute and relative differences in BLEU scores between Bergamot and other
![Results](img/bg-en-bleu.png)
---
## de-en
## en-es
| Translator/Dataset | flores-dev | iwslt17 | wmt10 | wmt08 | wmt18 | wmt20 | wmt19 | wmt12 | wmt15 | wmt09 | wmt17 | wmt14 | wmt11 | wmt16 | wmt22 | wmt13 | flores-test | wmt21 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 39.60 | 28.60 | 29.00 | 26.00 | 43.30 | 38.80 | 39.00 | 27.60 | 33.50 | 26.40 | 35.00 | 33.50 | 26.30 | 39.60 | 29.20 | 30.80 | 39.10 | 31.50 |
| google | 43.10 (+3.50, +8.84%) | 30.10 (+1.50, +5.24%) | 32.10 (+3.10, +10.69%) | 27.60 (+1.60, +6.15%) | 46.20 (+2.90, +6.70%) | 41.80 (+3.00, +7.73%) | 41.10 (+2.10, +5.38%) | 29.50 (+1.90, +6.88%) | 36.10 (+2.60, +7.76%) | 27.20 (+0.80, +3.03%) | 38.70 (+3.70, +10.57%) | 37.40 (+3.90, +11.64%) | 27.30 (+1.00, +3.80%) | 42.30 (+2.70, +6.82%) | 33.30 (+4.10, +14.04%) | 32.40 (+1.60, +5.19%) | 42.80 (+3.70, +9.46%) | 32.70 (+1.20, +3.81%) |
| microsoft | 44.90 (+5.30, +13.38%) | 32.50 (+3.90, +13.64%) | 33.40 (+4.40, +15.17%) | 29.40 (+3.40, +13.08%) | 49.60 (+6.30, +14.55%) | 43.60 (+4.80, +12.37%) | 43.80 (+4.80, +12.31%) | 31.30 (+3.70, +13.41%) | 38.10 (+4.60, +13.73%) | 29.10 (+2.70, +10.23%) | 40.80 (+5.80, +16.57%) | 39.20 (+5.70, +17.01%) | 29.20 (+2.90, +11.03%) | 46.30 (+6.70, +16.92%) | 33.50 (+4.30, +14.73%) | 34.30 (+3.50, +11.36%) | 45.80 (+6.70, +17.14%) | 34.30 (+2.80, +8.89%) |
| Translator/Dataset | wmt08 | wmt11 | wmt09 | flores-dev | wmt10 | wmt13 | wmt12 | flores-test |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 29.00 | 37.90 | 29.90 | 25.90 | 36.70 | 34.80 | 38.90 | 26.20 |
| google | 30.00 (+1.00, +3.45%) | 39.90 (+2.00, +5.28%) | 30.90 (+1.00, +3.34%) | 30.50 (+4.60, +17.76%) | 38.80 (+2.10, +5.72%) | 36.90 (+2.10, +6.03%) | 40.50 (+1.60, +4.11%) | 30.40 (+4.20, +16.03%) |
| microsoft | 29.90 (+0.90, +3.10%) | 39.10 (+1.20, +3.17%) | 30.70 (+0.80, +2.68%) | 28.40 (+2.50, +9.65%) | 37.80 (+1.10, +3.00%) | 35.70 (+0.90, +2.59%) | 40.00 (+1.10, +2.83%) | 28.50 (+2.30, +8.78%) |
![Results](img/de-en-bleu.png)
![Results](img/en-es-bleu.png)
---
## en-bg
| Translator/Dataset | flores-dev | flores-test |
| --- | --- | --- |
| bergamot | 42.00 | 42.20 |
| google | 44.10 (+2.10, +5.00%) | 45.10 (+2.90, +6.87%) |
| microsoft | 38.00 (-4.00, -9.52%) | 39.10 (-3.10, -7.35%) |
![Results](img/en-bg-bleu.png)
---
## en-cs
| Translator/Dataset | wmt21 | wmt11 | wmt09 | wmt19 | wmt16 | wmt20 | flores-dev | wmt13 | wmt08 | wmt15 | wmt18 | wmt10 | wmt12 | wmt22 | wmt14 | wmt17 | flores-test |
| Translator/Dataset | wmt12 | flores-dev | wmt19 | wmt21 | wmt14 | wmt17 | wmt20 | wmt15 | wmt22 | wmt08 | wmt11 | wmt18 | wmt09 | wmt10 | flores-test | wmt13 | wmt16 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 19.50 | 20.70 | 20.80 | 27.10 | 25.80 | 32.70 | 30.10 | 23.40 | 19.00 | 25.40 | 22.70 | 20.90 | 18.80 | 31.70 | 28.70 | 23.40 | 30.20 |
| google | 21.80 (+2.30, +11.79%) | 23.00 (+2.30, +11.11%) | 22.60 (+1.80, +8.65%) | 27.20 (+0.10, +0.37%) | 28.30 (+2.50, +9.69%) | 35.50 (+2.80, +8.56%) | 34.10 (+4.00, +13.29%) | 25.20 (+1.80, +7.69%) | 20.50 (+1.50, +7.89%) | 26.80 (+1.40, +5.51%) | 24.40 (+1.70, +7.49%) | 22.40 (+1.50, +7.18%) | 20.70 (+1.90, +10.11%) | 48.40 (+16.70, +52.68%) | 31.20 (+2.50, +8.71%) | 24.70 (+1.30, +5.56%) | 34.40 (+4.20, +13.91%) |
| microsoft | 22.00 (+2.50, +12.82%) | 25.30 (+4.60, +22.22%) | 25.00 (+4.20, +20.19%) | 27.20 (+0.10, +0.37%) | 29.90 (+4.10, +15.89%) | 34.10 (+1.40, +4.28%) | 33.50 (+3.40, +11.30%) | 27.70 (+4.30, +18.38%) | 22.60 (+3.60, +18.95%) | 27.40 (+2.00, +7.87%) | 24.90 (+2.20, +9.69%) | 24.30 (+3.40, +16.27%) | 22.90 (+4.10, +21.81%) | 42.10 (+10.40, +32.81%) | 31.90 (+3.20, +11.15%) | 25.60 (+2.20, +9.40%) | 34.00 (+3.80, +12.58%) |
| bergamot | 18.80 | 30.10 | 27.10 | 19.50 | 28.70 | 23.40 | 32.70 | 25.40 | 31.70 | 19.00 | 20.70 | 22.70 | 20.80 | 20.90 | 30.20 | 23.40 | 25.80 |
| google | 20.70 (+1.90, +10.11%) | 34.10 (+4.00, +13.29%) | 27.20 (+0.10, +0.37%) | 21.80 (+2.30, +11.79%) | 31.20 (+2.50, +8.71%) | 24.70 (+1.30, +5.56%) | 35.50 (+2.80, +8.56%) | 26.80 (+1.40, +5.51%) | 48.40 (+16.70, +52.68%) | 20.50 (+1.50, +7.89%) | 23.00 (+2.30, +11.11%) | 24.40 (+1.70, +7.49%) | 22.60 (+1.80, +8.65%) | 22.40 (+1.50, +7.18%) | 34.40 (+4.20, +13.91%) | 25.20 (+1.80, +7.69%) | 28.30 (+2.50, +9.69%) |
| microsoft | 22.90 (+4.10, +21.81%) | 33.50 (+3.40, +11.30%) | 27.20 (+0.10, +0.37%) | 22.00 (+2.50, +12.82%) | 31.90 (+3.20, +11.15%) | 25.60 (+2.20, +9.40%) | 34.10 (+1.40, +4.28%) | 27.40 (+2.00, +7.87%) | 42.10 (+10.40, +32.81%) | 22.60 (+3.60, +18.95%) | 25.30 (+4.60, +22.22%) | 24.90 (+2.20, +9.69%) | 25.00 (+4.20, +20.19%) | 24.30 (+3.40, +16.27%) | 34.00 (+3.80, +12.58%) | 27.70 (+4.30, +18.38%) | 29.90 (+4.10, +15.89%) |
![Results](img/en-cs-bleu.png)
---
## cs-en
## de-en
| Translator/Dataset | flores-dev | wmt10 | wmt08 | wmt18 | wmt20 | wmt12 | wmt15 | wmt09 | wmt17 | wmt14 | wmt11 | wmt16 | wmt22 | wmt13 | flores-test | wmt21 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 35.30 | 28.20 | 24.50 | 31.30 | 27.00 | 26.50 | 32.00 | 27.60 | 30.20 | 35.00 | 28.10 | 33.40 | 44.50 | 30.30 | 35.30 | 27.90 |
| google | 38.60 (+3.30, +9.35%) | 30.50 (+2.30, +8.16%) | 26.30 (+1.80, +7.35%) | 32.10 (+0.80, +2.56%) | 28.40 (+1.40, +5.19%) | 28.60 (+2.10, +7.92%) | 33.60 (+1.60, +5.00%) | 29.90 (+2.30, +8.33%) | 31.20 (+1.00, +3.31%) | 38.00 (+3.00, +8.57%) | 30.20 (+2.10, +7.47%) | 34.80 (+1.40, +4.19%) | 49.40 (+4.90, +11.01%) | 32.40 (+2.10, +6.93%) | 39.00 (+3.70, +10.48%) | 30.70 (+2.80, +10.04%) |
| microsoft | 40.00 (+4.70, +13.31%) | 30.70 (+2.50, +8.87%) | 26.40 (+1.90, +7.76%) | 34.30 (+3.00, +9.58%) | 27.60 (+0.60, +2.22%) | 29.70 (+3.20, +12.08%) | 34.70 (+2.70, +8.44%) | 29.60 (+2.00, +7.25%) | 33.60 (+3.40, +11.26%) | 39.90 (+4.90, +14.00%) | 30.90 (+2.80, +9.96%) | 38.30 (+4.90, +14.67%) | 54.90 (+10.40, +23.37%) | 33.40 (+3.10, +10.23%) | 40.30 (+5.00, +14.16%) | 30.50 (+2.60, +9.32%) |
| Translator/Dataset | wmt08 | wmt19 | wmt17 | iwslt17 | wmt10 | flores-dev | wmt22 | flores-test | wmt12 | wmt11 | wmt14 | wmt15 | wmt16 | wmt13 | wmt18 | wmt09 | wmt21 | wmt20 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 26.00 | 39.00 | 35.00 | 28.60 | 29.00 | 39.60 | 29.20 | 39.10 | 27.60 | 26.30 | 33.50 | 33.50 | 39.60 | 30.80 | 43.30 | 26.40 | 31.50 | 38.80 |
| google | 27.60 (+1.60, +6.15%) | 41.10 (+2.10, +5.38%) | 38.70 (+3.70, +10.57%) | 30.10 (+1.50, +5.24%) | 32.10 (+3.10, +10.69%) | 43.10 (+3.50, +8.84%) | 33.30 (+4.10, +14.04%) | 42.80 (+3.70, +9.46%) | 29.50 (+1.90, +6.88%) | 27.30 (+1.00, +3.80%) | 37.40 (+3.90, +11.64%) | 36.10 (+2.60, +7.76%) | 42.30 (+2.70, +6.82%) | 32.40 (+1.60, +5.19%) | 46.20 (+2.90, +6.70%) | 27.20 (+0.80, +3.03%) | 32.70 (+1.20, +3.81%) | 41.80 (+3.00, +7.73%) |
| microsoft | 29.40 (+3.40, +13.08%) | 43.80 (+4.80, +12.31%) | 40.80 (+5.80, +16.57%) | 32.50 (+3.90, +13.64%) | 33.40 (+4.40, +15.17%) | 44.90 (+5.30, +13.38%) | 33.50 (+4.30, +14.73%) | 45.80 (+6.70, +17.14%) | 31.30 (+3.70, +13.41%) | 29.20 (+2.90, +11.03%) | 39.20 (+5.70, +17.01%) | 38.10 (+4.60, +13.73%) | 46.30 (+6.70, +16.92%) | 34.30 (+3.50, +11.36%) | 49.60 (+6.30, +14.55%) | 29.10 (+2.70, +10.23%) | 34.30 (+2.80, +8.89%) | 43.60 (+4.80, +12.37%) |
![Results](img/cs-en-bleu.png)
![Results](img/de-en-bleu.png)
---
## it-en
| Translator/Dataset | mtedx_test | flores-dev | flores-test | wmt09 |
| --- | --- | --- | --- | --- |
| bergamot | 35.70 | 31.10 | 30.40 | 33.50 |
| google | 35.90 (+0.20, +0.56%) | 33.40 (+2.30, +7.40%) | 33.30 (+2.90, +9.54%) | 35.40 (+1.90, +5.67%) |
| microsoft | 36.40 (+0.70, +1.96%) | 33.30 (+2.20, +7.07%) | 32.70 (+2.30, +7.57%) | 35.80 (+2.30, +6.87%) |
![Results](img/it-en-bleu.png)
---
## pl-en
| Translator/Dataset | flores-dev | flores-test | wmt20 |
| --- | --- | --- | --- |
| bergamot | 26.80 | 25.80 | 31.00 |
| google | 30.00 (+3.20, +11.94%) | 29.60 (+3.80, +14.73%) | 34.10 (+3.10, +10.00%) |
| microsoft | 30.10 (+3.30, +12.31%) | 29.90 (+4.10, +15.89%) | 35.50 (+4.50, +14.52%) |
![Results](img/pl-en-bleu.png)
---
## en-fr
| Translator/Dataset | iwslt17 | wmt13 | wmt09 | wmt11 | wmt10 | wmt08 | wmt12 | flores-test | wmt14 | wmt15 | flores-dev |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 38.60 | 33.40 | 28.80 | 33.60 | 31.00 | 25.50 | 31.40 | 48.70 | 39.70 | 36.90 | 48.50 |
| google | 28.00 (-10.60, -27.46%) | 26.50 (-6.90, -20.66%) | 23.50 (-5.30, -18.40%) | 27.50 (-6.10, -18.15%) | 26.60 (-4.40, -14.19%) | 20.70 (-4.80, -18.82%) | 25.10 (-6.30, -20.06%) | 41.80 (-6.90, -14.17%) | 32.60 (-7.10, -17.88%) | 30.60 (-6.30, -17.07%) | 41.30 (-7.20, -14.85%) |
| microsoft | 36.50 (-2.10, -5.44%) | 31.50 (-1.90, -5.69%) | 28.20 (-0.60, -2.08%) | 32.10 (-1.50, -4.46%) | 33.00 (+2.00, +6.45%) | 25.10 (-0.40, -1.57%) | 29.60 (-1.80, -5.73%) | 52.70 (+4.00, +8.21%) | 40.40 (+0.70, +1.76%) | 39.70 (+2.80, +7.59%) | 52.50 (+4.00, +8.25%) |
![Results](img/en-fr-bleu.png)
---
## en-pl
| Translator/Dataset | wmt20 | flores-dev | flores-test |
| --- | --- | --- | --- |
| bergamot | 25.10 | 20.70 | 21.00 |
| google | 27.90 (+2.80, +11.16%) | 24.20 (+3.50, +16.91%) | 24.40 (+3.40, +16.19%) |
| microsoft | 27.70 (+2.60, +10.36%) | 23.00 (+2.30, +11.11%) | 23.80 (+2.80, +13.33%) |
![Results](img/en-pl-bleu.png)
---
## pt-en
| Translator/Dataset | mtedx_test | flores-dev | flores-test |
| --- | --- | --- | --- |
| bergamot | 40.20 | 47.80 | 46.60 |
| google | 39.10 (-1.10, -2.74%) | 50.40 (+2.60, +5.44%) | 50.30 (+3.70, +7.94%) |
| microsoft | 41.00 (+0.80, +1.99%) | 49.80 (+2.00, +4.18%) | 48.60 (+2.00, +4.29%) |
![Results](img/pt-en-bleu.png)
---
## es-en
| Translator/Dataset | wmt08 | mtedx_test | wmt10 | flores-dev | flores-test | wmt12 | wmt11 | wmt13 | wmt09 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 27.30 | 36.80 | 35.80 | 27.50 | 26.80 | 38.30 | 34.30 | 35.20 | 29.40 |
| google | 28.30 (+1.00, +3.66%) | 35.40 (-1.40, -3.80%) | 37.00 (+1.20, +3.35%) | 30.50 (+3.00, +10.91%) | 30.30 (+3.50, +13.06%) | 38.80 (+0.50, +1.31%) | 35.20 (+0.90, +2.62%) | 35.70 (+0.50, +1.42%) | 31.60 (+2.20, +7.48%) |
| microsoft | 26.80 (-0.50, -1.83%) | 37.60 (+0.80, +2.17%) | 35.40 (-0.40, -1.12%) | 30.30 (+2.80, +10.18%) | 29.90 (+3.10, +11.57%) | 37.80 (-0.50, -1.31%) | 33.70 (-0.60, -1.75%) | 35.30 (+0.10, +0.28%) | 29.60 (+0.20, +0.68%) |
![Results](img/es-en-bleu.png)
---
## en-de
| Translator/Dataset | flores-test | wmt22 | iwslt17 | wmt20 | wmt08 | wmt12 | wmt19 | wmt10 | wmt11 | flores-dev | wmt09 | wmt13 | wmt17 | wmt16 | wmt18 | wmt14 | wmt21 | wmt15 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| bergamot | 38.80 | 32.10 | 26.70 | 35.70 | 23.60 | 24.30 | 44.50 | 26.80 | 23.40 | 38.80 | 23.00 | 28.20 | 32.00 | 40.00 | 47.70 | 29.80 | 27.70 | 33.10 |
| google | 42.30 (+3.50, +9.02%) | 38.30 (+6.20, +19.31%) | 28.90 (+2.20, +8.24%) | 36.50 (+0.80, +2.24%) | 23.70 (+0.10, +0.42%) | 24.70 (+0.40, +1.65%) | 43.50 (-1.00, -2.25%) | 26.50 (-0.30, -1.12%) | 24.10 (+0.70, +2.99%) | 43.70 (+4.90, +12.63%) | 23.60 (+0.60, +2.61%) | 28.80 (+0.60, +2.13%) | 31.50 (-0.50, -1.56%) | 38.60 (-1.40, -3.50%) | 47.80 (+0.10, +0.21%) | 30.90 (+1.10, +3.69%) | 29.70 (+2.00, +7.22%) | 33.70 (+0.60, +1.81%) |
| microsoft | 42.90 (+4.10, +10.57%) | 37.30 (+5.20, +16.20%) | 28.20 (+1.50, +5.62%) | 36.10 (+0.40, +1.12%) | 24.00 (+0.40, +1.69%) | 25.30 (+1.00, +4.12%) | 43.80 (-0.70, -1.57%) | 27.20 (+0.40, +1.49%) | 23.70 (+0.30, +1.28%) | 44.00 (+5.20, +13.40%) | 23.90 (+0.90, +3.91%) | 28.80 (+0.60, +2.13%) | 33.10 (+1.10, +3.44%) | 40.50 (+0.50, +1.25%) | 48.70 (+1.00, +2.10%) | 32.20 (+2.40, +8.05%) | 29.80 (+2.10, +7.58%) | 34.30 (+1.20, +3.63%) |
![Results](img/en-de-bleu.png)
---

Разница между файлами не показана из-за своего большого размера Загрузить разницу

Двоичные данные
evaluation/prod/img/avg-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/avg-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 27 KiB

После

Ширина:  |  Высота:  |  Размер: 27 KiB

Двоичные данные
evaluation/prod/img/bg-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/cs-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 25 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/prod/img/cs-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 28 KiB

После

Ширина:  |  Высота:  |  Размер: 28 KiB

Двоичные данные
evaluation/prod/img/de-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/de-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 29 KiB

После

Ширина:  |  Высота:  |  Размер: 29 KiB

Двоичные данные
evaluation/prod/img/en-cs-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 25 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/prod/img/en-cs-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/en-de-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/en-de-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 28 KiB

После

Ширина:  |  Высота:  |  Размер: 28 KiB

Двоичные данные
evaluation/prod/img/en-es-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/prod/img/en-es-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 25 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/prod/img/en-et-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 20 KiB

После

Ширина:  |  Высота:  |  Размер: 20 KiB

Двоичные данные
evaluation/prod/img/en-et-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/en-fr-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 24 KiB

После

Ширина:  |  Высота:  |  Размер: 24 KiB

Двоичные данные
evaluation/prod/img/en-fr-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/en-it-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/en-it-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/prod/img/en-pl-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 20 KiB

После

Ширина:  |  Высота:  |  Размер: 20 KiB

Двоичные данные
evaluation/prod/img/en-pl-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/prod/img/en-pt-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 19 KiB

После

Ширина:  |  Высота:  |  Размер: 19 KiB

Двоичные данные
evaluation/prod/img/es-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 25 KiB

После

Ширина:  |  Высота:  |  Размер: 25 KiB

Двоичные данные
evaluation/prod/img/es-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/et-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/et-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/prod/img/fr-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 26 KiB

После

Ширина:  |  Высота:  |  Размер: 26 KiB

Двоичные данные
evaluation/prod/img/fr-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 28 KiB

После

Ширина:  |  Высота:  |  Размер: 28 KiB

Двоичные данные
evaluation/prod/img/it-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Двоичные данные
evaluation/prod/img/it-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 24 KiB

После

Ширина:  |  Высота:  |  Размер: 24 KiB

Двоичные данные
evaluation/prod/img/nb-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 22 KiB

После

Ширина:  |  Высота:  |  Размер: 22 KiB

Двоичные данные
evaluation/prod/img/pl-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/pl-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 22 KiB

После

Ширина:  |  Высота:  |  Размер: 22 KiB

Двоичные данные
evaluation/prod/img/pt-en-bleu.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 21 KiB

После

Ширина:  |  Высота:  |  Размер: 21 KiB

Двоичные данные
evaluation/prod/img/pt-en-comet.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 23 KiB

После

Ширина:  |  Высота:  |  Размер: 23 KiB

Просмотреть файл

@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e8f1b9054453a0bced8600d67150b4a468c6607b7ba69600a64d29112b998823
size 2735538

Просмотреть файл

@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d7a76467335bc45389b695465996eecca6a5613ce8a9f3e4db865b2afad1672b
size 12825781

Просмотреть файл

@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a34751849d5cce1430bafb374105252e2b2f6ace8d7e7e70d1a8c5279644702e
size 413033

Просмотреть файл

@ -457,6 +457,29 @@
"modelType": "prod"
}
},
"caen": {
"model": {
"name": "model.caen.intgemm.alphas.bin",
"size": 17140899,
"estimatedCompressedSize": 12825781,
"expectedSha256Hash": "3a315266490d87f72adf9e5387ee567b2fb76a30018e51586b882b1d87bf5aed",
"modelType": "dev"
},
"lex": {
"name": "lex.50.50.caen.s2t.bin",
"size": 5244644,
"estimatedCompressedSize": 2735538,
"expectedSha256Hash": "a648be17d6f008feee687b455d00dbfaedba2ead8bee32658783c4325a8d3ece",
"modelType": "dev"
},
"vocab": {
"name": "vocab.caen.spm",
"size": 811443,
"estimatedCompressedSize": 413033,
"expectedSha256Hash": "10a1f25e5640f596b547190082f87ba4994f8714693904c82a35d965b9cc7470",
"modelType": "dev"
}
},
"enfa": {
"model": {
"name": "model.enfa.intgemm.alphas.bin",