267 строки
6.7 KiB
Plaintext
267 строки
6.7 KiB
Plaintext
==========================
|
|
x_name: wmt16.bergamot.fi
|
|
y_name: wmt16.microsoft.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8815
|
|
y-mean: 0.9169
|
|
ties (%): 0.0000
|
|
x_wins (%): 0.0000
|
|
y_wins (%): 1.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: -28.5455
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.microsoft.fi outperforms wmt16.bergamot.fi.
|
|
==========================
|
|
x_name: wmt16.bergamot.fi
|
|
y_name: wmt16.google.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8815
|
|
y-mean: 0.9096
|
|
ties (%): 0.0000
|
|
x_wins (%): 0.0000
|
|
y_wins (%): 1.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: -21.9931
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.google.fi outperforms wmt16.bergamot.fi.
|
|
==========================
|
|
x_name: wmt16.bergamot.fi
|
|
y_name: wmt16.argos.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8815
|
|
y-mean: 0.8470
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 20.8611
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.bergamot.fi outperforms wmt16.argos.fi.
|
|
==========================
|
|
x_name: wmt16.bergamot.fi
|
|
y_name: wmt16.nllb.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8815
|
|
y-mean: 0.8492
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 18.0039
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.bergamot.fi outperforms wmt16.nllb.fi.
|
|
==========================
|
|
x_name: wmt16.bergamot.fi
|
|
y_name: wmt16.opusmt.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8815
|
|
y-mean: 0.9086
|
|
ties (%): 0.0000
|
|
x_wins (%): 0.0000
|
|
y_wins (%): 1.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: -21.5593
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.opusmt.fi outperforms wmt16.bergamot.fi.
|
|
==========================
|
|
x_name: wmt16.microsoft.fi
|
|
y_name: wmt16.google.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9169
|
|
y-mean: 0.9096
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 8.8037
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.microsoft.fi outperforms wmt16.google.fi.
|
|
==========================
|
|
x_name: wmt16.microsoft.fi
|
|
y_name: wmt16.argos.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9169
|
|
y-mean: 0.8470
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 41.5030
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.microsoft.fi outperforms wmt16.argos.fi.
|
|
==========================
|
|
x_name: wmt16.microsoft.fi
|
|
y_name: wmt16.nllb.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9169
|
|
y-mean: 0.8492
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 39.0803
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.microsoft.fi outperforms wmt16.nllb.fi.
|
|
==========================
|
|
x_name: wmt16.microsoft.fi
|
|
y_name: wmt16.opusmt.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9169
|
|
y-mean: 0.9086
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 8.6536
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.microsoft.fi outperforms wmt16.opusmt.fi.
|
|
==========================
|
|
x_name: wmt16.google.fi
|
|
y_name: wmt16.argos.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9096
|
|
y-mean: 0.8470
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 37.5897
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.google.fi outperforms wmt16.argos.fi.
|
|
==========================
|
|
x_name: wmt16.google.fi
|
|
y_name: wmt16.nllb.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9096
|
|
y-mean: 0.8492
|
|
ties (%): 0.0000
|
|
x_wins (%): 1.0000
|
|
y_wins (%): 0.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: 35.4453
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.google.fi outperforms wmt16.nllb.fi.
|
|
==========================
|
|
x_name: wmt16.google.fi
|
|
y_name: wmt16.opusmt.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.9096
|
|
y-mean: 0.9086
|
|
ties (%): 0.4200
|
|
x_wins (%): 0.4967
|
|
y_wins (%): 0.0833
|
|
|
|
Paired T-Test Results:
|
|
statistic: 0.8847
|
|
p_value: 0.3764
|
|
Null hypothesis can't be rejected.
|
|
Both systems have equal averages.
|
|
==========================
|
|
x_name: wmt16.argos.fi
|
|
y_name: wmt16.nllb.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8470
|
|
y-mean: 0.8492
|
|
ties (%): 0.2267
|
|
x_wins (%): 0.1133
|
|
y_wins (%): 0.6600
|
|
|
|
Paired T-Test Results:
|
|
statistic: -1.2776
|
|
p_value: 0.2015
|
|
Null hypothesis can't be rejected.
|
|
Both systems have equal averages.
|
|
==========================
|
|
x_name: wmt16.argos.fi
|
|
y_name: wmt16.opusmt.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8470
|
|
y-mean: 0.9086
|
|
ties (%): 0.0000
|
|
x_wins (%): 0.0000
|
|
y_wins (%): 1.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: -37.5091
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.opusmt.fi outperforms wmt16.argos.fi.
|
|
==========================
|
|
x_name: wmt16.nllb.fi
|
|
y_name: wmt16.opusmt.fi
|
|
|
|
Bootstrap Resampling Results:
|
|
x-mean: 0.8492
|
|
y-mean: 0.9086
|
|
ties (%): 0.0000
|
|
x_wins (%): 0.0000
|
|
y_wins (%): 1.0000
|
|
|
|
Paired T-Test Results:
|
|
statistic: -34.7892
|
|
p_value: 0.0000
|
|
Null hypothesis rejected according to t-test.
|
|
Scores differ significantly across samples.
|
|
wmt16.opusmt.fi outperforms wmt16.nllb.fi.
|
|
|
|
Summary
|
|
If system_x is better than system_y then:
|
|
Null hypothesis rejected according to t-test with p_value=0.05.
|
|
Scores differ significantly across samples.
|
|
system_x \ system_y wmt16.bergamot.fi wmt16.microsoft.fi wmt16.google.fi wmt16.argos.fi wmt16.nllb.fi wmt16.opusmt.fi
|
|
--------------------- ------------------- -------------------- ----------------- ---------------- --------------- -----------------
|
|
wmt16.bergamot.fi False False True True False
|
|
wmt16.microsoft.fi True True True True True
|
|
wmt16.google.fi True False True True False
|
|
wmt16.argos.fi False False False False False
|
|
wmt16.nllb.fi False False False False False
|
|
wmt16.opusmt.fi True False False True True
|