зеркало из https://github.com/mozilla/gecko-dev.git
52a6fe2427
* Properly take into account non-ASCII bytes at word boundaries for windows-1252. (Especially relevant for Italian, Catalan, Icelandic, and Faroese.) * Move Estonian from the Baltic model to the Western model. This improves overall Estonian detection but causes š and ž encoded as windows-1257, ISO-8859-13, or ISO-8859-4 to get misdecoded. (It would be possible to add a post-processing step to adjust for š and ž, but this would cause reloads given the way chardetng is integrated with Firefox.) * Improve Thai accuracy a lot. * Improve Vietnamese, Lithuanian, and Latvian accuracy a bit. * Improve accuracy for most Central European languages a bit. * Regress accuracy for some Central European languages a bit (as side effect of fixing Italian and Catalan). * Properly classify letters that ISO-8859-4 has but windows-1257 doesn't have in order to avoid misdetecting non-ISO-8859-4 input as ISO-8859-4. * Improve character classification of windows-1254. * Avoid classifying byte 0xA1 or above as space-like to avoid misdetection. * Reduce binary size. Differential Revision: https://phabricator.services.mozilla.com/D63197 --HG-- extra : moz-landing-system : lando |
||
---|---|---|
.. | ||
aom | ||
dav1d | ||
msgpack | ||
prio | ||
python | ||
rlbox | ||
rust | ||
sqlite3 | ||
webkit/PerformanceTests | ||
moz.build |