gecko-dev

Граф коммитов

Автор	SHA1	Сообщение	Дата
Henri Sivonen	c55405f18e	Bug 1678175 - Avoid detecting windows-1252 euro sign as GBK. r=m_kato Differential Revision: https://phabricator.services.mozilla.com/D98005	2020-11-29 08:07:45 +00:00
Henri Sivonen	c9cae42014	Bug 1631983 - Update chardetng to 0.1.9. r=m_kato * Avoid misdetecting windows-1252 English as windows-1254. * Avoid misdetecting windows-1252 English as IBM866. * Avoid misdetecting windows-1252 English as GBK or EUC-KR. * Improve Chinese and Japanese detection by not giving single-byte encodings score for letter next to digit. * Improve Italian, Portuguese, Castilian, Catalan, and Galician detection by taking into account ordinal indicator use. * Reduce lookup table size. Differential Revision: https://phabricator.services.mozilla.com/D73237	2020-05-12 13:56:29 +00:00
Henri Sivonen	52a6fe2427	Bug 1615836 - Update chardetng to 0.1.6. r=emk * Properly take into account non-ASCII bytes at word boundaries for windows-1252. (Especially relevant for Italian, Catalan, Icelandic, and Faroese.) * Move Estonian from the Baltic model to the Western model. This improves overall Estonian detection but causes š and ž encoded as windows-1257, ISO-8859-13, or ISO-8859-4 to get misdecoded. (It would be possible to add a post-processing step to adjust for š and ž, but this would cause reloads given the way chardetng is integrated with Firefox.) * Improve Thai accuracy a lot. * Improve Vietnamese, Lithuanian, and Latvian accuracy a bit. * Improve accuracy for most Central European languages a bit. * Regress accuracy for some Central European languages a bit (as side effect of fixing Italian and Catalan). * Properly classify letters that ISO-8859-4 has but windows-1257 doesn't have in order to avoid misdetecting non-ISO-8859-4 input as ISO-8859-4. * Improve character classification of windows-1254. * Avoid classifying byte 0xA1 or above as space-like to avoid misdetection. * Reduce binary size. Differential Revision: https://phabricator.services.mozilla.com/D63197 --HG-- extra : moz-landing-system : lando	2020-02-18 22:31:00 +00:00
Henri Sivonen	5c2bad25ab	Bug `1551276` - Autodetect legacy encodings on unlabeled pages. r=emk Differential Revision: https://phabricator.services.mozilla.com/D56362 --HG-- extra : moz-landing-system : lando	2019-12-12 17:50:19 +00:00
Oana Pop Rus	df78d6011c	Backed out changeset 0810ad586986 (bug `1551276`) for wpt failures in ar-ISO-8859-6-late.tentative.html on a CLOSED TREE	2019-12-12 16:38:54 +02:00
Henri Sivonen	07527a83c9	Bug `1551276` - Autodetect legacy encodings on unlabeled pages. r=emk Differential Revision: https://phabricator.services.mozilla.com/D56362 --HG-- extra : moz-landing-system : lando	2019-12-12 12:59:47 +00:00

6 Коммитов