ruby/enc
duerst 3628eae2e7 implement special behavior for Georgian for String#capitalize
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-09 23:14:29 +00:00
..
jis gperf.sed: static declarations 2017-12-15 14:42:43 +00:00
trans replace copyrights by explanatory text in data files for GB2312/GB12345 mappings 2017-09-01 10:22:09 +00:00
unicode delete Unicode 10.0.0 related files, no longer needed [#14802] 2018-12-09 02:02:45 +00:00
Makefile.in common.mk: directory timestamps 2016-07-15 21:26:02 +00:00
ascii.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
big5.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
cp949.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
depend ruby tool/update-deps --fix 2018-07-05 12:48:45 +00:00
ebcdic.h enc/ebcdic.h, enc/trans/ebcdic.trans, 2015-12-15 08:57:58 +00:00
emacs_mule.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
encdb.c
encinit.c.erb
euc_jp.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
euc_kr.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
euc_tw.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
gb2312.c
gb18030.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
gbk.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_2022_jp.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
iso_8859.h iso_8859.h: SHARP_s 2016-06-11 02:24:38 +00:00
iso_8859_1.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_2.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_3.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_4.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_5.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_6.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_7.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_8.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_9.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_10.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_11.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_13.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_14.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_15.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
iso_8859_16.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
koi8_r.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
koi8_u.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
make_encmake.rb Refactor ERB version checking for keyword arguments 2018-02-27 11:12:23 +00:00
mktable.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
shift_jis.c Update to Onigmo 6.1.3-669ac9997619954c298da971fcfacccf36909d05. 2017-12-01 13:50:13 +00:00
shift_jis.h Add missing file 2017-12-01 14:08:13 +00:00
unicode.c implement special behavior for Georgian for String#capitalize 2018-12-09 23:14:29 +00:00
us_ascii.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
utf_7.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
utf_8.c Change max byte length of UTF-8 to 4 bytes 2017-05-30 05:43:41 +00:00
utf_16_32.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
utf_16be.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
utf_16le.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
utf_32be.c fix UTF-32 valid_encoding? 2017-03-09 02:04:10 +00:00
utf_32le.c fix UTF-32 valid_encoding? 2017-03-09 02:04:10 +00:00
windows_31j.c Update to Onigmo 6.1.3-669ac9997619954c298da971fcfacccf36909d05. 2017-12-01 13:50:13 +00:00
windows_1250.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
windows_1251.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
windows_1252.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
windows_1253.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
windows_1254.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
windows_1257.c Merge Onigmo 6.0.0 2016-12-10 17:47:04 +00:00
x_emoji.h * enc/x_emoji.h: fix dead-link. 2015-12-27 11:00:36 +00:00