github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
duerst	7fe64d17d3	update to Unicode Version 12.1.0 (beta) Unicode Version 12.1.0 adds one single character, U+32FF SQUARE ERA NAME REIWA, for the new Japanese era starting on May 1st. 12.1.0 will be finalized only on May 7th, so we go with the beta version because further changes in the data we need are highly unlikely, and we want to make sure Ruby is ready for the new era. * common.mk: change UNICODE_VERSION to 12.1.0, UNICODE_BETA to YES * enc/unicode/12.1.0, enc/unicode/12.1.0/casefold.h, enc/unicode/12.1.0/name2ctype.h: add directory and generated data files for new version * lib/unicode_normalize/tables.rb: update for new character * test/ruby/test_regexp.rb: add test for character property age=12.1 * test/test_unicode_normalize.rb: add test for NFKC decomposition of new character This (mostly) completes issue #15195. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-04-05 00:58:51 +00:00
duerst	f831ca6764	delete directory and files related to Unicode version 11.0.0 this completes and closes feature #15321 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-03-06 03:19:10 +00:00
duerst	cff7eefa07	update Unicode version (and Emoji version) to 12.0.0 - common.mk: set UNICODE_VERSION and UNICODE_EMOJI_VERSION to 12.0.0 - lib/unicode_normalize/tables.rb: update table data to Unicode version 12.0.0 - enc/unicode/12.0.0/casefold.h, enc/unicode/12.0.0/name2ctype.h: add generated files for Unicode version 12.0.0 This is the main commit for #15321. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-03-06 01:55:19 +00:00
nobu	4fc656a2f3	Removed duplicate dependents git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-02-14 05:40:08 +00:00
nobu	b2f4241509	Update dependencies, internal.h includes ruby.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-02-12 12:31:55 +00:00
duerst	3628eae2e7	implement special behavior for Georgian for String#capitalize The modern Georgian script is special in that it has an 'uppercase' variant called MTAVRULI which can be used for emphasis of whole words, for screamy headlines, and so on. However, in contrast to all other bicameral scripts, there is no usage of capitalizing the first letter in a word or a sentence. Words with mixed capitalization are not used at all. We therefore implement special behavior for String#capitalize. Formally, we define String#capitalize as first applying String#downcase for the whole string, then using titlecase on the first letter. Because Georgian defines titlecase as the identity function both for MTAVRULI ('uppercase') and Mkhedruli (lowercase), this results in String#capitalize being equivalent to String#downcase for Georgian. This avoids undesirable mixed case. * enc/unicode.c: Actual implementation * string.c: Add mention of this special case for documentation * test/ruby/enc/test_case_mapping.rb: Add two tests, a general one that uses String#capitalize on some (including nonsensical) combinations of MTAVRULI and Mkhedruli, and a canary test to detect the potential assignment of characters to the currently open slots (holes) at U+1CBB and U+1CBC. * test/ruby/enc/test_case_comprehensive.rb: Tweak generation of expectation data. Together with r65933, this closes issue #14839. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-09 23:14:29 +00:00
duerst	c2d8078e3d	delete Unicode 10.0.0 related files, no longer needed [#14802 ] This line, and those below, will be ignored-- D enc/unicode/10.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-09 02:02:45 +00:00
duerst	e824e21beb	remove obsolete data from unicode.c * unicode.c: Remove the arrays onigenc_unicode_GCB_ranges_GAZ, onigenc_unicode_GCB_ranges_E_Base, and onigenc_unicode_GCB_ranges_Emoji, because they are not needed anymore for Unicode 11.0.0. * regparse.c: Remove external declarations for above arrays. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66232 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-06 00:05:08 +00:00
duerst	66a6073859	update to Unicode 11.0.0 (main step, not complete yet) - common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0 - test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. Files for Unicode 10.0.0 will be removed once we are sure 11.0.0 works. - lib/unicode_normalize/tables.rb: Updated table. - regparse.c: Almost completely reimplement grapheme cluster detection in function node_extended_grapheme_cluster(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-05 08:10:24 +00:00
duerst	a96a594f99	solve the genie/zombie/wrestlers bug enc/unicode.c: - Add U+1F93C (WRESTLERS), U+1F9DE (GENIE), and U+1F9DF to onigenc_unicode_GCB_ranges_E_Base. - Add comments with character names. test/ruby/enc/test_emoji_breaks.rb: Activate tests for genie/zombie/wrestlers. This closes issue #15343. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-02 10:07:42 +00:00
nobu	26771cadc0	Added words in the comment at r65088 [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66103 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-30 07:19:49 +00:00
nobu	7aaf5b2878	Embed the Emoji version git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-27 06:44:02 +00:00
duerst	fc6243a6a6	deal with ONIGENC_CASE_IS_TITLECASE flag on lowercase characters In the function onigenc_unicode_case_map() in enc/unicode.c, deal with the case that the ONIGENC_CASE_IS_TITLECASE flag is set on lowercase characters. This is in preparation for Georgian Mtavruli, which are uppercase but not titlecase, in Unicode 11.0.0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65971 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-25 10:12:45 +00:00
duerst	2d5b57d63c	prepare for Unicode 11.0.0 update - enc/unicode/case-folding.rb: - Convert unpredicted case to actual flag setting - Eliminate an unused variable - Change a variable name to avoid a warning git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65933 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-23 06:45:26 +00:00
nobu	34cc6fef83	Make some internal functions static git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-16 06:52:00 +00:00
shyouhei	6732423b5e	enc/unicode.c: 'a' is bigger than 'A' In ASCII, 'a' is bigger than 'A'. Which means 'A' - 'a' is a negative number (-32, to be precise). In C, the type of 'a' and 'A' are signed int (cf: ISO/IEC 9899:1990 section 6.1.3.4). So 'A' - 'a' is also a signed int. It is `(signed int)-32`. The problem is, OnigCodePoint is unsigned int. Adding a negative number to a variable of OnigCodepoint (`code` here) introduces an unintentional cast of `(unsigned)(signed)-32`, which is 4,294,967,264. Adding this value to code then overflows, and the result eventually becomes normal codepoint. The series of operations are not a serious problem but because `code >= 'a'` holds, we can `(code - 'a') + 'A'` to reroute this. See also: https://github.com/k-takata/Onigmo/pull/107 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65752 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-16 02:34:00 +00:00
duerst	a5818630f8	revert r65091, r65090 because ci fails git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65093 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 07:53:37 +00:00
duerst	33b5c610a6	update to Unicode 11.0.0 (basic step, not complete yet) - common.mk: Change Unicode version to 11.0.0 - enc/unicode/case-folding.rb, enc/unicode.c: Initial changes to deal with Gregorian Mtavruli. This should bring us up to the same level as e.g. Python 3.7, by following the Unicode tables exactly. But it will produce undesirable (mixed-case) results for String#capitalize. This will be addressed in a later commit. - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. - lib/unicode_normalize/tables.rb: Updated table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 07:01:55 +00:00
duerst	7223582866	add some comments to enc/unicode/case-folding.rb [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 06:41:47 +00:00
nobu	0814870fed	Removed data for old Unicode [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 05:14:59 +00:00
nobu	e07c5baf66	unicode.c: moved addtional GCB ranges * enc/unicode.c: moved additional Grapheme Cluster Break ranges which depend on the Unicode version. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-15 13:48:20 +00:00
nobu	179045acaf	regparse.c: Suppress duplicated range warning by mere \X * regparse.c (node_extended_grapheme_cluster): as Unicode 10 has added Grapheme_Cluster_Break properties to some characters, remove duplicated ranges for Unicode 9. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-15 12:31:25 +00:00
naruse	1d74de37e3	ruby tool/update-deps --fix git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63860 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-07-05 12:48:45 +00:00
k0kubun	3406c5d613	Refactor ERB version checking for keyword arguments Improving code like r62590. See r62529 for details. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62594 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-27 11:12:23 +00:00
naruse	b81538d259	Judge ERB version not RUBY_VERSION but ERB.version On cross compilation, ruby command uses fake RUBY_VERSION. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62560 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-24 02:58:41 +00:00
k0kubun	cc777d09f4	erb.rb: deprecate safe_level of ERB.new Also, as it's in the middle of the list of 4 arguments, 3rd and 4th arguments (trim_mode, eoutvar) are changed to keyword arguments. Old ways to specify arguments are deprecated and warned now. bin/erb: deprecate -S option. We'll remove all of deprecated ones at Ruby 2.7+. enc/make_encmake.rb: stopped using deprecated interface ext/etc/mkconstants.rb: ditto ext/socket/mkconstants.rb: ditto sample/ripper/ruby2html.rb: ditto spec/ruby/library/erb/defmethod/def_erb_method_spec.rb: ditto spec/ruby/library/erb/new_spec.rb: ditto test/erb/test_erb.rb: ditto test/erb/test_erb_command.rb: ditto tool/generic_erb.rb: ditto tool/ruby_vm/helpers/dumper.rb: ditto tool/transcode-tblgen.rb: ditto lib/rdoc/erbio.rb: ditto lib/rdoc/generator/darkfish.rb: ditto [Feature #14256] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62529 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-22 13:28:25 +00:00
nobu	982e27e95d	Update dependencies * common.mk: enc/unicode.$(OBJEXT) depends on onigmo.h via oniguruma.h. * common.mk: dependencies of prelude.$(OBJEXT) are defined for each generated C sources. enc/depend: casefold.h and name2ctype.h are located under $(UNICODE_HDR_DIR). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-13 09:23:40 +00:00
kazu	e02e13d723	Update dependencies using `tool/update-deps` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-13 00:30:01 +00:00
nobu	aaa18bae72	update dependencies git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61715 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-09 06:55:55 +00:00
nobu	7c4306e6e9	gperf.sed: static declarations * tool/gperf.sed: comment out arguments part only, to keep the following declarations static. [Feature #13883] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61282 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-15 14:42:43 +00:00
nobu	a4804fbdf5	support gperf 3.1 * tool/gperf.sed: extracted sed commands to a script. ANSI-C code produced by gperf 3.1 declares length arguments as `size_t`. it causes conflict with existing declarations, and needs casts for a local variable and return statements. [Feature #13883] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61076 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-08 05:51:19 +00:00
nobu	01830719f6	fix for emoji-data.txt * common.mk: download emoji-data.txt. As emoji data files are located in a separate directory in Unicode.org site, reearranged Unicode data files directories same as the site. * tool/enc-unicode.rb (get_file): search emoji data files in the second argument path. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 03:12:51 +00:00
naruse	9ba2aa7644	Add missing file git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60967 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-01 14:08:13 +00:00
naruse	31796f17d3	Update to Onigmo 6.1.3-669ac9997619954c298da971fcfacccf36909d05. [Bug #13892] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-01 13:50:13 +00:00
duerst	df155f092c	remove Unicode 9.0.0-related files We don't need these files anymore because we upgraded to Unicode 10.0.0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59760 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 08:12:02 +00:00
duerst	04547c7dc0	update Ruby to Unicode 10.0.0 - In common.mk, set UNICODE_VERSION to 10.0.0 - Generate and add enc/unicode/10.0.0/casefold.h and enc/unicode/10.0.0/name2ctype.h - Update lib/unicode_normalize/tables.rb git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59759 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 07:56:41 +00:00
duerst	3d46d51c45	replace copyrights by explanatory text in data files for GB2312/GB12345 mappings Replace the copyrights and explanatory texts in the data files used for mapping GB2312/GB12345 to/from Unicode with short explanatory texts. [Bug #12598] [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59715 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-01 10:22:09 +00:00
duerst	11954049fa	Change max byte length of UTF-8 to 4 bytes In enc/utf_8.c, change maximum byte length of UTF-8 to 4 bytes (from 6) to conform to definition of UTF-8. This closes issue #13590. (This is a retry of r58954, after issue #13590 has been addressed.) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-30 05:43:41 +00:00
duerst	e07bff3ce3	revert r58954 temporarily Revert change to maximum of 4 bytes for UTF-8 characters at r58954 temporarily. This failed spec at https://travis-ci.org/ruby/ruby/builds/237086017, but it is totally unclear why. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58955 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-29 08:59:41 +00:00
duerst	a03690ae73	Change max byte length of UTF-8 to 4 bytes In enc/utf_8.c, change maximum byte length of UTF-8 to 4 bytes (from 6) to conform to definition of UTF-8. This closes issue #13590. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58954 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-29 08:41:23 +00:00
duerst	f86766970f	delete enc/prelude.rb, because no longer needed git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58579 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-06 04:42:58 +00:00
nobu	d97f700938	clean autogenerated files * enc/depend (clean, clean-srcs): fix path of name2ctype.h, and remove casefold.h too. * enc/jis/props.h: autogenerated file. [ruby-core:80823] [Bug #13493] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-04-21 23:16:43 +00:00
nobu	ae73dc367a	enc/depend: remove Unicode versions * enc/depend (enc/unicode.o): remove hardcoded Unicode versions. this object file must be compiled by toplevel make. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-04-18 02:58:44 +00:00
normal	d6a6971212	fix ext/-test-/struct/ dependencies I started writing a template for auto-generation and let "tool/update-deps --fix" fill in the rest. Hopefully this fixes problems with some CI builds after r58359. Further changes to other ext/-test-/ files should probably add or update "depend" files, too. * ext/-test-/struct/depend: new file * enc/depend: auto-updated with unicode 9.0.0 headers (side-effect) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-04-15 07:13:05 +00:00
nobu	8083a359d0	enc-unicode.rb: uniname2ctype_offset git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:59:56 +00:00
nobu	12b8058661	update name2ctype.h * enc/unicode/9.0.0/name2ctype.h: update due to merger of Onigmo 6.0.0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58064 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:53:35 +00:00
shyouhei	20c72dc89d	ruby tool/update-deps --fix Onigumo 6 (r57045) introduced new onigumo.h header file, which is required from quite much everywhere. This commit adds necessary dependencies. Note: ruby/oniguruma.h now includes onigumo.h, ruby/io.h includes oniguruma.h, ruby/encoding.h also includes oniguruma.h, and internal.h includes encoding.h. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-22 06:00:18 +00:00
nobu	4171ed6c21	fix UTF-32 valid_encoding? * enc/utf_32be.c (utf32be_mbc_enc_len): check arguments precisely. [ruby-core:79966] [Bug #13292] * enc/utf_32le.c (utf32le_mbc_enc_len): ditto. * regenc.h (UNICODE_VALID_CODEPOINT_P): predicate for valid Unicode codepoints. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57816 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-09 02:04:10 +00:00
naruse	2873edeafb	Merge Onigmo 6.0.0 * https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY * fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78 * suppress warning: https://github.com/k-takata/Onigmo/pull/79 * include/ruby/oniguruma.h: include onigmo.h. * template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-10 17:47:04 +00:00
duerst	8baa73be48	remove special processing for U+03B9/U+03BC/U+A64B * enc/unicode.c: Remove special processing for U+03B9/U+03BC/U+A64B (GREEK SMALL LETTERs IOTA/MU, CYRILLIC SMALL LETTER MONOGRAPH UK) from onigenc_unicode_case_map and simplify code. * enc/unicode/case-folding.rb: Remove check for U+03B9/U+03BC/U+A64B. This and the previous few related commits make sure that we won't hit the equivalent of bug #12990 anymore for future updates of Unicode versions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-04 01:58:54 +00:00
duerst	31fb4e3ec3	Reorder codepoints in some entries of CaseUnfold_11_Table * enc/unicode/case-folding.rb: Reorder codepoints so that the upper-case mapping comes first. * enc/unicode/9.0.0/casefold.h: Codepoints reordered, upper-case mapping flag added. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56975 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-04 01:17:34 +00:00
nobu	671c929f0a	Use offsetof macro and shrink table size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-01 00:34:42 +00:00
nobu	4f7c3d3583	constify CaseMappingSpecials git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-01 00:34:41 +00:00
naruse	c11e648799	Regexp supports Unicoe 9.0.0's \X * meta character \X matches Unicode 9.0.0 characters with some workarounds for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences. [Feature #12831] [ruby-core:77586] The term "character" can have many meanings bytes, codepoints, combined characters, and so on. "grapheme cluster" is highest one of such words, which means user-perceived characters. Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to handle grapheme clusters (extended grapheme cluster). But some specs aren't updated to current situation because Unicode Emoji is rapidly extended without well definition. It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be easily tested by looking at immediately adjacent characters". (the sentence will be removed in the next version) Though some of its detail are described in Unicode Technical Report #51 UNICODE EMOJI but it is not merged into UTR#29 yet. http://unicode.org/reports/tr29/ http://unicode.org/reports/tr51/ http://unicode.org/Public/emoji/4.0/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-30 17:29:19 +00:00
duerst	87b937bdfd	fix uppercasing for U+A64B, CYRILLIC SMALL LETTER MONOGRAPH UK * enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC at the end of onigenc_unicode_case_map (Bug #12990). * enc/unicode/case-folding.rb: Add U+A64B to the special cases 03B9 and 03BC. Add a comment pointing to enc/unicode.c. Change warnings to exceptions for unpredicted cases, because this would have been more easily noticed (the warning was not noticed when upgrading to Unicode 9.0.0). * test/ruby/enc/test_case_comprehensive.rb: Remove temporary exclusion of U+A64B from testing. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-30 08:25:46 +00:00
duerst	2959b5aa16	* enc/windows_1254.c: Fix typo. Reported by k-takata at https://github.com/k-takata/Onigmo/commit/ceb59cc. Thanks! git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56523 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-29 21:39:37 +00:00
nobu	b24e093296	Update windows-1255 table * enc/trans/windows-1255-tbl.rb: update mapping from 0xCA to U+05BA. [Feature #12877] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56516 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-28 15:14:32 +00:00
nobu	4f7a051eee	enc/depend: downcase * enc/depend: downcase table file names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56515 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-28 14:25:57 +00:00
nobu	8027dfafd7	enc/depend: extract transcode_tblgen * enc/depend: extract transcode_tblgen method calls for libraries loaded by dynamically generated names, in single_byte.trans. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-28 14:22:34 +00:00
nobu	06711fd1e3	single_byte.trans: dead code * enc/trans/single_byte.trans (transcode_tblgen_singlebyte): remove useless code. returned value is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-28 14:18:52 +00:00
duerst	64b62f40a5	* enc/windows_1254.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1254. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56433 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-16 06:09:08 +00:00
duerst	c0f48f2385	* unicode/8.0.0/casefold.h, name2ctype.h, unicode/data/8.0.0: removing directories/files related to Unicode version 8.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-07 08:35:39 +00:00
duerst	d25e478e91	* common.mk: Updated Unicode version to 9.0.0 [Feature #12513 ] * unicode/9.0.0/casefold.h, name2ctype.h, unicode/data/9.0.0: new directories/files for Unicode version 9.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-07 08:13:08 +00:00
nobu	7b664abad1	common.mk: separate unicode headers * common.mk (UNICODE_HDR_DIR): separate unicode header files from unicode data files. [ruby-core:76879] [Bug #12677] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55942 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-16 12:04:34 +00:00
nobu	a44caf8067	common.mk: UNICODE_HDR_DIR * common.mk (UNICODE_HDR_DIR): directory for unicode headers. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55933 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-16 08:53:49 +00:00
nobu	bfb5b0f84b	iso_8859_2.c: dedent [ci skip] * enc/iso_8859_2.c: remove unnecessary indent. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-30 10:32:06 +00:00
duerst	4abdd6c5aa	* enc/iso_8859_2.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-2, by Yushiro Ishii. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-30 03:00:09 +00:00
duerst	55378a9eb6	* enc/windows_1253.c: Remove dead code found by Coverity Scan. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-27 01:33:01 +00:00
duerst	7b2b2869c9	* enc/windows_1257.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1257, by Sho Koike. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55752 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-26 07:33:18 +00:00
duerst	14dd8a17e8	* enc/windows_1250.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1250, by Sho Koike. * ChangeLog: Fixed order of previous two entries. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55751 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-26 07:19:43 +00:00
duerst	aec1ac6e51	* enc/windows_1251.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1251, by Shunsuke Sato. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55750 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-26 06:54:18 +00:00
duerst	c8a1d8b33b	* enc/windows_1251.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1251, by Shunsuke Sato. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-26 06:30:39 +00:00
duerst	6ed393ad89	* regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, emacs_mule.c, euc_jp.c, euc_kr.c, euc_tw.c, gb18030.c, gbk.c, iso_8859_1\|2\|3\|4\|5\|6\|7\|8\|9\|10\|11\|13\|14\|15\|16.c, koi8_r.c, koi8_u.c, shift_jis.c, unicode.c, us_ascii.c, utf_16\|32be\|le.c, utf_8.c, windows_1250\|51\|52\|53\|54\|57.c, windows_31j.c, unicode.c: Remove conditional compilation macro ONIG_CASE_MAPPING. [Feature #12386]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-24 07:33:15 +00:00
nobu	af2d3c9866	Move generated headers to unicode data directory * common.mk, enc/depend (casefold.h, name2ctype.h): move to unicode data directory per version. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-17 11:59:26 +00:00
nobu	d54856c1d4	common.mk: directory timestamps * common.mk, enc/Makefile.in: moved timestamp files for directories under the specific directory, to get rid of match with files under the source directory. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55696 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 21:26:02 +00:00
usa	251d87583b	Revert r55693 because it broke building on all platforms (and had no ChangeLog). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55694 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 20:24:01 +00:00
nobu	2ace43ba57	common.mk: directory timestamps * common.mk, enc/Makefile.in: moved timestamp files for directories under the specific directory, to get rid of match with files under the source directory. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55693 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 14:08:20 +00:00
nobu	e827c334c3	enc/unicode: check Unicode versions * enc/unicode/case-folding.rb, tool/enc-unicode.rb: check if Unicode versions are consistent with each other. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 00:53:50 +00:00
nobu	2f87f9e63b	common.mk: update enc/unicode/name2ctype.h * Makefile.in (enc/unicode/name2ctype.h): remove stale recipe, which did not support Unicode age properties. * common.mk (enc/unicode/name2ctype.h): update by --header option of tool/enc-unicode.rb. enc/unicode/name2ctype.kwd file has not been used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55678 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-14 08:26:04 +00:00
kazu	9632e413a7	Fix file name in comment again git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55670 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 14:22:22 +00:00
duerst	2ac58e6891	* enc/iso_8859_9.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-9, by Kazuki Iijima. * enc/iso_8859_9.c: Exclude dotless i/I with dot from case-insensitive matching because they are not a case pair. * test/ruby/enc/test_iso_8859.rb: Make test coverage for ISO-8859-9 a bit more complete. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 09:09:47 +00:00
duerst	9f74ae4cf5	* enc/windows_1252.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for Windows-1252, by Serina Tai. * test/ruby/enc/test_case_comprehensive.rb: Fix order of encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55665 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 08:21:29 +00:00
duerst	6a52a5488a	* enc/iso_8859_7.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-7, by Kosuke Kurihara. * test/ruby/enc/test_case_comprehensive.rb: Fix order of encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55664 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 07:19:25 +00:00
nobu	fc4b2d9228	Fix file names in comments git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55661 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 06:32:37 +00:00
duerst	b9cd6920d2	* enc/iso_8859_1.c, enc/iso_8859_4.c: Avoid setting modification flag if there is no modification. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55660 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 06:19:07 +00:00
duerst	e3600eaca1	* enc/iso_8859_5.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-5, by Masaru Onodera. * test/ruby/enc/test_case_comprehensive.rb: Fix order of encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 05:40:12 +00:00
duerst	c5682ac490	* enc/windows_1254.c: Adjust variable/macro names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55654 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 05:15:28 +00:00
duerst	93c1109c19	* enc/iso_8859_9.c, enc/windows_1254.c: Split Windows-1254 from ISO-8859-9 to be able to implement different case conversions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55653 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 04:19:17 +00:00
duerst	cbc947885a	* enc/iso_8859_7.c, enc/windows_1253.c: Split Windows-1253 from ISO-8859-7 to be able to implement different case conversions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55652 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 04:08:36 +00:00
duerst	336b6b1980	* enc/iso_8859_13.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-13, by Kanon Shindo. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 01:50:17 +00:00
duerst	0f3d197da1	* enc/iso_8859_13.c, enc/windows_1257.c: Split Windows-1257 from ISO-8859-13 to be able to implement different case conversions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 01:31:44 +00:00
duerst	19b5e818dd	* enc/iso_8859_3.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-3, by Takuya Miyamoto. * test/ruby/enc/test_case_comprehensive.rb: Extend special treatment for Turkic. * enc/iso_8859_3.c: Exclude dotless i/I with dot from case-insensitive matching because they are not a case pair. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55648 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-13 00:02:34 +00:00
duerst	7c0cb4351a	* revert r55642 (previous commit) because of test failure at https://travis-ci.org/ruby/ruby/builds/144148780 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55643 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-12 12:59:46 +00:00
duerst	7b66f0bae9	* enc/iso_8859_3.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-3, by Takuya Miyamoto. * test/ruby/enc/test_case_comprehensive.rb: Extend special treatment for Turkic. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55642 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-12 12:33:17 +00:00
duerst	7253570a83	* enc/iso_8859_1.c: Moved test for lowercase characters without uppercase equivalent. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55632 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-11 09:05:53 +00:00
duerst	b5d869a89d	* enc/iso_8859_4.c, enc/iso_8859_10.c, enc/iso_8859_14.c, enc/iso_8859_15.c, enc/iso_8859_16.c: Replace case-by-case code with lookup in ENC_ISO_8859_xx_TO_LOWER_CASE table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55631 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-11 08:49:38 +00:00
nobu	a00ec4cf3a	enc/iso_8859_4.c: adjust indent [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-10 14:28:25 +00:00
duerst	07ac66ccec	* enc/iso_8859_10.c, test/ruby/enc/test_case_comprehensive.rb: Implement non-ASCII case conversion for ISO-8859-10, by Toya Hosokawa. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55627 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-10 10:53:45 +00:00
duerst	8f0b58d36a	* test/ruby/enc/test_case_comprehensive.rb: Changed testing logic in to catch unintended modifications of characters that do not have a case equivalent in the respective encoding. * enc/iso_8859_1.c, enc/iso_8859_15.c: Fixed unintended modifications of micro sign and y with diaeresis. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55626 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-10 10:33:51 +00:00
svn	40a34feb4c	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-10 08:05:40 +00:00

1 2 3 4 5 ...

668 Коммитов