github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
duerst	50eea44a27	remove Unicode 12.0.0 related directory and generated files This completes issue #15195. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67453 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-04-05 23:52:15 +00:00
duerst	7fe64d17d3	update to Unicode Version 12.1.0 (beta) Unicode Version 12.1.0 adds one single character, U+32FF SQUARE ERA NAME REIWA, for the new Japanese era starting on May 1st. 12.1.0 will be finalized only on May 7th, so we go with the beta version because further changes in the data we need are highly unlikely, and we want to make sure Ruby is ready for the new era. * common.mk: change UNICODE_VERSION to 12.1.0, UNICODE_BETA to YES * enc/unicode/12.1.0, enc/unicode/12.1.0/casefold.h, enc/unicode/12.1.0/name2ctype.h: add directory and generated data files for new version * lib/unicode_normalize/tables.rb: update for new character * test/ruby/test_regexp.rb: add test for character property age=12.1 * test/test_unicode_normalize.rb: add test for NFKC decomposition of new character This (mostly) completes issue #15195. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-04-05 00:58:51 +00:00
duerst	f831ca6764	delete directory and files related to Unicode version 11.0.0 this completes and closes feature #15321 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-03-06 03:19:10 +00:00
duerst	cff7eefa07	update Unicode version (and Emoji version) to 12.0.0 - common.mk: set UNICODE_VERSION and UNICODE_EMOJI_VERSION to 12.0.0 - lib/unicode_normalize/tables.rb: update table data to Unicode version 12.0.0 - enc/unicode/12.0.0/casefold.h, enc/unicode/12.0.0/name2ctype.h: add generated files for Unicode version 12.0.0 This is the main commit for #15321. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2019-03-06 01:55:19 +00:00
duerst	c2d8078e3d	delete Unicode 10.0.0 related files, no longer needed [#14802 ] This line, and those below, will be ignored-- D enc/unicode/10.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-09 02:02:45 +00:00
duerst	66a6073859	update to Unicode 11.0.0 (main step, not complete yet) - common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0 - test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. Files for Unicode 10.0.0 will be removed once we are sure 11.0.0 works. - lib/unicode_normalize/tables.rb: Updated table. - regparse.c: Almost completely reimplement grapheme cluster detection in function node_extended_grapheme_cluster(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-12-05 08:10:24 +00:00
nobu	7aaf5b2878	Embed the Emoji version git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-27 06:44:02 +00:00
duerst	2d5b57d63c	prepare for Unicode 11.0.0 update - enc/unicode/case-folding.rb: - Convert unpredicted case to actual flag setting - Eliminate an unused variable - Change a variable name to avoid a warning git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65933 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-23 06:45:26 +00:00
nobu	34cc6fef83	Make some internal functions static git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-16 06:52:00 +00:00
duerst	a5818630f8	revert r65091, r65090 because ci fails git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65093 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 07:53:37 +00:00
duerst	33b5c610a6	update to Unicode 11.0.0 (basic step, not complete yet) - common.mk: Change Unicode version to 11.0.0 - enc/unicode/case-folding.rb, enc/unicode.c: Initial changes to deal with Gregorian Mtavruli. This should bring us up to the same level as e.g. Python 3.7, by following the Unicode tables exactly. But it will produce undesirable (mixed-case) results for String#capitalize. This will be addressed in a later commit. - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. - lib/unicode_normalize/tables.rb: Updated table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 07:01:55 +00:00
duerst	7223582866	add some comments to enc/unicode/case-folding.rb [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 06:41:47 +00:00
nobu	a4804fbdf5	support gperf 3.1 * tool/gperf.sed: extracted sed commands to a script. ANSI-C code produced by gperf 3.1 declares length arguments as `size_t`. it causes conflict with existing declarations, and needs casts for a local variable and return statements. [Feature #13883] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61076 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-08 05:51:19 +00:00
nobu	01830719f6	fix for emoji-data.txt * common.mk: download emoji-data.txt. As emoji data files are located in a separate directory in Unicode.org site, reearranged Unicode data files directories same as the site. * tool/enc-unicode.rb (get_file): search emoji data files in the second argument path. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 03:12:51 +00:00
duerst	df155f092c	remove Unicode 9.0.0-related files We don't need these files anymore because we upgraded to Unicode 10.0.0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59760 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 08:12:02 +00:00
duerst	04547c7dc0	update Ruby to Unicode 10.0.0 - In common.mk, set UNICODE_VERSION to 10.0.0 - Generate and add enc/unicode/10.0.0/casefold.h and enc/unicode/10.0.0/name2ctype.h - Update lib/unicode_normalize/tables.rb git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59759 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 07:56:41 +00:00
nobu	8083a359d0	enc-unicode.rb: uniname2ctype_offset git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:59:56 +00:00
nobu	12b8058661	update name2ctype.h * enc/unicode/9.0.0/name2ctype.h: update due to merger of Onigmo 6.0.0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58064 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:53:35 +00:00
duerst	8baa73be48	remove special processing for U+03B9/U+03BC/U+A64B * enc/unicode.c: Remove special processing for U+03B9/U+03BC/U+A64B (GREEK SMALL LETTERs IOTA/MU, CYRILLIC SMALL LETTER MONOGRAPH UK) from onigenc_unicode_case_map and simplify code. * enc/unicode/case-folding.rb: Remove check for U+03B9/U+03BC/U+A64B. This and the previous few related commits make sure that we won't hit the equivalent of bug #12990 anymore for future updates of Unicode versions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-04 01:58:54 +00:00
duerst	31fb4e3ec3	Reorder codepoints in some entries of CaseUnfold_11_Table * enc/unicode/case-folding.rb: Reorder codepoints so that the upper-case mapping comes first. * enc/unicode/9.0.0/casefold.h: Codepoints reordered, upper-case mapping flag added. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56975 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-04 01:17:34 +00:00
nobu	671c929f0a	Use offsetof macro and shrink table size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-01 00:34:42 +00:00
nobu	4f7c3d3583	constify CaseMappingSpecials git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-01 00:34:41 +00:00
naruse	c11e648799	Regexp supports Unicoe 9.0.0's \X * meta character \X matches Unicode 9.0.0 characters with some workarounds for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences. [Feature #12831] [ruby-core:77586] The term "character" can have many meanings bytes, codepoints, combined characters, and so on. "grapheme cluster" is highest one of such words, which means user-perceived characters. Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to handle grapheme clusters (extended grapheme cluster). But some specs aren't updated to current situation because Unicode Emoji is rapidly extended without well definition. It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be easily tested by looking at immediately adjacent characters". (the sentence will be removed in the next version) Though some of its detail are described in Unicode Technical Report #51 UNICODE EMOJI but it is not merged into UTR#29 yet. http://unicode.org/reports/tr29/ http://unicode.org/reports/tr51/ http://unicode.org/Public/emoji/4.0/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-30 17:29:19 +00:00
duerst	87b937bdfd	fix uppercasing for U+A64B, CYRILLIC SMALL LETTER MONOGRAPH UK * enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC at the end of onigenc_unicode_case_map (Bug #12990). * enc/unicode/case-folding.rb: Add U+A64B to the special cases 03B9 and 03BC. Add a comment pointing to enc/unicode.c. Change warnings to exceptions for unpredicted cases, because this would have been more easily noticed (the warning was not noticed when upgrading to Unicode 9.0.0). * test/ruby/enc/test_case_comprehensive.rb: Remove temporary exclusion of U+A64B from testing. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-30 08:25:46 +00:00
duerst	c0f48f2385	* unicode/8.0.0/casefold.h, name2ctype.h, unicode/data/8.0.0: removing directories/files related to Unicode version 8.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-07 08:35:39 +00:00
duerst	d25e478e91	* common.mk: Updated Unicode version to 9.0.0 [Feature #12513 ] * unicode/9.0.0/casefold.h, name2ctype.h, unicode/data/9.0.0: new directories/files for Unicode version 9.0.0 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-07 08:13:08 +00:00
nobu	7b664abad1	common.mk: separate unicode headers * common.mk (UNICODE_HDR_DIR): separate unicode header files from unicode data files. [ruby-core:76879] [Bug #12677] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55942 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-16 12:04:34 +00:00
nobu	af2d3c9866	Move generated headers to unicode data directory * common.mk, enc/depend (casefold.h, name2ctype.h): move to unicode data directory per version. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-17 11:59:26 +00:00
nobu	e827c334c3	enc/unicode: check Unicode versions * enc/unicode/case-folding.rb, tool/enc-unicode.rb: check if Unicode versions are consistent with each other. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 00:53:50 +00:00
nobu	2f87f9e63b	common.mk: update enc/unicode/name2ctype.h * Makefile.in (enc/unicode/name2ctype.h): remove stale recipe, which did not support Unicode age properties. * common.mk (enc/unicode/name2ctype.h): update by --header option of tool/enc-unicode.rb. enc/unicode/name2ctype.kwd file has not been used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55678 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-14 08:26:04 +00:00
nobu	893bb61bcb	case-folding.rb: define version numbers * enc/unicode/case-folding.rb: define Unicode version numbers. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55546 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-30 08:24:11 +00:00
nobu	753ce99eac	case-folding.rb: check version numbers * enc/unicode/case-folding.rb: check if version numbers in each data files match. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55545 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-30 08:13:28 +00:00
naruse	0e585b37ec	Revert "Use gperf 3.0.4" It is wrong commit. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55518 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-28 04:38:32 +00:00
naruse	4b31485ad8	Use gperf 3.0.4 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-27 18:30:12 +00:00
nobu	656c458665	Read CaseFolding.txt in binary mode * enc/unicode/case-folding.rb (CaseFolding#load): read in binary mode to deal with non-ASCII charater in CaseFolding.txt. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-24 05:29:28 +00:00
nobu	eff6873363	touch * enc/unicode/case-folding.rb: touch the destination file. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55494 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-24 00:23:46 +00:00
nobu	d1e2c50a0c	Updating casefold.h * common.mk (lib/unicode_normalize/tables.rb): should not depend on Unicode data files unless ALWAYS_UPDATE_UNICODE=yes, to get rid of downloading Unicode data unnecessary. [ruby-dev:49681] * common.mk (enc/unicode/casefold.h): update Unicode files in a sub-make, not to let the header depend on the files always. * enc/unicode/case-folding.rb: if gperf is not usable, assume the existing file is OK. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-24 00:17:17 +00:00
duerst	5e9d33ad49	* enc/unicode/case-folding.rb, casefold.h: Data generation to implement swapcase functionality for titlecase characters. Swapcase isn't defined by Unicode, because the purpose/usage of swapcase is unclear anyway. The implementation follows a proposal from Nobu, swaping the case of each component of a titlecase character individually. This means that the titlecase characters have to be decomposed. * enc/unicode.c: Code using the above data. * test/ruby/enc/test_case_mapping.rb: Tests for the above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-04-01 11:58:47 +00:00
duerst	78f540019a	* enc/unicode/case-folding.rb, casefold.h: Tweaked handling of 6 special cases in CaseUnfold_11_Table. * enc/unicode.c: Adjustments for above. * test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in test_titlecase activated; test_greek added. A test in test_cherokee fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-29 07:53:43 +00:00
duerst	0e6f8b166d	* enc/unicode/case-folding.rb, casefold.h: Removing data for idempotent titlecasing. * enc/unicode.c: Adjust code to data removal. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54347 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-29 04:24:55 +00:00
svn	d864828fb4	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54230 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-22 12:08:31 +00:00
duerst	2f455ceca4	* include/ruby/oniguruma.h: Additional flag for characters that are titlecase. * enc/unicode/case-folding.rb, casefold.h: Using above flag in data. * enc/unicode.c: Marking capitalized character as unmodified if it is already titlecase. * test/ruby/enc/test_case_mapping.rb: Tests for above functionality. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-22 12:08:30 +00:00
duerst	59766643db	* enc/unicode/case-folding.rb, casefold.h: Streamlining approach to case mapping data not available from case folding by unifying all three cases (special title, special upper, special lower). * enc/unicode.c: Adjust macro names for above (macros are currently inactive). (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-11 07:11:27 +00:00
duerst	c4e6964141	* enc/unicode/case-folding.rb, casefold.h: Reducing size of TitleCase table by eliminating duplicates. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-27 08:06:17 +00:00
duerst	7feb182a08	* enc/unicode/case-folding.rb: Adding possibility for debugging output for TitleCase table in casefold.h. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53930 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-25 10:04:59 +00:00
duerst	1cc579cb00	* enc/unicode/case-folding.rb, casefold.h: Outputting actual titlecase data (new table, with indices from other tables). * enc/unicode.c: Ignoring titlecase data indices for the moment. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-23 12:53:10 +00:00
duerst	8aa8847b7c	* enc/unicode/case-folding.rb, casefold.h: Reading casing data from SpecialCasing.txt. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53904 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-23 06:21:55 +00:00
duerst	4ca9138bac	* enc/unicode/case-folding.rb, casefold.h: Adding flag for title-case, not yet operational. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53891 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-22 09:34:34 +00:00
duerst	5470ce8206	* enc/unicode/case-folding.rb, casefold.h: Fixed bug that avoided inclusion of compatibility characters in uppper-/lower-case mappings. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53890 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-22 09:17:43 +00:00
duerst	6a808bda64	* enc/unicode/case-folding.rb, casefold.h: Used only first element (rather than all) of target in CaseUnfold_11 array. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-16 10:10:37 +00:00

1 2

97 Коммитов