github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Nobuyoshi Nakada	4909747e5a	Use the hexdigit character class	2023-10-02 00:49:05 +09:00
Nobuyoshi Nakada	be09c8370b	tool/enc-unicode.rb: make the condition concice with flip-flop And regexps are not necessary here.	2023-10-01 22:33:31 +09:00
Janosch Müller	08b3fb1152	[Bug #19728 ] Auto-generate unicode property docs https://bugs.ruby-lang.org/issues/19728	2023-07-01 23:22:17 +09:00
Janosch Müller	b742fb029d	[DOC] Update how to run tool/enc-unicode.rb	2023-05-12 14:15:47 +09:00
Nobuyoshi Nakada	eab1f1ef18	Avoid diffutils 3.8 bug#61193 [ci skip]	2023-04-14 13:51:59 +09:00
Hiroshi SHIBATA	db0a4c8923	Prefer to use File.foreach instead of IO.foreach	2023-02-27 18:49:18 +09:00
卜部昌平	45741918e1	reserved_word: just use gperf 3.1 declaration The reason why this was commented out was because of gperf 3.0 vs 3.1 differences (see [Feature #13883]). Five years passed, I am pretty confident that we can drop support of old versions here. Ditto for uniname2ctype_p(), onig_jis_property(), and zonetab().	2022-09-21 11:44:09 +09:00
Nobuyoshi Nakada	03ce48dac7	Emoji files header changed at 15.0 again	2022-09-17 12:37:48 +09:00
Nobuyoshi Nakada	76c0056505	Follow emoji data files header change The header of emoji data files in UCD, which were moved at 13.0.0, has been changed since 14.0.0. It seems to be the same as other files in UCD.	2022-09-17 12:37:48 +09:00
Martin Dürst	94fc4b1869	Adjust tool/enc-unicode.rb to deal with new location of some emoji files - Change location of file emoji-data.txt - Change range of files in emoji directory ([stz] is for emoji-sequences.txt, emoji-test.txt, and emoji-zwj-sequences.txt) - Make sure that version of all emoji files is checked against Emoji version	2021-07-08 14:45:03 +09:00
nobu	7aaf5b2878	Embed the Emoji version git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-27 06:44:02 +00:00
nobu	34cc6fef83	Make some internal functions static git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-16 06:52:00 +00:00
nobu	04a353fe02	tool/enc-unicode.rb: rewrote without flip-flop git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-09-22 20:39:35 +00:00
nobu	36bc8c0b28	tool: removed unused variables git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63459 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-05-18 00:38:00 +00:00
nobu	a4804fbdf5	support gperf 3.1 * tool/gperf.sed: extracted sed commands to a script. ANSI-C code produced by gperf 3.1 declares length arguments as `size_t`. it causes conflict with existing declarations, and needs casts for a local variable and return statements. [Feature #13883] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61076 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-08 05:51:19 +00:00
nobu	01830719f6	fix for emoji-data.txt * common.mk: download emoji-data.txt. As emoji data files are located in a separate directory in Unicode.org site, reearranged Unicode data files directories same as the site. * tool/enc-unicode.rb (get_file): search emoji data files in the second argument path. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 03:12:51 +00:00
nobu	8b180dd74e	enc-unicode.rb: for gperf 3.1 * tool/enc-unicode.rb: support for gperf 3.1, which defines length arguments as `size_t` but a local variable as `unsigned int`. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 03:12:50 +00:00
naruse	31796f17d3	Update to Onigmo 6.1.3-669ac9997619954c298da971fcfacccf36909d05. [Bug #13892] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-01 13:50:13 +00:00
naruse	9cf7985893	Merge Onigmo 6.1.2 `1364ae3488` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58768 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-17 05:38:37 +00:00
nobu	e1e5857c08	enc-unicode.rb: fix version matching * tool/enc-unicode.rb (data_foreach): version comments do not include sub directory names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 15:55:00 +00:00
nobu	81ab413288	fix GraphemeBreakProperty.txt * tool/downloader.rb: download to the file given in ARGV. * tool/enc-unicode.rb (parse_GraphemeBreakProperty): fix data file path as $(UNICODE_PROPERTY_FILES) in common.mk. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58069 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 15:49:10 +00:00
nobu	d77214e8a3	enc-unicode.rb: ifdef blocks * tool/enc-unicode.rb (Unifdef#ifdef): enclose conditional blocks in blocks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:59:57 +00:00
nobu	8083a359d0	enc-unicode.rb: uniname2ctype_offset git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-23 07:59:56 +00:00
naruse	2873edeafb	Merge Onigmo 6.0.0 * https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY * fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78 * suppress warning: https://github.com/k-takata/Onigmo/pull/79 * include/ruby/oniguruma.h: include onigmo.h. * template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-10 17:47:04 +00:00
nobu	671c929f0a	Use offsetof macro and shrink table size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-01 00:34:42 +00:00
naruse	c11e648799	Regexp supports Unicoe 9.0.0's \X * meta character \X matches Unicode 9.0.0 characters with some workarounds for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences. [Feature #12831] [ruby-core:77586] The term "character" can have many meanings bytes, codepoints, combined characters, and so on. "grapheme cluster" is highest one of such words, which means user-perceived characters. Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to handle grapheme clusters (extended grapheme cluster). But some specs aren't updated to current situation because Unicode Emoji is rapidly extended without well definition. It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be easily tested by looking at immediately adjacent characters". (the sentence will be removed in the next version) Though some of its detail are described in Unicode Technical Report #51 UNICODE EMOJI but it is not merged into UTR#29 yet. http://unicode.org/reports/tr29/ http://unicode.org/reports/tr51/ http://unicode.org/Public/emoji/4.0/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-30 17:29:19 +00:00
nobu	2ae5e54e62	open Unicode data in binary mode * tool/enc-unicode.rb (data_foreach): open in binary mode because Unicode 9.0.0 contains non-ascii characters. * template/unicode_norm_gen.tmpl: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55945 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-16 13:01:30 +00:00
nobu	e827c334c3	enc/unicode: check Unicode versions * enc/unicode/case-folding.rb, tool/enc-unicode.rb: check if Unicode versions are consistent with each other. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 00:53:50 +00:00
nobu	0fd7666d57	enc-unicode.rb: check Unicode version * tool/enc-unicode.rb (data_foreach): check Unicode version in data files, and yield each lines. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55685 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-14 16:32:13 +00:00
normal	f8d0bdedf1	tool: add descriptions and fix typos * tool/asm_parse.rb: add description * tool/change_maker.rb: ditto * tool/downloader.rb: ditto * tool/eval.rb: ditto * tool/expand-config.rb: ditto * tool/extlibs.rb: ditto * tool/fake.rb: ditto * tool/file2lastrev.rb: ditto * tool/gem-unpack.rb: ditto * tool/gen_dummy_probes.rb: ditto * tool/gen_ruby_tapset.rb: ditto * tool/generic_erb.rb: ditto * tool/id2token.rb: ditto * tool/ifchange: ditto * tool/insns2vm.rb: ditto * tool/instruction.rb: ditto * tool/jisx0208.rb: ditto * tool/merger.rb: ditto * tool/mkrunnable.rb: ditto * tool/node_name.rb: ditto * tool/parse.rb: ditto * tool/rbinstall.rb: ditto * tool/rbuninstall.rb: ditto * tool/rmdirs: ditto * tool/runruby.rb: ditto * tool/strip-rdoc.rb: ditto * tool/vcs.rb: ditto * tool/vtlh.rb: ditto * tool/ytab.sed: ditto * tool/enc-unicode.rb: fix typo * tool/mk_call_iseq_optimized.rb: ditto * tool/update-deps: ditto [ruby-core:76215] [Bug #12539] by Noah Gibbs <the.codefolio.guy@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-02 21:01:04 +00:00
nobu	2fac41a63e	enc-unicode.rb: --header * tool/enc-unicode.rb: add --header option to emit name2ctype.h directly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52681 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2015-11-20 04:03:10 +00:00
naruse	64c81e40d4	* regcomp.c: Merge Onigmo 5.14.1 25a8a69fc05ae3b56a09. this includes Support for Unicode 7.0 [Bug #9092]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2014-07-16 03:27:25 +00:00
naruse	407bcb4bc6	* Merge Onigmo d4bad41e16e3eccd97ccce6f1f96712e557c4518. fix lookbehind assertion fails with /m mode enabled. [Bug #8023] fix \Z matches where it shouldn't. [Bug #8001] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@39718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2013-03-11 03:46:55 +00:00
naruse	78dbaa1648	* Merge Onigmo 0fe387da2fee089254f6b04990541c731a26757f v5.13.3 [Bug#7972] [Bug#7974] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@39547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2013-03-01 16:36:37 +00:00
naruse	06d483006c	* Makefile.in: don't remove macros. now name2ctype uses macros. * tool/enc-unicode.rb: add comment why it uses Hash#index. * enc/unicode/{name2ctype.kwd,name2ctype.src,name2ctype.h.blt}: update to follow the current name2ctype.h. FYI current Unicode version is 6.1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2012-06-13 17:54:14 +00:00
naruse	6227690604	* tool/enc-unicode.rb: don't use 1.9 feature on tools. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2012-02-18 02:47:41 +00:00
naruse	0424e152c6	* Merge Onigmo-5.13.1. [ruby-dev:45057] [Feature #5820 ] https://github.com/k-takata/Onigmo cp reg{comp,enc,error,exec,parse,syntax}.c reg{enc,int,parse}.h cp oniguruma.h cp tool/enc-unicode.rb cp -r enc/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2012-02-17 07:42:23 +00:00
naruse	375fd3152f	* tool/transcode-tblgen.rb (import_ucm): don't use \h because the script should work with ruby 1.8. * tool/enc-unicode.rb: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2012-02-17 00:53:13 +00:00
naruse	a0265b0662	* tool/enc-unicode.rb, enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: Add Age property to regexp. [ruby-core:33019] patched by Ammar Ali, tested by Run Paint Run Run git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2010-11-08 05:32:45 +00:00
naruse	f85b841a01	* tool/enc-unicode.rb, enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: Add 'Unknown' Script. patched by Run Paint Run Run. [ruby-core:32937] #3998 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29626 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2010-10-29 01:03:21 +00:00
naruse	fc9176ac0e	* tool/enc-unicode.rb, enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: Update Oniguruma for Unicode 6. patched by Run Paint Run Run. [ruby-core:32923] #3989 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2010-10-28 11:14:05 +00:00
nobu	b238a3f3fd	* tool/enc-unicode.rb: get rid of lots of warnings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29489 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2010-10-13 14:16:49 +00:00
naruse	d5537936ab	* tool/enc-unicode.rb, enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: use UTS#18 for POSIX character class. http://rubyspec.org/issues/show/161 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25338 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-10-14 16:51:52 +00:00
naruse	181eb7d5c1	Add derived core and binary property and aliases. * tool/enc-unicode.rb, enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: Add DerivedCoreProperties, PropList (Binary Property), PropertyAlias and PropertyValueAlias. Now users of tool/enc-unicode.rb should specify the directory of UCD files. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-10-13 12:27:00 +00:00
naruse	5a4ce608e2	* tool/enc-unicode.rb: optimized. * enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: U+100000-U+10FFFD is assigned, not Cn. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-10-08 18:07:08 +00:00
naruse	866c79e2de	* tool/enc-unicode.rb: parse range notation of UnicodeData.txt. * enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: follow above change. [ruby-dev:39444] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-10-08 02:49:11 +00:00
naruse	ee4b59a419	* unicode.c (onigenc_unicode_property_name_to_ctype): ignore case of properties. * tool/enc-unicode.rb: downcase properties list. * enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: follow above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-09-10 22:54:01 +00:00
naruse	f1eff95745	Update Oniguruma's UnicodeData to 5.1. * tool/enc-unicode.rb: added for generate name2ctype.kwd. contributed by Run Paint Run Run [ruby-core:24775] use like following: ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \ enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd * enc/unicode.c (CodeRanges): move definitions to name2ctype.h. * enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: updated to v5.1. * enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1. * Makefile.in: add rule to generate name2ctype.kwd from UnicodeData.txt and Scripts.txt. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2009-08-25 16:15:38 +00:00

48 Коммитов