Граф коммитов

286 Коммитов

Автор SHA1 Сообщение Дата
naruse f8d97b0026 * enc/iso_2022_jp.h: add CP50220.
* enc/trans/iso2022.trans: add converter for CP50220.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27860 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-17 06:28:16 +00:00
naruse 90be970018 * enc/utf_8.c: Add new alias UTF-8-HFS for UTF8-MAC.
http://www.gnu.org/software/emacs/NEWS.23.2

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-11 06:15:53 +00:00
nobu a0136f4f27 * properties.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-07 13:29:44 +00:00
naruse afd64aafd1 * enc/trans/iso2022.trans: CP50221 supports 8bit JIS.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27149 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-04-01 08:18:38 +00:00
nobu 7e3e79d083 * enc/utf_16{be,le}.c (utf16{be,le}_mbc_to_code): simplified.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27143 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-04-01 05:30:25 +00:00
muraken e4d8dc5c46 * bignum.c, node.h, strftime.c, enc/trans/utf8_mac.trans: added explicit casts for supplessing warnings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-25 03:08:28 +00:00
akr 49d993729f * tool/transcode-tblgen.rb (transcode_compile_tree): make
valid_encoding mandatory unless from_encoding is registered in
  ValidEncoding.
  (transcode_tbl_only): ditto.
  (transcode_tblgen): ditto.
  (ValidEncoding): new function.

* enc/trans/escape.trans: specify valid_encoding.

* enc/trans/emoji_sjis_docomo.trans: ditto.

* enc/trans/emoji.trans: ditto.

* enc/trans/emoji_iso2022_kddi.trans: ditto.

* enc/trans/big5.trans: ditto.

* enc/trans/emoji_sjis_softbank.trans: ditto.

* enc/trans/emoji_sjis_kddi.trans: ditto.

* enc/trans/chinese.trans: use ValidEncoding() instead of
  ValidEncoding[].


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26995 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-21 03:38:58 +00:00
muraken 04d90693dc * enc/trans/emoji.trans: added codepoints leading 0xf4 into nomap_table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26955 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-16 11:18:03 +00:00
akr a73374bb57 * tool/transcode-tblgen.rb (transcode_tblgen): add valid_encoding
optional argument.

* enc/trans/single_byte.trans use valid_encoding argument for
  transcode_tblgen.

* enc/trans/chinese.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-15 12:25:20 +00:00
akr ff39d22c33 * enc/trans/emoji.trans: fix nomap_table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-14 06:49:22 +00:00
akr fa37ab769f * tool/transcode-tblgen.rb: reject ambiguous mapping.
* enc/trans/single_byte.trans: remove ambiguous maping such as
  \xD6 -> U+05F2 and \xD6\xC7 -> U+FB1F in Windows-1255


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-13 17:54:43 +00:00
muraken 9eb49ff8d7 * enc/x_emoji.h: renamed from enc/x-emoji.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26863 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-10 03:12:17 +00:00
muraken 62f8df2d3c * enc/trans/EMOJI/*.src, enc/trans/emoji*, enc/x-emoji.c, test/ruby/enc/test_emoji.rb, tool/enc-emoji-citrus-gen.rb, tool/enc-emoji4unicode.rb, tool/jisx0208.rb, tool/test/test_jisx0208.rb: new encodings to support emoji charsets, which are used by Japanese mobile phones [ruby-dev:40528]. Thanks Yoji Shidara for a lot of contribution.
* tool/transcode-tblgen.rb: modified for enc-emoji4unicode.rb.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26856 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-09 09:15:42 +00:00
matz db37773e13 * include/ruby/oniguruma.h: updated to follow Oniguruma 5.9.2.
* re.c (make_regexp): use onig_new() instead of onig_alloc_init().

* re.c (rb_reg_to_s): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-01 21:54:59 +00:00
naruse 6899b6ff80 * enc/trans/utf8_mac.trans (buf_shift_char): don't see uninitialised
value. [ruby-dev:40233]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26464 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-01-29 00:56:10 +00:00
duerst b32ee85f97 * transcode_data.h, transcode.c, tool/transcode-tblgen.rb: Added
support for new transcoding instruction FUNsio (with Tatsuya Mizuno)

* enc/trans/gb18030.trans: Significantly reduced GB18030 conversion
  table footprint using FUNsio and differences (with Tatsuya Mizuno)

* test/ruby/test_transcode.rb: Minor name fix (from Tatsuya Mizuno)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-12-10 11:59:12 +00:00
duerst 9998481d4e * enc/trans/gb18030-tbl.rb: Fix omission of C1 region in code table
(from Tatsuya Mizuno)

* test/ruby/test_transcode.rb: Added test for converting full range of
  Unicode codepoints from/to GB18030 (from Tatsuya Mizuno)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25980 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-12-03 11:29:33 +00:00
akr cc128e3ecf * enc/trans/newline.trans (fun_so_universal_newline): generate \n
after \r\n detection instead of just after \r.
  [ruby-list:45988] [ruby-core:25881] [ruby-core:26788]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25883 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-11-22 19:15:55 +00:00
duerst e0436c54c2 * enc/big5.c, enc/trans/big5.trans, enc/trans/big5-uao-tbl.rb,
test/ruby/test-transcode.rb: Added Encoding 'Big5-UAO' and transcoding
  for it (from Tatsuya Mizuno) (see Bug #1784)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25822 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-11-17 08:56:11 +00:00
naruse d5537936ab * tool/enc-unicode.rb,
enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  use UTS#18 for POSIX character class.
  http://rubyspec.org/issues/show/161

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25338 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-14 16:51:52 +00:00
naruse 181eb7d5c1 Add derived core and binary property and aliases.
* tool/enc-unicode.rb,
  enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  Add DerivedCoreProperties, PropList (Binary Property),
  PropertyAlias and PropertyValueAlias.
  Now users of tool/enc-unicode.rb should specify
  the directory of UCD files.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-13 12:27:00 +00:00
nobu 7081875aa8 * enc/unicode/name2ctype.h: update.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-10 03:20:49 +00:00
naruse 5a4ce608e2 * tool/enc-unicode.rb: optimized.
* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  U+100000-U+10FFFD is assigned, not Cn.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-08 18:07:08 +00:00
naruse 866c79e2de * tool/enc-unicode.rb: parse range notation of UnicodeData.txt.
* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  follow above change. [ruby-dev:39444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-08 02:49:11 +00:00
naruse 8d4ebdc8fe * enc/unicode/name2ctype.h: Updated to Unicode 5.2.0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-02 16:03:20 +00:00
naruse 48eafcbc49 Updated to Unicode 5.2.0.
* enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd,
  enc/unicode/name2ctype.src: Updated to Unicode 5.2.0.
  NOTE: when you update these data, download UnicodeData.txt
  and Scripts.txt from http://www.unicode.org/Public/UNIDATA/
  and run
  ruby1.9 tool/enc-unicode.rb UnicodeData.txt Scripts.txt \
  > enc/unicode/name2ctype.kwd

* enc/unicode/Scripts.txt: removed.

* enc/unicode/UnicodeData.txt: removed.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-02 13:37:41 +00:00
naruse ee4b59a419 * unicode.c (onigenc_unicode_property_name_to_ctype):
ignore case of properties.

* tool/enc-unicode.rb: downcase properties list.

* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  follow above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-10 22:54:01 +00:00
nobu 31b7ae00c0 * include/ruby/st.h (st_hash_func): use st_index_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-08 13:10:04 +00:00
naruse 6e6a28183a * unicode.c (PROPERTY_NAME_MAX_SIZE): use MAX_WORD_LENGTH.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24677 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26 17:01:10 +00:00
nobu 1af43ae867 * enc/unicode.c (onigenc_unicode_mbc_case_fold): balanced braces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26 00:48:49 +00:00
nobu 1fd7f2e57d * enc/unicode/name2ctype.h: updated.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24657 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25 21:54:03 +00:00
naruse f1eff95745 Update Oniguruma's UnicodeData to 5.1.
* tool/enc-unicode.rb: added for generate name2ctype.kwd.
  contributed by Run Paint Run Run [ruby-core:24775]
  use like following:
    ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \
      enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd

* enc/unicode.c (CodeRanges): move definitions to name2ctype.h.

* enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd,
  enc/unicode/name2ctype.src: updated to v5.1.

* enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1.

* Makefile.in: add rule to generate name2ctype.kwd from
  UnicodeData.txt and Scripts.txt.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25 16:15:38 +00:00
nobu a7b920686a * enc/unicode/name2ctype.h: split from enc/unicode.c and made a
perfect hash.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-21 08:01:09 +00:00
nobu a606038c6a * enc/utf_8.c (code_to_mbc): suppressed a warning.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24607 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-21 06:37:36 +00:00
nobu e1c9ac6bd9 * enc/unicode.c (CodeRanges): initialized statically.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-19 02:32:49 +00:00
naruse 2b91cbbf11 * enc/Makefile.in (MKDIRS): revert r24525.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24538 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-14 08:20:13 +00:00
nobu 24c783e95e * configure.in, Makefile.in (MAKEDIRS): used MKDIR_P instead of
as_mkdir_p.  [ruby-dev:39063]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24525 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-13 07:20:16 +00:00
naruse 38107457a3 * enc/encdb.c (ENC_SET_BASE): fix typo. patch by ujihisa [ruby-dev:39004]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 03:42:11 +00:00
naruse 8c658137d9 More strict for Big5 series.
* enc/big5.c (EncLen_Big5): back to original Big5 table.
  (EncLen_Big5_HKSCS): for Big5-HKSCS.
  (trans): add the lead byte table for Big5-HKSCS.
  (big5_mbc_enc_len): abstract function for Big5 series.
  (big5_mbc_enc_len): for Big5.
  (big5_hkscs_mbc_enc_len): for Big5-HKSCS.
  (BIG5_HKSCS_P): added.
  (BIG5_ISMB_FIRST): add routine for Big5-HKSCS.
  (big5_hkscs): add for Big5-HKSCS.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 00:51:22 +00:00
naruse b3d7273dc1 Add functions and macros for second encoding definitions.
* encoding.c (rb_enc_set_base): Add for setting base encoding
  with their names. this is internal function.

* template/encdb.h.tmpl: specify ENC_SET_BASE for second
  encodings in each encoding files.

* enc/encdb.c (rb_enc_set_base): add a declaration.
  (ENC_SET_BASE): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 00:50:59 +00:00
nobu 4fd615943e * enc/big5.c: not executable.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-25 04:42:48 +00:00
naruse a8951a5b3a * enc/big5.c: Fix EncLen_BIG5 for Big5-HKSCS. see [ruby-core:24390]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-24 16:15:53 +00:00
duerst 2886207584 * enc/trans/big5.trans, big5-hkscs-tbl.rb:
new Chinese BIG5-HKSCS transcoding (with Tatsuya Mizuno)

* test/ruby/test_transcode.rb: added tests for the above
  (with Tatsuya Mizuno)

* enc/big5.c: Added BIG5-HKSCS as a replicate encoding of BIG5
  (short term solution, needs more work; with Tatsuya Mizuno)

* tool/transcode-tblgen.rb: made 'pat' directly accessible in
  class StrSet


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-24 10:26:18 +00:00
nobu c030cf1975 * ruby.c (process_options), enc/prelude.rb: encdb and transdb are
extension libraries.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-22 05:41:51 +00:00
naruse d9cf0f822f * enc/trans/utf8_mac.trans: remove wrong optimization.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-13 18:55:55 +00:00
naruse 3abca796f4 Fix: DON'T move in_p because before in_p is replaced by buffered data.
* transcode.c: NOMAP is now multibyte direct map.

* transcode.c: remove ASIS.

* transcode_data.h: ditto.

* tool/transcode-tb (ActionMap#generate_info): remove :asis.

* tool/transcode-tb (ActionMap#generate_info): add :nomap0.

* enc/trans/utf8_mac.trans: replace :asis by :nomap0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23344 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-05-05 00:05:11 +00:00
naruse f207f9fd51 * enc/trans/utf8_mac-tbl.rb: don't use Unicode escape.
* enc/trans/utf8_mac.trans: follow above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23325 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-05-02 01:38:27 +00:00
nobu 8543ecee53 * enc/trans/utf8_mac.trans: get rid of a 1.9 feature for cross
compile.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-30 06:27:51 +00:00
naruse 80705b9fbf Add new transcoder: CP51932 <-> CP50221.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-29 12:50:57 +00:00
naruse d0a4f8ada9 * enc/trans/utf8_mac.trans: Add converter for UTF8-MAC.
* enc/trans/utf8_mac-tbl.rb: ditto.

* test/ruby/test_econv.rb: tests for above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-26 14:21:43 +00:00