Граф коммитов

114 Коммитов

Автор SHA1 Сообщение Дата
mame 65670f9400 * enc/iso_8859_5.c: Large omicron should lowercase to small omicron.
* test/ruby/test_big5.rb, test/ruby/test_cp949.rb,
  test/ruby/test_euc_jp.rb, test/ruby/test_euc_kr.rb,
  test/ruby/test_euc_tw.rb, test/ruby/test_gb18030.rb,
  test/ruby/test_gbk.rb, test/ruby/test_iso_8859.rb,
  test/ruby/test_koi8.rb, test/ruby/test_shift_jis.rb,
  test/ruby/test_windows_1251.rb: new tests for encoding.

* test/ruby/test_utf16.rb, test/ruby/test_utf32.rb,
  test/ruby/test_regexp.rb: add tests.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16759 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-02 13:30:38 +00:00
naruse 5397015d2e * enc/gb18030.c (gb18030_code_to_mbc): add 0x80000000
for 4bytes character.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16739 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-01 19:36:28 +00:00
naruse 9c13fc7d89 * enc/gb18030.c (gb18030_mbc_to_code): mask by 0x7FFFFFFF
because OnigCodePoint will be used as 32bit signed int.
  Masking by 0x7FFFFFFF is ok on GB18030;
  Minumum 4bytes character is 0x81308130.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16737 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-01 18:29:08 +00:00
naruse 0682fab6a2 * enc/utf_16{be,le}.c (utf16{be,le}_code_to_mbc):
fix codepoint to bytes.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-31 10:14:38 +00:00
nobu 6a734c810c * common.mk (prelude.c): simply depends on PREP. [ruby-dev:34877]
* enc/make_encdb.rb, enc/trans/make_transdb.rb: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-30 03:18:45 +00:00
naruse d6025a3be4 * enc/utf_8.c: add UTF8-MAC (UTF-8-MAC).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16697 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-29 22:12:57 +00:00
usa 22088e3423 * enc/trans/japanese.c (to_SHIFT_JIS_EF_infos): typo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16661 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-28 06:54:18 +00:00
naruse 3ab17047f5 * enc/trans/japanese.c: add workarround for Unicode to CP932.
U+2015->0x815C, U+2225->0x8161, U+FF0D->0x817C, U+FF3C->0x815F,
  U+FF5E->0x8160, U+FFE0->0x8191, U+FFE1->0x8192, U+FFE2->0x81CA

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16657 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-28 04:35:59 +00:00
mame 0a2053713b * enc/trans/utf_16_32.c (fun_so_to_utf_16be, fun_so_to_utf_16le): add
parentheses to remove warnings of gcc.

* io.c (rb_io_getc): remove unused variables.

* compile.c (NODE_NEXT, NODE_REDO): remove unused labels.

* ext/nkf/nkf.c (rb_nkf_convert): remove unused variables.

* ext/syck/rubyext.c (syck_resolver_initialize,
  syck_resolver_detect_implicit, syck_emitter_emit): remove unused
  variables.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16061 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-17 13:22:40 +00:00
nobu 7dc26509c6 * common.mk (prelude.c): depends on enc/prelude.rb.
* enc/prelude.rb: fixed initial library names.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-08 01:30:31 +00:00
nobu 1369cfd16e * encoding.c (enc_init_db): moved to enc/encdb.c.
* transcode.c (init_transcoder_table): moved to enc/trans/transdb.c.

* enc/depend (enc/encdb.o enc/trans/transdb.o): depend on
  corresponding headers.

* common.mk (COMMONOBJS): moved transcode.o from OBJS


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-07 06:51:33 +00:00
duerst 2e7815dd80 Sun Mar 16 18:07:07 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: bug fix (some invalid UTF-8 sequences
	  were legal)

	* test/ruby/test_transcode.rb: test for above bug



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15786 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-16 09:09:53 +00:00
nobu 9d014dc254 * ext/extmk.rb, enc/make_encmake.rb: load current mkmf.rb even if
cross-compiling.

* ext/extmk.rb, enc/make_encmake.rb, lib/mkmf.rb: need to be 1.8
  compatible for cross-compiling.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15616 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-26 18:56:00 +00:00
nobu 80e81d283d * enc/{depend,make_encdb.rb,trans/make_transdb.rb}: sort in alpha-numeric order.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-22 00:07:23 +00:00
naruse 8984fa6742 * enc/{euc_jp.c,gbk.c,iso_8859_1.c,iso_8859_11.c,iso_8859_13.c,
iso_8859_2.c,iso_8859_6.c,iso_8859_7.c,iso_8859_8.c,iso_8859_9.c,
  shift_jis.c,windows_1251.c}: add document about encodings.

* enc/cp949.c: divided into new file.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15516 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 03:21:20 +00:00
naruse a2d85d61bd * enc/iso_8859_{4,13}.c: Windows-1257 is replica of ISO-8859-13.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15495 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 20:55:27 +00:00
naruse a8739621cf * lib/uri/generic.rb: revert r15442. 2nd argument of String#sub parse
escapes. [ruby-dev:33726]

* bootstraptest/test_method.rb enc/depend instruby.rb lib/mkmf.rb
  mkconfig.rb: revert r15443. ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15456 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-13 07:26:52 +00:00
usa f6628871b5 * enc/depend: fix typo.
* lib/mkmf.rb: revert r15443. "\\1#{sep}\\2" is wrong if sep is ended
	  with "\\".



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15455 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-13 02:21:25 +00:00
naruse a10ded3ba0 * bootstraptest/runner.rb, bootstraptest/test_method.rb, enc/depend,
instruby.rb, lib/mkmf.rb, lib/test/unit/util/procwrapper.rb,
mkconfig.rb, sample/test.rb, template/vm.inc.tmpl,
test/ruby/test_stringchar.rb: fixes arround String#gsub.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-12 06:28:23 +00:00
naruse e22ff0c9b6 * enc/trans/korean.c: add support for CP949 by Park Ji-In. [ruby-dev:33626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15393 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 06:05:32 +00:00
naruse 4f0083e45f * enc/trans/korean.c: add EUC-KR conversion support by Park Ji-In.
[ruby-dev:33621]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15385 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-06 19:40:11 +00:00
naruse 9ac5a0ca4d * enc/*.c: add GB12345, UCS-{2,4}{BE,LE}.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15341 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-30 08:35:03 +00:00
akr 44cfd58dc5 * enc/utf_16be.c (UTF16_IS_SURROGATE_FIRST): avoid branch.
(UTF16_IS_SURROGATE_SECOND): ditto.
  (UTF16_IS_SURROGATE): defined.
  (utf16be_mbc_enc_len): validation implemented.

* enc/utf_16le.c (UTF16_IS_SURROGATE_FIRST): avoid branch.
  (UTF16_IS_SURROGATE_SECOND): ditto.
  (UTF16_IS_SURROGATE): defined.
  (utf16le_mbc_enc_len): validation implemented.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15338 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-30 03:49:54 +00:00
akr 0ba09d829c fix state definition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 15:35:37 +00:00
akr 12e8b588ac * enc/euc_tw.c (euctw_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 15:10:50 +00:00
akr 6e3391c866 * enc/euc_tw.c (euctw_islead): 0x8e is a leading byte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15323 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 13:01:27 +00:00
naruse b9821b02a0 * enc/trans/make_transdb.rb: add for make transdb.h.
* dmytranscode.c: add for miniruby.

* enc/gbk.c (gbk_left_adjust_char_head, gbk_is_allowed_reverse_match):
  fix odd regexp match. [ruby-dev:33502]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 11:44:08 +00:00
naruse 7a8c02cd47 * add enc/trans/make_transdb.rb, dmytranscode.c
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 11:18:22 +00:00
naruse 74b254e833 * enc/trans/japanese.c (rb_to_Windows_31J): to 'Windows-31J'.
* common.mk: add rules for transdb.h.

* transcode.c (init_transcoder_table): use transdb.h.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15317 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 10:05:39 +00:00
naruse 19d9380b3d * enc/gbk.c (EncLen_gbk): too short. [ruby-dev:33497]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:55:19 +00:00
akr 86a9215bbf * enc/gb18030.c (gb18030_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:38:21 +00:00
naruse 00a3c40c37 * enc/euc_kr.c: remove CP949.
* enc/euc_cn.c: remove CP936 and rename to gb2312.c

* enc/gb2312.c: GB2312 is preferred MIME name.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 04:41:41 +00:00
naruse fe15b86b9d * enc/gbk.c: add GBK, CP936 and CP949.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 04:26:30 +00:00
naruse a2b03f10dc * enc/gbk.c: add GBK, CP936 and CP949.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 03:55:45 +00:00
naruse 2f961c1f37 * enc/utf_7.h: add dummy encoding UTF-7 and its alias CP65000.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15291 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 08:57:40 +00:00
usa fee57bb8c8 * enc/utf_8.c: add alias CP65001.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 08:41:49 +00:00
akr ffbf8ab367 * enc/big5.c (big5_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15289 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 06:33:57 +00:00
akr 5f9bc1779e * enc/euc_kr.c (euckr_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15288 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 04:02:39 +00:00
nobu b2c5814afc * enc/trans/japanese.c (rb_from_Windows_31J, rb_to_Windows_31J):
provisional workaround for Windows-31J.  [ruby-dev:33320]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15188 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 13:14:31 +00:00
duerst ef3fdbca15 Tue Jan 22 17:52:52 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: Streamline parentheses, add more
	  'static' qualifiers.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-22 08:52:02 +00:00
duerst 38321fc0eb Mon Jan 21 19:42:42 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-32BE and UTF-32LE conversions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 10:41:59 +00:00
nobu 463af63468 * transcode.c (transcode_loop, str_transcoding_resize): use unsigned
char.  [ruby-dev:33232]

* transcode_data.h (rb_transcoding, rb_transcoder): removed callback
  parameters.

* enc/trans/japanese.c: ditto.

* enc/trans/utf_16_32.c: parenthesized bit-or operands.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15150 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 03:35:05 +00:00
nobu a8969e999a * transcode.c (transcode_dispatch): constified return value.
* transcode_data.h (rb_transcoding): include pointer to rb_transcoder
  and auxiliary data.

* transcode_data.h (rb_transcoder): all callback functions shoud have
  their own parameters.

* enc/trans/{japanese,single_byte}.c: constified.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 21:40:08 +00:00
duerst a9b15a4e0c Sun Jan 20 20:00:20 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-16LE conversions.

	* fixed changelog for last commit



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15144 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 11:00:24 +00:00
duerst 3d0c7bea4d Sun Jan 20 15:08:08 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: new file, currently implementing
	  UTF-16BE conversions only.

	* test/ruby/test_transcode.rb: Added tests for UTF-16BE;
	  made check_both_ways() use force_encoding differently.

	* transcode_data.h, transcode.c: Support for more conversion
	  functions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 06:12:48 +00:00
naruse 9a1d7e4d01 * enc/make_encdb.rb: fix duplication check.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15135 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 20:15:13 +00:00
naruse 7b3781c60c * ascii.c: remove difinition of replica KOI8-U.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15134 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 20:04:35 +00:00
naruse 6e1c3a0f54 * enc/koi8_u.c: added.
* regenc.c, enc/utf_8.c, enc/unicode.c, enc/gb18030.c: add ARG_UNUSED.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 15:37:06 +00:00
nobu 8b112c580c * enc/euc_cn.c: split from enc/euc_kr.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 15:17:21 +00:00
nobu 0052259d5e * common.mk (encdb.h): give output file name to make_encdb.rb.
* encoding.c (enc_table): simplified.

* encoding.c (enc_register_at): lazy loading.  [ruby-dev:33013]

* regenc.h (ENC_DUMMY): added.

* enc/make_encdb.rb: now emits macros only.

* enc/iso_2022_jp.h: split from encoding.c.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 14:56:22 +00:00