Граф коммитов

581 Коммитов

Автор SHA1 Сообщение Дата
nobu e668e36b49 * template/{encdb,transdb}.h.tmpl: moved enc/make_encdb.rb and
enc/trans/make_transdb.rb using tool/generic_erb.rb.

* common.mk (encdb.h, transdb.h): generates from avobe template.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21490 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-13 09:05:29 +00:00
nobu 4cb8d3316a * enc/trans/make_transdb.rb (converters): should not depend on the
hash order for cross compile.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21489 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-13 08:28:14 +00:00
duerst deeade6f3e * enc/trans/gbk.trans, gbk-tbl.rb:
new Chinese GBK transcoding (from Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added tests for the above
  (from Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 09:12:14 +00:00
duerst fecce9e5e5 * test/ruby/test_transcode.rb: added tests for GB2312
(from Yoshihiro Kambayashi)

* enc/trans/chinese.trans: set valid byte patterns for
  GB2312 and GB12345


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 08:55:04 +00:00
duerst 3bc30f0b73 * enc/trans/big5.trans, big5-tbl.rb:
new Chinese Big5 transcoding (from Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added tests for the above
  (from Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 08:40:26 +00:00
naruse 1240916075 change encoding name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21285 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-03 14:12:39 +00:00
naruse 2920aaa2d1 * enc/trans/chinese.trans: added for transcoding EUC-CN and GB12345.
* enc/trans/GB/: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21283 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-03 06:31:05 +00:00
duerst a28fdecda7 * enc/trans/single_byte.trans, cp850-tbl.rb, cp852-tbl.rb,
cp855-tbl.rb, koi8-r-tbl.rb, koi8-u-tbl.rb, tis-620-tbl.rb:
  new single-byte transcodings (from Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added tests for the above
  (from Yoshihiro Kambayashi), small cosmetic fixes


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20599 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-09 09:39:25 +00:00
nobu a6d8d84a9e * enc/depend (clean-srcs): split out from clean.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-08 15:17:52 +00:00
nobu 8e6ad88737 * enc/depend (LIBS): fixed for disable-shared. [ruby-dev:37103]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20241 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-17 09:05:19 +00:00
duerst 831e804388 * enc/trans/single_byte.trans, macgreek-tbl.rb, macroman-tbl.rb,
macromania-tbl.rb, macturkish-tbl.rb, macukraine-tbl.rb,
  ibm437-tbl.rb, ibm852-tbl.rb, ibm855-tbl.rb, ibm857-tbl.rb,
  ibm860-tbl.rb, ibm861-tbl.rb, ibm862-tbl.rb, ibm863-tbl.rb,
  ibm865-tbl.rb, ibm866-tbl.rb, ibm869-tbl.rb, ibm775-tbl.rb:
  new single-byte transcodings (from Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added tests for the above
  (from Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20178 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-11 05:26:20 +00:00
duerst d37df9fb13 * enc/trans/single_byte.trans, maccroatioan-tbl.rb,
maccyrillic-tbl.rb, maciceland-tbl.rb: new single-byte
  transcodings (from Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added tests for the above
  (from Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20075 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-31 09:07:21 +00:00
duerst 6fd14ccae5 * enc/trans/single_byte.trans: refactoring to make it easier
to add more transcodings (with Yoshihiro Kambayashi)

* enc/trans/iso-8859-1-tbl.rb: new file to avoid having to
  treat ISO-8859-1 as special


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-30 05:47:01 +00:00
nobu da6300e8f8 * enc/us_ascii.c (us_ascii_mbc_enc_len): made static. a patch by
Tadashi Saito <shiba AT mail2.accsnet.ne.jp> at [ruby-dev:36916]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19929 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-24 19:00:35 +00:00
duerst b014f1bc02 * enc/trans/single_byte.trans: adding WINDOWS-wwww encodings
(wwww = 874/1250/1251/1253/1254/1255/1256/1257)
  (contributed by Yoshihiro Kambayashi)

* enc/trans/windows-wwww-tbl.rb: 8 new files
  (contributed by Yoshihiro Kambayashi)

* test/ruby/test_transcode.rb: added test_windows_wwww
  (contributed by Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19846 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-19 09:15:37 +00:00
duerst 7adbfbb793 * tool/transcode-tblgen.rb: added set_valid_byte_pattern
to reduce coupling between table generation script and
  specific encodings.

* enc/trans/single_byte.trans: using set_valid_byte_pattern


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-18 08:10:57 +00:00
nobu 7485e91f76 * common.mk, enc/depend (enc, trans): targets for sources.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-16 05:34:25 +00:00
akr b968fa97f6 * enc/trans/single_byte.trans (transcode_tblgen_singlebyte): renamed
from transcode_tblgen_windows.
  (transcode_tblgen_iso8859): use transcode_tblgen_singlebyte.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-14 11:33:17 +00:00
duerst 48a303c027 * enc/trans/single_byte.trans: added windows-1252
* enc/trans/windows-1252-tbl.rb: new file
  (contributed by Yoshihiro Kambayashi)

* tool/transcode-tblgen.rb: listed windows-1252 as '1byte'

* test/ruby/test_transcode.rb: added test_windows_1252
  (contributed by Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-14 04:37:10 +00:00
akr 081c802cb9 * grapheme cluster implementation reverted. [ruby-dev:36375]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-18 12:53:25 +00:00
akr b3d772643e fix typos.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16 16:59:08 +00:00
akr a67d4fa01c * include/ruby/oniguruma.h (OnigEncodingTypeST): add precise_ret
argument for mbc_to_code.
  (ONIGENC_MBC_TO_CODE): provide NULL for precise_ret.
  (ONIGENC_MBC_PRECISE_CODEPOINT): defined.

* include/ruby/encoding.h (rb_enc_mbc_precise_codepoint): defined.

* regenc.h (onigenc_single_byte_mbc_to_code): precise_ret argument
  added.
  (onigenc_mbn_mbc_to_code): ditto.

* regenc.c (onigenc_single_byte_mbc_to_code): precise_ret argument
  added.
  (onigenc_mbn_mbc_to_code): ditto.

* string.c (count_utf8_lead_bytes_with_word): removed.
  (str_utf8_nth): removed.
  (str_utf8_offset): removed.
  (str_strlen): UTF-8 codepoint oriented optimization removed.
  (rb_str_substr): ditto.
  (enc_succ_char): use rb_enc_mbc_precise_codepoint.
  (enc_pred_char): ditto.
  (rb_str_succ): ditto.

* encoding.c (rb_enc_ascget): check length with
  rb_enc_mbc_precise_codepoint.
  (rb_enc_codepoint): use rb_enc_mbc_precise_codepoint.

* regexec.c (string_cmp_ic): add text_end argument.
  (match_at): check end of character after exact string matches.

* enc/utf_8.c (graphme_table): defined for extended graphme cluster
  boundary.
  (grapheme_cmp): defined.
  (get_grapheme_properties): defined.
  (grapheme_boundary_p): defined.
  (MAX_BYTES_LENGTH): defined.
  (comb_char_enc_len): defined.
  (mbc_to_code0): extracted from mbc_to_code.
  (mbc_to_code): use mbc_to_code0.
  (left_adjust_combchar_head): defined.
  (utf_8): use a extended graphme cluster as a unit.

* enc/unicode.c (onigenc_unicode_mbc_case_fold): use
  ONIGENC_MBC_PRECISE_CODEPOINT to extract codepoints.
  (onigenc_unicode_get_case_fold_codes_by_str): ditto.

* enc/euc_jp.c (mbc_to_code): follow mbc_to_code field change.
  use onigenc_mbn_mbc_to_code.

* enc/shift_jis.c (mbc_to_code): ditto.

* enc/emacs_mule.c (mbc_to_code): ditto.

* enc/gbk.c (gbk_mbc_to_code): follow mbc_to_code field and
  onigenc_mbn_mbc_to_code change.

* enc/cp949.c (cp949_mbc_to_code): ditto.

* enc/big5.c (big5_mbc_to_code): ditto.

* enc/euc_tw.c (euctw_mbc_to_code): ditto.

* enc/euc_kr.c (euckr_mbc_to_code): ditto.

* enc/gb18030.c (gb18030_mbc_to_code): ditto.

* enc/utf_32be.c (utf32be_mbc_to_code): follow mbc_to_code field
  change.

* enc/utf_16be.c (utf16be_mbc_to_code): ditto.

* enc/utf_32le.c (utf32le_mbc_to_code): ditto.

* enc/utf_16le.c (utf16le_mbc_to_code): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16 16:48:05 +00:00
akr 0675246ba6 * transcode_data.h (rb_transcoder): resetsize_func and resetstate_func
also returns ssize_t.

* enc/trans/iso2022.trans: follow the type change.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-15 02:11:50 +00:00
akr c767be3039 * transcode_data.h: return output functions ssize_t.
* transcode.c (transcode_restartable0): don't need to cast the result
  of output functions.

* enc/trans/newline.trans: follow the type change.

* enc/trans/escape.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.

* enc/trans/iso2022.trans: ditto.

* enc/trans/japanese.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19351 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-14 18:35:17 +00:00
akr a3c8c0adec * transcode_data.h: output function takes output buffer size.
* transcode.c: give output buffer size for output functions.

* enc/trans/newline.trans: follow the type change.

* enc/trans/escape.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.

* enc/trans/iso2022.trans: ditto.

* enc/trans/japanese.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19350 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-14 18:06:20 +00:00
akr 19416601a0 * include/ruby/oniguruma.h (OnigEncodingTypeST): add end argument for
left_adjust_char_head.
  (ONIGENC_LEFT_ADJUST_CHAR_HEAD): add end argument.
  (onigenc_get_left_adjust_char_head): ditto.

* include/ruby/encoding.h (rb_enc_left_char_head): add end argument.

* regenc.h (onigenc_single_byte_left_adjust_char_head): ditto.

* regenc.c (onigenc_get_right_adjust_char_head): follow the interface
  change.
  (onigenc_get_right_adjust_char_head_with_prev): ditto.
  (onigenc_get_prev_char_head): ditto.
  (onigenc_step_back): ditto.
  (onigenc_get_left_adjust_char_head): ditto.
  (onigenc_single_byte_code_to_mbc): ditto.

* re.c: ditto.

* string.c: ditto.

* io.c: ditto.

* regexec.c: ditto.

* enc/euc_jp.c: ditto.

* enc/cp949.c: ditto.

* enc/shift_jis.c: ditto.

* enc/gbk.c: ditto.

* enc/big5.c: ditto.

* enc/euc_tw.c: ditto.

* enc/euc_kr.c: ditto.

* enc/emacs_mule.c: ditto.

* enc/gb18030.c: ditto.

* enc/utf_8.c: ditto.

* enc/utf_16le.c: ditto.

* enc/utf_16be.c: ditto.

* enc/utf_32le.c: ditto.

* enc/utf_32be.c: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19334 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-13 19:23:52 +00:00
akr 41d3a01486 * enc/trans/escape.trans: transcoder name renamed to use underscore.
* transcode.c: follow the renaming.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-12 17:30:07 +00:00
naruse d51b061565 * include/ruby/oniguruma.h (OnigCodePoint): unsigned long to unsigned int.
* include/ruby/encoding.h (rb_enc_codepoint): ditto.

* encoding.c (rb_enc_codepoint): signed int to unsigned int.

* encoding.c (rb_enc_ascget): ditto.

* string.c (rb_str_casecmp): ditto.

* string.c (enc_succ_alnum_char): ditto.

* string.c (rb_str_inspect): ditto.

* string.c (rb_str_upcase_bang): ditto.

* string.c (rb_str_downcase_bang): ditto.

* string.c (rb_str_capitalize_bang): ditto.

* string.c (rb_str_swapcase_bang): ditto.

* string.c (struct tr): ditto.

* string.c (trnext): ditto.

* string.c (tr_trans): ditto.

* string.c (tr_setup_table): ditto.

* string.c (tr_find): ditto.

* string.c (rb_str_delete_bang): ditto.

* string.c (rb_str_squeeze_bang): ditto.

* string.c (rb_str_count): ditto.

* string.c (rb_str_split_m): ditto.

* string.c (rb_str_each_line): ditto.

* string.c (rb_str_lstrip_bang): ditto.

* string.c (rb_str_rstrip_bang): ditto.

* string.c (rb_str_intern): ditto.

* dir.c (char_casecmp): ditto.

* sprintf.c (rb_str_format): ditto.

* enc/emacs_mule.c (mbc_to_code): to be 32bit clean.

* enc/emacs_mule.c (code_to_mbc): ditto.

* enc/gb18030.c (mbc_to_code): ditto.

* enc/gb18030.c (code_to_mbc): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-11 10:34:59 +00:00
akr 817a623d13 * enc/trans/newline.trans (rb_universal_newline): swap src_encoding
and dst_encoding.

* transcode.c (rb_econv_decorate_at): call get_transcoder_entry only
  once.
  (rb_econv_binmode): follow universal_newline change.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19276 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-09 16:06:54 +00:00
akr 6270ad5b7f * include/ruby/encoding.h (rb_econv_asciicompat_encoding): renamed
from rb_econv_stateless_encoding to apply stateless ASCII
  incompatible encodings such as UTF-16BE.

* io.c (make_writeconv): use rb_econv_asciicompat_encoding.

* transcode_data.h (rb_transcoder_asciicompat_type_t): renamed from
  rb_transcoder_stateful_type_t.
  (rb_transcoder): use rb_transcoder_asciicompat_type_t.

* transcode.c: follow the type change.
  (asciicompat_encoding_i): renamed from stateless_encoding_i.
  (rb_econv_asciicompat_encoding): renamed from
  rb_econv_stateless_encoding.
  (econv_s_asciicompat_encoding): method renamed.

* tool/transcode-tblgen.rb: follow the type change.

* enc/trans/utf_16_32.trans: follow the type change.
  rb_from_UTF_16BE to UTF-8 is asciicompat_decoder.
  rb_from_UTF_16LE to UTF-8 is asciicompat_decoder.
  rb_from_UTF_32BE to UTF-8 is asciicompat_decoder.
  rb_from_UTF_32LE to UTF-8 is asciicompat_decoder.
  UTF-8 to rb_to_UTF_16BE is asciicompat_encoder.
  UTF-8 to rb_to_UTF_16LE is asciicompat_encoder.
  UTF-8 to rb_to_UTF_32BE is asciicompat_encoder.
  UTF-8 to rb_to_UTF_32LE is asciicompat_encoder.

* enc/trans/newline.trans: follow the type change.  universal newline
  decoder is asciicompat_converter.

* enc/trans/escape.trans: follow the type change.

* enc/trans/iso2022.trans: ditto.

* enc/trans/japanese.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19249 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-08 14:33:17 +00:00
akr 7de9b10819 * enc/trans/iso2022.trans: upcase to iso-2022-jp.
* enc/emacs_mule.c: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19221 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 15:05:12 +00:00
akr 78350a0cc6 * enc/trans/iso2022.trans: stateless-iso-2022-jp is defined to avoid
undefined conversion error between iso-2022-jp and the corresponding
  stateless encoding.

* enc/emacs_mule.c: replicate emacs-mule as stateless-iso-2022-jp.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19220 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 14:49:05 +00:00
akr a3cfd3233e * enc/trans/escape.trans (hexstr): renamed from str1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 11:03:24 +00:00
akr faa1387484 * enc/trans/escape.trans: use transcode_tblgen.
* tool/transcode-tblgen.rb: generate an empty line after str1.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 10:46:30 +00:00
akr c0bec2fae1 * transcode_data.h (STR1): defined for a string up to 255 bytes.
(STR1_BYTEINDEX): defined.
  (makeSTR1): defined.

* tool/transcode-tblgen.rb: generate STR1.

* transcode.c (transcode_restartable0): interpret STR1.

* enc/trans/escape.trans (fun_so_escape_xml_chref): removed.  STR1 is
  used instead.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19214 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 09:13:16 +00:00
akr 020e681eec * include/ruby/encoding.h (ECONV_XML_ATTR_CONTENT_ENCODER): defined.
(ECONV_STATEFUL_ENCODER_MASK): defined.
  (ECONV_XML_ATTR_QUOTE_ENCODER): defined.
  (ECONV_XML_ATTR_ENCODER): removed.

* enc/trans/escape.trans (rb_escape_xml_attr_content): defined.
  (rb_escape_xml_attr_quote): defined.
  (rb_escape_xml_attr): removed.

* io.c (NEED_WRITECONV): writeconv is required if supplemental
  converter is used.
  (make_writeconv): apply stateful encoder in writeconv.

* transcode.c: follow the constant change.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19209 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-07 03:13:29 +00:00
akr 76b3063022 * include/ruby/encoding.h (ECONV_XML_TEXT_ENCODER): renamed from
ECONV_HTML_TEXT_ENCODER.
  (ECONV_XML_ATTR_ENCODER): renamed from ECONV_HTML_ATTR_ENCODER.

* enc/trans/escape.trans: follow the renaming.

* transcode.c: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19191 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-06 14:05:10 +00:00
akr 9f687d7be9 * enc/trans/escape.trans (fun_so_escape_html_attr): fix return type.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19177 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-06 04:43:39 +00:00
akr 091171a286 * enc/trans/escape.trans (escape_html_attr_init): new function.
(fun_so_escape_html_attr): new function.
  (escape_html_attr_finish): new function.
  (rb_escape_html_attr): use them to quote the converted result.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19173 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-06 03:20:51 +00:00
akr a10a5ddaac * enc/trans/escape.trans: new file.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19165 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-05 21:29:12 +00:00
akr debeb7db77 * enc/trans/newline.trans (universal_newline_finish): new function.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19152 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-05 11:16:28 +00:00
akr 64b633f069 * enc/trans/newline.trans: record newline types met in universal
newline decoder.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19136 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-04 14:20:14 +00:00
akr f6441bf61c * transcode_data.h (rb_transcoding): remove stateful field.
add state field.
  (TRANSCODING_STATE): defined.
  (rb_transcoder): add fields: state_size, state_init_func,
  state_fini_func.
  change rb_transcoding* argument to void*.

* transcode.c (transcode_restartable0): use TRANSCODING_STATE for
  first arguments of transcoder functions.
  (rb_transcoding_open_by_transcoder): initialize state field.
  (rb_transcoding_close): finalize state field.

* tool/transcode-tblgen.rb: provide state size/init/fini.

* enc/trans/newline.trans (universal_newline_init): defined.
  (fun_so_universal_newline): take void* as a state pointer.
  (rb_universal_newline): provide state size/init/fini.
  (rb_crlf_newline): ditto.
  (rb_cr_newline): ditto.

* enc/trans/iso2022.trans (iso2022jp_init): defined.
  (fun_si_iso2022jp_to_eucjp): take void* as a state pointer.
  (fun_so_iso2022jp_to_eucjp): ditto.
  (fun_so_eucjp_to_iso2022jp): ditto.
  (iso2022jp_reset_sequence_size): ditto.
  (finish_eucjp_to_iso2022jp): ditto.
  (rb_ISO_2022_JP_to_EUC_JP): provide state size/init/fini.
  (rb_EUC_JP_to_ISO_2022_JP): ditto.

* enc/trans/utf_16_32.trans (fun_so_from_utf_16be): take void* as a
  state pointer.
  (fun_so_to_utf_16be): ditto.
  (fun_so_from_utf_16le): ditto.
  (fun_so_to_utf_16le): ditto.
  (fun_so_from_utf_32be): ditto.
  (fun_so_to_utf_32be): ditto.
  (fun_so_from_utf_32le): ditto.
  (fun_so_to_utf_32le): ditto.
  (rb_from_UTF_16BE): provide state size/init/fini.
  (rb_to_UTF_16BE): ditto.
  (rb_from_UTF_16LE): ditto.
  (rb_to_UTF_16LE): ditto.
  (rb_from_UTF_32BE): ditto.
  (rb_to_UTF_32BE): ditto.
  (rb_from_UTF_32LE): ditto.
  (rb_to_UTF_32LE): ditto.

* enc/trans/japanese.trans (fun_so_eucjp2sjis): take void* as a state
  pointer.
  (fun_so_sjis2eucjp): ditto.
  (rb_eucjp2sjis): provide state size/init/fini.
  (rb_sjis2eucjp): provide state size/init/fini.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-03 14:12:06 +00:00
akr 4406629bd6 * enc/trans/japanese.trans: new file.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-03 11:09:25 +00:00
naruse 42a48c1e9d * enc/trans/make_transdb.rb: check $(srcdir)/enc/trans before
enc/trans.

* enc/trans/make_transdb.rb: keep names_t.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19081 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-03 08:57:18 +00:00
akr 5cea1b07f4 * enc/trans/make_transdb.rb: check foo.c only if foo.trans exists.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19068 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 11:57:49 +00:00
akr 797faf92d9 * enc/trans/make_transdb.rb: error message improved.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 10:20:11 +00:00
akr 298d66296d revert last commit.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 10:12:14 +00:00
akr 01d7d19077 * enc/trans/make_transdb.rb: error message improved.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 09:35:38 +00:00
usa 7f8e24d43f * enc/trans/utf_16_32.trans (from_UTF_8): rename from to_UTF_16BE
because it was not collect.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19063 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 05:06:05 +00:00
yugui 2e094d086a * enc/emacs_mule.c (svn:executable): dropped executable bit.
* enc/make_encdb.rb (svn:executable): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19062 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 05:04:30 +00:00
akr a2ada174a5 * enc/trans/japanese_euc.trans: splitted from japanese.trans to avoid
compiler limitation.  reported by usa.

* enc/trans/japanese_sjis.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-02 03:39:11 +00:00
akr c5759bfb5b * tool/transcode-tblgen.rb: define TRANSCODE_TABLE_INFO in generated
code.  use it in rb_transcoder.

* enc/trans/newline.trans: use TRANSCODE_TABLE_INFO.

* enc/trans/iso2022.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19046 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 18:18:50 +00:00
akr 7908180df1 * tool/transcode-tblgen.rb: record offsets array as index of
byte_array to avoid relocation.

* transcode.c (transcode_restartable0): add byte_array to get offsets
  array.

* transcode_data.h (BYTE_LOOKUP_BASE): change return type to
  uintptr_t.
  (rb_transcoder): add fields: byte_array, word_array and word_size.

* enc/trans/newline.trans: follow rb_transcoder change.

* enc/trans/iso2022.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19043 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 17:40:32 +00:00
akr 752e053a1d * transcode_data.h (BYTE_LOOKUP): change to uintptr_t array.
(BYTE_LOOKUP_BASE): follow the type change.
  (BYTE_LOOKUP_INFO): ditto.
  (PType): ditto.
  (rb_transcoding): ditto.

* tool/transcode-tblgen.rb: follow the type change.

* transcode.c: ditto.

* enc/trans/newline.trans: ditto.

* enc/trans/iso2022.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19038 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 16:22:49 +00:00
naruse 3960c2f7a7 * enc/euc_jp.c (euc-jp-ms): euc-jp-ms is not EUC-JP not an alias of
eucJP-ms.

* enc/trans/japanese.trans (eucJP-ms): eucJP-ms is the correct
  name of the encoding in Ruby.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 08:24:55 +00:00
naruse a7cdd20204 * enc/trans/japanese.trans: fix mapping priority.
IBM extended is prior than NEC selected IBM.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19019 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 06:16:42 +00:00
nobu c6c4ce81c2 * enc/depend: transdb.c may not present.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19015 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 05:37:04 +00:00
naruse 7ea3d11e22 * enc/trans/japanese.trans: fix Ruby 1.8 compatibility.
* enc/trans/japanese.trans: fix mapping priority. [ruby-dev:36068]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19014 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-01 05:27:48 +00:00
akr 8841969485 * tool/transcode-tblgen.rb (transcode_generated_code): defined for
generating table at once.
  (transcode_tblgen): returns an empty string.
  (transcode_generate_node): ditto.

* enc/trans/newline.trans: use transcode_generated_code.

* enc/trans/iso2022.trans: ditto.

* enc/trans/single_byte.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.

* enc/trans/japanese.trans: ditto.

* enc/trans/korean.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19006 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-31 17:35:00 +00:00
naruse 3d8043d062 * enc/trans/eucjp-tbl.rb: replace by previous Citrus maps.
* enc/trans/sjis-tble.rb: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19004 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-31 16:25:38 +00:00
naruse 1893e19865 * tool/transcode-tblgen.rb: add table generator from Citrus maps.
* enc/trans/japanese.trans: use Citrus maps.

* enc/trans/CP: add maps from Citrus.

* enc/trans/JIS: ditto.

* test/ruby/test_transcode.rb: Shift_JIS and EUC-JP doesn't support
  IBM extended characters.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19003 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-31 16:23:04 +00:00
akr d1429c3cc6 * enc/trans/single_byte.trans (us_ascii_map): don't define 8bit bytes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18982 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-31 07:37:10 +00:00
naruse 12e61c83cb * enc/emacs_mule.c: fix ctype.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18827 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-24 22:10:57 +00:00
akr 2c78899641 * configure.in (BUILTIN_TRANSSRCS): defined.
(BUILTIN_TRANSOBJS): defined.

* enc/Makefile.in (BUILTIN_TRANSES): defined.

* enc/make_encmake.rb (BUILTIN_TRANSES): defined.

* enc/depend: don't generate rules for builtin transcoders.

* common.mk (COMMONOBJS): add BUILTIN_TRANSOBJS.
  (enc.mk): pass BUILTIN_TRANSOBJS.
  (newline.c): new rule.
  (newline.$(OBJEXT)): new ru.e
  (srcs): newline.c added.

* Makefile.in (BUILTIN_TRANSSRCS): defined.
  (BUILTIN_TRANSOBJS): defined.

* transcode.c (Init_transcode): call Init_newline.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18826 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-24 15:58:43 +00:00
naruse cc98d3e440 * enc/emacs_mule.c: support Emacs/Mule internal encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-24 01:58:18 +00:00
akr c0d3881e0e * include/ruby/io.h (FMODE_TEXTMODE): defined.
* include/ruby/encoding.h (rb_econv_t): new field: flags.
  (rb_econv_binmode): declared.

* io.c (io_unread): text mode hack removed.
  (NEED_NEWLINE_DECODER): defined.
  (NEED_NEWLINE_ENCODER): defined.
  (NEED_READCONV): defined.
  (NEED_WRITECONV): defined.
  (TEXTMODE_NEWLINE_ENCODER): defined for windows.
  (make_writeconv): setup converter with TEXTMODE_NEWLINE_ENCODER for
  text mode.
  (io_fwrite): use NEED_WRITECONV.  character code conversion is
  disabled if fptr->writeconv_stateless is nil.
  (make_readconv): setup converter with
  ECONV_UNIVERSAL_NEWLINE_DECODER for text mode.
  (read_all): use NEED_READCONV.
  (appendline): use NEED_READCONV.
  (rb_io_getline_1): use NEED_READCONV.
  (io_getc): use NEED_READCONV.
  (rb_io_ungetc): use NEED_READCONV.
  (rb_io_binmode): OS-level text mode test removed.  call
  rb_econv_binmode.
  (rb_io_binmode_m): call rb_io_binmode_m with write_io as well.
  (rb_io_flags_mode): return mode string including "t".
  (rb_io_mode_flags): detect "t" for text mode.
  (rb_sysopen): always specify O_BINARY.

* transcode.c (rb_econv_open_by_transcoder_entries): initialize flags.
  (rb_econv_open): if source and destination encoding is
  both empty string, open newline converter.  last_tc will be NULL in
  this case.
  (rb_econv_encoding_to_insert_output): last_tc may be NULL now.
  (rb_econv_string): ditto.
  (output_replacement_character): ditto.
  (transcode_loop): ditto.
  (econv_init): ditto.
  (econv_inspect): ditto.
  (rb_econv_binmode): new function.




git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-22 16:44:00 +00:00
akr 7a0bea4fd3 * include/ruby/encoding.h (rb_econv_t): add fields: in_buf_start,
in_data_start, in_data_end, in_buf_end and last_trans_index.
  (rb_econv_output): removed.
  (rb_econv_insert_output): declared.
  (rb_econv_encoding_to_insert_output): declared.

* enc/trans/newline.trans (rb_universal_newline): stateful_type
  changed.

* transcode.c (transcode_restartable0): initialize inchar_start,
  tc->recognized_len and next_table at beginning of the loop.
  (rb_econv_open_by_transcoder_entries): initialize new fields.
  (rb_econv_open): setup last_trans_index.
  (trans_sweep): last out_buf_start can be non-NULL now.
  (rb_econv_convert): check last out_buf_start and in_buf_start at
  first.
  (rb_econv_output_with_destination_encoding): removed.
  (econv_just_convert): removed.
  (rb_econv_output): removed.
  (econv_primitive_output): method removed.
  (rb_econv_encoding_to_insert_output): new function.
  (allocate_converted_string): new function.
  (rb_econv_insert_output): new function.
  (econv_primitive_insert_output): new method.
  (output_replacement_character): use rb_econv_insert_output.  unused
  arguments removed.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18654 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-16 05:32:42 +00:00
akr a55e167b68 * transcode_data.h (rb_transcoder_stateful_type_t): defined.
(rb_transcoder): add field: stateful_type.

* tool/transcode-tblgen.rb: generate stateful_type field as
  stateless_converter.

* enc/trans/iso2022.trans: follow rb_transcoder change.

* enc/trans/newline.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-15 23:13:01 +00:00
akr 74a36d5d1f * include/ruby/encoding.h (rb_econv_output): declared.
* transcode_data.h (rb_transcoder): add resetsize_func field.

* enc/trans/iso2022.trans (iso2022jp_reset_sequence_size): defined.
  (rb_EUC_JP_to_ISO_2022_JP): provede resetsize_func.

* tool/transcode-tblgen.rb: set NULL for resetsize_func.

* transcode.c (rb_econv_output): new function for inserting output.
  (output_replacement_character): use rb_econv_output.
  (transcode_loop): check return value of
  output_replacement_character.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-14 15:56:39 +00:00
akr 279e612973 fix variable name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-13 05:53:20 +00:00
akr a14d5eb09b * enc/trans/newline.trans (rb_crlf_newline): new transcoder.
(rb_cr_newline): new transcoder.

* transcode.c (trans_open_i): one more exra room for input newline
  converter.
  (rb_trans_open): crlf newline and cr newline implemented.
  (Init_transcode): Encoding::Converter::CRLF_NEWLINE and
  Encoding::Converter::LF_NEWLINE defined.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-13 05:48:57 +00:00
akr 74a2a7bdbf * enc/trans/newline.trans: new file.
* transcode_data.h (rb_trans_t): add last_tc field.

* transcode.c (UNIVERSAL_NEWLINE): defined.
  (CRLF_NEWLINE): defined.
  (CR_NEWLINE): defined.
  (rb_trans_open_by_transcoder_entries): initialize last_tc.
  (trans_open_i): allocate one more room for newline converter.
  (rb_trans_open): universal newline implemented.
  (more_output_buffer): take max_output argument instead ts.
  (output_replacement_character): take tc argument instead of ts.
  (transcode_loop): use last_tc field.
  (econv_init): add flags argument for rb_trans_open.
  (Init_transcode): Encoding::Converter::UNIVERSAL_NEWLINE defined.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18556 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-13 05:30:42 +00:00
akr 79cd32fe3b * enc/trans/make_transdb.rb: *.erb.c is not used anymore.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18532 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-12 09:43:44 +00:00
usa 6d8c1483d6 * enc/depend: (transvpath_prefix): prefix has no extension, so replace
%s with "".



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18517 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-12 06:19:00 +00:00
nobu 13f1418d54 * enc/Makefile.in (.SUFFIXES): renamed to .trans.
* enc/make_encmake.rb: added --encs and --no-encs options.

* enc/depend (TRANSVPATH): fix for nmake.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-12 05:25:14 +00:00
akr 94ca2d94de * transcode_data.h (rb_transcoder): add resetstate_func field for
resetting a state of stateful encoding.

* enc/trans/iso2022.trans (rb_EUC_JP_to_ISO_2022_JP): specify
  finish_eucjp_to_iso2022jp for resetstate_func.

* tool/transcode-tblgen.rb: specify NULL for resetstate_func.

* transcode.c (output_replacement_character): call resetstate_func
  before appending the replacement character.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 22:44:23 +00:00
akr 2897632f16 * enc/trans/iso2022.trans: renamed from iso2022.erb.c.
* enc/trans/single_byte.trans: ditto.

* enc/trans/utf_16_32.trans: ditto.

* enc/trans/korean.trans: ditto.

* enc/trans/japanese.trans: ditto.

* enc/depend: follow the renaming.

* tool/build-transcode: ditto.




git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18488 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 07:39:52 +00:00
nobu e3d9fc76e6 * enc/Makefile.in (make-workdir): use MAKEDIRS.
* enc/depend: makes target directory before compile/link.

* tool/transcode-tblgen.rb: creates target directory.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-10 02:45:18 +00:00
akr 139234e1a0 * transcode_data.h (rb_transcoding): add fields for restartable
transcoding.
  (rb_transcoder): add max_input field.
  from_unit_length field is renamed to input_unit_length.

* tool/transcode-tblgen.rb: generate max_input field.

* enc/trans/iso2022.erb.c: follow rb_transcoder change.

* enc/trans/utf_16_32.erb.c: ditto.

* transcode.c (PARTIAL_INPUT): new constant.
  (transcode_char_start): new function.
  (transcode_result_t): new type.
  (transcode_restartable): new function.
  (more_output_buffer): new function.
  (transcode_loop): use transcode_restartable.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18452 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-09 06:02:01 +00:00
nobu 1727355a22 * enc/make_encdb.rb, enc/trans/make_transdb.rb: skip nonexistent
directory.  [ruby-dev:35802]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 21:37:43 +00:00
akr 6db9e77e64 * enc/trans/utf_16_32.erb.c (fun_so_from_utf_32le): implemented.
(fun_so_to_utf_32le): implemented.
  [ruby-dev:35777]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 16:08:50 +00:00
akr 2833d9f95d * transcode_data.h (rb_transcoder): from_unit_length field added.
from_utf8 field removed.

* tool/transcode-tblgen.rb: generate offsets range.
  follow rb_transcoder change.

* transcode.c (transcode_loop): don't use from_utf8.
  make invalid region from_unit_length wise.

* enc/trans/iso2022.erb.c: follow rb_transcoder and 
  transcode_generate_node change.

* enc/trans/utf_16_32.erb.c: follow rb_transcoder and
  transcode_generate_node change.
  explicit :invalid map removed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18445 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 15:48:17 +00:00
nobu a456f022fc * enc/depend (TRANSCSRCS): needs rule_subst to apply.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 14:30:05 +00:00
nobu 2293171034 * common.mk (srcs-enc): renamed from transcodes.
* enc/Makefile.in (make-workdir): creates object directories.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18437 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 07:44:42 +00:00
nobu b917553e68 * common.mk (encdb.h): see both $(srcdir)/enc and enc.
* enc/make_encdb.rb: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18435 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 07:15:52 +00:00
nobu 2f37e03736 * enc/trans/make_transdb.rb: fix for the case no transdirs are given.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18434 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 07:09:03 +00:00
nobu 73d3ff0074 * enc/trans/make_transdb.rb: converts only one transcoders for each
basename.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18433 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 07:03:35 +00:00
naruse b41a687fe6 * common.mk: see both $(srcdir)/enc/trans and enc/trans.
* enc/trans/make_transdb.rb: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18422 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-07 19:24:47 +00:00
akr 1504652373 * transcode_data.h (rb_transcoding): new field "stateful".
(rb_transcoder): preprocessor and postprocessor field removed.
  change arguments of func_ii, func_si, func_io and func_so.
  new field "finish_func".

* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
  generatable.

* transcode.c (transcoder_lib_table): removed.
  (transcoder_table): change structure.
  (transcoder_key): removed because the above structure change.
  (make_transcoder_entry): new function.
  (get_transcoder_entry): ditto.
  (rb_register_transcoder): follow the structure change.
  (declare_transcoder): ditto.
  (transcode_search_path): new function for breadth first search to
  find a list of converters.
  (transcode_search_path_i): new function.
  (transcode_dispatch_cb): ditto.
  (transcode_dispatch): use transcode_search_path.
  (transcode_loop): follow the argument change.
  (str_transcode): preprocessor and postprocessor stuff removed.

* enc/trans/iso2022.erb.c: new file.  ISO-2022-JP conversion
  re-implemented.

* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.

nute(23:52:53)% head -40 ChangeLog
Thu Aug  7 23:43:11 2008  Tanaka Akira  <akr@fsij.org>

* transcode_data.h (rb_transcoding): new field "stateful".
  (rb_transcoder): preprocessor and postprocessor field removed.
  change arguments of func_ii, func_si, func_io and func_so.
  new field "finish_func".

* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
  generatable.

* transcode.c (transcoder_lib_table): removed.
  (transcoder_table): change structure.
  (transcoder_key): removed because the above structure change.
  (make_transcoder_entry): new function.
  (get_transcoder_entry): ditto.
  (rb_register_transcoder): follow the structure change.
  (declare_transcoder): ditto.
  (transcode_search_path): new function for breadth first search to
  find a list of converters.
  (transcode_search_path_i): new function.
  (transcode_dispatch_cb): ditto.
  (transcode_dispatch): use transcode_search_path.
  (transcode_loop): follow the argument change.
  (str_transcode): preprocessor and postprocessor stuff removed.

* enc/trans/iso2022.erb.c: new file.  ISO-2022-JP conversion
  re-implemented.

* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.

* enc/trans/utf_16_32.erb.c: follow argument change of FUNso.

[ruby-dev:35798]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18419 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-07 14:53:30 +00:00
nobu 380e558f33 * enc/depend: add transdb.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18412 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-07 08:47:58 +00:00
nobu fa3283c7ba * enc/depend: removed needless explicit commands.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18410 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-07 05:17:34 +00:00
naruse ec85af4955 * enc/depend: enc/*.c is source but enc/trans/*.c is generated.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18400 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 20:19:25 +00:00
naruse 508b170dcd * enc/depend: for build in other than srcdir.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18396 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 19:44:50 +00:00
akr e22b3b773d * tool/transcode-tblgen.rb (transcode_generate_node): code
argument removed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18395 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 13:42:21 +00:00
akr a016b30844 * enc/depend: transcode table generation depends on
tool/transcode-tblgen.rb.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18393 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 11:57:32 +00:00
akr fc841ddc66 useless comment removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 11:44:08 +00:00
nobu 8db17b30d6 * common.mk (transdb.h): requires transcoders.
* enc/depend (srcs): target for transcoders.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 05:40:13 +00:00
usa ce074dcf42 * enc/depend: replace not only $(<:...) but also $<.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 05:04:36 +00:00
usa 19e2cdbd47 * win32/Makefile.sub (config.status): export BASERUBY.
* enc/depend: avoid GNU make'ism.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 04:55:39 +00:00
akr e90dd02529 * tool/transcode-tblgen.rb: show generating tables in verbose mode.
(transcode_generate_node): call ActionMap#generate_node with showing
  table name.

* enc/trans/utf_16_32.erb.c: use transcode_generate_node.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-05 22:47:44 +00:00
nobu 0031e09170 * common.mk (transcodes), tool/build-transcode: generates transcode
sources.

* enc/trans/{japanese,korean,single_byte,utf_16_32}.c: to be
  autogenerated now.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-05 18:56:42 +00:00
nobu 82e89f1237 * enc/depend: added rules for .c from .erb.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-05 18:15:52 +00:00
akr f694ec83e8 * tool/build-transcode: new file.
* tool/transcode-tblgen.rb: new file.

* enc/trans/make_transdb.rb: exclude *.erb.c.

* enc/depend: exclude *.erb.c.

* enc/trans/utf_16_32.erb.c: new file.

* enc/trans/single_byte.erb.c: new file.

* enc/trans/japanese.erb.c: new file.

* enc/trans/korean.erb.c: new file.

* enc/trans/iso-8859-2-tbl.rb: new file.

* enc/trans/iso-8859-3-tbl.rb: new file.

* enc/trans/iso-8859-4-tbl.rb: new file.

* enc/trans/iso-8859-5-tbl.rb: new file.

* enc/trans/iso-8859-6-tbl.rb: new file.

* enc/trans/iso-8859-7-tbl.rb: new file.

* enc/trans/iso-8859-8-tbl.rb: new file.

* enc/trans/iso-8859-9-tbl.rb: new file.

* enc/trans/iso-8859-10-tbl.rb: new file.

* enc/trans/iso-8859-11-tbl.rb: new file.

* enc/trans/iso-8859-13-tbl.rb: new file.

* enc/trans/iso-8859-14-tbl.rb: new file.

* enc/trans/iso-8859-15-tbl.rb: new file.

* enc/trans/eucjp-tbl.rb: new file.

* enc/trans/sjis-tbl.rb: new file.

* enc/trans/euckr-tbl.rb: new file.

* enc/trans/utf_16_32.c: regenerated.

* enc/trans/single_byte.c: regenerated.

* enc/trans/japanese.c: regenerated.

* enc/trans/korean.c: regenerated.

[ruby-dev:35730]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18373 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-05 12:32:13 +00:00
nobu b84d31c524 * transcode_data.h (TRANSCODE_ERROR): common transcode failure
exception, would be changed later.

* enc/trans/japanese.c (UNSUPPORTED_MODE): unsupported mode transition
  exception.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18363 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-05 03:34:52 +00:00
naruse 4de6475690 * enc/trans/japanese.c: add U+FF5E to EUC-JP.
[ruby-dev:35720] [ruby-dev:35722]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-01 16:26:23 +00:00
naruse 4fc3ef70e1 * enc/trans/japanese.c: add support for CP51932,
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-01 14:17:21 +00:00
naruse 35bac2083c * enc/trans/japanese.c: add U+FF0C,
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-01 14:03:14 +00:00
naruse 143b3339e6 * enc/trans/japanese.c (to_SHIFT_JIS_EF_BF_offsets): add U+FFF3,
U+FFF4, U+FFF5.

* enc/trans/japanese.c (to_SHIFT_JIS_EF_BF_infos): ditto.

* enc/trans/japanese.c (to_EUC_JP_EF_BF_infos): added.

* enc/trans/japanese.c (to_EUC_JP_EF_BF): added.

* enc/trans/japanese.c (to_EUC_JP_EF_infos): change size.
  [ruby-dev:35714]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18305 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-01 12:26:56 +00:00
nobu 94ed51b281 * transcode.c (transcode_loop): constified.
* transcode.c (str_transcode): rb_str_set_len() sets a delimiter.

* transcode_data.h (rb_transcoder): constified preprocessor and
  postprocessor input.

* enc/trans/japanese.c: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-14 09:47:33 +00:00
naruse 5b7dfb7c13 * enc/shift_jis.c (code_is_ctype): HALF WIDTH KATAKANA is
a character.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17774 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-01 10:05:48 +00:00
shyouhei 0ef21e44e7 forgot to commit
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-01 09:50:30 +00:00
usa 40612c5bb5 * enc/make_encdb.h: always add ';' at the end of line.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17771 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-01 08:51:33 +00:00
shyouhei 53cffba8cb * enc/ascii.c: ISO C does not allow extra ';' outside of a
function

  	* enc/us_ascii.c: ditto.

	* enc/utf_8.c: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17768 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-01 08:39:10 +00:00
mame d7baa40cac * enc/euc_jp.c (property_name_to_ctype): core dumped when sizeof(int)
differs from sizeof(long).

* enc/shift_jis.c (property_name_to_ctype): ditto.

* enc/unicode.c (onigenc_unicode_property_name_to_ctype): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-17 13:06:34 +00:00
nobu fc0babb4c2 * enc/depend (clean): remove build directories.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17060 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-09 12:17:29 +00:00
nobu ce29c17877 * lib/mkmf.rb (configuration): set flags.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-09 12:14:39 +00:00
mame 65670f9400 * enc/iso_8859_5.c: Large omicron should lowercase to small omicron.
* test/ruby/test_big5.rb, test/ruby/test_cp949.rb,
  test/ruby/test_euc_jp.rb, test/ruby/test_euc_kr.rb,
  test/ruby/test_euc_tw.rb, test/ruby/test_gb18030.rb,
  test/ruby/test_gbk.rb, test/ruby/test_iso_8859.rb,
  test/ruby/test_koi8.rb, test/ruby/test_shift_jis.rb,
  test/ruby/test_windows_1251.rb: new tests for encoding.

* test/ruby/test_utf16.rb, test/ruby/test_utf32.rb,
  test/ruby/test_regexp.rb: add tests.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16759 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-02 13:30:38 +00:00
naruse 5397015d2e * enc/gb18030.c (gb18030_code_to_mbc): add 0x80000000
for 4bytes character.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16739 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-01 19:36:28 +00:00
naruse 9c13fc7d89 * enc/gb18030.c (gb18030_mbc_to_code): mask by 0x7FFFFFFF
because OnigCodePoint will be used as 32bit signed int.
  Masking by 0x7FFFFFFF is ok on GB18030;
  Minumum 4bytes character is 0x81308130.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16737 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-01 18:29:08 +00:00
naruse 0682fab6a2 * enc/utf_16{be,le}.c (utf16{be,le}_code_to_mbc):
fix codepoint to bytes.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-31 10:14:38 +00:00
nobu 6a734c810c * common.mk (prelude.c): simply depends on PREP. [ruby-dev:34877]
* enc/make_encdb.rb, enc/trans/make_transdb.rb: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-30 03:18:45 +00:00
naruse d6025a3be4 * enc/utf_8.c: add UTF8-MAC (UTF-8-MAC).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16697 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-29 22:12:57 +00:00
usa 22088e3423 * enc/trans/japanese.c (to_SHIFT_JIS_EF_infos): typo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16661 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-28 06:54:18 +00:00
naruse 3ab17047f5 * enc/trans/japanese.c: add workarround for Unicode to CP932.
U+2015->0x815C, U+2225->0x8161, U+FF0D->0x817C, U+FF3C->0x815F,
  U+FF5E->0x8160, U+FFE0->0x8191, U+FFE1->0x8192, U+FFE2->0x81CA

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16657 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-28 04:35:59 +00:00
mame 0a2053713b * enc/trans/utf_16_32.c (fun_so_to_utf_16be, fun_so_to_utf_16le): add
parentheses to remove warnings of gcc.

* io.c (rb_io_getc): remove unused variables.

* compile.c (NODE_NEXT, NODE_REDO): remove unused labels.

* ext/nkf/nkf.c (rb_nkf_convert): remove unused variables.

* ext/syck/rubyext.c (syck_resolver_initialize,
  syck_resolver_detect_implicit, syck_emitter_emit): remove unused
  variables.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16061 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-17 13:22:40 +00:00
nobu 7dc26509c6 * common.mk (prelude.c): depends on enc/prelude.rb.
* enc/prelude.rb: fixed initial library names.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-08 01:30:31 +00:00
nobu 1369cfd16e * encoding.c (enc_init_db): moved to enc/encdb.c.
* transcode.c (init_transcoder_table): moved to enc/trans/transdb.c.

* enc/depend (enc/encdb.o enc/trans/transdb.o): depend on
  corresponding headers.

* common.mk (COMMONOBJS): moved transcode.o from OBJS


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-07 06:51:33 +00:00
duerst 2e7815dd80 Sun Mar 16 18:07:07 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: bug fix (some invalid UTF-8 sequences
	  were legal)

	* test/ruby/test_transcode.rb: test for above bug



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15786 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-16 09:09:53 +00:00
nobu 9d014dc254 * ext/extmk.rb, enc/make_encmake.rb: load current mkmf.rb even if
cross-compiling.

* ext/extmk.rb, enc/make_encmake.rb, lib/mkmf.rb: need to be 1.8
  compatible for cross-compiling.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15616 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-26 18:56:00 +00:00
nobu 80e81d283d * enc/{depend,make_encdb.rb,trans/make_transdb.rb}: sort in alpha-numeric order.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-22 00:07:23 +00:00
naruse 8984fa6742 * enc/{euc_jp.c,gbk.c,iso_8859_1.c,iso_8859_11.c,iso_8859_13.c,
iso_8859_2.c,iso_8859_6.c,iso_8859_7.c,iso_8859_8.c,iso_8859_9.c,
  shift_jis.c,windows_1251.c}: add document about encodings.

* enc/cp949.c: divided into new file.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15516 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 03:21:20 +00:00
naruse a2d85d61bd * enc/iso_8859_{4,13}.c: Windows-1257 is replica of ISO-8859-13.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15495 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 20:55:27 +00:00
naruse a8739621cf * lib/uri/generic.rb: revert r15442. 2nd argument of String#sub parse
escapes. [ruby-dev:33726]

* bootstraptest/test_method.rb enc/depend instruby.rb lib/mkmf.rb
  mkconfig.rb: revert r15443. ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15456 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-13 07:26:52 +00:00
usa f6628871b5 * enc/depend: fix typo.
* lib/mkmf.rb: revert r15443. "\\1#{sep}\\2" is wrong if sep is ended
	  with "\\".



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15455 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-13 02:21:25 +00:00
naruse a10ded3ba0 * bootstraptest/runner.rb, bootstraptest/test_method.rb, enc/depend,
instruby.rb, lib/mkmf.rb, lib/test/unit/util/procwrapper.rb,
mkconfig.rb, sample/test.rb, template/vm.inc.tmpl,
test/ruby/test_stringchar.rb: fixes arround String#gsub.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-12 06:28:23 +00:00
naruse e22ff0c9b6 * enc/trans/korean.c: add support for CP949 by Park Ji-In. [ruby-dev:33626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15393 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 06:05:32 +00:00
naruse 4f0083e45f * enc/trans/korean.c: add EUC-KR conversion support by Park Ji-In.
[ruby-dev:33621]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15385 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-06 19:40:11 +00:00
naruse 9ac5a0ca4d * enc/*.c: add GB12345, UCS-{2,4}{BE,LE}.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15341 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-30 08:35:03 +00:00
akr 44cfd58dc5 * enc/utf_16be.c (UTF16_IS_SURROGATE_FIRST): avoid branch.
(UTF16_IS_SURROGATE_SECOND): ditto.
  (UTF16_IS_SURROGATE): defined.
  (utf16be_mbc_enc_len): validation implemented.

* enc/utf_16le.c (UTF16_IS_SURROGATE_FIRST): avoid branch.
  (UTF16_IS_SURROGATE_SECOND): ditto.
  (UTF16_IS_SURROGATE): defined.
  (utf16le_mbc_enc_len): validation implemented.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15338 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-30 03:49:54 +00:00
akr 0ba09d829c fix state definition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 15:35:37 +00:00
akr 12e8b588ac * enc/euc_tw.c (euctw_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 15:10:50 +00:00
akr 6e3391c866 * enc/euc_tw.c (euctw_islead): 0x8e is a leading byte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15323 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 13:01:27 +00:00
naruse b9821b02a0 * enc/trans/make_transdb.rb: add for make transdb.h.
* dmytranscode.c: add for miniruby.

* enc/gbk.c (gbk_left_adjust_char_head, gbk_is_allowed_reverse_match):
  fix odd regexp match. [ruby-dev:33502]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 11:44:08 +00:00
naruse 7a8c02cd47 * add enc/trans/make_transdb.rb, dmytranscode.c
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 11:18:22 +00:00
naruse 74b254e833 * enc/trans/japanese.c (rb_to_Windows_31J): to 'Windows-31J'.
* common.mk: add rules for transdb.h.

* transcode.c (init_transcoder_table): use transdb.h.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15317 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 10:05:39 +00:00
naruse 19d9380b3d * enc/gbk.c (EncLen_gbk): too short. [ruby-dev:33497]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:55:19 +00:00
akr 86a9215bbf * enc/gb18030.c (gb18030_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:38:21 +00:00
naruse 00a3c40c37 * enc/euc_kr.c: remove CP949.
* enc/euc_cn.c: remove CP936 and rename to gb2312.c

* enc/gb2312.c: GB2312 is preferred MIME name.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 04:41:41 +00:00
naruse fe15b86b9d * enc/gbk.c: add GBK, CP936 and CP949.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 04:26:30 +00:00
naruse a2b03f10dc * enc/gbk.c: add GBK, CP936 and CP949.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 03:55:45 +00:00
naruse 2f961c1f37 * enc/utf_7.h: add dummy encoding UTF-7 and its alias CP65000.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15291 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 08:57:40 +00:00
usa fee57bb8c8 * enc/utf_8.c: add alias CP65001.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 08:41:49 +00:00
akr ffbf8ab367 * enc/big5.c (big5_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15289 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 06:33:57 +00:00
akr 5f9bc1779e * enc/euc_kr.c (euckr_mbc_enc_len): validation implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15288 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 04:02:39 +00:00
nobu b2c5814afc * enc/trans/japanese.c (rb_from_Windows_31J, rb_to_Windows_31J):
provisional workaround for Windows-31J.  [ruby-dev:33320]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15188 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 13:14:31 +00:00
duerst ef3fdbca15 Tue Jan 22 17:52:52 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: Streamline parentheses, add more
	  'static' qualifiers.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-22 08:52:02 +00:00
duerst 38321fc0eb Mon Jan 21 19:42:42 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-32BE and UTF-32LE conversions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 10:41:59 +00:00
nobu 463af63468 * transcode.c (transcode_loop, str_transcoding_resize): use unsigned
char.  [ruby-dev:33232]

* transcode_data.h (rb_transcoding, rb_transcoder): removed callback
  parameters.

* enc/trans/japanese.c: ditto.

* enc/trans/utf_16_32.c: parenthesized bit-or operands.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15150 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 03:35:05 +00:00
nobu a8969e999a * transcode.c (transcode_dispatch): constified return value.
* transcode_data.h (rb_transcoding): include pointer to rb_transcoder
  and auxiliary data.

* transcode_data.h (rb_transcoder): all callback functions shoud have
  their own parameters.

* enc/trans/{japanese,single_byte}.c: constified.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 21:40:08 +00:00
duerst a9b15a4e0c Sun Jan 20 20:00:20 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-16LE conversions.

	* fixed changelog for last commit



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15144 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 11:00:24 +00:00
duerst 3d0c7bea4d Sun Jan 20 15:08:08 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: new file, currently implementing
	  UTF-16BE conversions only.

	* test/ruby/test_transcode.rb: Added tests for UTF-16BE;
	  made check_both_ways() use force_encoding differently.

	* transcode_data.h, transcode.c: Support for more conversion
	  functions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 06:12:48 +00:00
naruse 9a1d7e4d01 * enc/make_encdb.rb: fix duplication check.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15135 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 20:15:13 +00:00
naruse 7b3781c60c * ascii.c: remove difinition of replica KOI8-U.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15134 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 20:04:35 +00:00
naruse 6e1c3a0f54 * enc/koi8_u.c: added.
* regenc.c, enc/utf_8.c, enc/unicode.c, enc/gb18030.c: add ARG_UNUSED.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19 15:37:06 +00:00
nobu 8b112c580c * enc/euc_cn.c: split from enc/euc_kr.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 15:17:21 +00:00
nobu 0052259d5e * common.mk (encdb.h): give output file name to make_encdb.rb.
* encoding.c (enc_table): simplified.

* encoding.c (enc_register_at): lazy loading.  [ruby-dev:33013]

* regenc.h (ENC_DUMMY): added.

* enc/make_encdb.rb: now emits macros only.

* enc/iso_2022_jp.h: split from encoding.c.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 14:56:22 +00:00
nobu 85e6dff165 * enc/shift_jis.c: newline at EOF.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15082 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 08:08:08 +00:00
nobu 9c1bf098e0 * enc/windows_1251.c: newline at EOF.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15080 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-16 09:38:01 +00:00
naruse 0a640a9386 * enc/*: add ARG_UNUSED.
* enc/koi8_u.c: added.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15069 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 14:54:40 +00:00
naruse 904572d2e5 * enc/utf_{16,32}{be,le}.c: remove some ARG_UNUSED. replace struct
OnigEncodingST by OnigEncoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15068 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 14:02:32 +00:00
naruse edc61cf4c1 * encoding.c (ENC_REGISTER): use &OnigEncoding*.
(ENCINDEX_UTF_8): renamed from ENCINDEX_UTF8.
  (rb_enc_init): use ENC_REGISTER.

* include/ruby/oniguruma.h (OnigEncodingUTF8, ONIG_ENCODING_UTF8):
  removed.

* enc/*.c: remove use of &encoding_*; use enc argument instead.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 13:36:18 +00:00
matz d2a377d747 * enc/utf_8.c: remove use of ONIG_ENCODING_UTF8 altogether; use
enc argument instead.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 09:47:58 +00:00
usa 648c0f7c80 * enc/utf_8.c (ONIG_ENCODING_UTF8): reverted.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 09:06:03 +00:00
matz a34288d947 * enc/utf_8.c (OnigEncodingDefine): encoding name should be kept
unchanged.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15063 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 09:00:48 +00:00
nobu 68adb6193a * enc/Makefile.in: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15062 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 08:53:07 +00:00
nobu ad73c8b348 * enc/utf_8.c: renamed as IANA name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15061 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-15 08:26:54 +00:00
matz d9ff499bf3 * re.c (rb_char_to_option_kcode): use rb_enc_find_index() instead
of using fixed index value.

* enc/Makefile.in (encsrcdir): make US-ASCII built-in.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15047 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 13:49:29 +00:00
matz 4d034f3477 * enc/us_ascii.c: wrong alias name: ANSI_X3.4-1986.
* rubytest.rb: add -I#{srcdir} to load encoding DLL.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 12:11:06 +00:00
naruse 0605d15f6a * encoding.c (rb_locale_encoding): return US-ASCII when charmap is nil.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15039 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 09:08:45 +00:00
duerst 5f31c7b548 Mon Jan 14 10:45:45 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/ascii.c: Exchanged order of arguments for one ENC_ALIAS




git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15031 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 01:45:52 +00:00
naruse 5b46f99ce1 * enc/*.c: add replicas and aliases.
* enc/make_encdb.h: add duplicate and undefined check.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15028 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 20:46:00 +00:00
naruse 50bbc4e6ae * define replica encoding "CP949".
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 17:21:47 +00:00
naruse 8f15b8128c * include/ruby/oniguruma.h: remove ONIG_ENCODING_* and OnigEncoding*
which are not builtin.

* regenc.{c,h} (onigenc_mb2_code_to_mbclen, onigenc_mb4_code_to_mbclen):
  fix prototype.

* enc/big5.c, enc/euc_kr.c, enc/euc_tw.c, enc/gb18030.c,
  enc/koi8_r.c, enc/windows_1251.c: imported from Oniguruma.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15026 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 17:16:09 +00:00
naruse 21671b558c * enc/make_encdb.h: sort encoding names by original name.
* encoding.c, enc/*.c: define replicas and aliases.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15025 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 14:29:12 +00:00
nobu bb8ddbe847 * encoding.c (Init_Encoding): moved initialization from encdb.h.
* enc/make_encdb.rb (enc_name_list): constified.

* enc/make_encdb.rb (enc_init_db): moved some functions to encoding.c.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 09:41:50 +00:00
naruse 513d0ca7f6 * encoding.c (ENCINDEX_EUC_JP, ENCINDEX_SJIS): removed.
(rb_enc_init): EUC-JP and Shift_JIS are not builtin now.

* enc/Makefile.in: ditto.

* common.mk: ditto.

* ruby.c (proc_options): ditto.

* enc/shift_jis.c, enc/euc_jp.c: fixes for romove from builtin.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15016 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 02:51:15 +00:00
nobu 00fb802284 * encoding.c (enc_table): packed all enc_table stuff.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15015 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 01:21:42 +00:00
naruse 80a569906d * encoding.c (rb_enc_init): revert removing SJIS.
* enc/sjis.c: move to enc/shift_jis.c, to make encoding name equal to
  filename for convinience of loading lib.

* enc/shift_jis.c: moved from enc/sjis.c.

* common.mk: follows enc/shift_jis.c.

* enc/Makefile.in: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15014 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 01:15:32 +00:00
nobu 9bded8aae9 * enc/make_encdb.rb: set properties.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15011 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-13 00:10:00 +00:00
matz e699dda504 * enc/make_encdb.rb: should work on Ruby 1.8. [ruby-dev:33069]
* common.mk (encdb.h): pass enc dir from outside to make_encdb.rb.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15010 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-12 16:55:33 +00:00
naruse 5b9739a832 * enc/make_encdb.rb: added. search enc/*.c and make encoding database.
* regenc.h (ENC_REPLICATE, ENC_ALIAS): added for defining replica
  encoding and encoding alias.

* encoding.c (rb_enc_init): move alias definitions to enc/*.c.
  (rb_enc_find_index): search original of replica and alias when no
  encoding library.
  (rb_enc_name_list, rb_enc_aliases_enc_i, rb_enc_aliases_str_i,
   rb_enc_aliases, Encoding.name_list, Encoding.aliases): added.
  (Init_Encoding): init encdb.

* enc/ascii.c, enc/us_ascii.c, enc/euc_jp.c, enc/sjis.c:
  add replica encoding and encoding alias difinition.

* common.mk (dist-clean-local): add rule for remvoe encdb.h.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15007 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-12 16:03:51 +00:00
naruse fdeb4b1384 * enc/Makefile.in (BUILTIN_ENCS): UTF-{16,32}{BE,LE} are not builtin.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 16:58:31 +00:00
naruse ed540e8bdf * encoding.c, Makefile.in, include/ruby/oniguruma.h,
enc/Makefile.in: fix rules for UTF-{16,32}{BE,LE}.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 13:35:24 +00:00
nobu cc22700b90 * enc/utf_{16,32}{be,le}.c: renamed to match with encoding names.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 07:27:53 +00:00
nobu aab064f0dc * enc/utf_{16,32}{be,le}.c: renamed to match with encoding names.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14948 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 07:27:43 +00:00
usa ecf8b1c807 * enc/utf{16,32}_{be,le}.c: use &OnigEncodingName(*) instead of
ONIG_ENCODING_*.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14947 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 06:48:49 +00:00
nobu dca4de6838 * regenc.c (onigenc_strlen_null, onigenc_str_bytelen_null): suppressed
warnings.

* regenc.h, enc/unicode.c (onigenc_unicode_ctype_code_range): added
  encoding argument.

* enc/utf{16,32}_{be,le}.c: added init functions.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14946 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 06:40:33 +00:00
nobu 4cc42da33f * enc/utf{16,32}_{be,le}.c: imported from Oniguruma 5.9.1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14945 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 06:27:22 +00:00
akr ed74723af4 * enc/euc_jp.c: remove eucjp_ prefix. breakpoint can be specified as
euc_jp.c:mbc_enc_len.  avoid needless conflict by merge.

* enc/sjis.c: remove sjis_ prefix.

* enc/utf8.c: remove utf8_ prefix.

* enc/iso_8859_1.c: remove iso_8859_1_ prefix.

* enc/iso_8859_2.c: remove iso_8859_2_ prefix.

* enc/iso_8859_3.c: remove iso_8859_3_ prefix.

* enc/iso_8859_4.c: remove iso_8859_4_ prefix.

* enc/iso_8859_5.c: remove iso_8859_5_ prefix.

* enc/iso_8859_6.c: remove iso_8859_6_ prefix.

* enc/iso_8859_7.c: remove iso_8859_7_ prefix.

* enc/iso_8859_8.c: remove iso_8859_8_ prefix.

* enc/iso_8859_9.c: remove iso_8859_9_ prefix.

* enc/iso_8859_10.c: remove iso_8859_10_ prefix.

* enc/iso_8859_11.c: remove iso_8859_11_ prefix.

* enc/iso_8859_13.c: remove iso_8859_13_ prefix.

* enc/iso_8859_14.c: remove iso_8859_14_ prefix.

* enc/iso_8859_15.c: remove iso_8859_15_ prefix.

* enc/iso_8859_16.c: remove iso_8859_16_ prefix.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14877 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 17:57:48 +00:00
matz 52ed8c4edd * include/ruby/oniguruma.h: Oniguruma 1.9.1 merged.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14874 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 15:55:04 +00:00
akr a13c1148a9 * enc/us_ascii.c: add us_ascii_ prefix for functions to ease
setting breakpoint when debugging.

* enc/euc_jp.c: add eucjp_ prefix.

* enc/sjis.c: add sjis_ prefix.

* enc/iso_8859_1.c: add iso_8859_1_ prefix.

* enc/iso_8859_2.c: add iso_8859_2_ prefix.

* enc/iso_8859_3.c: add iso_8859_3_ prefix.

* enc/iso_8859_4.c: add iso_8859_4_ prefix.

* enc/iso_8859_5.c: add iso_8859_5_ prefix.

* enc/iso_8859_6.c: add iso_8859_6_ prefix.

* enc/iso_8859_7.c: add iso_8859_7_ prefix.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14856 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-02 20:06:58 +00:00
akr 40871d401f * enc/depend: dependency updated.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14834 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-01 16:20:56 +00:00
naruse e73a962a65 * enc/depend: replace spaces by tab
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-30 02:06:48 +00:00
naruse 6c2849dd46 * configure.in: rm largefile.h.
* common.mk: clean golf, conf*, preludes, and so on.

* enc/depend: silent and ignore error for rm.

* enc/Makefile.in: should define prefix and exec_prefix.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-30 02:00:59 +00:00
nobu 1644d3f073 * enc/Makefile.in (DLDFLAGS): like as extensions. [ruby-core:14567]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14785 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-29 16:43:59 +00:00
duerst 793e9423cd Fri Dec 28 01:55:04 2007 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c (transcode_dispatch): reverted some of the changes
          in r14746.

	* transcode.c, enc/trans/single_byte.c: Added conversions to/from
	  US-ASCII and ASCII-8BIT (using data tables).

	* enc/trans/single_byte.c: Some spacing/ordering changes due to
	  automatic data file generation.

	* transcode_data.h, transcode.c: Preliminary code for using
	  micro-conversion functions.

	* test/ruby/test_transcode.rb: Added some tests for US-ASCII and
	  ASCII-8BIT conversions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-28 09:26:55 +00:00
akr 173f1e1563 * lib/weakref.rb, lib/irb/ruby-lex.rb, lib/irb/lc/error.rb, enc/trans/japanese.c:
change "illegal" to "invalid" in a context which doesn' t against
  a law.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-27 08:58:03 +00:00
nobu 7489c4d93e * enc/trans/japanese.c (rb_{from,to}_{SHIFT_JIS,EUC_JP}): inversed
from_encoding and to_encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14684 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 07:51:10 +00:00
nobu c90dbedbb1 * enc/trans/japanese.c (rb_to_EUC_JP): fixed typo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14682 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 07:37:15 +00:00
usa ec579454bb * enc/trans/single_byte.c (Init_single_byte): renamed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 06:09:05 +00:00
nobu b7db9036be * common.mk (COMMONOBJS): transcode_data_*.c moved under enc/trans.
* transcode_data.h (rb_transcoding, rb_transcoder): prefixed.

* transcode.c (rb_register_transcoder, rb_declare_transcoder): split
  declaration and registration.  [ruby-dev:32704]

* transcode.c (transcode_dispatch): autoload pre-declared transcoder.

* transcode.c (str_transcode): use rb_define_dummy_encoding().

* transcode.c (Init_transcode): initialize transcoder tables.

* enc/trans/single_byte.c, enc/trans/japanese.c: moved from top.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 05:57:04 +00:00
nobu 8d292a08df * Makefile.in, configure.in, lib/mkmf.rb, */Makefile.sub: specify
compiled output file name explicitly.

* enc/Makefile.in, enc/depend: now makes compiler to put generated
  files under directories corresnponding to the each source.
  enc/trans supported.

* enc/make_encmake.rb: evaluates depend file before Makefile.in so
  that the former can influence to CONFIG.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 03:49:56 +00:00
nobu cd42707d86 * enc/depend, enc/make_encmake.rb: use erb.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 18:35:53 +00:00
akr bcb064eb0f * regenc.c (onigenc_ascii_is_code_ctype): moved from enc/ascii.c.
* regenc.h (onigenc_ascii_is_code_ctype): declared.

* enc/ascii.c: use onigenc_ascii_is_code_ctype.

* enc/us_ascii.c: new file for US-ASCII.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 05:38:33 +00:00
nobu 817a4e3c83 * common.mk (enc.mk): depends on $(RBCONFIG) instead of rbconfig.rb.
* encoding.c (Init_Encoding): ISO-8859-1 is no longer a replica.

* regenc.h (OnigEncodingDefine): names of extension and encoding can
  differ.

* enc/Makefile.in: always shared.

* enc/depend (deffile): should not upcase.

* enc/{ascii,euc_jp,sjis,utf8,iso_8859_{1..16}}.c: fix for Init.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 02:23:26 +00:00
nobu dc4d4b3923 * common.mk (enc.mk): depends on rbconfig.rb.
* regenc.h (OnigEncodingDefine): external encoding definition macro.

* enc/Makefile.in: fix for linking.

* enc/depend, enc/make_encmake.rb: fix for Windows.


* enc/{ascii,euc_jp,sjis,utf8,iso_8859_{1..16}}.c: renamed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14358 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-20 08:07:56 +00:00
nobu e42fac7c06 * enc/iso_8859_{1..16}.c: renamed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14355 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-20 06:47:14 +00:00
nobu c677977267 * enc/iso8859_{1..16}.c: adjust for ruby.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14344 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 17:50:30 +00:00
nobu 42244c17f6 * enc/iso8859_{1..16}.c: imported from Onigiruma 5.9.0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14342 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 17:28:32 +00:00
nobu 359115948a * enc/Makefile.in (RM): added.
* enc/depend (encs): sort in alpha-numeric order.

* enc/depend (clean, distclean): added.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14341 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 17:23:24 +00:00
nobu 874b367bdc * enc/depend: get rid of target expanded as empty for nmake.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14286 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-18 05:05:25 +00:00
nobu c611b6d0cc * configure.in (BUILTIN_ENCS): removed.
* common.mk (enc.mk): pass BUILTIN_ENCS from command line.

* enc/depend: ditto.

* enc/make_encmake.rb: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14281 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 17:30:57 +00:00
nobu 6ed9bdd463 * common.mk (encs): added dependencies.
* enc/Makefile.in, enc/depend, enc/make_encmake.rb: moved serb code.

* lib/mkmf.rb (depend_rules): now takes content string, not file name.

* win32/enc-setup.mak: overrides default target.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14276 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 16:15:46 +00:00
nobu b2d9f1e9d0 * common.mk (encs): new target to compile external encodings.
* enc/Makefile.in: became a serb template.

* enc/make_encmake.rb: creates enc.mk from enc/Makefile.in using serb.

* lib/mkmf.rb (relative_from): moved from ext/extmk.rb.

* lib/mkmf.rb ($extmk): true if under to top source directory, not
  only ext.

* lib/mkmf.rb (depend_rules): extracted from create_makefile.

* tool/serb.rb (serb): splitted from tool/compile_prelude.rb.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 08:47:28 +00:00
nobu 5f431c49a4 * enc/depend: commit miss.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14265 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 08:19:29 +00:00
nobu 4cf13ffaef * configure.in (EXTERNAL_ENCOBJS, ENCSOS): removed.
* enc/Makefile.in (BUILTIN_ENCS): includes .c suffix.

* enc/depend: splitted from Makefile.in.

* {bcc32,win32,wince}/setup.mak (-encs-): extracts BUILTIN_ENCOBJS.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 08:17:50 +00:00
nobu f2bd108d8d * configure.in (enc/Makefile): add external encoding objects list.
* common.mk (BUILTIN_ENCOBJS): renamed from ENCOBJS.

* Makefile.in (BUILTIN_ENCOBJS): substitued by autoconf.

* enc/Makefile.in: new file to compile external encoding sources.

* encoding.c (rb_enc_find_index): auto-load external encoding objects
  as "ext/ENCODING_NAME".  [ruby-dev:32606]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-15 09:56:59 +00:00
akr 69406aad50 * encoding.c (rb_enc_precise_mbclen): new function for mbclen with
validation.

* include/ruby/encoding.h (rb_enc_precise_mbclen): declared.
  (MBCLEN_CHARFOUND): new macro.
  (MBCLEN_INVALID): new macro.
  (MBCLEN_NEEDMORE): new macro.

* include/ruby/oniguruma.h (OnigEncodingTypeST): replace mbc_enc_len
  by precise_mbc_enc_len.
  (ONIGENC_PRECISE_MBC_ENC_LEN): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_INVALID): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE): new macro.
  (ONIGENC_MBCLEN_CHARFOUND): new macro.
  (ONIGENC_MBCLEN_INVALID): new macro.
  (ONIGENC_MBCLEN_NEEDMORE): new macro.
  (ONIGENC_MBC_ENC_LEN): use ONIGENC_PRECISE_MBC_ENC_LEN.

* enc/euc_jp.c: validation implemented.

* enc/sjis.c: ditto.

* enc/utf8.c: ditto.

* string.c (rb_str_inspect): use rb_enc_precise_mbclen for invalid
  encoding.
  (rb_str_valid_encoding_p): new method String#valid_encoding?.

* io.c (rb_io_getc): use rb_enc_precise_mbclen.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-06 09:28:26 +00:00
nobu 59609a4fba * enc/utf8.c (utf8_code_to_mbclen): 0xfe and 0xff are valid Unicode to
be encoded to 2bytes in UTF-8.  [ruby-core:12700]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13727 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 18:06:31 +00:00
nobu 72483cdcca * Makefile.in, */Makefile.sub (VPATH): add enc directory.
* common.mk (ENCOBJS): encoding objects.

* enc: directory for encodings.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13675 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-10 21:35:45 +00:00