naruse
6e6a28183a
* unicode.c (PROPERTY_NAME_MAX_SIZE): use MAX_WORD_LENGTH.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24677 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26 17:01:10 +00:00
nobu
1af43ae867
* enc/unicode.c (onigenc_unicode_mbc_case_fold): balanced braces.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26 00:48:49 +00:00
nobu
1fd7f2e57d
* enc/unicode/name2ctype.h: updated.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24657 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25 21:54:03 +00:00
naruse
f1eff95745
Update Oniguruma's UnicodeData to 5.1.
...
* tool/enc-unicode.rb: added for generate name2ctype.kwd.
contributed by Run Paint Run Run [ruby-core:24775]
use like following:
ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \
enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd
* enc/unicode.c (CodeRanges): move definitions to name2ctype.h.
* enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd,
enc/unicode/name2ctype.src: updated to v5.1.
* enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1.
* Makefile.in: add rule to generate name2ctype.kwd from
UnicodeData.txt and Scripts.txt.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25 16:15:38 +00:00
nobu
a7b920686a
* enc/unicode/name2ctype.h: split from enc/unicode.c and made a
...
perfect hash.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-21 08:01:09 +00:00
nobu
a606038c6a
* enc/utf_8.c (code_to_mbc): suppressed a warning.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24607 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-21 06:37:36 +00:00
nobu
e1c9ac6bd9
* enc/unicode.c (CodeRanges): initialized statically.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-19 02:32:49 +00:00
naruse
2b91cbbf11
* enc/Makefile.in (MKDIRS): revert r24525.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24538 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-14 08:20:13 +00:00
nobu
24c783e95e
* configure.in, Makefile.in (MAKEDIRS): used MKDIR_P instead of
...
as_mkdir_p. [ruby-dev:39063]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24525 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-13 07:20:16 +00:00
naruse
38107457a3
* enc/encdb.c (ENC_SET_BASE): fix typo. patch by ujihisa [ruby-dev:39004]
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 03:42:11 +00:00
naruse
8c658137d9
More strict for Big5 series.
...
* enc/big5.c (EncLen_Big5): back to original Big5 table.
(EncLen_Big5_HKSCS): for Big5-HKSCS.
(trans): add the lead byte table for Big5-HKSCS.
(big5_mbc_enc_len): abstract function for Big5 series.
(big5_mbc_enc_len): for Big5.
(big5_hkscs_mbc_enc_len): for Big5-HKSCS.
(BIG5_HKSCS_P): added.
(BIG5_ISMB_FIRST): add routine for Big5-HKSCS.
(big5_hkscs): add for Big5-HKSCS.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 00:51:22 +00:00
naruse
b3d7273dc1
Add functions and macros for second encoding definitions.
...
* encoding.c (rb_enc_set_base): Add for setting base encoding
with their names. this is internal function.
* template/encdb.h.tmpl: specify ENC_SET_BASE for second
encodings in each encoding files.
* enc/encdb.c (rb_enc_set_base): add a declaration.
(ENC_SET_BASE): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-04 00:50:59 +00:00
nobu
4fd615943e
* enc/big5.c: not executable.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-25 04:42:48 +00:00
naruse
a8951a5b3a
* enc/big5.c: Fix EncLen_BIG5 for Big5-HKSCS. see [ruby-core:24390]
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-24 16:15:53 +00:00
duerst
2886207584
* enc/trans/big5.trans, big5-hkscs-tbl.rb:
...
new Chinese BIG5-HKSCS transcoding (with Tatsuya Mizuno)
* test/ruby/test_transcode.rb: added tests for the above
(with Tatsuya Mizuno)
* enc/big5.c: Added BIG5-HKSCS as a replicate encoding of BIG5
(short term solution, needs more work; with Tatsuya Mizuno)
* tool/transcode-tblgen.rb: made 'pat' directly accessible in
class StrSet
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-24 10:26:18 +00:00
nobu
c030cf1975
* ruby.c (process_options), enc/prelude.rb: encdb and transdb are
...
extension libraries.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-22 05:41:51 +00:00
naruse
d9cf0f822f
* enc/trans/utf8_mac.trans: remove wrong optimization.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-13 18:55:55 +00:00
naruse
3abca796f4
Fix: DON'T move in_p because before in_p is replaced by buffered data.
...
* transcode.c: NOMAP is now multibyte direct map.
* transcode.c: remove ASIS.
* transcode_data.h: ditto.
* tool/transcode-tb (ActionMap#generate_info): remove :asis.
* tool/transcode-tb (ActionMap#generate_info): add :nomap0.
* enc/trans/utf8_mac.trans: replace :asis by :nomap0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23344 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-05-05 00:05:11 +00:00
naruse
f207f9fd51
* enc/trans/utf8_mac-tbl.rb: don't use Unicode escape.
...
* enc/trans/utf8_mac.trans: follow above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23325 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-05-02 01:38:27 +00:00
nobu
8543ecee53
* enc/trans/utf8_mac.trans: get rid of a 1.9 feature for cross
...
compile.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-30 06:27:51 +00:00
naruse
80705b9fbf
Add new transcoder: CP51932 <-> CP50221.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-29 12:50:57 +00:00
naruse
d0a4f8ada9
* enc/trans/utf8_mac.trans: Add converter for UTF8-MAC.
...
* enc/trans/utf8_mac-tbl.rb: ditto.
* test/ruby/test_econv.rb: tests for above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-04-26 14:21:43 +00:00
nobu
15265f8be6
* enc/depend (link_so): replaces $(TARGET) with basename of the
...
target. [ruby-talk:330286]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23035 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-03-22 21:51:18 +00:00
usa
39bc33d9a7
* enc/depend: extract comile rules to each target for VC++.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21892 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-30 05:13:22 +00:00
nobu
c938de20cd
* common.mk (distclean-enc, realclean-enc): do not call clean of
...
enc.mk twice or more.
* enc/depend (cleanobjs): added deffile.
* lib/mkmf.rb (create_makefile): removes deffile at clean instead
of distclean.
* win32/Makefile.sub (miniruby, LIBRUBY_SO): removes lib and exp
files.
* win32/Makefile.sub (clean, distclean): have moved to common.mk.
* win32/rmdirs.bat: omits `not empty' message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21790 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-27 02:03:54 +00:00
nobu
e24346d6c6
* enc/trans/gb18030.trans: get rid of a 1.9 feature for cross
...
compile. [ruby-core:21345]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21512 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-14 15:06:19 +00:00
duerst
82c673d3a1
* enc/trans/gb18030.trans, gb18030-tbl.rb:
...
new Chinese GB18030 transcoding (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi)
* transcode_data.h, transcode.c, tool/transcode_tblgen.rb:
added support for GB18030-specific 4-byte sequences
(with Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21509 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-14 11:12:30 +00:00
nobu
e668e36b49
* template/{encdb,transdb}.h.tmpl: moved enc/make_encdb.rb and
...
enc/trans/make_transdb.rb using tool/generic_erb.rb.
* common.mk (encdb.h, transdb.h): generates from avobe template.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21490 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-13 09:05:29 +00:00
nobu
4cb8d3316a
* enc/trans/make_transdb.rb (converters): should not depend on the
...
hash order for cross compile.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21489 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-13 08:28:14 +00:00
duerst
deeade6f3e
* enc/trans/gbk.trans, gbk-tbl.rb:
...
new Chinese GBK transcoding (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 09:12:14 +00:00
duerst
fecce9e5e5
* test/ruby/test_transcode.rb: added tests for GB2312
...
(from Yoshihiro Kambayashi)
* enc/trans/chinese.trans: set valid byte patterns for
GB2312 and GB12345
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 08:55:04 +00:00
duerst
3bc30f0b73
* enc/trans/big5.trans, big5-tbl.rb:
...
new Chinese Big5 transcoding (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-04 08:40:26 +00:00
naruse
1240916075
change encoding name.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21285 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-03 14:12:39 +00:00
naruse
2920aaa2d1
* enc/trans/chinese.trans: added for transcoding EUC-CN and GB12345.
...
* enc/trans/GB/: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21283 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-01-03 06:31:05 +00:00
duerst
a28fdecda7
* enc/trans/single_byte.trans, cp850-tbl.rb, cp852-tbl.rb,
...
cp855-tbl.rb, koi8-r-tbl.rb, koi8-u-tbl.rb, tis-620-tbl.rb:
new single-byte transcodings (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi), small cosmetic fixes
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20599 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-09 09:39:25 +00:00
nobu
a6d8d84a9e
* enc/depend (clean-srcs): split out from clean.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-08 15:17:52 +00:00
nobu
8e6ad88737
* enc/depend (LIBS): fixed for disable-shared. [ruby-dev:37103]
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20241 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-17 09:05:19 +00:00
duerst
831e804388
* enc/trans/single_byte.trans, macgreek-tbl.rb, macroman-tbl.rb,
...
macromania-tbl.rb, macturkish-tbl.rb, macukraine-tbl.rb,
ibm437-tbl.rb, ibm852-tbl.rb, ibm855-tbl.rb, ibm857-tbl.rb,
ibm860-tbl.rb, ibm861-tbl.rb, ibm862-tbl.rb, ibm863-tbl.rb,
ibm865-tbl.rb, ibm866-tbl.rb, ibm869-tbl.rb, ibm775-tbl.rb:
new single-byte transcodings (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20178 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-11 05:26:20 +00:00
duerst
d37df9fb13
* enc/trans/single_byte.trans, maccroatioan-tbl.rb,
...
maccyrillic-tbl.rb, maciceland-tbl.rb: new single-byte
transcodings (from Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added tests for the above
(from Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20075 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-31 09:07:21 +00:00
duerst
6fd14ccae5
* enc/trans/single_byte.trans: refactoring to make it easier
...
to add more transcodings (with Yoshihiro Kambayashi)
* enc/trans/iso-8859-1-tbl.rb: new file to avoid having to
treat ISO-8859-1 as special
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-30 05:47:01 +00:00
nobu
da6300e8f8
* enc/us_ascii.c (us_ascii_mbc_enc_len): made static. a patch by
...
Tadashi Saito <shiba AT mail2.accsnet.ne.jp> at [ruby-dev:36916]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19929 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-24 19:00:35 +00:00
duerst
b014f1bc02
* enc/trans/single_byte.trans: adding WINDOWS-wwww encodings
...
(wwww = 874/1250/1251/1253/1254/1255/1256/1257)
(contributed by Yoshihiro Kambayashi)
* enc/trans/windows-wwww-tbl.rb: 8 new files
(contributed by Yoshihiro Kambayashi)
* test/ruby/test_transcode.rb: added test_windows_wwww
(contributed by Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19846 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-19 09:15:37 +00:00
duerst
7adbfbb793
* tool/transcode-tblgen.rb: added set_valid_byte_pattern
...
to reduce coupling between table generation script and
specific encodings.
* enc/trans/single_byte.trans: using set_valid_byte_pattern
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-18 08:10:57 +00:00
nobu
7485e91f76
* common.mk, enc/depend (enc, trans): targets for sources.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-16 05:34:25 +00:00
akr
b968fa97f6
* enc/trans/single_byte.trans (transcode_tblgen_singlebyte): renamed
...
from transcode_tblgen_windows.
(transcode_tblgen_iso8859): use transcode_tblgen_singlebyte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-14 11:33:17 +00:00
duerst
48a303c027
* enc/trans/single_byte.trans: added windows-1252
...
* enc/trans/windows-1252-tbl.rb: new file
(contributed by Yoshihiro Kambayashi)
* tool/transcode-tblgen.rb: listed windows-1252 as '1byte'
* test/ruby/test_transcode.rb: added test_windows_1252
(contributed by Yoshihiro Kambayashi)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-14 04:37:10 +00:00
akr
081c802cb9
* grapheme cluster implementation reverted. [ruby-dev:36375]
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-18 12:53:25 +00:00
akr
b3d772643e
fix typos.
...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16 16:59:08 +00:00
akr
a67d4fa01c
* include/ruby/oniguruma.h (OnigEncodingTypeST): add precise_ret
...
argument for mbc_to_code.
(ONIGENC_MBC_TO_CODE): provide NULL for precise_ret.
(ONIGENC_MBC_PRECISE_CODEPOINT): defined.
* include/ruby/encoding.h (rb_enc_mbc_precise_codepoint): defined.
* regenc.h (onigenc_single_byte_mbc_to_code): precise_ret argument
added.
(onigenc_mbn_mbc_to_code): ditto.
* regenc.c (onigenc_single_byte_mbc_to_code): precise_ret argument
added.
(onigenc_mbn_mbc_to_code): ditto.
* string.c (count_utf8_lead_bytes_with_word): removed.
(str_utf8_nth): removed.
(str_utf8_offset): removed.
(str_strlen): UTF-8 codepoint oriented optimization removed.
(rb_str_substr): ditto.
(enc_succ_char): use rb_enc_mbc_precise_codepoint.
(enc_pred_char): ditto.
(rb_str_succ): ditto.
* encoding.c (rb_enc_ascget): check length with
rb_enc_mbc_precise_codepoint.
(rb_enc_codepoint): use rb_enc_mbc_precise_codepoint.
* regexec.c (string_cmp_ic): add text_end argument.
(match_at): check end of character after exact string matches.
* enc/utf_8.c (graphme_table): defined for extended graphme cluster
boundary.
(grapheme_cmp): defined.
(get_grapheme_properties): defined.
(grapheme_boundary_p): defined.
(MAX_BYTES_LENGTH): defined.
(comb_char_enc_len): defined.
(mbc_to_code0): extracted from mbc_to_code.
(mbc_to_code): use mbc_to_code0.
(left_adjust_combchar_head): defined.
(utf_8): use a extended graphme cluster as a unit.
* enc/unicode.c (onigenc_unicode_mbc_case_fold): use
ONIGENC_MBC_PRECISE_CODEPOINT to extract codepoints.
(onigenc_unicode_get_case_fold_codes_by_str): ditto.
* enc/euc_jp.c (mbc_to_code): follow mbc_to_code field change.
use onigenc_mbn_mbc_to_code.
* enc/shift_jis.c (mbc_to_code): ditto.
* enc/emacs_mule.c (mbc_to_code): ditto.
* enc/gbk.c (gbk_mbc_to_code): follow mbc_to_code field and
onigenc_mbn_mbc_to_code change.
* enc/cp949.c (cp949_mbc_to_code): ditto.
* enc/big5.c (big5_mbc_to_code): ditto.
* enc/euc_tw.c (euctw_mbc_to_code): ditto.
* enc/euc_kr.c (euckr_mbc_to_code): ditto.
* enc/gb18030.c (gb18030_mbc_to_code): ditto.
* enc/utf_32be.c (utf32be_mbc_to_code): follow mbc_to_code field
change.
* enc/utf_16be.c (utf16be_mbc_to_code): ditto.
* enc/utf_32le.c (utf32le_mbc_to_code): ditto.
* enc/utf_16le.c (utf16le_mbc_to_code): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16 16:48:05 +00:00
akr
0675246ba6
* transcode_data.h (rb_transcoder): resetsize_func and resetstate_func
...
also returns ssize_t.
* enc/trans/iso2022.trans: follow the type change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-15 02:11:50 +00:00