Граф коммитов

144 Коммитов

Автор SHA1 Сообщение Дата
naruse ed540e8bdf * encoding.c, Makefile.in, include/ruby/oniguruma.h,
enc/Makefile.in: fix rules for UTF-{16,32}{BE,LE}.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08 13:35:24 +00:00
akr 4e4d4331ca * include/ruby/oniguruma.h (OnigEncodingType): new member
ruby_encoding_index to avoid linear search in rb_enc_to_index.

* include/ruby/encoding.h (rb_enc_to_index): macro defined to use
  ruby_encoding_index.

* encoding.c (rb_enc_to_index): removed.
  (enc_register_at): initialize ruby_encoding_index member.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14931 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 07:48:24 +00:00
akr 063beac343 * encoding.c (rb_enc_internal_get_index): extracted from
rb_enc_get_index.
  (rb_enc_internal_set_index): extracted from rb_enc_associate_index

* include/ruby/encoding.h (ENCODING_SET): work over ENCODING_INLINE_MAX.
  (ENCODING_GET): ditto.
  (ENCODING_IS_ASCII8BIT): defined.
  (ENCODING_CODERANGE_SET): defined.

* re.c (rb_reg_fixed_encoding_p): use ENCODING_IS_ASCII8BIT.

* string.c (rb_enc_str_buf_cat): use ENCODING_IS_ASCII8BIT.

* parse.y (reg_fragment_setenc_gen): use ENCODING_IS_ASCII8BIT.

* marshal.c (has_ivars): use ENCODING_IS_ASCII8BIT.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 02:49:01 +00:00
akr 6cdef2dc7e * $Date$ keyword removed to avoid inclusion of locale dependent
string.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 15:49:38 +00:00
akr 8987b97ca9 * include/ruby/encoding.h (rb_enc_str_buf_cat): declared.
* string.c (coderange_scan): extracted from rb_enc_str_coderange.
  (rb_enc_str_coderange): use coderange_scan.
  (rb_str_shared_replace): copy encoding and coderange.
  (rb_enc_str_buf_cat): new function for linear complexity string
  accumulation with encoding.
  (rb_str_sub_bang): don't conflict substituted part and replacement.
  (str_gsub): use rb_enc_str_buf_cat.
  (rb_str_clear): clear coderange.

* re.c (rb_reg_regsub): use rb_enc_str_buf_cat.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14910 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 09:25:09 +00:00
akr 9eab58ee03 * include/ruby/ruby.h (rb_intern): memorize interned ID for constant
string, using gcc's __builtin_constant_p and statement expression.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14888 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 17:21:53 +00:00
nobu 8638ee26e7 * include/ruby/intern.h, re.c (rb_reg_new): keep interface same as
1.8.  [ruby-core:14583]

* include/ruby/intern.h, re.c (rb_reg_new_str): renamed, and defines
  HAVE_RB_REG_NEW_STR macro to tell if it is available.

* include/ruby/encoding.h (rb_enc_reg_new): added.

* insns.def (toregexp), marshal.c (r_object0): use rb_reg_new_str().

* re.c (rb_reg_regcomp, rb_reg_s_union): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 16:30:33 +00:00
akr e395327e8e parenthesize macro arguments.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 09:41:44 +00:00
nobu 97751bbd5a * win32.h: only VC6 needs extern "C++" for math.h. [ruby-talk:285660]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14875 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 16:20:29 +00:00
matz 52ed8c4edd * include/ruby/oniguruma.h: Oniguruma 1.9.1 merged.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14874 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 15:55:04 +00:00
akr b661bfe319 * include/ruby/ruby.h (st_strcasecmp): declared for STRCASECMP.
(st_strncasecmp): declared for STRNCASECMP.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14871 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 12:57:26 +00:00
akr b1d373308f * regenc.h (onigenc_ascii_is_code_ctype): put back.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 08:56:08 +00:00
akr 5f237d7903 * encoding.c (rb_isalnum): defined.
(rb_isalpha): ditto.
  (rb_isblank): ditto.
  (rb_iscntrl): ditto.
  (rb_isdigit): ditto.
  (rb_isgraph): ditto.
  (rb_islower): ditto.
  (rb_isprint): ditto.
  (rb_ispunct): ditto.
  (rb_isspace): ditto.
  (rb_isupper): ditto.
  (rb_isxdigit): ditto.
  (rb_tolower): ditto.
  (rb_toupper): ditto.

* include/ruby/ruby.h: don't include include/ruby/encoding.h.
  (rb_isascii): defined.
  (rb_isalnum): declared.
  (rb_isalpha): ditto.
  (rb_isblank): ditto.
  (rb_iscntrl): ditto.
  (rb_isdigit): ditto.
  (rb_isgraph): ditto.
  (rb_islower): ditto.
  (rb_isprint): ditto.
  (rb_ispunct): ditto.
  (rb_isspace): ditto.
  (rb_isupper): ditto.
  (rb_isxdigit): ditto.
  (rb_tolower): ditto.
  (rb_toupper): ditto.
  (ISASCII): simplified.
  (ISPRINT): ditto.
  (ISSPACE): ditto.
  (ISUPPER): ditto.
  (ISLOWER): ditto.
  (ISALNUM): ditto.
  (ISALPHA): ditto.
  (ISDIGIT): ditto.
  (ISXDIGIT): ditto.
  (TOUPPER): ditto.
  (TOLOWER): ditto.

* include/ruby/encoding.h (rb_isascii): removed.
  (rb_isalnum): ditto.
  (rb_isalpha): ditto.
  (rb_isblank): ditto.
  (rb_iscntrl): ditto.
  (rb_isdigit): ditto.
  (rb_isgraph): ditto.
  (rb_islower): ditto.
  (rb_isprint): ditto.
  (rb_ispunct): ditto.
  (rb_isspace): ditto.
  (rb_isupper): ditto.
  (rb_isxdigit): ditto.
  (rb_tolower): ditto.
  (rb_toupper): ditto.

* common.mk: dependency updated.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 08:44:01 +00:00
akr 155fda385e * include/ruby/encoding.h (rb_isascii): simplified.
(rb_isalnum): call onigenc_ascii_is_code_ctype without indirect call.
  (rb_isalpha): ditto.
  (rb_isblank): ditto.
  (rb_iscntrl): ditto.
  (rb_isdigit): ditto.
  (rb_isgraph): ditto.
  (rb_islower): ditto.
  (rb_isprint): ditto.
  (rb_ispunct): ditto.
  (rb_isspace): ditto.
  (rb_isupper): ditto.
  (rb_isxdigit): ditto.

* include/ruby/oniguruma.h (onigenc_ascii_is_code_ctype): declaration
  moved from regenc.h.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 06:13:31 +00:00
akr 0352d32f05 * util.c (ruby_strtoul): locale independent strtoul is implemented to
avoid "i".to_i(36) cause 0 under tr_TR locale.
  This is newly implemented, not a copy of missing/strtoul.c.

* include/ruby/ruby.h (ruby_strtoul): declared.
  (STRTOUL): defined to use ruby_strtoul.

* bignum.c, pack.c, ext/socket/socket.c: use STRTOUL.

* configure.in (strtoul): don't check.

* missing/strtoul.c: removed.

* include/ruby/missing.h (strtoul): removed.

* common.mk (strtoul.o): removed.

* LEGAL (missing/strtoul.c): removed.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-02 06:24:27 +00:00
akr aac5220c66 * include/ruby/missing.h (strcasecmp): removed.
(strncasecmp): removed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14849 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-02 05:41:47 +00:00
akr 041e829127 * include/ruby/encoding.h (rb_isascii): defined.
(rb_isalnum): ditto.
  (rb_isalpha): ditto.
  (rb_isblank): ditto.
  (rb_iscntrl): ditto.
  (rb_isdigit): ditto.
  (rb_isgraph): ditto.
  (rb_islower): ditto.
  (rb_isprint): ditto.
  (rb_ispunct): ditto.
  (rb_isspace): ditto.
  (rb_isupper): ditto.
  (rb_isxdigit): ditto.
  (rb_tolower): ditto.
  (rb_toupper): ditto.

* include/ruby/st.h (st_strcasecmp): declared.
  (st_strncasecmp): ditto.

* st.c (type_strcasehash): use st_strcasecmp instead of strcasecmp.
  (st_strcasecmp): defined.
  (st_strncasecmp): ditto.

* include/ruby/ruby.h: include include/ruby/encoding.h.
  (ISASCII): use rb_isascii.
  (ISPRINT): use rb_isprint.
  (ISSPACE): use rb_isspace.
  (ISUPPER): use rb_isupper.
  (ISLOWER): use rb_islower.
  (ISALNUM): use rb_isalnum.
  (ISALPHA): use rb_isalpha.
  (ISDIGIT): use rb_isdigit.
  (ISXDIGIT): use rb_isxdigit.
  (TOUPPER): defined.
  (TOLOWER): ditto.
  (STRCASECMP): ditto.
  (STRNCASECMP): ditto.

* dir.c, encoding.c, file.c, hash.c, process.c, ruby.c, time.c,
  transcode.c, ext/readline/readline.c: use locale insensitive
  functions.  [ruby-core:14662]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14829 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-01 12:24:04 +00:00
nobu 3d0260cc94 * include/ruby/encoding.h (rb_enc_sprintf, rb_enc_vsprintf): prototyped.
* sprintf.c (rb_enc_sprintf, rb_enc_vsprintf): new functions to format
  arguments with encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14806 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-30 21:08:36 +00:00
akr 1c0416e6ee * ext/strscan/strscan.c (str_new): new function for allocate an string
with encoding propagation.
  (extract_range): use str_new.
  (extract_beg_len): ditto.
  (strscan_peek): ditto.
  (strscan_rest): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14772 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-28 14:55:43 +00:00
akr 371977ff3d * encoding.c (rb_locale_encoding): defined.
* include/ruby/encoding.h (rb_locale_encoding): declared.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14768 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-28 10:12:13 +00:00
akr ce2b982cd2 * encoding.c (rb_enc_codelen): show codepoint in error message.
* include/ruby/encoding.h (rb_enc_codelen): comment it returns
  positive integer.

* string.c (rb_str_concat): rb_enc_codelen doesn't return 0.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-27 06:27:39 +00:00
nobu 0ee5a49dd4 * encoding.h (rb_enc_mbc_to_codepoint): wrapper for
ONIGENC_MBC_TO_CODE().

* string.c (rb_str_succ): deal with invalid sequence as binary.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 10:01:06 +00:00
shugo 8e4c6d8482 * vm.c (rb_frame_method_id_and_class): new function to get the
method id and class of the current frame.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 08:49:09 +00:00
ko1 b79c868296 * include/ruby/ruby.h, thread.c: rename is_ruby_native_thread() to
ruby_native_thread_p().
* ext/tk/tcltklib.c: apply it.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14680 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-25 07:28:51 +00:00
matz a04a812ed0 * include/ruby/encoding.h (rb_enc_left_char_head): new utility macro.
* include/ruby/encoding.h (rb_enc_right_char_head): ditto.

* io.c (appendline): does multibyte RS search in the function.

* io.c (prepare_getline_args): RS may be nil.

* io.c (rb_io_getc): should process character based on external
  encoding, when transcoding required.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14619 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 16:36:14 +00:00
akr 30f1eb1856 * include/ruby/intern.h, random.c, array.c:
change exported name.
  genrand_int32 -> rb_genrand_int32.
  genrand_real -> rb_genrand_real.
  [ruby-core:14335]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14588 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 08:19:28 +00:00
nobu efed292c43 * load.c (rb_feature_p): returns loading path name too.
* load.c (search_required): returns path too if feature is being
  loaded.  [ruby-dev:32048]  [TODO: refactoring]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14586 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 08:06:16 +00:00
akr 80e38ad66b comment updated.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14526 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-23 15:19:23 +00:00
akr cf36df97fb * encoding.c (rb_enc_codepoint): implemented to raise invalid
encoding.

* include/ruby/encoding.h (rb_enc_codepoint): macro is replaced as a
  declaration.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-23 14:06:00 +00:00
akr 5b809a28f8 * include/ruby/encoding.h, encoding.c, re.c, io.c, parse.y, numeric.c,
ruby.c, transcode.c: rename rb_ascii_encoding. to
  rb_ascii8bit_encoding.  rb_ascii_encoding is ambiguous with 
  ASCII-8BIT and US-ASCII.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 23:47:18 +00:00
davidflanagan b83cbb0c7c * io.c, io.h: temporary patch to partially implement transcode-on-read and transcode-on-write
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14497 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 16:21:09 +00:00
nobu aefc34a041 * common.mk (encs, ext/ripper/ripper.c): needs MFLAGS.
* configure.in (STRINGIZE): stringizing macro.

* include/ruby/defines.h (STRINGIZE): fallback.

* tool/make-snapshot: new file.

* version.c (ruby_description, ruby_copyright): string constants for
  -v option.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 06:14:50 +00:00
matz d7cc14d436 * encoding.c (rb_ascii_encoding): renamed from previous
rb_default_encoding().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 18:55:30 +00:00
nobu 7c2ae80d70 * include/ruby/ruby.h (rb_catch_obj, rb_throw_obj): prototyped.
* include/ruby/intern.h (rb_fiber_alive_p): prototyped.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14431 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 11:13:53 +00:00
nobu 5f2e5c07a7 * encoding.c (rb_enc_replicate): now creates first class encoding.
* encoding.c (rb_define_dummy_encoding): always based on the default
  encoding.

* encoding.c (rb_enc_dummy_p): check if dummy.

* encoding.c (enc_inspect): shows if dummy.

* encoding.c (Init_Encoding): added dummy? method

* include/ruby/encoding.h (ENCODING_INLINE_MAX): increased.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14429 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 11:00:04 +00:00
nobu e791259e00 * enumerator.c (enumerator_iter_i): adjusted for rb_block_call_func.
* include/ruby/ruby.h (rb_block_call_func): function to be called back
  as block.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14416 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 07:33:31 +00:00
nobu 12df6cf7ce * encoding.c (rb_enc_init): use enc_register_at() directly.
* encoding.c (rb_utf8_encoding): returns utf-8 encoding.

* include/ruby/encoding.h (rb_utf8_encoding): prototyped.

* parse.y (UTF8_ENC): uses rb_utf8_encoding().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14410 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 06:59:48 +00:00
akr 0530cf9ff8 * encoding.c: include locale.h
(rb_locale_charmap): new method Encoding.locale_charmap for
  nl_langinfo(CODESET).

* include/ruby/encoding.h (rb_locale_charmap): declared.

* main.c (main): call setlocale with LC_CTYPE.

* ruby.c (locale_encoding): use rb_locale_charmap.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14380 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 02:52:23 +00:00
matz 2521b33ed7 * object.c (rb_obj_freeze): preserve frozen state of immediate
values in internal hash table, a la generic_ivar.

* object.c (rb_obj_frozen_p): check immediate values too.

* variable.c (generic_ivar_set): add frozen check fro immediate
  values.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14294 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-18 08:28:39 +00:00
akr 3f07e548fc * include/ruby/encoding.h (ENC_CODERANGE_VALID): rename from
ENC_CODERANGE_8BIT.

* string.c (rb_enc_str_coderange): follow the renaming.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14257 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 05:07:37 +00:00
matz 266d186bcf * include/ruby/io.h (MakeOpenFile): fptr->enc should be
intialized to zero.  [ruby-dev:32569]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14207 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13 04:10:24 +00:00
akr 646f27b822 * encoding.c (rb_enc_ascget): renamed from rb_enc_get_ascii.
* include/ruby/encoding.h: follow the renaming.

* re.c: ditto. 



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 07:39:16 +00:00
akr 5802768b40 * encoding.c (rb_enc_get_ascii): add an argument to provide the
length of the returned character.

* include/ruby/encoding.h (rb_enc_get_ascii): add the argument.

* re.c (rb_reg_expr_str): modify rb_enc_get_ascii call.
  (rb_reg_quote): ditto.
  (rb_reg_regsub): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 03:08:50 +00:00
akr 9ee1ab0e28 * include/ruby/oniguruma.h (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE):
parenthesize an argument.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14189 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 00:41:56 +00:00
nobu 6d7999c132 * transcode.c (str_transcode): allow non-registered encodings.
[ruby-dev:32520]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14182 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 12:47:55 +00:00
matz 9d8075b99c * parse.y (expr): redefinable not (!) operator.
* parse.y (arg): ditto.

* object.c (rb_obj_not): new method "!".

* object.c (rb_obj_not_equal): new method "!=".

* object.c (rb_obj_not_match): new method "!~".

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14162 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 16:39:49 +00:00
akr 3a8f7f1d7f * include/ruby/ruby.h (FilePathStringValue): defined. similar to
FilePathValue but no taint check.

* file.c (rb_get_path_no_checksafe): implementation of
  FilePathStringValue.
  (rb_file_s_basename): use FilePathStringValue.
  (rb_file_s_dirname): ditto.
  (rb_file_s_extname): ditto.
  (rb_file_s_split): ditto.
  (rb_file_join): ditto.

* dir.c (file_s_fnmatch): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14155 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 05:12:31 +00:00
akr f1b7e60cb9 * encoding.c (rb_enc_mbclen): make it never fail.
(rb_enc_nth): don't check the return value of rb_enc_mbclen.
  (rb_enc_strlen): ditto.
  (rb_enc_precise_mbclen): return needmore(1) if e <= p.
  (rb_enc_get_ascii): new function for extracting ASCII character.

* include/ruby/encoding.h (rb_enc_get_ascii): declared.

* include/ruby/regex.h (ismbchar): removed.

* re.c (rb_reg_expr_str): use rb_enc_get_ascii.
  (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine
  the termination of escaped non-ASCII character.
  (unescape_nonascii): use rb_enc_precise_mbclen.
  (rb_reg_quote): use rb_enc_get_ascii.
  (rb_reg_regsub): use rb_enc_get_ascii.

* string.c (rb_str_reverse) don't check the return value of
  rb_enc_mbclen.
  (rb_str_split_m): don't call rb_enc_mbclen with e <= p.

* parse.y (is_identchar): use ISASCII.
  (parser_ismbchar): removed.
  (parser_precise_mbclen): new macro.
  (parser_isascii): new macro.
  (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid
  character precisely.
  (parser_tokadd_string): use parser_isascii.
  (parser_yylex): ditto.
  (is_special_global_name): don't call is_identchar with e <= p.
  (rb_enc_symname_p): ditto.

  [ruby-dev:32455]

* ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie
  because the encoding is not UTF-8.  [ruby-dev:32475]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08 02:50:43 +00:00
akr 69406aad50 * encoding.c (rb_enc_precise_mbclen): new function for mbclen with
validation.

* include/ruby/encoding.h (rb_enc_precise_mbclen): declared.
  (MBCLEN_CHARFOUND): new macro.
  (MBCLEN_INVALID): new macro.
  (MBCLEN_NEEDMORE): new macro.

* include/ruby/oniguruma.h (OnigEncodingTypeST): replace mbc_enc_len
  by precise_mbc_enc_len.
  (ONIGENC_PRECISE_MBC_ENC_LEN): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_INVALID): new macro.
  (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE): new macro.
  (ONIGENC_MBCLEN_CHARFOUND): new macro.
  (ONIGENC_MBCLEN_INVALID): new macro.
  (ONIGENC_MBCLEN_NEEDMORE): new macro.
  (ONIGENC_MBC_ENC_LEN): use ONIGENC_PRECISE_MBC_ENC_LEN.

* enc/euc_jp.c: validation implemented.

* enc/sjis.c: ditto.

* enc/utf8.c: ditto.

* string.c (rb_str_inspect): use rb_enc_precise_mbclen for invalid
  encoding.
  (rb_str_valid_encoding_p): new method String#valid_encoding?.

* io.c (rb_io_getc): use rb_enc_precise_mbclen.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-06 09:28:26 +00:00
akr 7ff702406a * include/ruby/intern.h (rb_uv_to_utf8): declared.
* re.c (rb_reg_preprocess): new function for dynamic regexp with
  \u{} such as Regexp.new("\\u{6666}").
  (rb_reg_prepare_re): preprocess regexp for recompiling.
  (read_escaped_byte): new function.
  (unescape_escaped_nonascii): new function.
  (append_utf8): new function.
  (unescape_unicode_list): new function.
  (unescape_unicode_bmp): new function.
  (unescape_nonascii): new function.
  (rb_reg_initialize): preprocess regexp.

* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.

* parse.y (STR_NEW3): take func instead of has8 and hasmb.
  (parser_str_new): use default coderange mechanism except for regexp.
  (parser_tokadd_utf8): copy regexp source as-is.
  (parser_read_escape): UTF-8 stuff removed.
  (parser_tokadd_escape): has8bit and hasmb removed.
  (parser_tokadd_string): fix 8-bit single byte character with \u.
  (parser_parse_string): has8bit and hasmb removed.
  (parser_here_document): has8bit and hasmb removed.
  (parser_yylex): call parser_tokadd_utf8 instead of read_escape for
  UTF-8 character.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-01 16:56:19 +00:00