some strings are allocated before rb_cString is created.
This prevents a "called on terminated object" error by
ObjectSpace.each_object(Module) {|m| p m.name }.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14631 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
ruby.c, transcode.c: rename rb_ascii_encoding. to
rb_ascii8bit_encoding. rb_ascii_encoding is ambiguous with
ASCII-8BIT and US-ASCII.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
not be reserved word. a patch from Keita Yamaguchi
<keita.yamaguchi AT gmail.com> in [ruby-dev:32675].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14450 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
literal is used.
(reg_named_capture_assign_gen): assign the result of named capture
into local variables.
[ruby-dev:32588]
* re.c: document the assignment by named captures.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14297 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_enc_nth): don't check the return value of rb_enc_mbclen.
(rb_enc_strlen): ditto.
(rb_enc_precise_mbclen): return needmore(1) if e <= p.
(rb_enc_get_ascii): new function for extracting ASCII character.
* include/ruby/encoding.h (rb_enc_get_ascii): declared.
* include/ruby/regex.h (ismbchar): removed.
* re.c (rb_reg_expr_str): use rb_enc_get_ascii.
(unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine
the termination of escaped non-ASCII character.
(unescape_nonascii): use rb_enc_precise_mbclen.
(rb_reg_quote): use rb_enc_get_ascii.
(rb_reg_regsub): use rb_enc_get_ascii.
* string.c (rb_str_reverse) don't check the return value of
rb_enc_mbclen.
(rb_str_split_m): don't call rb_enc_mbclen with e <= p.
* parse.y (is_identchar): use ISASCII.
(parser_ismbchar): removed.
(parser_precise_mbclen): new macro.
(parser_isascii): new macro.
(parser_tokadd_mbchar): use parser_precise_mbclen to check invalid
character precisely.
(parser_tokadd_string): use parser_isascii.
(parser_yylex): ditto.
(is_special_global_name): don't call is_identchar with e <= p.
(rb_enc_symname_p): ditto.
[ruby-dev:32455]
* ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie
because the encoding is not UTF-8. [ruby-dev:32475]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
[ruby-dev:31351]
* thread.c (ruby_suppress_tracing): added a new parameter, which
directs to call func always.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14104 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* re.c (rb_reg_preprocess): new function for dynamic regexp with
\u{} such as Regexp.new("\\u{6666}").
(rb_reg_prepare_re): preprocess regexp for recompiling.
(read_escaped_byte): new function.
(unescape_escaped_nonascii): new function.
(append_utf8): new function.
(unescape_unicode_list): new function.
(unescape_unicode_bmp): new function.
(unescape_nonascii): new function.
(rb_reg_initialize): preprocess regexp.
* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.
* parse.y (STR_NEW3): take func instead of has8 and hasmb.
(parser_str_new): use default coderange mechanism except for regexp.
(parser_tokadd_utf8): copy regexp source as-is.
(parser_read_escape): UTF-8 stuff removed.
(parser_tokadd_escape): has8bit and hasmb removed.
(parser_tokadd_string): fix 8-bit single byte character with \u.
(parser_parse_string): has8bit and hasmb removed.
(parser_here_document): has8bit and hasmb removed.
(parser_yylex): call parser_tokadd_utf8 instead of read_escape for
UTF-8 character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT.
rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT.
Because single byte 8bit character, such as Shift_JIS 1byte katakana,
is represented by ENC_CODERANGE_MULTI even if it is not multi byte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
[ruby-dev:32250]
* parse.y: remove NEED_ASSOC that break test_parser_events.
* parse.y (parser_yylex): should not decrement line numbers at the
end of file.
* file.c (rb_find_file_ext): search .rb files first through in the
loadpath.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
bison with sed. [ruby-dev:32204]
* ruby.c (proc_options): use yydebug in cmdline_options.
* ruby.c (process_options): set yydebug flag of parser.
* parse.y (yydebug): moved into struct parser_params.
* parse.y (rb_parser_get_yydebug, rb_parser_set_yydebug): parser
generic methods.
* */Makefile.sub (parse.c): moved to common.mk.
* tool/ytab.sed: comment out yydebug definition, and substitute
yyerror with parser_yyerror.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13908 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* parse.y (parser_yylex): adjust line number for fluent interface.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13858 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
also modifications for better coderange detection
* test/ruby/test_unicode_escapes.rb: test cases
* test/ruby/test_mixed_unicode_escapes.rb: mixed encoding test cases
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
correct interning of multi-byte symbols. Without this patch
:x==:x is false when x is a multi-byte character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13835 b2dd03c8-39d4-4d8f-98ff-823fe69b080e