Граф коммитов

395 Коммитов

Автор SHA1 Сообщение Дата
matz 880a96c795 * re.c (rb_reg_prepare_enc): error condition was updated for non
ASCII compatible strings.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16423 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-15 10:45:58 +00:00
matz ab24f2b077 * re.c (rb_reg_prepare_re): made non static with small refactoring.
* ext/strscan/strscan.c (strscan_do_scan): should adjust encoding
  before regex searching.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-12 06:09:53 +00:00
matz f34a75657d * re.c (Init_Regexp): remove MatchData#select. [ruby-dev:34563]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-02 04:57:19 +00:00
nobu cc88283bad * re.c (rb_reg_search): use local variable. a patch from wanabe
<s.wanabe AT gmail.com> in [ruby-dev:34537].  [ruby-dev:34492]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16239 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-30 08:47:23 +00:00
nobu 89c408704b * enumerator.c (enumerator_each, enumerator_with_index): suppress
warnings.

* pack.c (pack_unpack): ditto.

* process.c (rb_syswait): ditto.

* re.c (rb_reg_prepare_enc, rb_reg_prepare_re,
  rb_reg_adjust_startpos): ditto.

* regparse.c (onig_name_to_group_numbers): ditto.

* missing/vsnprintf.c (BSD_vfprintf): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-22 13:49:43 +00:00
matz fee4ed204f * re.c (rb_reg_search): make search reentrant. [ruby-dev:34223]
* test/ruby/test_parse.rb (TestParse::test_global_variable):
  should preserve $& variable.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-14 15:47:51 +00:00
matz 1dcbd6921e * re.c (rb_reg_quote): should always copy the quoting string.
[ruby-core:16235]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15925 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-08 01:53:35 +00:00
naruse 3467a1754c * re.c (rb_memsearch_qs): wrong boundary condition.
* re.c (rb_memsearch_qs_utf8): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15903 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-04 14:26:19 +00:00
matz 2b8af7d624 * re.c (rb_memsearch_qs): wrong boundary condition. a patch from
wanabe <s.wanabe AT gmail.com> in [ruby-dev:34248].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15902 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-04 05:13:06 +00:00
naruse e58adeae0f * re.c (rb_memsearch_ss): simple shift search.
* re.c (rb_memsearch_qs): quick search.

* re.c (rb_memsearch_qs_utf8): quick search for UTF-8 string.

* re.c (rb_memsearch_qs_utf8_hash): hash functions for above.

* re.c (rb_memsearch): use above functions.

* string.c (rb_str_index): give enc to rb_memsearch.

* include/ruby/intern.h (rb_memsearch): move to encoding.h.

* include/ruby/encoding.h (rb_memsearch): move from intern.h.

* common.mk (PREP): add dependency.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-17 19:04:29 +00:00
akr 861219ce4a fix doc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-09 01:04:46 +00:00
matz 39787ea14d * numeric.c (fix_to_s): avoid rb_scan_args() when no argument
given. 
* bignum.c (rb_big_to_s): ditto.
* enum.c (enum_first): ditto.
* eval_jump.c (rb_f_catch): ditto.
* io.c (rb_obj_display): ditto.
* class.c (rb_obj_singleton_methods): ditto.
* object.c (rb_class_initialize): ditto.
* random.c (rb_f_srand): ditto.
* range.c (range_step): ditto.
* re.c (rb_reg_s_last_match): ditto.
* string.c (rb_str_to_i): ditto.
* string.c (rb_str_each_line): ditto.
* string.c (rb_str_chomp_bang): ditto.
* string.c (rb_str_sum): ditto.

* string.c (str_modifiable): declare inline.
* string.c (str_independent): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15691 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-05 05:22:17 +00:00
matz bbc2f80a32 * re.c (rb_reg_regsub): remove too strict encoding check.
[ruby-dev:33966]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-03 08:22:18 +00:00
matz daa622aed0 * time.c (time_strftime): format should be ascii compatible.
* parse.y (rb_intern3): non ASCII compatible symbols.

* re.c (rb_reg_regsub): add encoding check.

* string.c (rb_str_chomp_bang): ditto.

* test/ruby/test_utf16.rb (TestUTF16::test_chomp): raises exception.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-29 09:19:15 +00:00
akr d77ddf33ae add tests for sub/gsub with hash.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15535 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 03:51:34 +00:00
akr 1783b7aacc typo fix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15534 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 03:43:11 +00:00
akr a74c11cd4a * re.c (re_warn): defined to restore warnings for /[a-c-e]/, etc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15532 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 02:52:10 +00:00
akr 583a4b1774 * re.c (rb_reg_regsub): don't repeat repl twice with
"X".sub!(/./, sprintf("\\%c", 255)).


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15527 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 15:35:09 +00:00
akr b8fd2fabbe * re.c (rb_reg_prepare_re): add enable_warning parameter.
(rb_reg_adjust_startpos): disable warning by rb_reg_prepare_re.
  (rb_reg_search): follow rb_reg_prepare_re parameter change.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 12:54:17 +00:00
akr 0f4199fb56 * re.c (rb_reg_quote): return US-ASCII string consistently.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15515 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 02:00:05 +00:00
akr 71c5e48598 * include/ruby/re.h (struct rmatch_offset): new struct for character
offsets.
  (struct rmatch): new struct.
  (struct RMatch): reference struct rmatch.
  (RMATCH_REGS): new macro.

* re.c (match_alloc): initialize struct rmatch.
  (pair_byte_cmp): new function.
  (update_char_offset): update character offsets.
  (match_init_copy): copy regexp and character offsets.
  (match_sublen): removed.
  (match_offset): use update_char_offset.
  (match_begin): ditto.
  (match_end): ditto.
  (rb_reg_search): make character offset updated flag false.
  (match_size): use RMATCH_REGS.
  (match_backref_number): ditto.
  (rb_reg_nth_defined): ditto.
  (rb_reg_nth_match): ditto.
  (rb_reg_match_pre): ditto.
  (rb_reg_match_post): ditto.
  (rb_reg_match_last): ditto.
  (match_array): ditto.
  (match_aref): ditto.
  (match_values_at): ditto.
  (match_inspect): ditto.

* string.c (rb_str_subpat_set): use RMATCH_REGS.
  (rb_str_sub_bang): ditto.
  (str_gsub): ditto.
  (rb_str_split_m): ditto.
  (scan_once): ditto.

* gc.c (obj_free): free character offsets.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 20:08:35 +00:00
akr 60fa63b819 * re.c (match_inspect): avoid SEGV with MatchData.allocate.inspect.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15509 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 11:13:47 +00:00
nobu 17fb1248af * re.c (rb_reg_quote): set US-ACII for ASCII-only string.
[ruby-dev:33785]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15481 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 01:35:56 +00:00
akr ec4756f633 * re.c (rb_reg_preprocess_dregexp): use non-preprocessed regexp source
for result.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15465 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-14 03:34:12 +00:00
akr d5c8ad5359 * insns.def (toregexp): generate a regexp from strings instead of one
string.

* re.c (rb_reg_new_ary): defined for toregexp.  it concatenates
  strings after each string is preprocessed. 

* compile.c (compile_dstr_fragments): split from compile_dstr.
  (compile_dstr): call compile_dstr_fragments.
  (compile_dregx): defined for dynamic regexp.
  (iseq_compile_each): use compile_dregx for dynamic regexp.

  [ruby-dev:33400]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:03:51 +00:00
naruse 3c6969ec11 * string.c, parse.y, re.c: use rb_ascii8bit_encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15292 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 09:03:09 +00:00
akr fc208c1bd5 * include/ruby/oniguruma.h: precise mbclen API redesigned to avoid
inline functions.
  (onigenc_mbclen_charfound): removed.
  (onigenc_mbclen_needmore): removed.
  (onigenc_mbclen_recover): removed.
  (ONIGENC_MBCLEN_CHARFOUND): removed.
  (ONIGENC_MBCLEN_CHARFOUND_P): defined.
  (ONIGENC_MBCLEN_CHARFOUND_LEN): defined.
  (ONIGENC_MBCLEN_INVALID): removed.
  (ONIGENC_MBCLEN_INVALID_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE): removed.
  (ONIGENC_MBCLEN_NEEDMORE_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE_LEN): defined.
  (ONIGENC_MBC_ENC_LEN): use onigenc_mbclen_approximate.

* regenc.c (onigenc_mbclen_approximate): defined.

* include/ruby/encoding.h (MBCLEN_CHARFOUND): removed.
  (MBCLEN_INVALID): removed.
  (MBCLEN_NEEDMORE): removed.
  (MBCLEN_CHARFOUND_P): defined.
  (MBCLEN_INVALID_P): defined.
  (MBCLEN_NEEDMORE_P): defined.
  (MBCLEN_CHARFOUND_LEN): defined.
  (MBCLEN_NEEDMORE_LEN): defined.

* encoding.c: use new API.

* re.c: ditto.

* string.c: ditto.

* parse.y: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15280 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 14:27:07 +00:00
naruse f3fe101d55 * re.c (rb_reg_source): set encoding as regexp encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15265 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 07:26:51 +00:00
akr b9c18bdcdd * re.c (rb_reg_preprocess): force fixed encoding when ASCII
incompatible source string.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-26 21:01:52 +00:00
akr 1e41069754 * include/ruby/intern.h (rb_str_buf_cat_ascii): declared.
* string.c (rb_str_buf_cat_ascii): defined.

* re.c (rb_reg_s_union): use rb_str_buf_cat_ascii to support ASCII
  incompatible encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15232 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-25 07:35:27 +00:00
usa b1257d4d20 * re.c (rb_reg_fixed_encoding_p): no need to treat ASCII-8BIT specially.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-24 09:15:03 +00:00
usa fbe52683e6 * re.c (rb_reg_initialize): 7bit clean regexp should be US-ASCII.
[ruby-dev:33346]



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-24 07:56:12 +00:00
akr 3766eac339 * re.c (rb_reg_prepare_re): fix SEGV by
/a/ =~ "aa".force_encoding("utf-16be").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15178 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 04:40:43 +00:00
usa 61fd7dbf6d * re.c (rb_char_to_option_kcode): Regexp switch `s' should mean
Windows-31J, as wells as `-Ks'.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-18 00:44:15 +00:00
nobu a0029e3adc * re.c (rb_char_to_option_kcode): fixed typo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 12:48:23 +00:00
matz d9ff499bf3 * re.c (rb_char_to_option_kcode): use rb_enc_find_index() instead
of using fixed index value.

* enc/Makefile.in (encsrcdir): make US-ASCII built-in.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15047 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 13:49:29 +00:00
akr a31e2da12c * re.c (rb_reg_prepare_re): initialize error message buffer.
(rb_reg_search): ditto.
  (rb_reg_check_preprocess): ditto.
  (rb_reg_new_str): ditto.
  (rb_enc_reg_new): ditto.
  (rb_reg_compile): ditto.
  (rb_reg_initialize_m): ditto.
  (rb_reg_s_union_m): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15034 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 04:51:10 +00:00
akr 238c59842c * re.c (rb_reg_preprocess): fix fixed_enc condition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14924 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 04:55:26 +00:00
akr 063beac343 * encoding.c (rb_enc_internal_get_index): extracted from
rb_enc_get_index.
  (rb_enc_internal_set_index): extracted from rb_enc_associate_index

* include/ruby/encoding.h (ENCODING_SET): work over ENCODING_INLINE_MAX.
  (ENCODING_GET): ditto.
  (ENCODING_IS_ASCII8BIT): defined.
  (ENCODING_CODERANGE_SET): defined.

* re.c (rb_reg_fixed_encoding_p): use ENCODING_IS_ASCII8BIT.

* string.c (rb_enc_str_buf_cat): use ENCODING_IS_ASCII8BIT.

* parse.y (reg_fragment_setenc_gen): use ENCODING_IS_ASCII8BIT.

* marshal.c (has_ivars): use ENCODING_IS_ASCII8BIT.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 02:49:01 +00:00
akr f38cc001a7 * re.c (rb_reg_initialize_str): forbid raw non ASCII character
for ASCII-8BIT regexp in non ASCII-8BIT script.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14911 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 12:15:48 +00:00
akr 8987b97ca9 * include/ruby/encoding.h (rb_enc_str_buf_cat): declared.
* string.c (coderange_scan): extracted from rb_enc_str_coderange.
  (rb_enc_str_coderange): use coderange_scan.
  (rb_str_shared_replace): copy encoding and coderange.
  (rb_enc_str_buf_cat): new function for linear complexity string
  accumulation with encoding.
  (rb_str_sub_bang): don't conflict substituted part and replacement.
  (str_gsub): use rb_enc_str_buf_cat.
  (rb_str_clear): clear coderange.

* re.c (rb_reg_regsub): use rb_enc_str_buf_cat.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14910 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 09:25:09 +00:00
akr da42c102c1 * re.c (rb_reg_initialize_str): /\x80/n is not an error even if script
encoding is EUC-JP.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14899 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-05 16:39:38 +00:00
nobu 8638ee26e7 * include/ruby/intern.h, re.c (rb_reg_new): keep interface same as
1.8.  [ruby-core:14583]

* include/ruby/intern.h, re.c (rb_reg_new_str): renamed, and defines
  HAVE_RB_REG_NEW_STR macro to tell if it is available.

* include/ruby/encoding.h (rb_enc_reg_new): added.

* insns.def (toregexp), marshal.c (r_object0): use rb_reg_new_str().

* re.c (rb_reg_regcomp, rb_reg_s_union): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 16:30:33 +00:00
akr f780cdec75 * re.c (rb_reg_prepare_re): check string encoding. Oniguruma doesn't
support invalid encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14880 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 05:01:58 +00:00
akr 7d98c90ef2 unused variable removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 03:13:53 +00:00
matz 22e7258275 * re.c (rb_reg_search): avoid inner loop for reverse search.
* regexec.c: unset USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE
  which is turned on since oniguruma 5.9.1.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14878 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 01:24:12 +00:00
akr 52f9c1d2e1 * re.c (rb_reg_search): iterate onig_match for reverse mode.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 17:48:06 +00:00
akr e21907e0f8 fix typos.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14810 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-31 05:52:59 +00:00
nobu 5ee7f4b0b5 * re.c (rb_reg_regsub): returns the given string itself if nothing
changed.

* string.c (rb_str_sub_bang): keeps code-range as possible.

* string.c (str_gsub): adjusts code-range.  [ruby-core:14566]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14782 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-29 13:44:32 +00:00
akr fd640aec82 * re.c (rb_reg_s_union): show encodings in error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-27 07:38:23 +00:00
akr b910bb7761 * re.c (rb_reg_prepare_re): show regexp encoding in the error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14597 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 09:38:20 +00:00
akr 5b809a28f8 * include/ruby/encoding.h, encoding.c, re.c, io.c, parse.y, numeric.c,
ruby.c, transcode.c: rename rb_ascii_encoding. to
  rb_ascii8bit_encoding.  rb_ascii_encoding is ambiguous with 
  ASCII-8BIT and US-ASCII.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 23:47:18 +00:00
akr fa3d06c738 refine error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 07:14:07 +00:00
matz d7cc14d436 * encoding.c (rb_ascii_encoding): renamed from previous
rb_default_encoding().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 18:55:30 +00:00
matz b36c642a85 * re.c (rb_reg_prepare_re): stop ENCODING_NONE warning if the
encoding of the str is ASCII-8BIT.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 18:21:41 +00:00
akr b82a05989e * re.c (ARG_ENCODING_NONE): defined for /.../n option.
(REG_ENCODING_NONE): ditto.
  (rb_char_to_option_kcode): return ARG_ENCODING_NONE for n.
  (rb_reg_prepare_re): warn /ascii/n =~ "non-ascii".
  (rb_reg_initialize): set REG_ENCODING_NONE from ARG_ENCODING_NONE.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 16:39:36 +00:00
akr e667720bd4 * re.c (append_utf8): use rb_utf8_encoding() instead of
rb_enc_find("utf-8").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14412 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 07:07:21 +00:00
matz 668bd7d992 * test/ruby/test_system.rb (TestSystem::valid_syntax): apply
ASCII-8BIT encoding explicitly.

* re.c (rb_reg_prepare_re): add encoding name in the message.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14402 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 05:03:14 +00:00
akr 59dca19910 * re.c: change "character encodings differ" error messages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14401 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 04:54:54 +00:00
matz 77629d2cbe * string.c (rb_str_rindex_m): too much adjustment.
* re.c (reg_match_pos): pos adjustment should be based on
  characters.

* test/ruby/test_m17n.rb (TestM17N::test_str_insert): test updated
  to check negative offset behavior.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 17:02:29 +00:00
nobu 474a88f041 * re.c (rb_reg_regsub): should set checked encoding.
* string.c (rb_str_sub_bang): applied r14212 too.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 12:42:19 +00:00
akr 2d01290cfd * parse.y (arg tMATCH arg): call reg_named_capture_assign_gen if regexp
literal is used.
  (reg_named_capture_assign_gen): assign the result of named capture
  into local variables.
  [ruby-dev:32588]

* re.c: document the assignment by named captures.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14297 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-18 11:26:24 +00:00
matz ebfcc5d933 * re.c (rb_reg_initialize): raise error if non-Unicode fixed
encoding option is specified for regexp literals with \u{}
  escapes.

* string.c (rb_str_squeeze_bang): should squeeze multibyte
  characters as well.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 16:06:21 +00:00
matz d6a70c4bb7 * string.c (scan_once): need no encoding compatibility check.
it's done inside of re_reg_seach().

* string.c (rb_str_split_m): ditto.

* re.c (rb_reg_regsub): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 09:44:06 +00:00
matz 7bb3ea6afa * re.c (rb_reg_initialize): embedded string may override encoding
of the regular expression.

* re.c (rb_reg_initialize): fix encoding of regular expression if
  embedded string has its own encoding specified.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13 16:09:53 +00:00
matz a648fc802b * encoding.c (rb_enc_compatible): encoding should never fall back
to ASCII-8BIT unless both encodings are ASCII-8BIT.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13 13:44:02 +00:00
akr b92cee1ddb * re.c, regerror.c, string.c, parse.y, ruby.c, file.c:
use capital letter for \xHH notation.  [ruby-dev:32511]



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12 14:30:54 +00:00
nobu ad72efa269 * re.c (rb_reg_regsub): should copy encoding.
* string.c (rb_str_sub_bang, str_gsub): should check and copy encoding
  to be replaced.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12 03:11:44 +00:00
akr 646f27b822 * encoding.c (rb_enc_ascget): renamed from rb_enc_get_ascii.
* include/ruby/encoding.h: follow the renaming.

* re.c: ditto. 



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 07:39:16 +00:00
akr 5802768b40 * encoding.c (rb_enc_get_ascii): add an argument to provide the
length of the returned character.

* include/ruby/encoding.h (rb_enc_get_ascii): add the argument.

* re.c (rb_reg_expr_str): modify rb_enc_get_ascii call.
  (rb_reg_quote): ditto.
  (rb_reg_regsub): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 03:08:50 +00:00
matz f6a9c859be * re.c (rb_reg_match): should calculate offset by converted
operand.  [ruby-cvs:21416]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 10:03:48 +00:00
nobu 38a24d73c8 * re.c (rb_reg_search): return byte offset. [ruby-dev:32452]
* re.c (rb_reg_match, rb_reg_match2, rb_reg_match_m): convert byte
  offset to char index.

* string.c (rb_str_index): return byte offset.  [ruby-dev:32472]

* string.c (rb_str_split_m): calculate in byte offset.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14171 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 04:50:35 +00:00
akr f4592d7bb0 * re.c (rb_reg_expr_str): use \xHH instead of \OOO.
* regerror.c (to_ascii): ditto.
  (onig_snprintf_with_pattern): ditto.
  (onig_snprintf_with_pattern): ditto.

* string.c (rb_str_inspect): ditto.
  (rb_str_dump): ditto.

* parse.y (parser_yylex): ditto.

* ruby.c (proc_options): ditto.

* file.c (rb_f_test): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 21:48:05 +00:00
akr 08eb58d3dd * re.c (rb_reg_names): new method Regexp#names.
(rb_reg_named_captures): new method Regexp#named_captures
  (match_regexp): new method MatchData#regexp.
  (match_names): new method MatchData#names.

* lib/pp.rb (MatchData#pretty_print): show names of named captures.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 21:44:19 +00:00
akr e56e8c758d * re.c (rb_reg_s_last_match): accept named capture's name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14161 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 13:35:38 +00:00
akr 2d101f0a87 Regexp#fixed_encoding? documented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 11:57:06 +00:00
akr 7a7c26be73 document named capture of MatchData#{offset,begin,end,inspect}.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 07:35:54 +00:00
akr 18d8fbac54 * re.c (match_backref_number): new function for converting a backref
name/number to an integer.
  (match_offset): use match_backref_number.
  (match_begin): ditto.
  (match_end): ditto.
  (name_to_backref_number): raise IndexError instead of RuntimeError.
  (match_inspect): show capture index.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 07:12:44 +00:00
akr 5a1c2b2677 * re.c (append_utf8): check unicode range.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 03:50:11 +00:00
akr b12bb50149 * re.c (rb_reg_check_preprocess): new function for validating regexp
fragment.

* parse.y (regexp): invoke reg_fragment_check.
  (reg_fragment_check): defined.
  (reg_fragment_check_gen): defined.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08 07:21:05 +00:00
akr f1b7e60cb9 * encoding.c (rb_enc_mbclen): make it never fail.
(rb_enc_nth): don't check the return value of rb_enc_mbclen.
  (rb_enc_strlen): ditto.
  (rb_enc_precise_mbclen): return needmore(1) if e <= p.
  (rb_enc_get_ascii): new function for extracting ASCII character.

* include/ruby/encoding.h (rb_enc_get_ascii): declared.

* include/ruby/regex.h (ismbchar): removed.

* re.c (rb_reg_expr_str): use rb_enc_get_ascii.
  (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine
  the termination of escaped non-ASCII character.
  (unescape_nonascii): use rb_enc_precise_mbclen.
  (rb_reg_quote): use rb_enc_get_ascii.
  (rb_reg_regsub): use rb_enc_get_ascii.

* string.c (rb_str_reverse) don't check the return value of
  rb_enc_mbclen.
  (rb_str_split_m): don't call rb_enc_mbclen with e <= p.

* parse.y (is_identchar): use ISASCII.
  (parser_ismbchar): removed.
  (parser_precise_mbclen): new macro.
  (parser_isascii): new macro.
  (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid
  character precisely.
  (parser_tokadd_string): use parser_isascii.
  (parser_yylex): ditto.
  (is_special_global_name): don't call is_identchar with e <= p.
  (rb_enc_symname_p): ditto.

  [ruby-dev:32455]

* ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie
  because the encoding is not UTF-8.  [ruby-dev:32475]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08 02:50:43 +00:00
akr 6af5227ec0 fix Regexp#inspect document.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 15:46:21 +00:00
akr 7f65110b53 document MatchData#inspect.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 14:42:05 +00:00
akr c650096adf * re.c (unescape_escaped_nonascii): fix mbclen argument.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14084 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 11:45:02 +00:00
akr 9bd11f24b3 s/unicode/Unicode/ in error messages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 02:53:46 +00:00
akr 7ff702406a * include/ruby/intern.h (rb_uv_to_utf8): declared.
* re.c (rb_reg_preprocess): new function for dynamic regexp with
  \u{} such as Regexp.new("\\u{6666}").
  (rb_reg_prepare_re): preprocess regexp for recompiling.
  (read_escaped_byte): new function.
  (unescape_escaped_nonascii): new function.
  (append_utf8): new function.
  (unescape_unicode_list): new function.
  (unescape_unicode_bmp): new function.
  (unescape_nonascii): new function.
  (rb_reg_initialize): preprocess regexp.

* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.

* parse.y (STR_NEW3): take func instead of has8 and hasmb.
  (parser_str_new): use default coderange mechanism except for regexp.
  (parser_tokadd_utf8): copy regexp source as-is.
  (parser_read_escape): UTF-8 stuff removed.
  (parser_tokadd_escape): has8bit and hasmb removed.
  (parser_tokadd_string): fix 8-bit single byte character with \u.
  (parser_parse_string): has8bit and hasmb removed.
  (parser_here_document): has8bit and hasmb removed.
  (parser_yylex): call parser_tokadd_utf8 instead of read_escape for
  UTF-8 character.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-01 16:56:19 +00:00
akr f5ee0fd521 * include/ruby/encoding.h, encoding.c, re.c, string.c, parse.y:
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT.
  rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT.
  Because single byte 8bit character, such as Shift_JIS 1byte katakana,
  is represented by ENC_CODERANGE_MULTI even if it is not multi byte.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-27 02:21:17 +00:00
akr cbd72b86da * re.c (Init_Regexp): new method Regexp#fixed_encoding?
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26 08:33:11 +00:00
akr 0dc9be63a5 * re.c (rb_reg_fixed_encoding_p): extracted from rb_reg_prepare_re and
rb_reg_s_union.
  (rb_reg_s_union): refactored.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26 02:27:59 +00:00
akr b2e60b2ce7 * include/ruby/encoding.h (rb_enc_str_asciionly_p): declared.
(rb_enc_str_asciicompat_p): defined.

* re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p.
  (rb_reg_quote): return ascii-8bit string if the argument is
  ascii-only to generate encoding generic regexp if possible.
  (rb_reg_s_union): fix encoding handling.  [ruby-dev:32094]

* string.c (rb_enc_str_asciionly_p): defined.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-25 13:25:34 +00:00
akr 2109a52503 * re.c (REG_CASESTATE): unused macro removed.
(rb_reg_prepare_re): check encoding difference.
  (rb_reg_initialize): check 8bit byte.

* parse.y (parser_tokadd_escape): fix has8bit.

  [ruby-dev:32113]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23 06:30:26 +00:00
matz d73f08d56d * re.c (match_begin): should return offset by character.
[ruby-dev:32331]

* re.c (match_end): ditto.

* re.c (rb_reg_search): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23 02:10:44 +00:00
akr af9c868eae * re.c (rb_reg_quote): quote \v as well.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04 15:03:31 +00:00
akr 794fc684e8 * re.c (rb_reg_initialize_m): use StringValuePtr instead of
StringValueCStr because \0 exists when Regexp.new("\0").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13817 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04 14:53:36 +00:00
nobu c7697aba34 * parse.y (parser_regx_options, reg_compile_gen): relaxened encoding
matching rule.

* re.c (rb_reg_initialize): always set encoding of Regexp.

* re.c (rb_reg_initialize_str): fix enconding for non 7bit-clean
  strings.

* re.c (rb_reg_initialize_m): use ascii encoding for 'n' option.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-19 07:41:03 +00:00
matz 05737c3500 * re.c (rb_reg_s_union): the last check was not complete.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17 05:21:10 +00:00
nobu 2d1d6c4705 * encoding.c (rb_enc_from_encoding, rb_enc_register): associate index
to self.

* encoding.c (enc_capable): Encoding objects are encoding capable.

* re.c (rb_reg_s_union): check if encoding matching by exact encoding
  objects.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17 02:30:57 +00:00
nobu b06a606278 * re.c (rb_reg_desc): set encoding.
* re.c (rb_reg_s_union): check encodings.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 18:37:09 +00:00
nobu 81ed881511 * re.c (rb_reg_initialize_m): allow binary encoding option.
[ruby-dev:32083]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 16:57:08 +00:00
nobu 5d8ba5a43f * re.c (rb_reg_s_union): check for encoding of original object.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 10:48:02 +00:00
nobu 676dc908b6 * parse.y (parser_regx_options): check if regexp encoding option
matches to current encoding.

* re.c (char_to_option, rb_char_to_option_kcode): 'n' is not kcode
  option now.

* re.c (rb_reg_to_s, rb_reg_error_desc): copy encoding rather than
  append as an option.

* re.c (make_regexp, rb_reg_prepare_re): use encoding of Regexp and
  String instead of kcode.

* re.c (rb_reg_initialize): set fixed option if none is set.

* re.c (rb_reg_regcomp): ditto.

* re.c (rb_reg_equal): check if encodings are equal.

* re.c (rb_reg_initialize_m): encoding option is obsolete.

* re.c (rb_kcode, rb_get_kcode, rb_set_kcode): removed.

* re.c (Init_Regexp): removed Regexp#kcode method.

* ruby.c (proc_options): allow long encoding name.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 05:48:40 +00:00
matz 9f00119776 * re.c (rb_reg_s_union): encoding of all regexp objects should
match.  [ruby-dev:32076]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 05:06:30 +00:00
matz ba9eb2c929 * re.c (match_values_at): make #select to be alias to #values_at
to adapt RDoc description.  [ruby-core:12588]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13683 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-12 14:35:26 +00:00
matz 79a202433c * re.c (rb_reg_s_quote): no longer takes optional second argument
that has never been documented.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-10 14:34:42 +00:00
akr cf9bdd01d8 fix rdoc position of Regexp.union.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-08 16:03:53 +00:00
akr d751dad12a add an example for Regexp.union document.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13642 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06 05:40:45 +00:00
nobu 6845578c92 * insns.def (opt_eq): get rid of gcc bug.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13641 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06 05:32:37 +00:00
matz bd00bb3ef7 * include/ruby/defines.h: no longer provide DEFAULT_KCODE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-05 17:39:59 +00:00
akr dea669cf4e * re.c (rb_reg_s_union_m): Regexp.union accepts single argument which
is an array of patterns. 


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13638 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-05 12:26:35 +00:00
matz 1d758debe0 replace rb_memcicmp()
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13624 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 09:59:56 +00:00
matz c953283d7e revert rb_memcmp() change to pacify GCC optimizer
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13623 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 09:54:53 +00:00
matz 1677425e9d * re.c (rb_memcmp): no longer useful without ruby_ignorecase.
* re.c (rb_reg_prepare_re): revert recompile condition.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13622 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 09:24:00 +00:00
matz 506cdbf64a * re.c (kcode_setter): restore erroneously removed setter.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 08:34:06 +00:00
matz dbcc539602 * re.c (ignorecase_setter): change warning message.
* re.c (ignorecase_getter): now gives warning.

* string.c (rb_str_cmp_m): update RDoc document.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 08:09:06 +00:00
matz 9a2a45cd69 * re.c (Init_Regexp): remove obsolete const alias: MatchingData.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 07:54:53 +00:00
matz 1c9a2e1154 * re.c (kcode_setter): Perl-ish global variable `$=' no longer
effective.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13616 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 07:31:50 +00:00
nobu 19dee8af57 * encoding.c (rb_obj_encoding): returns encoding of the given object.
* re.c (Init_Regexp): new method Regexp#encoding.

* string.c (str_encoding): moved to encoding.c


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04 06:57:19 +00:00
akr 910b0709ed * re.c (Init_Regexp): test DEFAULT_KCODE in C code because
KCODE_EUC, etc are enum.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13571 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-29 19:06:40 +00:00
matz 5376745fb6 * re.c (rb_reg_match_m): evaluate a block if match. it would make
condition statement much shorter, if no else clause is needed.

* string.c (rb_str_match_m): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-20 17:14:01 +00:00
matz edd7c787ad * array.c (rb_ary_cycle): typo in rdoc. a patch from Yugui
<yugui@yugui.sakura.ne.jp>.  [ruby-dev:31748]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13348 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-06 12:33:45 +00:00
matz 3d7f8c2320 * string.c (str_gsub): should not use mbclen2() which has broken API.
* re.c: remove rb_reg_mbclen2().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-29 19:16:02 +00:00
nobu 69099d3e69 * re.c (rb_reg_mbclen2): suppress a warning.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-29 03:19:00 +00:00
matz 51b4cc11d1 * string.c (rb_str_subseq): retrieve substring based on byte offset.
* string.c (rb_str_rindex_m): was confusing character offset and
  byte offset.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-28 06:45:32 +00:00
nobu c456863bd6 * parse.y, re.c: re-applied revision 13092.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25 07:06:47 +00:00
matz a25fbe3b3e * encoding.c: provide basic features for M17N.
* parse.y: encoding aware parsing.

* parse.y (pragma_encoding): encoding specification pragma.

* parse.y (rb_intern3): encoding specified symbols.

* string.c (rb_str_length): length based on characters.  
  for older behavior, bytesize method added.

* string.c (rb_str_index_m): index based on characters.  rindex as
  well.

* string.c (succ_char): encoding aware succeeding string.

* string.c (rb_str_reverse): reverse based on characters.

* string.c (rb_str_inspect): encoding aware string description.

* string.c (rb_str_upcase_bang): encoding aware case conversion.
  downcase, capitalize, swapcase as well.

* string.c (rb_str_tr_bang): tr based on characters.  delete,
  squeeze, tr_s, count as well.

* string.c (rb_str_split_m): split based on characters.

* string.c (rb_str_each_line): encoding aware each_line.

* string.c (rb_str_each_char): added.  iteration based on
  characters.

* string.c (rb_str_strip_bang): encoding aware whitespace
  stripping.  lstrip, rstrip as well.

* string.c (rb_str_justify): encoding aware justifying (ljust,
  rjust, center).

* string.c (str_encoding): get encoding attribute from a string. 

* re.c (rb_reg_initialize): encoding aware regular expression

* sprintf.c (rb_str_format): formatting (i.e. length count) based
  on characters.

* io.c (rb_io_getc): getc to return one-character string.
  for older behavior, getbyte method added.

* ext/stringio/stringio.c (strio_getc): ditto.

* io.c (rb_io_ungetc): allow pushing arbitrary string at the
  current reading point.

* ext/stringio/stringio.c (strio_ungetc): ditto.

* ext/strscan/strscan.c: encoding support.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13261 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25 03:29:39 +00:00
matz bdf32ff14f * array.c (rb_ary_s_try_convert): more document description.
* re.c (rb_reg_s_try_convert): typo fixed.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13256 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25 00:43:13 +00:00
matz 5e1c401ff5 * array.c (rb_ary_s_try_convert): a new class method to convert
object or nil if it's not target-type.  this mechanism is used
  to convert types in the C implemented methods.

* hash.c (rb_hash_s_try_convert): ditto.

* io.c (rb_io_s_try_convert): ditto.

* re.c (rb_reg_s_try_convert): ditto.

* string.c (rb_str_s_try_convert): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-24 17:47:09 +00:00
nobu 3f025d2078 * parse.y (reg_compile_gen): obtain error info from errinfo.
* re.c (rb_reg_error_desc): make RegexpError for initialization error.

* re.c (rb_reg_compile): return nil and set errinfo if error.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-18 05:05:36 +00:00
nobu 11e1e96f4b * re.c (option_to_str, arg_kcode, opt_kcode): options conversion
between int and string.

* re.c (rb_reg_compile): append regexp options to error message.
  [ruby-dev:31334]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12863 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-02 14:42:59 +00:00
nobu d9274e7d6b * parse.y (reg_compile_gen): set error if failed to compile regexp
literal.  [ruby-dev:31336]

* re.c (rb_reg_compile): should not use regexp which could not get
  initialized.  [ruby-dev:31333]
  return error message to let the parser know it.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-02 14:36:25 +00:00
nobu 46603a78af * include/ruby/{intern,ruby}.h, compile.[ch], error.c, eval.c,
eval_load.c, gc.c, iseq.c, main.c, parse.y, re.c, ruby.c,
  yarvcore.[ch] (ruby_eval_tree, ruby_sourcefile, ruby_sourceline,
  ruby_nerrs): purge global variables.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-07-05 08:12:18 +00:00
akr fe377d3b8e update document to follow MatchData#inspect implementation.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12589 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-06-23 08:34:21 +00:00
akr 18ee945174 * re.c (match_inspect): MatchData#inspect implemented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12588 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-06-23 08:26:08 +00:00
nobu 2b592580bf * include/ruby: moved public headers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12501 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-06-10 03:06:15 +00:00
nobu 99d65b14b4 * compile.c, dir.c, eval.c, eval_jump.h, eval_method.h, numeric.c,
pack.c, parse.y, re.c, thread.c, vm.c, vm_dump.c, call_cfunc.ci,
  thread_pthread.ci, thread_win32.ci: fixed indentation.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12431 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-06-05 04:25:10 +00:00
matz 6ee2e54239 * oniguruma.h: updated to Oniguruma 5.7.0.
* regsyntax.c, unicode.c: new files along with Oniguruma 5.x.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-05-23 01:32:08 +00:00
matz 3098d80818 * re.c (reg_operand): allow symbols to be operands for regular
expression matches.

* string.c (Init_String): allow Symbol#===.

* lib/date/format.rb (Date::Format::Bag::to_hash): string
  added prefixes.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-02-14 04:57:25 +00:00
matz 6bf30a90ef * dir.c (dir_s_glob): restore GC protection volatile variable.
[ruby-dev:29588]

* re.c (rb_reg_regcomp): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10960 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-09-18 06:33:03 +00:00
matz 749df1d0fd * dir.c (dir_s_glob): remove unused variable.
* math.c (math_log): ditto.

* re.c (rb_reg_regcomp): ditto.

* eval.c (break_jump): ditto.

* eval.c (rb_thread_yield_0): remove unused function.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-09-18 01:59:00 +00:00
matz 54af80844f * ruby.h (struct RString): embed small strings.
(RSTRING_LEN): defined for accessing string members.
  (RSTRING_PTR): ditto.

* string.c: use RSTRING_LEN and RSTRING_PTR.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10810 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-08-31 10:47:44 +00:00
matz c09cea5e1b * object.c (rb_mod_attr): make Module#attr to be an alias to
attr_reader.  [RCR#331]

* ruby.h: export classes/modules to implement sandbox.
  [ruby-core:08283]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10577 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-07-20 17:36:36 +00:00
matz 1b7465e893 * eval.c, file.c, etc.: code-cleanup patch from Stefan Huehner
<stefan at huehner.org>.  [ruby-core:08029]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10351 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-06-20 18:02:17 +00:00
matz 9b383bd6cf * sprintf.c (rb_str_format): allow %c to print one character
string (e.g. ?x).

* lib/tempfile.rb (Tempfile::make_tmpname): put dot between
  basename and pid.  [ruby-talk:196272]
* parse.y (do_block): remove -> style block.

* parse.y (parser_yylex): remove tLAMBDA_ARG.

* eval.c (rb_call0): binding for the return event hook should have
  consistent scope.  [ruby-core:07928]

* eval.c (proc_invoke): return behavior should depend whether it
  is surrounded by a lambda or a mere block.

* eval.c (formal_assign): handles post splat arguments.

* eval.c (rb_call0): ditto.

* st.c (strhash): use FNV-1a hash.

* parse.y (parser_yylex): removed experimental ';;' terminator.

* eval.c (rb_node_arity): should be aware of post splat arguments.

* eval.c (rb_proc_arity): ditto.

* parse.y (f_args): syntax rule enhanced to support arguments
  after the splat.

* parse.y (block_param): ditto for block parameters.

* parse.y (f_post_arg): mandatory formal arguments after the splat
  argument.

* parse.y (new_args_gen): generate nodes for mandatory formal
  arguments after the splat argument.

* eval.c (rb_eval): dispatch mandatory formal arguments after the
  splat argument.

* parse.y (args): allow more than one splat in the argument list.

* parse.y (method_call): allow aref [] to accept all kind of
  method argument, including assocs, splat, and block argument.

* eval.c (SETUP_ARGS0): prepare block argument as well.

* lib/mathn.rb (Integer): remove Integer#gcd2. [ruby-core:07931]

* eval.c (error_line): print receivers true/false/nil specially.

* eval.c (rb_proc_yield): handles parameters in yield semantics.

* eval.c (nil_yield): gives LocalJumpError to denote no block
  error.

* io.c (rb_io_getc): now takes one-character string.

* string.c (rb_str_hash): use FNV-1a hash from Fowler/Noll/Vo
  hashing algorithm.

* string.c (rb_str_aref): str[0] now returns 1 character string,
  instead of a fixnum.	[Ruby2]

* parse.y (parser_yylex): ?c now returns 1 character string,
  instead of a fixnum.	[Ruby2]

* string.c (rb_str_aset): no longer support fixnum insertion.

* eval.c (umethod_bind): should not update original class.
  [ruby-dev:28636]

* eval.c (ev_const_get): should support constant access from
  within instance_eval().  [ruby-dev:28327]

* time.c (time_timeval): should round for usec floating
  number.  [ruby-core:07896]

* time.c (time_add): ditto.

* dir.c (sys_warning): should not call a vararg function
  rb_sys_warning() indirectly.	[ruby-core:07886]

* numeric.c (flo_divmod): the first element of Float#divmod should
  be an integer. [ruby-dev:28589]

* test/ruby/test_float.rb: add tests for divmod, div, modulo and remainder.

* re.c (rb_reg_initialize): should not allow modifying literal
  regexps.  frozen check moved from rb_reg_initialize_m as well.

* re.c (rb_reg_initialize): should not modify untainted objects in
  safe levels higher than 3.

* re.c (rb_memcmp): type change from char* to const void*.

* dir.c (dir_close): should not close untainted dir stream.

* dir.c (GetDIR): add tainted/frozen check for each dir operation.

* lib/rdoc/parsers/parse_rb.rb (RDoc::RubyParser::parse_symbol_arg):
  typo fixed.  a patch from Florian Gross <florg at florg.net>.

* eval.c (EXEC_EVENT_HOOK): trace_func may remove itself from
  event_hooks.	no guarantee for arbitrary hook deletion.
  [ruby-dev:28632]

* util.c (ruby_strtod): differ addition to minimize error.
  [ruby-dev:28619]

* util.c (ruby_strtod): should not raise ERANGE when the input
  string does not have any digits.  [ruby-dev:28629]

* eval.c (proc_invoke): should restore old ruby_frame->block.
  thanks to ts <decoux at moulon.inra.fr>.  [ruby-core:07833]
  also fix [ruby-dev:28614] as well.

* signal.c (trap): sig should be less then NSIG.  Coverity found
  this bug.  a patch from Kevin Tew <tewk at tewk.com>.
  [ruby-core:07823]

* math.c (math_log2): add new method inspired by
  [ruby-talk:191237].

* math.c (math_log): add optional base argument to Math::log().
  [ruby-talk:191308]

* ext/syck/emitter.c (syck_scan_scalar): avoid accessing
  uninitialized array element.	a patch from Pat Eyler
  <rubypate at gmail.com>.  [ruby-core:07809]

* array.c (rb_ary_fill): initialize local variables first.  a
  patch from Pat Eyler <rubypate at gmail.com>.	 [ruby-core:07810]

* ext/syck/yaml2byte.c (syck_yaml2byte_handler): need to free
  type_tag.  a patch from Pat Eyler <rubypate at gmail.com>.
  [ruby-core:07808]

* ext/socket/socket.c (make_hostent_internal): accept ai_family
  check from Sam Roberts <sroberts at uniserve.com>.
  [ruby-core:07691]

* util.c (ruby_strtod): should not cut off 18 digits for no
  reason.  [ruby-core:07796]

* array.c (rb_ary_fill): internalize local variable "beg" to
  pacify Coverity.  [ruby-core:07770]

* pack.c (pack_unpack): now supports CRLF newlines.  a patch from
  <tommy at tmtm.org>.	[ruby-dev:28601]

* applied code clean-up patch from Stefan Huehner
  <stefan at huehner.org>.  [ruby-core:07764]

* lib/jcode.rb (String::tr_s): should have translated non
  squeezing character sequence (i.e. a character) as well.  thanks
  to Hiroshi Ichikawa <gimite at gimite.ddo.jp> [ruby-list:42090]

* ext/socket/socket.c: document update patch from Sam Roberts
  <sroberts at uniserve.com>.  [ruby-core:07701]

* lib/mathn.rb (Integer): need not to remove gcd2.  a patch from
  NARUSE, Yui <naruse at airemix.com>.	[ruby-dev:28570]

* parse.y (arg): too much NEW_LIST()

* eval.c (SETUP_ARGS0): remove unnecessary access to nd_alen.

* eval.c (rb_eval): use ARGSCAT for NODE_OP_ASGN1.
  [ruby-dev:28585]

* parse.y (arg): use NODE_ARGSCAT for placeholder.

* lib/getoptlong.rb (GetoptLong::get): RDoc update patch from
  mathew <meta at pobox.com>.  [ruby-core:07738]

* variable.c (rb_const_set): raise error when no target klass is
  supplied.  [ruby-dev:28582]

* prec.c (prec_prec_f): documentation patch from
  <gerardo.santana at gmail.com>.  [ruby-core:07689]

* bignum.c (rb_big_pow): second operand may be too big even if
  it's a Fixnum.  [ruby-talk:187984]

* README.EXT: update symbol description.  [ruby-talk:188104]

* COPYING: explicitly note GPLv2.  [ruby-talk:187922]

* parse.y: remove some obsolete syntax rules (unparenthesized
  method calls in argument list).

* eval.c (rb_call0): insecure calling should be checked for non
  NODE_SCOPE method invocations too.

* eval.c (rb_alias): should preserve the current safe level as
  well as method definition.

* process.c (rb_f_sleep): remove RDoc description about SIGALRM
  which is not valid on the current implementation. [ruby-dev:28464]

 Thu Mar 23 21:40:47 2006  K.Kosako  <sndgk393 AT ybb.ne.jp>

* eval.c (method_missing): should support argument splat in
  super.  a bug in combination of super, splat and
  method_missing.  [ruby-talk:185438]

* configure.in: Solaris SunPro compiler -rapth patch from
  <kuwa at labs.fujitsu.com>.  [ruby-dev:28443]

* configure.in: remove enable_rpath=no for Solaris.
  [ruby-dev:28440]

* ext/win32ole/win32ole.c (ole_val2olevariantdata): change behavior
  of converting  OLE Variant object with VT_ARRAY|VT_UI1 and Ruby
  String object.

* ruby.1: a clarification patch from David Lutterkort
  <dlutter at redhat.com>.  [ruby-core:7508]

* lib/rdoc/ri/ri_paths.rb (RI::Paths): adding paths from rubygems
  directories.	a patch from Eric Hodel <drbrain at segment7.net>.
  [ruby-core:07423]

* eval.c (rb_clear_cache_by_class): clearing wrong cache.

* ext/extmk.rb: use :remove_destination to install extension libraries
  to avoid SEGV.  [ruby-dev:28417]

* eval.c (rb_thread_fd_writable): should not re-schedule output
  from KILLED thread (must be error printing).

* array.c (rb_ary_flatten_bang): allow specifying recursion
  level.  [ruby-talk:182170]

* array.c (rb_ary_flatten): ditto.

* gc.c (add_heap): a heap_slots may overflow.  a patch from Stefan
  Weil <weil at mail.berlios.de>.

* eval.c (rb_call): use separate cache for fcall/vcall
  invocation.

* eval.c (rb_eval): NODE_FCALL, NODE_VCALL can call local
  functions.

* eval.c (rb_mod_local): a new method to specify newly added
  visibility "local".

* eval.c (search_method): search for local methods which are
  visible only from the current class.

* class.c (rb_class_local_methods): a method to list local methods.

* object.c (Init_Object): add BasicObject class as a top level
  BlankSlate class.

* ruby.h (SYM2ID): should not cast to signed long.
  [ruby-core:07414]

* class.c (rb_include_module): allow module duplication.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10235 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-06-09 21:20:17 +00:00
kosako 8c5f9ef2ea don't use onig_recompile()
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10155 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-05-15 12:39:25 +00:00
kosako 6e154b9743 refactoring for options
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-03-26 13:04:13 +00:00
kosako 2025346077 RDoc description updated
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10053 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-03-24 12:14:18 +00:00
kosako 50091c990d prohibit number backref in replaced string for named pattern
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10050 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-03-23 12:43:51 +00:00
kosako 0fa1086760 add back reference by name in replace string
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10046 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-03-22 15:03:40 +00:00
kosako a350e9c6e2 add String/Symbol argument to MatchData[x]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-03-21 13:16:21 +00:00
akr 51f4d6645b * re.c (rb_reg_regcomp): fix a GC problem on x86_64 with
gcc 3.3.5 (Debian 1:3.3.5-13).


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@9684 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-12-13 03:27:51 +00:00