Граф коммитов

543 Коммитов

Автор SHA1 Сообщение Дата
zzak 450b9bb6cb * re.c (rb_reg_eqq): doc: #=== is not a synonym for #=~, added example
[ruby-dev:46746] [Bug #7571]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@38567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-12-23 05:52:50 +00:00
knu 61e21e82ad Apply performance improvement to short byte array search.
* re.c (rb_memsearch_ss): Apply performance improvement to short
  byte array search for platforms without memmem(3).
  [Feature #6311] [ruby-dev:45530]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37793 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-11-22 05:23:12 +00:00
glass c5b19cf01c * re.c (rb_memsearch_ss): performance improvement by using memmem(3) if
possible. [ruby-dev:45530] [Feature #6311]

* configure.in: check existence of memmem(3) and that it is not broken.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37634 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-11-13 02:12:40 +00:00
glass 9f9ebe4eba * re.c (rb_memsearch): performance improvement by using memchr().
[ruby-dev:45397] [Feature #6173]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-11-08 13:40:33 +00:00
nari c3a46d6aca * include/ruby/ruby.h: add C APIs.
VALUE rb_newobj_of(VALUE klass, VALUE flags)
  #define NEWOBJ_OF(obj,type,klass,flags)
  These allow to change a allocation strategy depending on klass
  or flags.

* gc.c: ditto

* array.c: use new C API.
* bignum.c: ditto
* class.c: ditto
* complex.c: ditto
* ext/socket/ancdata.c: ditto
* ext/socket/option.c: ditto
* hash.c: ditto
* io.c: ditto
* marshal.c: ditto
* numeric.c: ditto
* object.c: ditto
* random.c: ditto
* range.c: ditto
* rational.c: ditto
* re.c: ditto
* string.c: ditto
* struct.c: ditto
  [Feature #7177][Feature #7047]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-10-20 06:57:51 +00:00
drbrain 6ac1d39ace * re.c (rb_reg_initialize_m): Forgot to update output for or'd-options
example.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36742 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-08-20 20:26:06 +00:00
drbrain decaaf845e * re.c (rb_reg_initialize_m): Update example to show that regexp
options use | an not || to avoid confusion.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-08-20 20:19:21 +00:00
drbrain a10f6137cc * re.c (rb_reg_s_last_match): Update $~ to reference Regexp
documentation about "special global variables".  [Bug #6723]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36526 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-07-24 21:49:31 +00:00
nobu 2073258a7d obj_init_copy
* object.c (rb_obj_init_copy): should check if trusted too.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-06-05 11:13:18 +00:00
nobu b0dd250dc9 use RB_TYPE_P() instead of comparison of TYPE()
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-05-23 07:13:21 +00:00
drbrain 2dece928e0 * re.c (rb_reg_equal): Removed incorrect example for Regexp#== with
"n" option.  [ruby-talk - Bug #6415]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35600 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-05-08 21:33:31 +00:00
drbrain da39d32f60 * encoding.c (rb_enc_codepoint_len): Use UNREACHABLE to avoid "control
reaches end of non-void function" warnings.  [ruby-trunk - Bug #6066]
* re.c (name_to_backref_number):  ditto.
* object.c (rb_Float):  ditto.
* io.c (io_readpartial):  ditto.
* io.c (io_read_nonblock):  ditto.
* pack.c (rb_uv_to_utf8):  ditto.
* proc.c (rb_method_entry_arity):  ditto.
* vm_method.c (rb_f_notimplement):  ditto.
* struct.c (rb_struct_aset_id):  ditto.
* class.c (rb_scan_args):  ditto.
* process.c (rlimit_resource_type):  ditto.
* process.c (rlimit_resource_value):  ditto.
* process.c (p_uid_switch):  ditto.
* process.c (p_gid_switch):  ditto.
* ext/digest/digest.c (rb_digest_instance_update):  ditto.
* ext/digest/digest.c (rb_digest_instance_finish):  ditto.
* ext/digest/digest.c (rb_digest_instance_reset):  ditto.
* ext/digest/digest.c (rb_digest_instance_block_length):  ditto.
* ext/bigdecimal/bigdecimal.c (BigDecimalCmp):  ditto.
* ext/dl/handle.c (rb_dlhandle_close):  ditto.
* ext/tk/tcltklib.c (pending_exception_check0):  ditto.
* ext/tk/tcltklib.c (pending_exception_check1):  ditto.
* ext/tk/tcltklib.c (ip_cancel_eval_core):  ditto.
* ext/tk/tcltklib.c (lib_get_reltype_name):  ditto.
* ext/tk/tcltklib.c (create_dummy_encoding_for_tk_core):  ditto.
* ext/tk/tkutil/tkutil.c (tk_hash_kv):  ditto.
* ext/openssl/ossl_ssl.c (ossl_ssl_session_reused):  ditto.
* ext/openssl/ossl_pkey_ec.c (ossl_ec_key_dsa_verify_asn1):  ditto.
* ext/openssl/ossl_pkey_ec.c (ossl_ec_point_is_at_infinit):  ditto.
* ext/openssl/ossl_pkey_ec.c (ossl_ec_point_is_on_curve):  ditto.
* ext/fiddle/conversions.c (generic_to_value):  ditto.
* ext/socket/raddrinfo.c (rsock_io_socket_addrinfo):  ditto.
* ext/socket/socket.c (sock_s_getnameinfo):  ditto.
* ext/ripper/eventids2.c (ripper_token2eventid):  ditto.
* cont.c (return_fiber):  ditto.
* dmydln.c (dln_load):  ditto.
* vm_insnhelper.c (vm_search_normal_superclass):  ditto.
* bignum.c (big_fdiv):  ditto.
* marshal.c (r_symlink):  ditto.
* marshal.c (r_symbol):  ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-04-13 23:45:37 +00:00
marcandre 7316302483 * include/ruby/intern.h: Add rb_check_arity, rb_error_arity [#6085]
* array.c: Use rb_check_arity / rb_error_arity

* class.c: ditto

* enumerator.c: ditto

* eval.c: ditto

* file.c: ditto

* hash.c: ditto

* numeric.c: ditto

* proc.c: ditto

* process.c: ditto

* random.c: ditto

* re.c: ditto

* signal.c: ditto

* string.c: ditto

* struct.c: ditto

* transcode.c: ditto

* vm_eval.c: ditto

* vm_insnhelper.c: ditto & implementation of rb_error_arity

* test/ruby/test_arity.rb: tests for above

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35024 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-03-14 21:10:34 +00:00
naruse 88b16cebc8 * gc.c (rb_objspace_free): global_List is allocated with xmalloc.
patched by Sokolov Yura.  https://github.com/ruby/ruby/pull/78

* dln_find.c: remove useless replacement of free.

* ext/readline/readline.c (readline_attempted_completion_function):
  strings for readline must allocated with malloc.

* process.c (run_exec_dup2): use free; see also r20950.

* re.c (onig_new_with_source): use malloc for oniguruma.

* vm.c (ruby_vm_destruct): use free for VMs.

* vm.c (thread_free): use free for threads.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-01-08 21:02:08 +00:00
nobu afea9046a9 * re.c (rb_reg_initialize): fix indent.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-11-20 14:05:44 +00:00
drbrain 430f4da042 * re.c (match_aref): Use <code> around indexing examples to prevent
hyperlinks.  [ruby-talk:389396]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33522 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-10-24 21:35:05 +00:00
nobu 8e6e8e6288 * use RB_TYPE_P which is optimized for constant types, instead of
comparison with TYPE.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33357 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-09-29 11:07:45 +00:00
akr e7996eb3cc * internal.h: declare internal functions here.
* node.h: declare NODE dependent internal functions here.

* iseq.h: declare rb_iseq_t dependent internal functions here.

* vm_core.h: declare rb_thread_t dependent internal functions here.

* bignum.c, class.c, compile.c, complex.c, cont.c, dir.c, encoding.c,
  enumerator.c, error.c, eval.c, file.c, gc.c, hash.c, inits.c, io.c,
  iseq.c, load.c, marshal.c, math.c, numeric.c, object.c, parse.y,
  proc.c, process.c, range.c, rational.c, re.c, ruby.c, string.c,
  thread.c, time.c, transcode.c, variable.c, vm.c,
  tool/compile_prelude.rb: don't declare internal functions declared
  in above headers.  include above headers if required.

  Note that rb_thread_mark() was declared as
  void rb_thread_mark(rb_thread_t *th) in cont.c but defined as
  void rb_thread_mark(void *ptr) in vm.c.  Now it is declared as
  the later in internal.h.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@32156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-06-17 22:43:38 +00:00
naruse f7b046987d * re.c (rb_reg_match): fix rdoc of Regexp#=~.
patched by Tsuyoshi Sawada. [Bug #4781]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31781 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-05-29 14:18:34 +00:00
drbrain e2b3183fc2 * re.c (Init_Regexp): Document option constants. Patch by Vincent
Batts.  [Ruby 1.9 - Bug #4677]
	* lib/uri/common.rb (module URI):  Documentation for URI.  Patch by
	  Vincent Batts.  [Ruby 1.9- Bug #4677]
	* lib/uri/ftp.rb (module URI):  ditto
	* lib/uri/generic.rb (module URI):  ditto
	* lib/uri/http.rb (module URI):  ditto
	* lib/uri/https.rb (module URI):  ditto
	* lib/uri/ldap.rb (module URI):  ditto
	* lib/uri/ldaps.rb (module URI):  ditto
	* lib/uri/mailto.rb (module URI):  ditto
	* process.c (Init_process):  Document Process constants.  Patch by
	  Vincent Batts.  [Ruby 1.9- Bug #4677]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-05-12 20:39:11 +00:00
tenderlove 89ef6628eb * re.c (Init_Regexp): added a constant for ARG_ENCODING_NONE
[ruby-core:35054]
* test/ruby/test_regexp.rb: corresponding test.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30765 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-02-02 22:18:14 +00:00
kosaki 58da04b398 * re.c (rb_reg_raise): add GC guard to prevent intermediate
variable from GC.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30684 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-01-27 17:37:30 +00:00
usa 183bbd8b69 Sorry, commit miss of r30412.
* re.c (rb_reg_expr_str): need to escape if the coderage is invalid.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30418 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-12-29 09:24:37 +00:00
akr 195992f032 * re.c: parenthesize macro arguments.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30403 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-12-27 09:27:43 +00:00
naruse 3010758245 Revert "* re.c (rb_reg_initialize): don't set US-ASCII to regexp"
This reverts commit r30058.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30059 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-12-02 19:19:44 +00:00
naruse 4e788aa17e * re.c (rb_reg_initialize): don't set US-ASCII to regexp
when parser make initially compile a regexp.
  Usually regexp are used for the same of its script encoding.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-12-02 17:40:13 +00:00
usa ca8405298f * re.c (rb_reg_initialize_str): should succeed the taint status from
the origin. [ruby-core:33338]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29932 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-11-26 00:50:42 +00:00
naruse b259e449d1 * random.c (rand_init): remove useless assignment.
* re.c (update_char_offset): remove unused variable.

* re.c (read_escaped_byte): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29408 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-10-04 01:23:58 +00:00
naruse d078f51f57 * re.c (rb_reg_search): fix: 4th argument should be regexp
object. patched by shintaro kuwamoto [ruby-dev:41667] #3459

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29074 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-08-23 03:32:58 +00:00
nobu 8044ac7b43 * re.c (rb_reg_expr_str): fixed out-of-boundary access at invalid
multibyte characters.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-07-23 00:02:51 +00:00
naruse 203ebcbb92 * re.c (rb_reg_expr_str): fix broken Regexp#inspect when it
is ASCII-8BIT and non-ASCII character.
  The length of character should be from original byte string.
  [ruby-core:31431]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28715 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-07-22 07:29:32 +00:00
naruse 3a80743ccf * re.c (rb_reg_expr_str): ASCII incompatible strings
must always escape or converted.

* re.c (rb_reg_expr_str): use rb_str_buf_cat_escaped_char
  when resenc is given: for Regexp#inspect or error message.

  * re.c (rb_reg_desc): add 'n' for ENCODING_NONE.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28177 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-06-05 11:32:05 +00:00
naruse 4c897fdde1 * re.c (unescape_nonascii): \P{FOO} is also Unicode regexp. [ruby-core:30540]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28120 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-06-01 14:20:59 +00:00
mame 268f95bdc6 * re.c (rb_reg_s_union_m): update rdoc. [ruby-dev:41354]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27929 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-20 13:52:18 +00:00
marcandre 914efd0b60 * proc.c (proc_lambda, unnamed_parameters): Small documentation fixes.
* re.c: ditto

* string.c: ditto

* time.c: ditto

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27867 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-17 21:07:46 +00:00
marcandre 7729de4d91 * array.c: Documentation: change => in call-seq to ->.
Harmonize "#=>" in examples. [ruby-core:30206]

* bignum.c: ditto

* class.c: ditto

* compar.c: ditto

* cont.c: ditto

* dir.c: ditto

* encoding.c: ditto

* enum.c: ditto

* enumerator.c: ditto

* error.c: ditto

* eval.c: ditto

* file.c: ditto

* gc.c: ditto

* io.c: ditto

* load.c: ditto

* marshal.c: ditto

* math.c: ditto

* numeric.c: ditto

* object.c: ditto

* pack.c: ditto

* proc.c: ditto

* process.c: ditto

* random.c: ditto

* range.c: ditto

* re.c: ditto

* ruby.c: ditto

* signal.c: ditto

* sprintf.c: ditto

* string.c: ditto

* struct.c: ditto

* thread.c: ditto

* time.c: ditto

* transcode.c: ditto

* variable.c: ditto

* vm_eval.c: ditto

* vm_method.c: ditto

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-17 21:07:33 +00:00
naruse 0e586b35b8 * re.c (rb_reg_initialize_m): fix wrong index for the lang
option's value 'N'. reported by Masaya TARUI via IRC.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-12 01:06:24 +00:00
naruse f65aac7a90 Add description about Regexp(str, opt, lang). [ruby-core:29893]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27738 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-11 13:06:54 +00:00
marcandre 1dee5e34a3 * error.c: RDoc for subclasses of Exception. [ruby-core:28394]
* cont.c: ditto

* enumerator.c: ditto

* io.c: ditto

* math.c: ditto

* numeric.c: ditto

* proc.c: ditto

* re.c: ditto

* thread.c: ditto

* transcode.c: ditto. Thanks to Run Paint for some of the documentation.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-05-08 04:50:09 +00:00
marcandre 478c3e080b * eval.c (make_exception, rb_obj_extend): Fix error messages in case of wrong
number of arguments

* file.c (rb_f_test, rb_file_s_umask): ditto

* numeric.c (int_chr, num_step): ditto

* process.c (rb_f_sleep): ditto

* re.c (rb_reg_initialize_m): ditto

* signal.c (rb_f_kill, sig_trap): ditto

* string.c (rb_str_aref_m, rb_str_aset_m, rb_str_count, rb_str_delete_bang,
  rb_str_slice_bang, rb_str_sub_bang, str_gsub): ditto

* proc.c (curry): rdoc fix

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-04-30 02:40:57 +00:00
drbrain 1325437297 * lib/rdoc: Import RDoc 2.5.2
* lib/rdoc/parser/ruby.rb (RDoc::Parser::Ruby): Don't parse rdoc
  files, reverts r24976 in favor of include directive support in C
  parser.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27283 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-04-10 06:36:13 +00:00
naruse 500c78c610 * re.c (make_regexp): use onig_new_with_source to keep
sourcefile and sourceline.

* re.c (onig_new_with_source): copied from onig_new in
  regcomp.c for keep sourcefile and sourceline.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-04-05 10:57:38 +00:00
naruse 10317605f0 * re.c (rb_reg_to_s): remove unused variable.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26854 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-09 05:56:39 +00:00
matz db37773e13 * include/ruby/oniguruma.h: updated to follow Oniguruma 5.9.2.
* re.c (make_regexp): use onig_new() instead of onig_alloc_init().

* re.c (rb_reg_to_s): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-01 21:54:59 +00:00
nobu 27e492bec7 * marshal.c (r_object0): register regexp object before encoding
name.  [ruby-dev:40414]

* re.c (rb_reg_alloc, rb_reg_init_str): split from rb_reg_new_str.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26661 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-02-13 19:45:35 +00:00
nobu be0197054c * re.c (match_aref): fixed indent.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26660 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-02-13 19:31:42 +00:00
nobu 4d786d21e3 * removed spaces just before tabs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25930 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-11-26 05:25:08 +00:00
naruse 331fdbe822 * re.c (last_match): add "thread and method" to the scope.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25168 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-30 01:08:44 +00:00
marcandre b59109a844 * re.c (last_match): Added note to the doc that last_match is local to current scope [ruby-core:25833]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25165 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-29 16:38:00 +00:00
naruse 63c7ca40d8 * doc/re.rb: New document for Ruby's fork of Oniguruma.
written by Run Paint Run Run [ruby-core:25420]

* re.c: import document in doc/re.rb.

* .document: add doc/re.rb.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24973 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-17 00:40:49 +00:00
naruse e13ca98198 * parse.y (rb_char_to_option_kcode): ASCII-8BIT should also delay.
* re.c (parser_regx_options): return rb_ascii8bit_encindex on
  ASCII-8BIT. [ruby-dev:39300]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24832 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-10 17:07:38 +00:00
nobu bbd9c406d6 * re.c (rb_reg_hash): must calculate hash.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24793 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-08 13:11:32 +00:00
nobu 31b7ae00c0 * include/ruby/st.h (st_hash_func): use st_index_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-08 13:10:04 +00:00
nobu 7b9024f740 * re.c (Init_Regexp): new methods. [ruby-core:24748]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-05 00:33:54 +00:00
nobu 605e7d4a60 * re.c (update_char_offset): position should be long.
* re.c (match_hash, match_equal): new methods.  [ruby-core:24748]

* re.c (reg_match_pos, rb_reg_eqq, rb_reg_s_quote): get rid of use
  VALUE as int.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24754 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-04 23:51:44 +00:00
nobu 7633eb4c51 * re.c (update_char_offset):
* re.c (rb_reg_equal):
* re.c (reg_match_pos):
* re.c (rb_reg_eqq):
* re.c (static VALUE):
* re.c (Init_Regexp):
[ruby-core:24748]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24753 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-04 23:49:18 +00:00
naruse 6ab36c6e19 *regparse.c (CC_DUP_WARN): use rb_compile_warn if ScanEnv has source
information. [ruby-dev:39105]

*re.c (rb_reg_compile): add sourcefile and sourceline to the arguments.

*re.c (make_regexp): ditto.

*re.c (rb_reg_initialize): ditto.

*re.c (rb_reg_initialize_str): ditto.

*re.c (rb_reg_compile): ditto.

*regcomp.c (onig_compile): ditto.

*regint.h (onig_compile): ditto.

*re.c (reg_compile_gen): follow above.

*re.c (rb_reg_to_s): ditto.

*re.c (make_regexp): ditto.

*re.c (rb_reg_initialize): ditto.

*re.c (rb_reg_initialize_str): ditto.

*re.c (rb_reg_new_str): ditto.

*re.c (rb_enc_reg_new): ditto.

*re.c (rb_reg_initialize_m): ditto.

*re.c (rb_reg_init_copy): ditto.

*regcomp.c (onig_new): ditto.

*regcomp.c (onig_compile): set sourcefile and sourceline to scan_env.

*regparse.h (ScanEnv): add sourcefile and sourceline.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-30 08:00:31 +00:00
naruse a20bd463a8 * re.c (rb_reg_preprocess_dregexp): set encoding as ASCII-8BIT
when /n is specified and the embeded string is escaped text.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24683 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-27 06:10:30 +00:00
naruse 4155811dc1 * re.c (rb_reg_preprocess_dregexp): change Exception class to
RegexpError.

* test/ruby/test_m17n.rb (test_regexp_usascii): follow above.

* test/ruby/test_m17n.rb (test_regexp_embed): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24539 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-14 09:05:44 +00:00
naruse 88c0b8fec9 Fix error message of /.../n with embeded non ASCII-8BIT string.
* re.c (rb_reg_preprocess_dregexp): add options to arguments.

* re.c (rb_reg_new_ary): follow above.

* re.c (rb_reg_preprocess_dregexp): change error message when
  /.../n has a non escaped non ASCII character in non ASCII-8BIT
  script. [ruby-dev:38524]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24398 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-05 01:38:36 +00:00
naruse acbb181219 use rb_enc_get to get the encoding of a Regexp object.
* re.c (reg_enc_error): use rb_enc_get to get the encoding of
  a Regexp object. REGEXP(re)->ptr->enc is the encoding of the
  regexp engin for patterns and target strings.
  [ruby-core:23208]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-07-18 15:00:50 +00:00
matz 5d7a215f6e * re.c (reg_match_pos): adjust offset based on characters, not
bytes.  [ruby-dev:38722]

* string.c (rb_str_offset): new function.

* string.c (rb_str_index_m): no call to rb_reg_adjust_startpos().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23916 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-30 09:06:48 +00:00
nobu 23a32d6444 * include/ruby/oniguruma.h, include/ruby/re.h, re.c, regcomp.c,
regenc.c, regerror.c, regexec.c, regint.h, regparse.c: use long.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23907 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-30 02:08:54 +00:00
nobu 22cde7b682 * dir.c, dln.c, parse.y, re.c, ruby.c, sprintf.c, strftime.c,
string.c, util.c, variable.c: use strlcpy, memcpy and snprintf
  instead of strcpy, strncpy and sprintf.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@22984 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-03-17 01:29:17 +00:00
akr f414bd65ae * string.c (rb_str_subpat): accept capture name.
(rb_str_aref): follow above change.
  (rb_str_aref_m): pass the 2nd argument to rb_str_subpat.
  (rb_str_subpat_set): accept capture name.
  (rb_str_aset): follow above change.
  (rb_str_partition): ditto.
  (rb_str_aset_m): pass the 2nd argument to rb_str_subpat_set.

* include/ruby/intern.h (rb_reg_backref_number): declared.

* re.c (rb_reg_backref_number): defined.

  [ruby-core:21057]



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@22959 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-03-14 18:04:21 +00:00
nobu 4de12b6ae9 * util.c (ruby_scan_oct, ruby_scan_hex): use size_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@22957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-03-14 09:25:20 +00:00
nobu 12d2c8ba41 stripped trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@22552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-02-22 14:23:33 +00:00
akr af7d8584c5 * re.c (Init_Regexp): define Regexp::FIXEDENCODING. [ruby-dev:38066]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@22506 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-02-22 06:12:21 +00:00
matz 1c400db1d5 * re.c (match_array): replace match_check().
* re.c (match_values_at): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-02-03 05:20:27 +00:00
akr ce17decdfb * re.c: use strlcpy for error messages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-16 10:44:36 +00:00
matz 3060c7438d * re.c (reg_enc_error): raise EncodingCompatibilityError for
encoding incompatibility.  [ruby-core:18600]

* re.c (rb_reg_prepare_enc): more consistent error message.
  [ruby-core:18611]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20626 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-11 04:40:08 +00:00
naruse d433a70b5d * re.c (rb_reg_initialize): raise RegexpError when encoding
is dummy encoding. [ruby-dev:37091]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-12-10 02:29:05 +00:00
matz 4ccfa1e9f8 * re.c (rb_reg_desc): re might be NULL.
* regerror.c (onig_error_code_to_format): message updated.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20243 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-18 06:51:19 +00:00
nobu 316b78a56d * re.c (rb_reg_regsub): returns -1 unless ascii as well as
rb_enc_ascget().  [ruby-dev:37097]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20237 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-11-15 13:46:52 +00:00
matz 1d38a821ea * re.c (unescape_escaped_nonascii): back out the last change on
the function.  [ruby-dev:36818]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-22 04:27:32 +00:00
matz fcce99c52d * re.c (rb_reg_initialize_m): specify ARG_ENCODING_NONE instead of
ARG_ENCODING_FIXED for Regexp.new("", nil, "n").  [ruby-dev:36761]

* test/ruby/test_regexp.rb (TestRegexp#test_initialize): test
  updated.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19832 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-18 10:29:06 +00:00
matz 98e6f9a79c * re.c (rb_reg_initialize_m): changed the message to clarify the
third option argument is now ignored.  [ruby-dev:36753]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-16 22:39:04 +00:00
matz 1e8bbf3154 * .gdbinit (rp): REGEXP handling fixed.
* string.c (rb_str_rindex_m): need not to call rb_enc_check on
  regexp.

* re.c (unescape_escaped_nonascii): try ASCII-8BIT encoding for
  broken strings.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19812 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-16 22:21:42 +00:00
akr de7845773a rdoc update.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19757 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-10-11 09:56:07 +00:00
naruse 00cdba732f * re.c (rb_reg_desc): Regexps of ASCII Compatible encoding may
contain non-ASCII characters. So in that case its encoding
  must keep original encoding.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19433 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-19 23:07:22 +00:00
naruse 025bd642a7 * re.c (rb_reg_desc): Regexp#inspect should be US-ASCII.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16 13:16:12 +00:00
akr 2e0a116dd5 * re.c (rb_reg_quote): use rb_enc_mbcput to generate ASCII
incompatible characters properly.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19369 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-15 16:01:08 +00:00
akr 19416601a0 * include/ruby/oniguruma.h (OnigEncodingTypeST): add end argument for
left_adjust_char_head.
  (ONIGENC_LEFT_ADJUST_CHAR_HEAD): add end argument.
  (onigenc_get_left_adjust_char_head): ditto.

* include/ruby/encoding.h (rb_enc_left_char_head): add end argument.

* regenc.h (onigenc_single_byte_left_adjust_char_head): ditto.

* regenc.c (onigenc_get_right_adjust_char_head): follow the interface
  change.
  (onigenc_get_right_adjust_char_head_with_prev): ditto.
  (onigenc_get_prev_char_head): ditto.
  (onigenc_step_back): ditto.
  (onigenc_get_left_adjust_char_head): ditto.
  (onigenc_single_byte_code_to_mbc): ditto.

* re.c: ditto.

* string.c: ditto.

* io.c: ditto.

* regexec.c: ditto.

* enc/euc_jp.c: ditto.

* enc/cp949.c: ditto.

* enc/shift_jis.c: ditto.

* enc/gbk.c: ditto.

* enc/big5.c: ditto.

* enc/euc_tw.c: ditto.

* enc/euc_kr.c: ditto.

* enc/emacs_mule.c: ditto.

* enc/gb18030.c: ditto.

* enc/utf_8.c: ditto.

* enc/utf_16le.c: ditto.

* enc/utf_16be.c: ditto.

* enc/utf_32le.c: ditto.

* enc/utf_32be.c: ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19334 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-13 19:23:52 +00:00
akr c965010582 * include/ruby/oniguruma.h (onigenc_get_right_adjust_char_head): add
end argument.

* include/ruby/encoding.h (rb_enc_right_char_head): add end argument.

* regenc.c (onigenc_get_right_adjust_char_head): use end argument.

* re.c (rb_reg_adjust_startpos): follow the interface change.

* string.c (rb_str_index): ditto.

* regexec.c (backward_search_range): ditto.
  (onig_search): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19330 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-13 16:40:31 +00:00
akr 933eb07996 * vm.c (rb_mRubyVMFrozenCore): registered for GC.
* re.c (rb_reg_preprocess_dregexp): fix GC problem on MacOS X with
  powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.
  build 5367).



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19241 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-08 09:14:59 +00:00
akr 93ad576b05 * re.c (rb_reg_inspect): don't raise for uninitialized Regexp.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18697 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-18 15:56:38 +00:00
shugo f433d710d0 * object.c (rb_obj_untrusted): new method Object#untrusted?.
(rb_obj_untrust): new method Object#untrust.
  (rb_obj_trust): new method Object#trust.
* array.c, debug.c, time.c, include/ruby/ruby.h, re.c, variable.c,
  string.c, io.c, dir.c, vm_method.c, struct.c, class.c, hash.c,
  ruby.c, marshal.c: fixes for Object#untrusted?.
* test/ruby/test_module.rb, test/ruby/test_array.rb,
  test/ruby/test_object.rb, test/ruby/test_string.rb,
  test/ruby/test_marshal.rb, test/ruby/test_hash.rb: added tests for
  Object#untrusted?.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18568 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-13 07:25:05 +00:00
nobu 0acca9a826 * compile.c (insn_data_to_s_detail), file.c (rb_stat_inspect),
iseq.c (ruby_iseq_disasm_insn, ruby_iseq_disasm),
  process.c (pst_message), re.c (match_inspect): use rb_str_catf.

* dir.c (dir_inspect), iseq.c (iseq_inspect, insn_operand_intern): use
  rb_sprintf.

* error.c (rb_name_error, rb_raise, rb_loaderror, rb_fatal): use
  rb_vsprintf.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-22 08:53:34 +00:00
akr fe80d63d68 * re.c (rb_reg_s_union): useless rb_enc_get call removed to prevent
SEGV by Regexp.union("", nil).


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18137 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-20 19:44:05 +00:00
akr 1a32af4e7a * re.c (unescape_nonascii): add has_property argument not to
raise error by /\p{Hiragana}\u{3042}/ in EUC-JP script.
  (rb_reg_preprocess): use has_property argument to make regexp
  encoding fixed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-04 23:50:33 +00:00
akr 54c984a898 * re.c (unescape_nonascii): make regexp fixed_encoding if \p is used.
fixed [ruby-core:17279].


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-04 23:33:04 +00:00
akr 340cd503a7 * include/ruby/ruby.h (struct RRegexp): new field usecnt. replace
str and len by src.

* gc.c (gc_mark_children): mark src field of regexp.
  (obj_free): don't free str field.

* re.c (REG_BUSY): removed.
  (rb_reg_initialize): prohibit re-initialize regexp.
  (rb_reg_search): use usecnt to prevent freeing regexp currently
  using.  this prevents SEGV by:
    r = /\A((a.)*(a.)*)*b/
    r =~ "ab" + "\xc2\xa1".force_encoding("euc-jp")
    t = Thread.new { r =~ "ab"*8 + "\xc2\xa1".force_encoding("utf-8")}
    sleep 0.2
    r =~ "ab"*8 + "\xc2\xa1".force_encoding("euc-jp")



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17635 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-28 12:25:45 +00:00
ko1 72ba13aa8e * array.c, bignum.c, cont.c, dir.c, dln.c, encoding.c, enumerator.c,
enumerator.c (enumerator_allocate), eval_jump.c, file.c, hash.c,
  io.c, load.c, pack.c, proc.c, random.c, re.c, ruby.c, st.c,
  string.c, thread.c, thread_pthread.c, time.c, util.c, variable.c,
  vm.c, gc.c:
  allocated memory objects by xmalloc (ruby_xmalloc) should be
  freed by xfree (ruby_xfree).
* ext/curses/curses.c, ext/dbm/dbm.c, ext/digest/digest.c,
  ext/gdbm/gdbm.c, ext/json/ext/parser/parser.c,
  ext/json/ext/parser/unicode.c, ext/openssl/ossl_cipher.c,
  ext/openssl/ossl_hmac.c, ext/openssl/ossl_pkey_ec.c,
  ext/sdbm/init.c, ext/strscan/strscan.c, ext/zlib/zlib.c:
  ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17017 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-08 10:01:40 +00:00
nobu 0455e8ea9a * io.c (rb_f_open), re.c (rb_reg_search), transcode.c (str_transcode):
suppress warnings.

* util.c (quorem, rv_alloc, nrv_alloc): only used in dtoa().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16873 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-07 12:38:03 +00:00
mame 7eb625425c * re.c: fix SEGV by Regexp.allocate.names, Match.allocate.names, etc.
* test/ruby/test_regexp.rb: add tests for above.

* io.c: fix SEGV by IO.allocate.print, etc.

* test/ruby/test_io.rb: add tests for above.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16757 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-02 12:45:42 +00:00
nobu 075530a685 * suppress warnings with -Wwrite-string.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-31 09:28:20 +00:00
matz 44cd8e457b * regparse.c (PINC): use optimized enclen() instead of
ONIGENC_MBC_ENC_LEN().

* regparse.c (PFETCH): ditto.

* regparse.c (PFETCH): small optimization.

* regexec.c (slow_search): single byte encoding optimization.

* regenc.h (enclen): avoid calling function when encoding's
  min_len == max_len.

* re.c (rb_reg_regsub): rb_enc_ascget() optimization for single
  byte encoding.

* re.c (rb_reg_search): avoid allocating new re_registers if we
  already have MatchData.

* re.c (match_init_copy): avoid unnecessary onig_region_free()
  before onig_region_copy. 

* encoding.c (rb_enc_get_index): remove implicit enc_capable check
  each time.

* encoding.c (rb_enc_set_index): ditto.

* encoding.c (enc_compatible_p): small refactoring.

* include/ruby/encoding.h (rb_enc_dummy_p): inline
  rb_enc_dummy_p() and export related code.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16477 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-19 08:25:03 +00:00
matz c39e8c6e85 * array.c (rb_ary_sort_bang): stop memory leak. [ruby-dev:34726]
* re.c (rb_reg_search): need to free allocated buffer in re_register.

* regexec.c (onig_region_new): more pedantic malloc check.

* regexec.c (onig_region_resize): ditto.

* regexec.c (STATE_CHECK_BUFF_INIT): ditto.

* regexec.c (onig_region_copy): use onig_region_resize.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16437 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-16 18:27:01 +00:00
matz 880a96c795 * re.c (rb_reg_prepare_enc): error condition was updated for non
ASCII compatible strings.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16423 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-15 10:45:58 +00:00
matz ab24f2b077 * re.c (rb_reg_prepare_re): made non static with small refactoring.
* ext/strscan/strscan.c (strscan_do_scan): should adjust encoding
  before regex searching.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-12 06:09:53 +00:00
matz f34a75657d * re.c (Init_Regexp): remove MatchData#select. [ruby-dev:34563]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-02 04:57:19 +00:00
nobu cc88283bad * re.c (rb_reg_search): use local variable. a patch from wanabe
<s.wanabe AT gmail.com> in [ruby-dev:34537].  [ruby-dev:34492]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16239 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-30 08:47:23 +00:00
nobu 89c408704b * enumerator.c (enumerator_each, enumerator_with_index): suppress
warnings.

* pack.c (pack_unpack): ditto.

* process.c (rb_syswait): ditto.

* re.c (rb_reg_prepare_enc, rb_reg_prepare_re,
  rb_reg_adjust_startpos): ditto.

* regparse.c (onig_name_to_group_numbers): ditto.

* missing/vsnprintf.c (BSD_vfprintf): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-22 13:49:43 +00:00
matz fee4ed204f * re.c (rb_reg_search): make search reentrant. [ruby-dev:34223]
* test/ruby/test_parse.rb (TestParse::test_global_variable):
  should preserve $& variable.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-14 15:47:51 +00:00
matz 1dcbd6921e * re.c (rb_reg_quote): should always copy the quoting string.
[ruby-core:16235]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15925 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-08 01:53:35 +00:00
naruse 3467a1754c * re.c (rb_memsearch_qs): wrong boundary condition.
* re.c (rb_memsearch_qs_utf8): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15903 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-04 14:26:19 +00:00
matz 2b8af7d624 * re.c (rb_memsearch_qs): wrong boundary condition. a patch from
wanabe <s.wanabe AT gmail.com> in [ruby-dev:34248].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15902 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-04-04 05:13:06 +00:00
naruse e58adeae0f * re.c (rb_memsearch_ss): simple shift search.
* re.c (rb_memsearch_qs): quick search.

* re.c (rb_memsearch_qs_utf8): quick search for UTF-8 string.

* re.c (rb_memsearch_qs_utf8_hash): hash functions for above.

* re.c (rb_memsearch): use above functions.

* string.c (rb_str_index): give enc to rb_memsearch.

* include/ruby/intern.h (rb_memsearch): move to encoding.h.

* include/ruby/encoding.h (rb_memsearch): move from intern.h.

* common.mk (PREP): add dependency.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-17 19:04:29 +00:00
akr 861219ce4a fix doc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-09 01:04:46 +00:00
matz 39787ea14d * numeric.c (fix_to_s): avoid rb_scan_args() when no argument
given. 
* bignum.c (rb_big_to_s): ditto.
* enum.c (enum_first): ditto.
* eval_jump.c (rb_f_catch): ditto.
* io.c (rb_obj_display): ditto.
* class.c (rb_obj_singleton_methods): ditto.
* object.c (rb_class_initialize): ditto.
* random.c (rb_f_srand): ditto.
* range.c (range_step): ditto.
* re.c (rb_reg_s_last_match): ditto.
* string.c (rb_str_to_i): ditto.
* string.c (rb_str_each_line): ditto.
* string.c (rb_str_chomp_bang): ditto.
* string.c (rb_str_sum): ditto.

* string.c (str_modifiable): declare inline.
* string.c (str_independent): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15691 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-05 05:22:17 +00:00
matz bbc2f80a32 * re.c (rb_reg_regsub): remove too strict encoding check.
[ruby-dev:33966]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-03 08:22:18 +00:00
matz daa622aed0 * time.c (time_strftime): format should be ascii compatible.
* parse.y (rb_intern3): non ASCII compatible symbols.

* re.c (rb_reg_regsub): add encoding check.

* string.c (rb_str_chomp_bang): ditto.

* test/ruby/test_utf16.rb (TestUTF16::test_chomp): raises exception.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-29 09:19:15 +00:00
akr d77ddf33ae add tests for sub/gsub with hash.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15535 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 03:51:34 +00:00
akr 1783b7aacc typo fix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15534 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 03:43:11 +00:00
akr a74c11cd4a * re.c (re_warn): defined to restore warnings for /[a-c-e]/, etc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15532 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 02:52:10 +00:00
akr 583a4b1774 * re.c (rb_reg_regsub): don't repeat repl twice with
"X".sub!(/./, sprintf("\\%c", 255)).


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15527 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 15:35:09 +00:00
akr b8fd2fabbe * re.c (rb_reg_prepare_re): add enable_warning parameter.
(rb_reg_adjust_startpos): disable warning by rb_reg_prepare_re.
  (rb_reg_search): follow rb_reg_prepare_re parameter change.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 12:54:17 +00:00
akr 0f4199fb56 * re.c (rb_reg_quote): return US-ASCII string consistently.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15515 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 02:00:05 +00:00
akr 71c5e48598 * include/ruby/re.h (struct rmatch_offset): new struct for character
offsets.
  (struct rmatch): new struct.
  (struct RMatch): reference struct rmatch.
  (RMATCH_REGS): new macro.

* re.c (match_alloc): initialize struct rmatch.
  (pair_byte_cmp): new function.
  (update_char_offset): update character offsets.
  (match_init_copy): copy regexp and character offsets.
  (match_sublen): removed.
  (match_offset): use update_char_offset.
  (match_begin): ditto.
  (match_end): ditto.
  (rb_reg_search): make character offset updated flag false.
  (match_size): use RMATCH_REGS.
  (match_backref_number): ditto.
  (rb_reg_nth_defined): ditto.
  (rb_reg_nth_match): ditto.
  (rb_reg_match_pre): ditto.
  (rb_reg_match_post): ditto.
  (rb_reg_match_last): ditto.
  (match_array): ditto.
  (match_aref): ditto.
  (match_values_at): ditto.
  (match_inspect): ditto.

* string.c (rb_str_subpat_set): use RMATCH_REGS.
  (rb_str_sub_bang): ditto.
  (str_gsub): ditto.
  (rb_str_split_m): ditto.
  (scan_once): ditto.

* gc.c (obj_free): free character offsets.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 20:08:35 +00:00
akr 60fa63b819 * re.c (match_inspect): avoid SEGV with MatchData.allocate.inspect.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15509 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 11:13:47 +00:00
nobu 17fb1248af * re.c (rb_reg_quote): set US-ACII for ASCII-only string.
[ruby-dev:33785]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15481 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 01:35:56 +00:00
akr ec4756f633 * re.c (rb_reg_preprocess_dregexp): use non-preprocessed regexp source
for result.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15465 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-14 03:34:12 +00:00
akr d5c8ad5359 * insns.def (toregexp): generate a regexp from strings instead of one
string.

* re.c (rb_reg_new_ary): defined for toregexp.  it concatenates
  strings after each string is preprocessed. 

* compile.c (compile_dstr_fragments): split from compile_dstr.
  (compile_dstr): call compile_dstr_fragments.
  (compile_dregx): defined for dynamic regexp.
  (iseq_compile_each): use compile_dregx for dynamic regexp.

  [ruby-dev:33400]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-29 08:03:51 +00:00
naruse 3c6969ec11 * string.c, parse.y, re.c: use rb_ascii8bit_encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15292 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 09:03:09 +00:00
akr fc208c1bd5 * include/ruby/oniguruma.h: precise mbclen API redesigned to avoid
inline functions.
  (onigenc_mbclen_charfound): removed.
  (onigenc_mbclen_needmore): removed.
  (onigenc_mbclen_recover): removed.
  (ONIGENC_MBCLEN_CHARFOUND): removed.
  (ONIGENC_MBCLEN_CHARFOUND_P): defined.
  (ONIGENC_MBCLEN_CHARFOUND_LEN): defined.
  (ONIGENC_MBCLEN_INVALID): removed.
  (ONIGENC_MBCLEN_INVALID_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE): removed.
  (ONIGENC_MBCLEN_NEEDMORE_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE_LEN): defined.
  (ONIGENC_MBC_ENC_LEN): use onigenc_mbclen_approximate.

* regenc.c (onigenc_mbclen_approximate): defined.

* include/ruby/encoding.h (MBCLEN_CHARFOUND): removed.
  (MBCLEN_INVALID): removed.
  (MBCLEN_NEEDMORE): removed.
  (MBCLEN_CHARFOUND_P): defined.
  (MBCLEN_INVALID_P): defined.
  (MBCLEN_NEEDMORE_P): defined.
  (MBCLEN_CHARFOUND_LEN): defined.
  (MBCLEN_NEEDMORE_LEN): defined.

* encoding.c: use new API.

* re.c: ditto.

* string.c: ditto.

* parse.y: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15280 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 14:27:07 +00:00
naruse f3fe101d55 * re.c (rb_reg_source): set encoding as regexp encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15265 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 07:26:51 +00:00
akr b9c18bdcdd * re.c (rb_reg_preprocess): force fixed encoding when ASCII
incompatible source string.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-26 21:01:52 +00:00
akr 1e41069754 * include/ruby/intern.h (rb_str_buf_cat_ascii): declared.
* string.c (rb_str_buf_cat_ascii): defined.

* re.c (rb_reg_s_union): use rb_str_buf_cat_ascii to support ASCII
  incompatible encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15232 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-25 07:35:27 +00:00
usa b1257d4d20 * re.c (rb_reg_fixed_encoding_p): no need to treat ASCII-8BIT specially.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-24 09:15:03 +00:00
usa fbe52683e6 * re.c (rb_reg_initialize): 7bit clean regexp should be US-ASCII.
[ruby-dev:33346]



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-24 07:56:12 +00:00
akr 3766eac339 * re.c (rb_reg_prepare_re): fix SEGV by
/a/ =~ "aa".force_encoding("utf-16be").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15178 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 04:40:43 +00:00
usa 61fd7dbf6d * re.c (rb_char_to_option_kcode): Regexp switch `s' should mean
Windows-31J, as wells as `-Ks'.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-18 00:44:15 +00:00
nobu a0029e3adc * re.c (rb_char_to_option_kcode): fixed typo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-17 12:48:23 +00:00
matz d9ff499bf3 * re.c (rb_char_to_option_kcode): use rb_enc_find_index() instead
of using fixed index value.

* enc/Makefile.in (encsrcdir): make US-ASCII built-in.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15047 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 13:49:29 +00:00
akr a31e2da12c * re.c (rb_reg_prepare_re): initialize error message buffer.
(rb_reg_search): ditto.
  (rb_reg_check_preprocess): ditto.
  (rb_reg_new_str): ditto.
  (rb_enc_reg_new): ditto.
  (rb_reg_compile): ditto.
  (rb_reg_initialize_m): ditto.
  (rb_reg_s_union_m): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15034 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-14 04:51:10 +00:00
akr 238c59842c * re.c (rb_reg_preprocess): fix fixed_enc condition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14924 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 04:55:26 +00:00
akr 063beac343 * encoding.c (rb_enc_internal_get_index): extracted from
rb_enc_get_index.
  (rb_enc_internal_set_index): extracted from rb_enc_associate_index

* include/ruby/encoding.h (ENCODING_SET): work over ENCODING_INLINE_MAX.
  (ENCODING_GET): ditto.
  (ENCODING_IS_ASCII8BIT): defined.
  (ENCODING_CODERANGE_SET): defined.

* re.c (rb_reg_fixed_encoding_p): use ENCODING_IS_ASCII8BIT.

* string.c (rb_enc_str_buf_cat): use ENCODING_IS_ASCII8BIT.

* parse.y (reg_fragment_setenc_gen): use ENCODING_IS_ASCII8BIT.

* marshal.c (has_ivars): use ENCODING_IS_ASCII8BIT.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-07 02:49:01 +00:00
akr f38cc001a7 * re.c (rb_reg_initialize_str): forbid raw non ASCII character
for ASCII-8BIT regexp in non ASCII-8BIT script.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14911 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 12:15:48 +00:00
akr 8987b97ca9 * include/ruby/encoding.h (rb_enc_str_buf_cat): declared.
* string.c (coderange_scan): extracted from rb_enc_str_coderange.
  (rb_enc_str_coderange): use coderange_scan.
  (rb_str_shared_replace): copy encoding and coderange.
  (rb_enc_str_buf_cat): new function for linear complexity string
  accumulation with encoding.
  (rb_str_sub_bang): don't conflict substituted part and replacement.
  (str_gsub): use rb_enc_str_buf_cat.
  (rb_str_clear): clear coderange.

* re.c (rb_reg_regsub): use rb_enc_str_buf_cat.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14910 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-06 09:25:09 +00:00
akr da42c102c1 * re.c (rb_reg_initialize_str): /\x80/n is not an error even if script
encoding is EUC-JP.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14899 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-05 16:39:38 +00:00
nobu 8638ee26e7 * include/ruby/intern.h, re.c (rb_reg_new): keep interface same as
1.8.  [ruby-core:14583]

* include/ruby/intern.h, re.c (rb_reg_new_str): renamed, and defines
  HAVE_RB_REG_NEW_STR macro to tell if it is available.

* include/ruby/encoding.h (rb_enc_reg_new): added.

* insns.def (toregexp), marshal.c (r_object0): use rb_reg_new_str().

* re.c (rb_reg_regcomp, rb_reg_s_union): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 16:30:33 +00:00
akr f780cdec75 * re.c (rb_reg_prepare_re): check string encoding. Oniguruma doesn't
support invalid encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14880 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 05:01:58 +00:00
akr 7d98c90ef2 unused variable removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 03:13:53 +00:00
matz 22e7258275 * re.c (rb_reg_search): avoid inner loop for reverse search.
* regexec.c: unset USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE
  which is turned on since oniguruma 5.9.1.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14878 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-04 01:24:12 +00:00
akr 52f9c1d2e1 * re.c (rb_reg_search): iterate onig_match for reverse mode.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03 17:48:06 +00:00
akr e21907e0f8 fix typos.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14810 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-31 05:52:59 +00:00
nobu 5ee7f4b0b5 * re.c (rb_reg_regsub): returns the given string itself if nothing
changed.

* string.c (rb_str_sub_bang): keeps code-range as possible.

* string.c (str_gsub): adjusts code-range.  [ruby-core:14566]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14782 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-29 13:44:32 +00:00
akr fd640aec82 * re.c (rb_reg_s_union): show encodings in error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-27 07:38:23 +00:00
akr b910bb7761 * re.c (rb_reg_prepare_re): show regexp encoding in the error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14597 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-24 09:38:20 +00:00
akr 5b809a28f8 * include/ruby/encoding.h, encoding.c, re.c, io.c, parse.y, numeric.c,
ruby.c, transcode.c: rename rb_ascii_encoding. to
  rb_ascii8bit_encoding.  rb_ascii_encoding is ambiguous with 
  ASCII-8BIT and US-ASCII.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 23:47:18 +00:00
akr fa3d06c738 refine error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 07:14:07 +00:00
matz d7cc14d436 * encoding.c (rb_ascii_encoding): renamed from previous
rb_default_encoding().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 18:55:30 +00:00
matz b36c642a85 * re.c (rb_reg_prepare_re): stop ENCODING_NONE warning if the
encoding of the str is ASCII-8BIT.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 18:21:41 +00:00
akr b82a05989e * re.c (ARG_ENCODING_NONE): defined for /.../n option.
(REG_ENCODING_NONE): ditto.
  (rb_char_to_option_kcode): return ARG_ENCODING_NONE for n.
  (rb_reg_prepare_re): warn /ascii/n =~ "non-ascii".
  (rb_reg_initialize): set REG_ENCODING_NONE from ARG_ENCODING_NONE.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 16:39:36 +00:00
akr e667720bd4 * re.c (append_utf8): use rb_utf8_encoding() instead of
rb_enc_find("utf-8").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14412 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 07:07:21 +00:00
matz 668bd7d992 * test/ruby/test_system.rb (TestSystem::valid_syntax): apply
ASCII-8BIT encoding explicitly.

* re.c (rb_reg_prepare_re): add encoding name in the message.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14402 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 05:03:14 +00:00
akr 59dca19910 * re.c: change "character encodings differ" error messages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14401 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-21 04:54:54 +00:00
matz 77629d2cbe * string.c (rb_str_rindex_m): too much adjustment.
* re.c (reg_match_pos): pos adjustment should be based on
  characters.

* test/ruby/test_m17n.rb (TestM17N::test_str_insert): test updated
  to check negative offset behavior.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 17:02:29 +00:00
nobu 474a88f041 * re.c (rb_reg_regsub): should set checked encoding.
* string.c (rb_str_sub_bang): applied r14212 too.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-19 12:42:19 +00:00
akr 2d01290cfd * parse.y (arg tMATCH arg): call reg_named_capture_assign_gen if regexp
literal is used.
  (reg_named_capture_assign_gen): assign the result of named capture
  into local variables.
  [ruby-dev:32588]

* re.c: document the assignment by named captures.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14297 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-18 11:26:24 +00:00
matz ebfcc5d933 * re.c (rb_reg_initialize): raise error if non-Unicode fixed
encoding option is specified for regexp literals with \u{}
  escapes.

* string.c (rb_str_squeeze_bang): should squeeze multibyte
  characters as well.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 16:06:21 +00:00
matz d6a70c4bb7 * string.c (scan_once): need no encoding compatibility check.
it's done inside of re_reg_seach().

* string.c (rb_str_split_m): ditto.

* re.c (rb_reg_regsub): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 09:44:06 +00:00
matz 7bb3ea6afa * re.c (rb_reg_initialize): embedded string may override encoding
of the regular expression.

* re.c (rb_reg_initialize): fix encoding of regular expression if
  embedded string has its own encoding specified.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13 16:09:53 +00:00
matz a648fc802b * encoding.c (rb_enc_compatible): encoding should never fall back
to ASCII-8BIT unless both encodings are ASCII-8BIT.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13 13:44:02 +00:00
akr b92cee1ddb * re.c, regerror.c, string.c, parse.y, ruby.c, file.c:
use capital letter for \xHH notation.  [ruby-dev:32511]



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12 14:30:54 +00:00
nobu ad72efa269 * re.c (rb_reg_regsub): should copy encoding.
* string.c (rb_str_sub_bang, str_gsub): should check and copy encoding
  to be replaced.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12 03:11:44 +00:00
akr 646f27b822 * encoding.c (rb_enc_ascget): renamed from rb_enc_get_ascii.
* include/ruby/encoding.h: follow the renaming.

* re.c: ditto. 



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 07:39:16 +00:00
akr 5802768b40 * encoding.c (rb_enc_get_ascii): add an argument to provide the
length of the returned character.

* include/ruby/encoding.h (rb_enc_get_ascii): add the argument.

* re.c (rb_reg_expr_str): modify rb_enc_get_ascii call.
  (rb_reg_quote): ditto.
  (rb_reg_regsub): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 03:08:50 +00:00
matz f6a9c859be * re.c (rb_reg_match): should calculate offset by converted
operand.  [ruby-cvs:21416]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 10:03:48 +00:00
nobu 38a24d73c8 * re.c (rb_reg_search): return byte offset. [ruby-dev:32452]
* re.c (rb_reg_match, rb_reg_match2, rb_reg_match_m): convert byte
  offset to char index.

* string.c (rb_str_index): return byte offset.  [ruby-dev:32472]

* string.c (rb_str_split_m): calculate in byte offset.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14171 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 04:50:35 +00:00
akr f4592d7bb0 * re.c (rb_reg_expr_str): use \xHH instead of \OOO.
* regerror.c (to_ascii): ditto.
  (onig_snprintf_with_pattern): ditto.
  (onig_snprintf_with_pattern): ditto.

* string.c (rb_str_inspect): ditto.
  (rb_str_dump): ditto.

* parse.y (parser_yylex): ditto.

* ruby.c (proc_options): ditto.

* file.c (rb_f_test): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 21:48:05 +00:00
akr 08eb58d3dd * re.c (rb_reg_names): new method Regexp#names.
(rb_reg_named_captures): new method Regexp#named_captures
  (match_regexp): new method MatchData#regexp.
  (match_names): new method MatchData#names.

* lib/pp.rb (MatchData#pretty_print): show names of named captures.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 21:44:19 +00:00
akr e56e8c758d * re.c (rb_reg_s_last_match): accept named capture's name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14161 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 13:35:38 +00:00
akr 2d101f0a87 Regexp#fixed_encoding? documented.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 11:57:06 +00:00
akr 7a7c26be73 document named capture of MatchData#{offset,begin,end,inspect}.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 07:35:54 +00:00
akr 18d8fbac54 * re.c (match_backref_number): new function for converting a backref
name/number to an integer.
  (match_offset): use match_backref_number.
  (match_begin): ditto.
  (match_end): ditto.
  (name_to_backref_number): raise IndexError instead of RuntimeError.
  (match_inspect): show capture index.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 07:12:44 +00:00
akr 5a1c2b2677 * re.c (append_utf8): check unicode range.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09 03:50:11 +00:00
akr b12bb50149 * re.c (rb_reg_check_preprocess): new function for validating regexp
fragment.

* parse.y (regexp): invoke reg_fragment_check.
  (reg_fragment_check): defined.
  (reg_fragment_check_gen): defined.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08 07:21:05 +00:00
akr f1b7e60cb9 * encoding.c (rb_enc_mbclen): make it never fail.
(rb_enc_nth): don't check the return value of rb_enc_mbclen.
  (rb_enc_strlen): ditto.
  (rb_enc_precise_mbclen): return needmore(1) if e <= p.
  (rb_enc_get_ascii): new function for extracting ASCII character.

* include/ruby/encoding.h (rb_enc_get_ascii): declared.

* include/ruby/regex.h (ismbchar): removed.

* re.c (rb_reg_expr_str): use rb_enc_get_ascii.
  (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine
  the termination of escaped non-ASCII character.
  (unescape_nonascii): use rb_enc_precise_mbclen.
  (rb_reg_quote): use rb_enc_get_ascii.
  (rb_reg_regsub): use rb_enc_get_ascii.

* string.c (rb_str_reverse) don't check the return value of
  rb_enc_mbclen.
  (rb_str_split_m): don't call rb_enc_mbclen with e <= p.

* parse.y (is_identchar): use ISASCII.
  (parser_ismbchar): removed.
  (parser_precise_mbclen): new macro.
  (parser_isascii): new macro.
  (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid
  character precisely.
  (parser_tokadd_string): use parser_isascii.
  (parser_yylex): ditto.
  (is_special_global_name): don't call is_identchar with e <= p.
  (rb_enc_symname_p): ditto.

  [ruby-dev:32455]

* ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie
  because the encoding is not UTF-8.  [ruby-dev:32475]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08 02:50:43 +00:00
akr 6af5227ec0 fix Regexp#inspect document.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 15:46:21 +00:00
akr 7f65110b53 document MatchData#inspect.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 14:42:05 +00:00
akr c650096adf * re.c (unescape_escaped_nonascii): fix mbclen argument.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14084 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 11:45:02 +00:00
akr 9bd11f24b3 s/unicode/Unicode/ in error messages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02 02:53:46 +00:00
akr 7ff702406a * include/ruby/intern.h (rb_uv_to_utf8): declared.
* re.c (rb_reg_preprocess): new function for dynamic regexp with
  \u{} such as Regexp.new("\\u{6666}").
  (rb_reg_prepare_re): preprocess regexp for recompiling.
  (read_escaped_byte): new function.
  (unescape_escaped_nonascii): new function.
  (append_utf8): new function.
  (unescape_unicode_list): new function.
  (unescape_unicode_bmp): new function.
  (unescape_nonascii): new function.
  (rb_reg_initialize): preprocess regexp.

* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.

* parse.y (STR_NEW3): take func instead of has8 and hasmb.
  (parser_str_new): use default coderange mechanism except for regexp.
  (parser_tokadd_utf8): copy regexp source as-is.
  (parser_read_escape): UTF-8 stuff removed.
  (parser_tokadd_escape): has8bit and hasmb removed.
  (parser_tokadd_string): fix 8-bit single byte character with \u.
  (parser_parse_string): has8bit and hasmb removed.
  (parser_here_document): has8bit and hasmb removed.
  (parser_yylex): call parser_tokadd_utf8 instead of read_escape for
  UTF-8 character.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-01 16:56:19 +00:00
akr f5ee0fd521 * include/ruby/encoding.h, encoding.c, re.c, string.c, parse.y:
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT.
  rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT.
  Because single byte 8bit character, such as Shift_JIS 1byte katakana,
  is represented by ENC_CODERANGE_MULTI even if it is not multi byte.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-27 02:21:17 +00:00
akr cbd72b86da * re.c (Init_Regexp): new method Regexp#fixed_encoding?
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26 08:33:11 +00:00
akr 0dc9be63a5 * re.c (rb_reg_fixed_encoding_p): extracted from rb_reg_prepare_re and
rb_reg_s_union.
  (rb_reg_s_union): refactored.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26 02:27:59 +00:00
akr b2e60b2ce7 * include/ruby/encoding.h (rb_enc_str_asciionly_p): declared.
(rb_enc_str_asciicompat_p): defined.

* re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p.
  (rb_reg_quote): return ascii-8bit string if the argument is
  ascii-only to generate encoding generic regexp if possible.
  (rb_reg_s_union): fix encoding handling.  [ruby-dev:32094]

* string.c (rb_enc_str_asciionly_p): defined.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-25 13:25:34 +00:00
akr 2109a52503 * re.c (REG_CASESTATE): unused macro removed.
(rb_reg_prepare_re): check encoding difference.
  (rb_reg_initialize): check 8bit byte.

* parse.y (parser_tokadd_escape): fix has8bit.

  [ruby-dev:32113]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23 06:30:26 +00:00
matz d73f08d56d * re.c (match_begin): should return offset by character.
[ruby-dev:32331]

* re.c (match_end): ditto.

* re.c (rb_reg_search): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23 02:10:44 +00:00
akr af9c868eae * re.c (rb_reg_quote): quote \v as well.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04 15:03:31 +00:00
akr 794fc684e8 * re.c (rb_reg_initialize_m): use StringValuePtr instead of
StringValueCStr because \0 exists when Regexp.new("\0").


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13817 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04 14:53:36 +00:00
nobu c7697aba34 * parse.y (parser_regx_options, reg_compile_gen): relaxened encoding
matching rule.

* re.c (rb_reg_initialize): always set encoding of Regexp.

* re.c (rb_reg_initialize_str): fix enconding for non 7bit-clean
  strings.

* re.c (rb_reg_initialize_m): use ascii encoding for 'n' option.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-19 07:41:03 +00:00
matz 05737c3500 * re.c (rb_reg_s_union): the last check was not complete.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17 05:21:10 +00:00
nobu 2d1d6c4705 * encoding.c (rb_enc_from_encoding, rb_enc_register): associate index
to self.

* encoding.c (enc_capable): Encoding objects are encoding capable.

* re.c (rb_reg_s_union): check if encoding matching by exact encoding
  objects.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17 02:30:57 +00:00
nobu b06a606278 * re.c (rb_reg_desc): set encoding.
* re.c (rb_reg_s_union): check encodings.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 18:37:09 +00:00
nobu 81ed881511 * re.c (rb_reg_initialize_m): allow binary encoding option.
[ruby-dev:32083]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 16:57:08 +00:00
nobu 5d8ba5a43f * re.c (rb_reg_s_union): check for encoding of original object.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 10:48:02 +00:00
nobu 676dc908b6 * parse.y (parser_regx_options): check if regexp encoding option
matches to current encoding.

* re.c (char_to_option, rb_char_to_option_kcode): 'n' is not kcode
  option now.

* re.c (rb_reg_to_s, rb_reg_error_desc): copy encoding rather than
  append as an option.

* re.c (make_regexp, rb_reg_prepare_re): use encoding of Regexp and
  String instead of kcode.

* re.c (rb_reg_initialize): set fixed option if none is set.

* re.c (rb_reg_regcomp): ditto.

* re.c (rb_reg_equal): check if encodings are equal.

* re.c (rb_reg_initialize_m): encoding option is obsolete.

* re.c (rb_kcode, rb_get_kcode, rb_set_kcode): removed.

* re.c (Init_Regexp): removed Regexp#kcode method.

* ruby.c (proc_options): allow long encoding name.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 05:48:40 +00:00
matz 9f00119776 * re.c (rb_reg_s_union): encoding of all regexp objects should
match.  [ruby-dev:32076]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16 05:06:30 +00:00