Граф коммитов

496 Коммитов

Автор SHA1 Сообщение Дата
naruse a532dcafe6 * string.c (rb_str_inspect): string of ascii incompatible encoding
should be escaped and returned as US-ASCII encoding.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15572 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-22 03:16:52 +00:00
naruse 7a9cf391cd * string.c (rb_str_substr): copy encoding although empty string.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15571 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-22 02:18:23 +00:00
naruse b62df564a6 * string.c (rb_str_times): empty string's coderange is CODERANGE_7BIT.
* string.c (rb_str_substr): ditto.

* encoding.c (rb_enc_compatible): empty string is compatible with not
  only nonasciicompatible strings. [ruby-dev:33895]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15566 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-21 19:54:48 +00:00
naruse 3ce61d2a63 * string.c: replace rb_enc_copy by rb_enc_cr_str_copy or
rb_enc_cr_str_exact_copy.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15560 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-21 02:42:51 +00:00
naruse 492f431a46 * string.c (rb_enc_str_copy): added for wrapper for rb_enc_copy.
this also copy coderange when ptr and len is equal.

* string.c (rb_enc_cr_str_copy): added for wrapper for rb_enc_copy.
  this always copy coderange.

* string.c (str_replace_shared): use rb_enc_str_copy.

* string.c (str_new3): don't rb_enc_copy because encoding is copied
  at str_replace_shared.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-20 10:20:43 +00:00
naruse f1c975b87a * string.c (rb_enc_strlen_cr): get length with coderange scan.
* string.c (str_strlen): use rb_enc_strlen_cr. [ruby-dev:33849]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-19 12:18:03 +00:00
akr 8efc7ea9ad * string.c (rb_str_each_line): fix newline size.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15539 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-18 06:18:31 +00:00
naruse 8dd8dfce21 * encoding.c (ENC_CODERANGE_AND): fix broken case. [ruby-dev:33826]
* string.c (rb_str_times): fix broken case. [ruby-dev:33826]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15525 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 13:01:52 +00:00
naruse 7a257b0110 * encoding.c (ENC_CODERANGE_AND): added.
* string.c (rb_str_plus, srb_str_times): keep coderange.

* parse.y (STR_NEW0) use rb_usascii_str_new.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15519 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 06:49:11 +00:00
akr a906fce838 * string.c (str_strlen): rb_enc_strlen doesn't fail.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15518 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 05:08:43 +00:00
akr bf2d82b280 * string.c (str_sublen): use rb_enc_strlen.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15517 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 04:04:14 +00:00
akr 35cb0f807b * string.c (rb_str_times): reduce loop overhead.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-17 00:18:16 +00:00
akr 71c5e48598 * include/ruby/re.h (struct rmatch_offset): new struct for character
offsets.
  (struct rmatch): new struct.
  (struct RMatch): reference struct rmatch.
  (RMATCH_REGS): new macro.

* re.c (match_alloc): initialize struct rmatch.
  (pair_byte_cmp): new function.
  (update_char_offset): update character offsets.
  (match_init_copy): copy regexp and character offsets.
  (match_sublen): removed.
  (match_offset): use update_char_offset.
  (match_begin): ditto.
  (match_end): ditto.
  (rb_reg_search): make character offset updated flag false.
  (match_size): use RMATCH_REGS.
  (match_backref_number): ditto.
  (rb_reg_nth_defined): ditto.
  (rb_reg_nth_match): ditto.
  (rb_reg_match_pre): ditto.
  (rb_reg_match_post): ditto.
  (rb_reg_match_last): ditto.
  (match_array): ditto.
  (match_aref): ditto.
  (match_values_at): ditto.
  (match_inspect): ditto.

* string.c (rb_str_subpat_set): use RMATCH_REGS.
  (rb_str_sub_bang): ditto.
  (str_gsub): ditto.
  (rb_str_split_m): ditto.
  (scan_once): ditto.

* gc.c (obj_free): free character offsets.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 20:08:35 +00:00
naruse 66583d9663 * string.c (rb_str_substr): optimized for UTF-8.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15511 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 11:53:04 +00:00
naruse bb831578c5 * string.c (str_strlen): revert r15507. [ruby-dev:33810]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15508 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 10:06:15 +00:00
naruse 0ad3d7ce2d * string.c (str_strlen): little more optimize.
(rb_enc_nth): remove needless variable 'c'.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15507 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 09:30:03 +00:00
akr 7eeba5f440 * encoding.c (rb_enc_compatible): empty strings are always compatible.
* string.c (rb_enc_cr_str_buf_cat): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15506 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 09:02:12 +00:00
akr a47e8e776c * string.c (rb_enc_strlen): UTF-8 character count moved to str_strlen.
(str_strlen): UTF-8 character count is only applicable for valid
  UTF-8 string.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 07:16:36 +00:00
akr 9b3ab49b5d * string.c (rb_str_sub_bang): stringize replacing hash values.
(str_gsub): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15500 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 04:17:52 +00:00
naruse 327673a43b * string.c (rb_enc_strlen): add search_nonascii like character
counter for UTF-8.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 04:05:58 +00:00
akr af75cc01bc * encoding.c (rb_enc_strlen): moved to string.c.
* string.c (rb_enc_strlen): use search_nonascii.
  (str_strlen): don't use search_nonascii.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15498 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-16 02:55:08 +00:00
naruse 132e3f54f2 * string.c (single_byte_optimizable): rb_enc_mbminlen must be 1
when rb_enc_mbmaxlen is 1.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15493 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 20:41:29 +00:00
akr 0831222a91 * encoding.c (rb_enc_nth): moved to string.c.
* string.c (rb_enc_nth): moved from string.c.  use search_nonascii
  for ASCII compatible string.
  (str_nth): wrong optimization removed to fix
  "a".force_encoding("EUC-JP").slice!(0,10) returns
  "a\x00\x00\x00\x00\x00\x00\x00\x00\x00"


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 18:48:27 +00:00
nobu a05337f14d * string.c (rb_str_sub_bang, str_gsub): allows hash for replacement.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15487 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 09:23:55 +00:00
matz 8b09f7015a * string.c (str_strlen): use search_nonascii() for performance.
* string.c (str_nth): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15486 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 08:14:40 +00:00
akr 12b1578cab * string.c (rb_str_getbyte): new method.
(rb_str_setbyte): new method.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15484 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-15 06:35:20 +00:00
matz 38694016bc * string.c (rb_str_hash_cmp): lighter version of rb_str_cmp() for
hash comparison function.

* hash.c (rb_any_cmp): use rb_str_hash_cmp().

* string.c (rb_str_casecmp): should return nil for incompatible
  comparison.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-12 03:17:43 +00:00
matz 0472db84c5 * range.c (range_include): specialize single character string
case (e.g. (?a ..?z).include(?x)) for performance.
  [ruby-core:15481]

* string.c (rb_str_upto): specialize single character case.

* string.c (rb_str_hash): omit coderange scan for performance.

* object.c (rb_check_to_integer): check Fixnum first.

* object.c (rb_to_integer): ditto.

* string.c (rb_str_equal): inline memcmp to avoid unnecessary
  rb_str_comparable(). 

* parse.y (rb_intern2): use US-ASCII encoding.

* parse.y (rb_intern_str): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15433 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-11 17:46:52 +00:00
akr 8f9fb1a820 * string.c (rb_str_new4): copy encoding from orig, instead of shared
one.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15410 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-09 01:04:29 +00:00
nobu 1809782c3e * string.c (rb_str_replace): makes frozen shared string before
sharing.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15398 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 10:11:40 +00:00
nobu fb506c3000 * string.c (rb_str_dup): reverted unneeded change. [ruby-dev:33634]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15397 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 09:57:06 +00:00
nobu 89941dffb5 * string.c (str_replace_shared): replaces string with sharing.
* string.c (rb_str_new4, rb_str_associate, rb_str_associated): allows
  associated strings shared.

* string.c (rb_str_dup, rb_str_substr, rb_str_replace): shares memory.
  [ruby-core:15400]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15395 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 07:33:50 +00:00
nobu 6c6ae98663 * string.c (rb_str_end_with): compares with the suffix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15394 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 06:42:44 +00:00
akr 84fe384383 * string.c (rb_str_succ): use wrapped character as a carry for
ASCII incompatible encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15339 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-30 05:29:37 +00:00
naruse 3c6969ec11 * string.c, parse.y, re.c: use rb_ascii8bit_encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15292 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-28 09:03:09 +00:00
akr fc208c1bd5 * include/ruby/oniguruma.h: precise mbclen API redesigned to avoid
inline functions.
  (onigenc_mbclen_charfound): removed.
  (onigenc_mbclen_needmore): removed.
  (onigenc_mbclen_recover): removed.
  (ONIGENC_MBCLEN_CHARFOUND): removed.
  (ONIGENC_MBCLEN_CHARFOUND_P): defined.
  (ONIGENC_MBCLEN_CHARFOUND_LEN): defined.
  (ONIGENC_MBCLEN_INVALID): removed.
  (ONIGENC_MBCLEN_INVALID_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE): removed.
  (ONIGENC_MBCLEN_NEEDMORE_P): defined.
  (ONIGENC_MBCLEN_NEEDMORE_LEN): defined.
  (ONIGENC_MBC_ENC_LEN): use onigenc_mbclen_approximate.

* regenc.c (onigenc_mbclen_approximate): defined.

* include/ruby/encoding.h (MBCLEN_CHARFOUND): removed.
  (MBCLEN_INVALID): removed.
  (MBCLEN_NEEDMORE): removed.
  (MBCLEN_CHARFOUND_P): defined.
  (MBCLEN_INVALID_P): defined.
  (MBCLEN_NEEDMORE_P): defined.
  (MBCLEN_CHARFOUND_LEN): defined.
  (MBCLEN_NEEDMORE_LEN): defined.

* encoding.c: use new API.

* re.c: ditto.

* string.c: ditto.

* parse.y: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15280 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 14:27:07 +00:00
akr b8b0f6fd46 * string.c (rb_str_inspect): avoid exception by
"\#\xa1".force_encoding("euc-jp").inspect.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 11:09:41 +00:00
akr 36b4d1a1dc * string.c (rb_str_succ): warning suppressed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 10:08:58 +00:00
akr b1e6c052cd * string.c (rb_str_succ): don't increment/decrement codepoint.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-27 08:21:24 +00:00
naruse df17bd4313 * string.c (rb_str_new): set US-ASCII and ENC_CODERANGE_7BIT when
empty string (len == 0).


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15247 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-26 00:18:50 +00:00
naruse 42dcda08ae * string.c (rb_str_usascii_new{,2}: defined.
(rb_str_new): set US-ASCII and ENC_CODERANGE_7BIT when empty
  string.

* encoding.c (rb_usascii_encoding, rb_usascii_encindex): defined.
  (rb_enc_inspect, enc_name, rb_locale_charmap, rb_enc_name_list_i):
  use rb_str_ascii_new.

* array.c (recursive_join, inspect_ary): ditto.

* object.c (nil_to_s, nil_inspect, true_to_s, false_to_s,
  rb_mod_to_s): ditto.

* hash.c (inspect_hash, rb_hash_inspect, rb_f_getenv, env_fetch,
  env_clear, env_to_s, env_inspect): ditto.

* numeric.c (flo_to_s, int_chr, rb_fix2str): ditto.

* bignum.c (rb_big2str): ditto.

* file.c (rb_file_ftype, rb_file_s_dirname, rb_file_s_extname,
  file_inspect_join, Init_file): ditto.

* test/ruby/test_ruby_m17n.rb: add checks for encoding of string.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15244 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-25 16:40:02 +00:00
akr 3a783ba707 * string.c (rb_str_buf_cat_ascii): use rb_enc_cr_str_buf_cat.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15237 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-25 12:50:12 +00:00
akr 1e41069754 * include/ruby/intern.h (rb_str_buf_cat_ascii): declared.
* string.c (rb_str_buf_cat_ascii): defined.

* re.c (rb_reg_s_union): use rb_str_buf_cat_ascii to support ASCII
  incompatible encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15232 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-25 07:35:27 +00:00
akr 968e404220 * string.c (rb_enc_cr_str_buf_cat): ASCII incompatible encoding is
not compatible with any other encoding.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-24 02:17:26 +00:00
matz 9580a9ca91 * string.c (rb_str_each_line): use memchr(3) for faster newline
search.

* io.c (appendline): remove unused arguments

* io.c (rb_io_getline_fast): make much simpler (and faster).

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15199 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 18:43:51 +00:00
nobu e94ece76d8 * string.c (str_make_independent): should set length.
* string.c (rb_str_associate): hide associated array from ObjectSpace.

* string.c (rb_str_associated): return associated array with freezing
  instead of false.  [ruby-dev:33282]

* string.c (rb_str_freeze): freeze associated array together.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 06:04:13 +00:00
nobu 0c8106ded6 * string.c (str_mod_check, str_nth, str_offset): consitfied.
* string.c (rb_str_dump): dump in ASCII-8BIT always.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15177 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-23 02:53:28 +00:00
matz 65a8185eb2 * configure.in (MINIRUBY): remove -I$(EXTOUT)/$(arch) from
MINIRUBY since miniruby might not be able to load DLL.

* test/ruby/test_m17n.rb: move tests from bootstrap test.

* encoding.c (enc_find): should check name if ASCII compatible.

* string.c (rb_str_end_with): should check character boundary.

* encoding.c (rb_enc_compatible): encoding must be ASCII
  compatible before checking ENC_CODERANGE_7BIT.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15167 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-22 03:59:53 +00:00
nobu 157664b9f3 * string.c (rb_str_each_char): iterates over a shadow.
[ruby-dev:33243]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15165 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-22 00:26:49 +00:00
matz 56be84e293 * parse.y (rb_intern3): do not call rb_enc_mbclen() if *m is
ASCII.  [ruby-talk:287225]

* string.c (rb_str_each_line): use rb_enc_is_newline() to gain
  performance if the record separator ($/) is not modified.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 19:47:26 +00:00