github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
akr	238c59842c	* re.c (rb_reg_preprocess): fix fixed_enc condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14924 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-07 04:55:26 +00:00
akr	063beac343	* encoding.c (rb_enc_internal_get_index): extracted from rb_enc_get_index. (rb_enc_internal_set_index): extracted from rb_enc_associate_index * include/ruby/encoding.h (ENCODING_SET): work over ENCODING_INLINE_MAX. (ENCODING_GET): ditto. (ENCODING_IS_ASCII8BIT): defined. (ENCODING_CODERANGE_SET): defined. * re.c (rb_reg_fixed_encoding_p): use ENCODING_IS_ASCII8BIT. * string.c (rb_enc_str_buf_cat): use ENCODING_IS_ASCII8BIT. * parse.y (reg_fragment_setenc_gen): use ENCODING_IS_ASCII8BIT. * marshal.c (has_ivars): use ENCODING_IS_ASCII8BIT. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-07 02:49:01 +00:00
akr	f38cc001a7	* re.c (rb_reg_initialize_str): forbid raw non ASCII character for ASCII-8BIT regexp in non ASCII-8BIT script. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14911 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-06 12:15:48 +00:00
akr	8987b97ca9	* include/ruby/encoding.h (rb_enc_str_buf_cat): declared. * string.c (coderange_scan): extracted from rb_enc_str_coderange. (rb_enc_str_coderange): use coderange_scan. (rb_str_shared_replace): copy encoding and coderange. (rb_enc_str_buf_cat): new function for linear complexity string accumulation with encoding. (rb_str_sub_bang): don't conflict substituted part and replacement. (str_gsub): use rb_enc_str_buf_cat. (rb_str_clear): clear coderange. * re.c (rb_reg_regsub): use rb_enc_str_buf_cat. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14910 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-06 09:25:09 +00:00
akr	da42c102c1	* re.c (rb_reg_initialize_str): /\x80/n is not an error even if script encoding is EUC-JP. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14899 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-05 16:39:38 +00:00
nobu	8638ee26e7	* include/ruby/intern.h, re.c (rb_reg_new): keep interface same as 1.8. [ruby-core:14583] * include/ruby/intern.h, re.c (rb_reg_new_str): renamed, and defines HAVE_RB_REG_NEW_STR macro to tell if it is available. * include/ruby/encoding.h (rb_enc_reg_new): added. * insns.def (toregexp), marshal.c (r_object0): use rb_reg_new_str(). * re.c (rb_reg_regcomp, rb_reg_s_union): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-04 16:30:33 +00:00
akr	f780cdec75	* re.c (rb_reg_prepare_re): check string encoding. Oniguruma doesn't support invalid encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14880 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-04 05:01:58 +00:00
akr	7d98c90ef2	unused variable removed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-04 03:13:53 +00:00
matz	22e7258275	* re.c (rb_reg_search): avoid inner loop for reverse search. * regexec.c: unset USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE which is turned on since oniguruma 5.9.1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14878 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-04 01:24:12 +00:00
akr	52f9c1d2e1	* re.c (rb_reg_search): iterate onig_match for reverse mode. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2008-01-03 17:48:06 +00:00
akr	e21907e0f8	fix typos. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14810 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-31 05:52:59 +00:00
nobu	5ee7f4b0b5	* re.c (rb_reg_regsub): returns the given string itself if nothing changed. * string.c (rb_str_sub_bang): keeps code-range as possible. * string.c (str_gsub): adjusts code-range. [ruby-core:14566] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14782 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-29 13:44:32 +00:00
akr	fd640aec82	* re.c (rb_reg_s_union): show encodings in error message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-27 07:38:23 +00:00
akr	b910bb7761	* re.c (rb_reg_prepare_re): show regexp encoding in the error message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14597 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-24 09:38:20 +00:00
akr	5b809a28f8	* include/ruby/encoding.h, encoding.c, re.c, io.c, parse.y, numeric.c, ruby.c, transcode.c: rename rb_ascii_encoding. to rb_ascii8bit_encoding. rb_ascii_encoding is ambiguous with ASCII-8BIT and US-ASCII. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-22 23:47:18 +00:00
akr	fa3d06c738	refine error message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-22 07:14:07 +00:00
matz	d7cc14d436	* encoding.c (rb_ascii_encoding): renamed from previous rb_default_encoding(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 18:55:30 +00:00
matz	b36c642a85	* re.c (rb_reg_prepare_re): stop ENCODING_NONE warning if the encoding of the str is ASCII-8BIT. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 18:21:41 +00:00
akr	b82a05989e	* re.c (ARG_ENCODING_NONE): defined for /.../n option. (REG_ENCODING_NONE): ditto. (rb_char_to_option_kcode): return ARG_ENCODING_NONE for n. (rb_reg_prepare_re): warn /ascii/n =~ "non-ascii". (rb_reg_initialize): set REG_ENCODING_NONE from ARG_ENCODING_NONE. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 16:39:36 +00:00
akr	e667720bd4	* re.c (append_utf8): use rb_utf8_encoding() instead of rb_enc_find("utf-8"). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14412 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 07:07:21 +00:00
matz	668bd7d992	* test/ruby/test_system.rb (TestSystem::valid_syntax): apply ASCII-8BIT encoding explicitly. * re.c (rb_reg_prepare_re): add encoding name in the message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14402 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 05:03:14 +00:00
akr	59dca19910	* re.c: change "character encodings differ" error messages. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14401 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-21 04:54:54 +00:00
matz	77629d2cbe	* string.c (rb_str_rindex_m): too much adjustment. * re.c (reg_match_pos): pos adjustment should be based on characters. * test/ruby/test_m17n.rb (TestM17N::test_str_insert): test updated to check negative offset behavior. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-19 17:02:29 +00:00
nobu	474a88f041	* re.c (rb_reg_regsub): should set checked encoding. * string.c (rb_str_sub_bang): applied r14212 too. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-19 12:42:19 +00:00
akr	2d01290cfd	* parse.y (arg tMATCH arg): call reg_named_capture_assign_gen if regexp literal is used. (reg_named_capture_assign_gen): assign the result of named capture into local variables. [ruby-dev:32588] * re.c: document the assignment by named captures. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14297 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-18 11:26:24 +00:00
matz	ebfcc5d933	* re.c (rb_reg_initialize): raise error if non-Unicode fixed encoding option is specified for regexp literals with \u{} escapes. * string.c (rb_str_squeeze_bang): should squeeze multibyte characters as well. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-17 16:06:21 +00:00
matz	d6a70c4bb7	* string.c (scan_once): need no encoding compatibility check. it's done inside of re_reg_seach(). * string.c (rb_str_split_m): ditto. * re.c (rb_reg_regsub): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-17 09:44:06 +00:00
matz	7bb3ea6afa	* re.c (rb_reg_initialize): embedded string may override encoding of the regular expression. * re.c (rb_reg_initialize): fix encoding of regular expression if embedded string has its own encoding specified. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-13 16:09:53 +00:00
matz	a648fc802b	* encoding.c (rb_enc_compatible): encoding should never fall back to ASCII-8BIT unless both encodings are ASCII-8BIT. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-13 13:44:02 +00:00
akr	b92cee1ddb	* re.c, regerror.c, string.c, parse.y, ruby.c, file.c: use capital letter for \xHH notation. [ruby-dev:32511] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-12 14:30:54 +00:00
nobu	ad72efa269	* re.c (rb_reg_regsub): should copy encoding. * string.c (rb_str_sub_bang, str_gsub): should check and copy encoding to be replaced. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-12 03:11:44 +00:00
akr	646f27b822	* encoding.c (rb_enc_ascget): renamed from rb_enc_get_ascii. * include/ruby/encoding.h: follow the renaming. * re.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-11 07:39:16 +00:00
akr	5802768b40	* encoding.c (rb_enc_get_ascii): add an argument to provide the length of the returned character. * include/ruby/encoding.h (rb_enc_get_ascii): add the argument. * re.c (rb_reg_expr_str): modify rb_enc_get_ascii call. (rb_reg_quote): ditto. (rb_reg_regsub): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-11 03:08:50 +00:00
matz	f6a9c859be	* re.c (rb_reg_match): should calculate offset by converted operand. [ruby-cvs:21416] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-10 10:03:48 +00:00
nobu	38a24d73c8	* re.c (rb_reg_search): return byte offset. [ruby-dev:32452] * re.c (rb_reg_match, rb_reg_match2, rb_reg_match_m): convert byte offset to char index. * string.c (rb_str_index): return byte offset. [ruby-dev:32472] * string.c (rb_str_split_m): calculate in byte offset. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14171 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-10 04:50:35 +00:00
akr	f4592d7bb0	* re.c (rb_reg_expr_str): use \xHH instead of \OOO. * regerror.c (to_ascii): ditto. (onig_snprintf_with_pattern): ditto. (onig_snprintf_with_pattern): ditto. * string.c (rb_str_inspect): ditto. (rb_str_dump): ditto. * parse.y (parser_yylex): ditto. * ruby.c (proc_options): ditto. * file.c (rb_f_test): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 21:48:05 +00:00
akr	08eb58d3dd	* re.c (rb_reg_names): new method Regexp#names. (rb_reg_named_captures): new method Regexp#named_captures (match_regexp): new method MatchData#regexp. (match_names): new method MatchData#names. * lib/pp.rb (MatchData#pretty_print): show names of named captures. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 21:44:19 +00:00
akr	e56e8c758d	* re.c (rb_reg_s_last_match): accept named capture's name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14161 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 13:35:38 +00:00
akr	2d101f0a87	Regexp#fixed_encoding? documented. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 11:57:06 +00:00
akr	7a7c26be73	document named capture of MatchData#{offset,begin,end,inspect}. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 07:35:54 +00:00
akr	18d8fbac54	* re.c (match_backref_number): new function for converting a backref name/number to an integer. (match_offset): use match_backref_number. (match_begin): ditto. (match_end): ditto. (name_to_backref_number): raise IndexError instead of RuntimeError. (match_inspect): show capture index. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 07:12:44 +00:00
akr	5a1c2b2677	* re.c (append_utf8): check unicode range. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 03:50:11 +00:00
akr	b12bb50149	* re.c (rb_reg_check_preprocess): new function for validating regexp fragment. * parse.y (regexp): invoke reg_fragment_check. (reg_fragment_check): defined. (reg_fragment_check_gen): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-08 07:21:05 +00:00
akr	f1b7e60cb9	* encoding.c (rb_enc_mbclen): make it never fail. (rb_enc_nth): don't check the return value of rb_enc_mbclen. (rb_enc_strlen): ditto. (rb_enc_precise_mbclen): return needmore(1) if e <= p. (rb_enc_get_ascii): new function for extracting ASCII character. * include/ruby/encoding.h (rb_enc_get_ascii): declared. * include/ruby/regex.h (ismbchar): removed. * re.c (rb_reg_expr_str): use rb_enc_get_ascii. (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine the termination of escaped non-ASCII character. (unescape_nonascii): use rb_enc_precise_mbclen. (rb_reg_quote): use rb_enc_get_ascii. (rb_reg_regsub): use rb_enc_get_ascii. * string.c (rb_str_reverse) don't check the return value of rb_enc_mbclen. (rb_str_split_m): don't call rb_enc_mbclen with e <= p. * parse.y (is_identchar): use ISASCII. (parser_ismbchar): removed. (parser_precise_mbclen): new macro. (parser_isascii): new macro. (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid character precisely. (parser_tokadd_string): use parser_isascii. (parser_yylex): ditto. (is_special_global_name): don't call is_identchar with e <= p. (rb_enc_symname_p): ditto. [ruby-dev:32455] * ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie because the encoding is not UTF-8. [ruby-dev:32475] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-08 02:50:43 +00:00
akr	6af5227ec0	fix Regexp#inspect document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 15:46:21 +00:00
akr	7f65110b53	document MatchData#inspect. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 14:42:05 +00:00
akr	c650096adf	* re.c (unescape_escaped_nonascii): fix mbclen argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14084 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 11:45:02 +00:00
akr	9bd11f24b3	s/unicode/Unicode/ in error messages. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 02:53:46 +00:00
akr	7ff702406a	* include/ruby/intern.h (rb_uv_to_utf8): declared. * re.c (rb_reg_preprocess): new function for dynamic regexp with \u{} such as Regexp.new("\\u{6666}"). (rb_reg_prepare_re): preprocess regexp for recompiling. (read_escaped_byte): new function. (unescape_escaped_nonascii): new function. (append_utf8): new function. (unescape_unicode_list): new function. (unescape_unicode_bmp): new function. (unescape_nonascii): new function. (rb_reg_initialize): preprocess regexp. * pack.c (rb_uv_to_utf8): renamed from uv_to_utf8. * parse.y (STR_NEW3): take func instead of has8 and hasmb. (parser_str_new): use default coderange mechanism except for regexp. (parser_tokadd_utf8): copy regexp source as-is. (parser_read_escape): UTF-8 stuff removed. (parser_tokadd_escape): has8bit and hasmb removed. (parser_tokadd_string): fix 8-bit single byte character with \u. (parser_parse_string): has8bit and hasmb removed. (parser_here_document): has8bit and hasmb removed. (parser_yylex): call parser_tokadd_utf8 instead of read_escape for UTF-8 character. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-01 16:56:19 +00:00
akr	f5ee0fd521	* include/ruby/encoding.h, encoding.c, re.c, string.c, parse.y: rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT. rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT. Because single byte 8bit character, such as Shift_JIS 1byte katakana, is represented by ENC_CODERANGE_MULTI even if it is not multi byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-27 02:21:17 +00:00
akr	cbd72b86da	* re.c (Init_Regexp): new method Regexp#fixed_encoding? git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-26 08:33:11 +00:00
akr	0dc9be63a5	* re.c (rb_reg_fixed_encoding_p): extracted from rb_reg_prepare_re and rb_reg_s_union. (rb_reg_s_union): refactored. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-26 02:27:59 +00:00
akr	b2e60b2ce7	* include/ruby/encoding.h (rb_enc_str_asciionly_p): declared. (rb_enc_str_asciicompat_p): defined. * re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p. (rb_reg_quote): return ascii-8bit string if the argument is ascii-only to generate encoding generic regexp if possible. (rb_reg_s_union): fix encoding handling. [ruby-dev:32094] * string.c (rb_enc_str_asciionly_p): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-25 13:25:34 +00:00
akr	2109a52503	* re.c (REG_CASESTATE): unused macro removed. (rb_reg_prepare_re): check encoding difference. (rb_reg_initialize): check 8bit byte. * parse.y (parser_tokadd_escape): fix has8bit. [ruby-dev:32113] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-23 06:30:26 +00:00
matz	d73f08d56d	* re.c (match_begin): should return offset by character. [ruby-dev:32331] * re.c (match_end): ditto. * re.c (rb_reg_search): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-23 02:10:44 +00:00
akr	af9c868eae	* re.c (rb_reg_quote): quote \v as well. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-04 15:03:31 +00:00
akr	794fc684e8	* re.c (rb_reg_initialize_m): use StringValuePtr instead of StringValueCStr because \0 exists when Regexp.new("\0"). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13817 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-04 14:53:36 +00:00
nobu	c7697aba34	* parse.y (parser_regx_options, reg_compile_gen): relaxened encoding matching rule. * re.c (rb_reg_initialize): always set encoding of Regexp. * re.c (rb_reg_initialize_str): fix enconding for non 7bit-clean strings. * re.c (rb_reg_initialize_m): use ascii encoding for 'n' option. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-19 07:41:03 +00:00
matz	05737c3500	* re.c (rb_reg_s_union): the last check was not complete. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-17 05:21:10 +00:00
nobu	2d1d6c4705	* encoding.c (rb_enc_from_encoding, rb_enc_register): associate index to self. * encoding.c (enc_capable): Encoding objects are encoding capable. * re.c (rb_reg_s_union): check if encoding matching by exact encoding objects. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-17 02:30:57 +00:00
nobu	b06a606278	* re.c (rb_reg_desc): set encoding. * re.c (rb_reg_s_union): check encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-16 18:37:09 +00:00
nobu	81ed881511	* re.c (rb_reg_initialize_m): allow binary encoding option. [ruby-dev:32083] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-16 16:57:08 +00:00
nobu	5d8ba5a43f	* re.c (rb_reg_s_union): check for encoding of original object. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-16 10:48:02 +00:00
nobu	676dc908b6	* parse.y (parser_regx_options): check if regexp encoding option matches to current encoding. * re.c (char_to_option, rb_char_to_option_kcode): 'n' is not kcode option now. * re.c (rb_reg_to_s, rb_reg_error_desc): copy encoding rather than append as an option. * re.c (make_regexp, rb_reg_prepare_re): use encoding of Regexp and String instead of kcode. * re.c (rb_reg_initialize): set fixed option if none is set. * re.c (rb_reg_regcomp): ditto. * re.c (rb_reg_equal): check if encodings are equal. * re.c (rb_reg_initialize_m): encoding option is obsolete. * re.c (rb_kcode, rb_get_kcode, rb_set_kcode): removed. * re.c (Init_Regexp): removed Regexp#kcode method. * ruby.c (proc_options): allow long encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-16 05:48:40 +00:00
matz	9f00119776	* re.c (rb_reg_s_union): encoding of all regexp objects should match. [ruby-dev:32076] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-16 05:06:30 +00:00
matz	ba9eb2c929	* re.c (match_values_at): make #select to be alias to #values_at to adapt RDoc description. [ruby-core:12588] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13683 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-12 14:35:26 +00:00
matz	79a202433c	* re.c (rb_reg_s_quote): no longer takes optional second argument that has never been documented. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-10 14:34:42 +00:00
akr	cf9bdd01d8	fix rdoc position of Regexp.union. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-08 16:03:53 +00:00
akr	d751dad12a	add an example for Regexp.union document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13642 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-06 05:40:45 +00:00
nobu	6845578c92	* insns.def (opt_eq): get rid of gcc bug. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13641 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-06 05:32:37 +00:00
matz	bd00bb3ef7	* include/ruby/defines.h: no longer provide DEFAULT_KCODE. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-05 17:39:59 +00:00
akr	dea669cf4e	* re.c (rb_reg_s_union_m): Regexp.union accepts single argument which is an array of patterns. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13638 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-05 12:26:35 +00:00
matz	1d758debe0	replace rb_memcicmp() git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13624 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 09:59:56 +00:00
matz	c953283d7e	revert rb_memcmp() change to pacify GCC optimizer git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13623 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 09:54:53 +00:00
matz	1677425e9d	* re.c (rb_memcmp): no longer useful without ruby_ignorecase. * re.c (rb_reg_prepare_re): revert recompile condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13622 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 09:24:00 +00:00
matz	506cdbf64a	* re.c (kcode_setter): restore erroneously removed setter. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 08:34:06 +00:00
matz	dbcc539602	* re.c (ignorecase_setter): change warning message. * re.c (ignorecase_getter): now gives warning. * string.c (rb_str_cmp_m): update RDoc document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 08:09:06 +00:00
matz	9a2a45cd69	* re.c (Init_Regexp): remove obsolete const alias: MatchingData. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 07:54:53 +00:00
matz	1c9a2e1154	* re.c (kcode_setter): Perl-ish global variable `$=' no longer effective. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13616 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 07:31:50 +00:00
nobu	19dee8af57	* encoding.c (rb_obj_encoding): returns encoding of the given object. * re.c (Init_Regexp): new method Regexp#encoding. * string.c (str_encoding): moved to encoding.c git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-10-04 06:57:19 +00:00
akr	910b0709ed	* re.c (Init_Regexp): test DEFAULT_KCODE in C code because KCODE_EUC, etc are enum. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13571 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-09-29 19:06:40 +00:00
matz	5376745fb6	* re.c (rb_reg_match_m): evaluate a block if match. it would make condition statement much shorter, if no else clause is needed. * string.c (rb_str_match_m): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-09-20 17:14:01 +00:00
matz	edd7c787ad	* array.c (rb_ary_cycle): typo in rdoc. a patch from Yugui <yugui@yugui.sakura.ne.jp>. [ruby-dev:31748] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13348 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-09-06 12:33:45 +00:00
matz	3d7f8c2320	* string.c (str_gsub): should not use mbclen2() which has broken API. * re.c: remove rb_reg_mbclen2(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-29 19:16:02 +00:00
nobu	69099d3e69	* re.c (rb_reg_mbclen2): suppress a warning. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-29 03:19:00 +00:00
matz	51b4cc11d1	* string.c (rb_str_subseq): retrieve substring based on byte offset. * string.c (rb_str_rindex_m): was confusing character offset and byte offset. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-28 06:45:32 +00:00
nobu	c456863bd6	* parse.y, re.c: re-applied revision 13092. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-25 07:06:47 +00:00
matz	a25fbe3b3e	* encoding.c: provide basic features for M17N. * parse.y: encoding aware parsing. * parse.y (pragma_encoding): encoding specification pragma. * parse.y (rb_intern3): encoding specified symbols. * string.c (rb_str_length): length based on characters. for older behavior, bytesize method added. * string.c (rb_str_index_m): index based on characters. rindex as well. * string.c (succ_char): encoding aware succeeding string. * string.c (rb_str_reverse): reverse based on characters. * string.c (rb_str_inspect): encoding aware string description. * string.c (rb_str_upcase_bang): encoding aware case conversion. downcase, capitalize, swapcase as well. * string.c (rb_str_tr_bang): tr based on characters. delete, squeeze, tr_s, count as well. * string.c (rb_str_split_m): split based on characters. * string.c (rb_str_each_line): encoding aware each_line. * string.c (rb_str_each_char): added. iteration based on characters. * string.c (rb_str_strip_bang): encoding aware whitespace stripping. lstrip, rstrip as well. * string.c (rb_str_justify): encoding aware justifying (ljust, rjust, center). * string.c (str_encoding): get encoding attribute from a string. * re.c (rb_reg_initialize): encoding aware regular expression * sprintf.c (rb_str_format): formatting (i.e. length count) based on characters. * io.c (rb_io_getc): getc to return one-character string. for older behavior, getbyte method added. * ext/stringio/stringio.c (strio_getc): ditto. * io.c (rb_io_ungetc): allow pushing arbitrary string at the current reading point. * ext/stringio/stringio.c (strio_ungetc): ditto. * ext/strscan/strscan.c: encoding support. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13261 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-25 03:29:39 +00:00
matz	bdf32ff14f	* array.c (rb_ary_s_try_convert): more document description. * re.c (rb_reg_s_try_convert): typo fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13256 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-25 00:43:13 +00:00
matz	5e1c401ff5	* array.c (rb_ary_s_try_convert): a new class method to convert object or nil if it's not target-type. this mechanism is used to convert types in the C implemented methods. * hash.c (rb_hash_s_try_convert): ditto. * io.c (rb_io_s_try_convert): ditto. * re.c (rb_reg_s_try_convert): ditto. * string.c (rb_str_s_try_convert): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-24 17:47:09 +00:00
nobu	3f025d2078	* parse.y (reg_compile_gen): obtain error info from errinfo. * re.c (rb_reg_error_desc): make RegexpError for initialization error. * re.c (rb_reg_compile): return nil and set errinfo if error. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-18 05:05:36 +00:00
nobu	11e1e96f4b	* re.c (option_to_str, arg_kcode, opt_kcode): options conversion between int and string. * re.c (rb_reg_compile): append regexp options to error message. [ruby-dev:31334] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12863 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-02 14:42:59 +00:00
nobu	d9274e7d6b	* parse.y (reg_compile_gen): set error if failed to compile regexp literal. [ruby-dev:31336] * re.c (rb_reg_compile): should not use regexp which could not get initialized. [ruby-dev:31333] return error message to let the parser know it. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-08-02 14:36:25 +00:00
nobu	46603a78af	* include/ruby/{intern,ruby}.h, compile.[ch], error.c, eval.c, eval_load.c, gc.c, iseq.c, main.c, parse.y, re.c, ruby.c, yarvcore.[ch] (ruby_eval_tree, ruby_sourcefile, ruby_sourceline, ruby_nerrs): purge global variables. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-07-05 08:12:18 +00:00
akr	fe377d3b8e	update document to follow MatchData#inspect implementation. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12589 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-06-23 08:34:21 +00:00
akr	18ee945174	* re.c (match_inspect): MatchData#inspect implemented. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12588 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-06-23 08:26:08 +00:00
nobu	2b592580bf	* include/ruby: moved public headers. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12501 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-06-10 03:06:15 +00:00
nobu	99d65b14b4	* compile.c, dir.c, eval.c, eval_jump.h, eval_method.h, numeric.c, pack.c, parse.y, re.c, thread.c, vm.c, vm_dump.c, call_cfunc.ci, thread_pthread.ci, thread_win32.ci: fixed indentation. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12431 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-06-05 04:25:10 +00:00
matz	6ee2e54239	* oniguruma.h: updated to Oniguruma 5.7.0. * regsyntax.c, unicode.c: new files along with Oniguruma 5.x. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-05-23 01:32:08 +00:00
matz	3098d80818	* re.c (reg_operand): allow symbols to be operands for regular expression matches. * string.c (Init_String): allow Symbol#===. * lib/date/format.rb (Date::Format::Bag::to_hash): string added prefixes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-02-14 04:57:25 +00:00

1 2 3 4 5 ...

308 Коммитов