github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Alan Wu	5a570421a5	[DOC] Regexp.last_match returns `$~`, not `$!`	2024-08-09 16:02:36 -04:00
Peter Zhu	7464514ca5	Fix memory leak in String#start_with? when regexp times out [Bug #20653] This commit refactors how Onigmo handles timeout. Instead of raising a timeout error, onig_search will return a ONIGERR_TIMEOUT which the caller can free memory, and then raise a timeout error. This fixes a memory leak in String#start_with when the regexp times out. For example: regex = Regexp.new("^#{"(a)" 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do str.start_with?(regex) rescue end puts `ps -o rss= -p #{$$}` end Before: 33216 51936 71152 81728 97152 103248 120384 133392 133520 133616 After: 14912 15376 15824 15824 16128 16128 16144 16144 16160 16160	2024-07-26 08:42:38 -04:00
Shugo Maeda	e048a073a3	Add MatchData#bytebegin and MatchData#byteend These methods return the byte-based offset of the beginning or end of the specified match. [Feature #20576]	2024-07-16 14:48:06 +09:00
Jean Boussier	3a7846b1aa	Add a hint of `ASCII-8BIT` being `BINARY` [Feature #18576] Since outright renaming `ASCII-8BIT` is deemed to backward incompatible, the next best thing would be to only change its `#inspect`, particularly in exception messages.	2024-04-18 10:17:26 +02:00
Peter Zhu	01bfd1a2bf	Fix memory leak in OnigRegion when match raises [Bug #20228] rb_reg_onig_match can raise a Regexp::TimeoutError, which would cause the OnigRegion to leak.	2024-02-02 10:39:42 -05:00
Peter Zhu	1c120efe02	Fix memory leak in stk_base when Regexp timeout [Bug #20228] If rb_reg_check_timeout raises a Regexp::TimeoutError, then the stk_base will leak.	2024-02-02 10:39:42 -05:00
git	5b6167c252	* expand tabs. [ci skip] Please consider using misc/expand_tabs.rb as a pre-commit hook.	2024-01-07 15:50:59 +00:00
Nobuyoshi Nakada	c30b8ae947	Adjust styles and indents [ci skip]	2024-01-08 00:50:41 +09:00
Luke Gruber	e12d4c654e	Don't create T_MATCH object if /regexp/.match(string) doesn't match Fixes [Bug #20104]	2024-01-01 13:28:26 -08:00
Peter Zhu	f0efeddd41	Fix Regexp#inspect for GC compaction rb_reg_desc was not safe for GC compaction because it took in the C string and length but not the backing String object so it get moved during compaction. This commit changes rb_reg_desc to use the string from the Regexp object. The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_inspect_under_gc_compact_stress [test/ruby/test_regexp.rb:474]: <"(?-mix:\\/)\|"> expected but was <"/\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00/">.	2023-12-24 11:04:41 -05:00
Peter Zhu	42442ed789	Fix Regexp#match for GC compaction The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_match_under_gc_compact_stress: NoMethodError: undefined method `match' for nil test_regexp.rb:878:in `block in test_match_under_gc_compact_stress'	2023-12-24 09:03:55 -05:00
Peter Zhu	fadda88903	Fix Regexp#to_s for GC compaction The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_to_s_under_gc_compact_stress = 13.46 s 1) Failure: TestRegexp#test_to_s_under_gc_compact_stress [test/ruby/test_regexp.rb:81]: <"(?-mix:abcd\u3042)"> expected but was <"(?-mix:\u5C78\u3030\u5C78\u3030\u5C78\u3030\u5C78\u3030\u5C78\u3030)">.	2023-12-23 16:52:05 -05:00
Nobuyoshi Nakada	dee45ac231	[DOC] State MatchData#[] when multiple captures with the same name	2023-12-19 13:48:51 +09:00
Victor Shepelev	570d7b2c3e	[DOC] Adjust some new features wording/examples. (#9183 ) * Reword Range#overlap? docs last paragraph. * Docs: add explanation about Queue#freeze * Docs: Add :rescue event docs for TracePoint * Docs: Enhance Module#set_temporary_name documentation * Docs: Slightly expand Process::Status deprecations * Fix MatchData#named_captures rendering glitch * Improve Dir.fchdir examples * Adjust Refinement#target docs	2023-12-14 23:01:48 +02:00
Dustin Brown	d89280e8bf	Copy encoding flags when copying a regex [Bug #20039 ] * 🐛 Fixes [Bug #20039](https://bugs.ruby-lang.org/issues/20039) When a Regexp is initialized with another Regexp, we simply copy the properties from the original. However, the flags on the original were not being copied correctly. This caused an issue when the original had multibyte characters and was being compared with an ASCII string. Without the forced encoding flag (`KCODE_FIXED`) transferred on to the new Regexp, the comparison would fail. See the included test for an example. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2023-12-06 19:25:29 -08:00
Nobuyoshi Nakada	caa9881fde	[DOC] Fix doc/regexp.rdoc links - Rename regexp.rdoc to exclude from "Pages". This file is for to be included in the "class Regexp" document, but it also appeared as a separate page duplicately. - Fix links on case-sensitive filesystems. - Fix to use rdoc-ref instead of converted HTML page names.	2023-11-14 15:56:57 +09:00
Herwin	8b3d044004	[DOC] Indentation fix in comments of MatchData#inspect The old version did not add syntax highlighting to the code block, and included the "Related:" line in the code block as well.	2023-10-20 18:26:37 +09:00
Herwin	3467355450	[DOC] Fix typo in docs of Regexp#deconstruct_keys of => if	2023-10-20 07:18:03 +09:00
Peter Zhu	d42b9ffb20	Reuse Regexp ptr when recompiling When matching an incompatible encoding, the Regexp needs to recompile. If `usecnt == 0`, then we can reuse the `ptr` because nothing else is using it. This avoids allocating another `regex_t`. This speeds up matches that switch to incompatible encodings by 15%. Branch: ``` Regex#match? with different encoding 1.431M (± 1.3%) i/s - 7.264M in 5.076153s Regex#match? with same encoding 16.858M (± 1.1%) i/s - 85.347M in 5.063279s ``` Base: ``` Regex#match? with different encoding 1.248M (± 2.0%) i/s - 6.342M in 5.083151s Regex#match? with same encoding 16.377M (± 1.1%) i/s - 82.519M in 5.039504s ``` Script: ``` regex = /foo/ str1 = "日本語" str2 = "English".force_encoding("ASCII-8BIT") Benchmark.ips do \|x\| x.report("Regex#match? with different encoding") do \|times\| i = 0 while i < times regex.match?(str1) regex.match?(str2) i += 1 end end x.report("Regex#match? with same encoding") do \|times\| i = 0 while i < times regex.match?(str1) i += 1 end end end ```	2023-07-31 09:17:18 -04:00
Takashi Kokubun	9721972175	Resurrect rb_reg_prepare_re C API Existing strscan releases rely on this C API. It means that the current Ruby master doesn't work if your Gemfile.lock has strscan unless it's locked to 3.0.7, which is not released yet. To fix it, let's not remove the C API we've exposed to users.	2023-07-27 15:30:10 -07:00
Peter Zhu	69b20d1196	Don't load RREGEXP_PTR twice	2023-07-27 14:41:12 -04:00
Peter Zhu	511c51e116	Refactor err string in rb_reg_prepare_re	2023-07-27 14:04:02 -04:00
Peter Zhu	7193b404a1	Add function rb_reg_onig_match rb_reg_onig_match performs preparation, error handling, and cleanup for matching a regex against a string. This reduces repetitive code and removes the need for StringScanner to access internal data of regex.	2023-07-27 13:33:40 -04:00
Kunshan Wang	639aa76e82	Embed struct rmatch into GC slot (#8097 )	2023-07-20 14:17:38 -04:00
Nobuyoshi Nakada	913e01e80e	Stop allocating unused backref strings at `defined?`	2023-06-27 23:14:10 +09:00
Nobuyoshi Nakada	df5ae0a550	Use `rb_reg_nth_defined` instead of `rb_match_nth_defined`	2023-06-27 22:39:15 +09:00
Burdette Lamar	932dd9f10e	[DOC] Regexp doc (#7923 )	2023-06-20 09:28:21 -04:00
git	d7300038e4	* expand tabs. [ci skip] Please consider using misc/expand_tabs.rb as a pre-commit hook.	2023-06-09 12:45:58 +00:00
Nobuyoshi Nakada	ab6eb3786c	Optimize `Regexp#dup` and `Regexp.new(/RE/)` When copying from another regexp, copy already built `regex_t` instead of re-compiling its source.	2023-06-09 20:22:30 +09:00
Jeremy Evans	a8ba1ddd78	Use UTF-8 encoding for literal extended regexps with UTF-8 characters in comments Fixes [Bug #19455]	2023-04-23 19:27:58 -07:00
Vladimir Dementyev	b09f5c7bf7	MatchData#named_captures: add optional symbolize_names keyword (#6952 )	2023-04-19 11:19:31 +12:00
Matt Valentine-House	026321c5b9	[Feature #19474 ] Refactor NEWOBJ macros NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec	2023-04-06 11:07:16 +01:00
Takashi Kokubun	233ddfac54	Stop exporting symbols for MJIT	2023-03-06 21:59:23 -08:00
Nobuyoshi Nakada	a5310e609d	[DOC] Fix options of `Regexp#initialize` `Integer#\|` is bit-wise OR operator, not logical OR.	2023-03-06 13:57:17 +09:00
Nobuyoshi Nakada	8ee604b9d4	`rb_scan_args` never fills optional arguments with `Qundef`	2023-03-06 13:57:17 +09:00
Nobuyoshi Nakada	680bd9027f	[Bug #19471 ] `Regexp.compile` should handle keyword arguments As well as `Regexp.new`, it should pass keyword arguments to the `Regexp#initialize` method.	2023-03-03 15:27:37 +09:00
Jeremy Evans	04cfb26bd3	Remove support for the Regexp.new 3rd argument This was deprecated in Ruby 3.2. Fixes [Bug #18797]	2023-03-01 23:42:47 -08:00
Nobuyoshi Nakada	ef00c6da88	Adjust `else` style to be consistent in each files [ci skip]	2023-02-26 13:20:43 +09:00
BurdetteLamar	3b239d2480	Remove (newly unneeded) remarks about aliases	2023-02-19 14:26:34 -08:00
Jean Boussier	46298955e4	Implement Write Barrier for RMatch objects They only have two references.	2023-02-10 16:12:22 +01:00
OKURA Masafumi	11e0f62148	[DOC] Fix typo in document of regexp [ci skip]	2023-02-10 18:32:21 +09:00
Nobuyoshi Nakada	b49cd84311	Remove `REG_LITERAL` flag All `Regexp` literals are frozen now.	2023-02-09 19:21:24 +09:00
Jeremy Evans	eccfc978fd	Fix parsing of regexps that toggle extended mode on/off inside regexp This was broken in `ec3542229b`. That commit didn't handle cases where extended mode was turned on/off inside the regexp. There are two ways to turn extended mode on/off: ``` /(?-x:#y)#z /x =~ '#y' /(?-x)#y(?x)#z /x =~ '#y' ``` These can be nested inside the same regexp: ``` /(?-x:(?x)#x (?-x)#y)#z /x =~ '#y' ``` As you can probably imagine, this makes handling these regexps somewhat complex. Due to the nesting inside portions of regexps, the unassign_nonascii function needs to be recursive. In recursive mode, it needs to track both opening and closing parentheses, similar to how it already tracked opening and closing brackets for character classes. When scanning the regexp and coming to `(?` not followed by `#`, scan for options, and use `x` and `i` to determine whether to turn on or off extended mode. For `:`, indicting only the current regexp section should have the extended mode switched, recurse with the extended mode set or unset. For `)`, indicating the remainder of the regexp (or current regexp portion if already recursing) should turn extended mode on or off, just change the extended mode flag and keep scanning. While testing this, I noticed that `a`, `d`, and `u` are accepted as options, in addition to `i`, `m`, and `x`, but I can't see where those options are documented. I'm not sure whether or not handling `a`, `d`, and `u` as options is a bug. Fixes [Bug #19379]	2023-01-30 08:51:12 -08:00
Burdette Lamar	30bd2a32fa	[DOC] Correction to RDoc for Regexp.new (#7130 ) Correction to RDoc for Regexp.new	2023-01-16 11:02:23 -06:00
Jeremy Evans	7e8fa06022	Always issue deprecation warning when calling Regexp.new with 3rd positional argument Previously, only certain values of the 3rd argument triggered a deprecation warning. First step for fix for bug #18797. Support for the 3rd argument will be removed after the release of Ruby 3.2. Fix minor fallout discovered by the tests. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2022-12-22 11:50:26 -08:00
Nobuyoshi Nakada	e61e4ae60b	Refactor `reg_extract_args` to return regexp if given	2022-12-22 19:27:27 +09:00
Nobuyoshi Nakada	454c00723a	Share argument parsing in `Regexp#initialize` and `Regexp.linear_time?`	2022-12-22 15:51:00 +09:00
卜部昌平	34d43ed9f5	typo in doc [ci skip]	2022-12-19 11:20:55 +09:00
卜部昌平	47a6e7b518	Note about Regexp.linera_time? [ci skip]	2022-12-19 11:05:55 +09:00
TSUYUSATO Kitsune	fbedadb61f	Add `Regexp.linear_time?` (#6901 )	2022-12-14 12:57:14 +09:00

1 2 3 4 5 ...

649 Коммитов