Граф коммитов

87 Коммитов

Автор SHA1 Сообщение Дата
Peter Zhu 7464514ca5 Fix memory leak in String#start_with? when regexp times out
[Bug #20653]

This commit refactors how Onigmo handles timeout. Instead of raising a
timeout error, onig_search will return a ONIGERR_TIMEOUT which the
caller can free memory, and then raise a timeout error.

This fixes a memory leak in String#start_with when the regexp times out.
For example:

    regex = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001)
    str = "a" * 1000000 + "x"

    10.times do
      100.times do
        str.start_with?(regex)
      rescue
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    33216
    51936
    71152
    81728
    97152
    103248
    120384
    133392
    133520
    133616

After:

    14912
    15376
    15824
    15824
    16128
    16128
    16144
    16144
    16160
    16160
2024-07-26 08:42:38 -04:00
Peter Zhu 1c120efe02 Fix memory leak in stk_base when Regexp timeout
[Bug #20228]

If rb_reg_check_timeout raises a Regexp::TimeoutError, then the stk_base
will leak.
2024-02-02 10:39:42 -05:00
Hiroya Fujinami 34cb174800
Optimize regexp matching for look-around and atomic groups (#7931) 2023-10-30 13:10:42 +09:00
TSUYUSATO Kitsune ac730d3e75
Delay start of the match cache optimization (#7738) 2023-05-04 13:15:51 +09:00
TSUYUSATO Kitsune a1c2c274ee
Refactor `Regexp#match` cache implementation (#7724)
* Refactor Regexp#match cache implementation

Improved variable and function names
Fixed [Bug 19537] (Maybe fixed in https://github.com/ruby/ruby/pull/7694)

* Add a comment of the glossary for "match cache"

* Skip to reset match cache when no cache point on null check
2023-04-19 13:08:28 +09:00
Nobuyoshi Nakada fac814c2dc
Fix `PLATFORM_GET_INC`
On platforms where unaligned word access is not allowed, and if
`sizeof(val)` and `sizeof(type)` differ:

- `val` > `type`, `val` will be a garbage.
- `val` < `type`, outside `val` will be clobbered.
2023-04-16 17:45:27 +09:00
TSUYUSATO Kitsune 36ff0521c1 Use long instead of int 2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune a1c1fc558a Revert "Refactor field names"
This reverts commit 1e6673d6bbd2adbf555d82c7c0906ceb148ed6ee.
2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune 22294731a8 Refactor field names 2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune f25bb291b4 Support OP_REPEAT and OP_REPEAT_INC 2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune cbabba9c82 Add index to the latest NULL_CHECK_STACK for fast matching 2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune f07dea16e3 Keep cache optimization info to MatchArg for global matching 2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune 881bf9a0b8 Implement cache optimization for regexp matching 2022-11-09 23:21:26 +09:00
Sergey Fedorov 567725ed30
Fix and improve coroutines for Darwin (macOS) ppc/ppc64. (#5975) 2022-10-19 23:49:45 +13:00
Yusuke Endoh ffc3b37f96 re.c: Add Regexp.timeout= and Regexp.timeout
[Feature #17837]
2022-03-30 16:50:46 +09:00
Yusuke Endoh 9112cf4ae7 regint.h: Reduce the frequency of rb_thread_check_ints
edc8576a65 checks interrupt at every
backtrack, which brought significant overhead.

This change makes the check only once every 128 backtracks.
2022-03-24 09:47:22 +09:00
Nobuyoshi Nakada efa0c31ce5
Add printf-style format attribute to oniguruma functions
Also make the format string compatible with literal strings which
are const arrays of "plain" chars.
2021-09-27 19:02:45 +09:00
AGSaidi 511b55bcef
Enable arm64 optimizations that exist for power/x86 (#3393)
* Enable unaligned accesses on arm64

64-bit Arm platforms support unaligned accesses.

Running the string benchmarks this change improves performance
by an average of 1.04x, min .96x, max 1.21x, median 1.01x

* arm64 enable gc optimizations

Similar to x86 and powerpc optimizations.

|       |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|hash1  |       0.225|     0.237|
|       |           -|     1.05x|
|hash2  |       0.110|     0.110|
|       |       1.00x|         -|

* vm_exec.c: improve performance for arm64

|                               |compare-ruby|built-ruby|
|:------------------------------|-----------:|---------:|
|vm_array                       |     26.501M|   27.959M|
|                               |           -|     1.06x|
|vm_attr_ivar                   |     21.606M|   31.429M|
|                               |           -|     1.45x|
|vm_attr_ivar_set               |     21.178M|   26.113M|
|                               |           -|     1.23x|
|vm_backtrace                   |       6.621|     6.668|
|                               |           -|     1.01x|
|vm_bigarray                    |     26.205M|   29.958M|
|                               |           -|     1.14x|
|vm_bighash                     |    504.155k|  479.306k|
|                               |       1.05x|         -|
|vm_block                       |     16.692M|   21.315M|
|                               |           -|     1.28x|
|block_handler_type_iseq        |       5.083|     7.004|
|                               |           -|     1.38x|
2020-08-14 02:15:54 +09:00
naruse 6b1c6e0e55 Merge Onigmo 6.1.1
* Support absent operator https://github.com/k-takata/Onigmo/issues/82
* https://github.com/k-takata/Onigmo/blob/Onigmo-6.1.1/HISTORY

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-11 15:08:33 +00:00
nobu c56182a0b0 regint.h: version for secure functions
* regint.h (xvsnprintf): secure version functions are not
  supported on old VC.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57175 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-24 13:22:53 +00:00
naruse 2873edeafb Merge Onigmo 6.0.0
* https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY
* fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78
* suppress warning: https://github.com/k-takata/Onigmo/pull/79
* include/ruby/oniguruma.h: include onigmo.h.
* template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in
  enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-10 17:47:04 +00:00
naruse b888ee25ff revert UNALIGNED_WORD_ACCESS for GCC6
Released GCC 6.0 fixed the issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69291
[ruby-core:72211] [Bug #11831] [Bug #11979]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54855 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-30 15:39:03 +00:00
nobu 04c55c0783 disable unaligned word access
* regint.h: disable unaligned word access with gcc 6 or later.
  [Bug #11831]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53545 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-15 13:12:17 +00:00
ngoto 5df321fb1d * regint.h (PLATFORM_UNALIGNED_WORD_ACCESS): The value of
UNALIGNED_WORD_ACCESS should be used to determine whether
  unaligned word access is allowed or not. After this commit,
  ./configure CPPFLAGS="-DUNALIGNED_WORD_ACCESS=0" disables
  unaligned word access even on platforms that support the feature.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53543 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-15 11:25:29 +00:00
nobu b3377eaa13 Revert r52995
revert slow atomic operations.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52997 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-09 07:30:44 +00:00
nobu 0599d7de58 use atomic operations
* regcomp.c (onig_chain_link_add): use atomic operation instead of
  mutex.
* regint.h (ONIG_STATE_{INC,DEC}_THREAD): ditto.
* regparse.c (PopFreeNode, node_recycle): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52995 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-09 06:48:15 +00:00
naruse 9ed1d63f41 * regcomp.c, regenc.c, regexec.c, regint.h, enc/unicode.c:
Merge Onigmo 58fa099ed1a34367de67fb3d06dd48d076839692
  + https://github.com/k-takata/Onigmo/pull/52

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-11-26 08:31:27 +00:00
akr 1efb3c31b7 * Avoid undefined behaviors found by gcc -fsanitize=undefined.
gcc (Debian 4.9.1-16) 4.9.1

* include/ruby/ruby.h (INT2FIX): Avoid undefined behavior.

* node.h (nd_set_line): Ditto.

* pack.c (encodes): Ditto.
  (pack_unpack): Ditto.

* regint.h (BIT_STATUS_AT): Ditto.
  (BS_BIT): Ditto.

* time.c (time_mdump): Ditto.
  (time_mload): Ditto.

* vm_core.h (VM_FRAME_MAGIC_MASK): Ditto.

* vm_trace.c (recalc_add_ruby_vm_event_flags): Ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47996 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-10-17 08:50:01 +00:00
naruse d2a5354255 * reg*.c: Merge Onigmo 5.15.0 38a870960aa7370051a3544
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-09-15 16:18:41 +00:00
nobu 63360be4d2 UNALIGNED_WORD_ACCESS on ppc64
* include/ruby/defines.h, siphash.c, st.c (UNALIGNED_WORD_ACCESS):
  add PowerPC64 too, which is capable to access unaligned words.
  patched by Gustavo Frederico Temple Pedrosa in [ruby-core:63937].
  [Feature #10081]
* regint.h (PLATFORM_UNALIGNED_WORD_ACCESS): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-07-23 14:50:12 +00:00
naruse 64c81e40d4 * regcomp.c: Merge Onigmo 5.14.1 25a8a69fc05ae3b56a09.
this includes Support for Unicode 7.0 [Bug #9092].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-07-16 03:27:25 +00:00
naruse 536a3274c9 * Merge Onigmo 5.13.4 f22cf2e566712cace60d17f84d63119d7c5764ee.
[bug] fix problem with optimization of \z (Issue #16) [Bug #8210]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@40276 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-04-13 11:30:35 +00:00
naruse 29f347af11 * regint.h: fix typo: _M_AMD86 -> _M_AMD64.
* siphash.c: ditto.

* st.c: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@40220 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-04-10 21:16:44 +00:00
nobu bff997e02b defines.h: RUBY_SYMBOL_EXPORT_{BEGIN,END}
* include/ruby/defines.h (RUBY_SYMBOL_EXPORT_{BEGIN,END}): visibility
  control macros.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@40122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-04-05 10:29:38 +00:00
naruse 78dbaa1648 * Merge Onigmo 0fe387da2fee089254f6b04990541c731a26757f
v5.13.3 [Bug#7972] [Bug#7974]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@39547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-03-01 16:36:37 +00:00
nobu d28c18966d * regint.h (BITS_IN_ROOM, BS_ROOM, BS_BIT): suppress warnings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35107 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-03-22 09:37:44 +00:00
naruse 0424e152c6 * Merge Onigmo-5.13.1. [ruby-dev:45057] [Feature #5820]
https://github.com/k-takata/Onigmo
  cp reg{comp,enc,error,exec,parse,syntax}.c reg{enc,int,parse}.h
  cp oniguruma.h
  cp tool/enc-unicode.rb
  cp -r enc/

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-17 07:42:23 +00:00
nobu 84dcc38273 * regcomp.c (onig_region_memsize): implemented for memsize_of().
* ext/objspace/objspace.c (memsize_of): use it.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34050 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-12-15 04:15:54 +00:00
nobu e8ab94ede0 * regint.h (PLATFORM_UNALIGNED_WORD_ACCESS): Power PC does not
allow unaligned word access.
* st.c (UNALIGNED_WORD_ACCESS): x86_64 allows unaligned word
  access as well as i386.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@32544 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-07-14 16:51:29 +00:00
naruse 48de1e292a * addr2line.c: suppressed shorten-64-to-32 warnings.
* regcomp.c: ditto.
* regexec.c: ditto.
* regint.h: ditto.
* regparse.c: ditto.
* regparse.h: ditto.
* time.c: ditto.
* variable.c: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-01-31 03:44:57 +00:00
naruse 8e2a1cba36 * regint.h (OnigOpInfoType): constify name.
* regcomp.c (op2name): constify return value.

* regcomp.c (onig_print_compiled_byte_code): use PRIuPTR and
  uintptr_t to clean warnings.

* regcomp.c (print_indent_tree): use PRIxPTR and intptr_t.

* regexec.c (match_at): use PRIdPTR and intptr_t.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-11-17 06:21:11 +00:00
naruse b3545895d1 * regint.h (OnigStackIndex): the type should be intptr_t.
Original Oniguruma assumes the size of long and that of void *
  are equal, but it's not true on LLP64 platform: mswin64.
  originally patched by shintaro kuwamoto [ruby-dev:42133]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29102 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-08-26 01:50:07 +00:00
nobu a979eb4548 * regcomp.c (onig_memsize): constified.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28980 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-08-14 06:17:38 +00:00
nobu 0bd71ff354 * configure.in (XCFLAGS): use -fvisibility=hidden if possible.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-07-21 21:38:25 +00:00
matz db37773e13 * include/ruby/oniguruma.h: updated to follow Oniguruma 5.9.2.
* re.c (make_regexp): use onig_new() instead of onig_alloc_init().

* re.c (rb_reg_to_s): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-01 21:54:59 +00:00
naruse de341005c8 * regcomp.c (print_enc_string): follow enclen's change.
* regcomp.c (onig_print_compiled_byte_code): ditto.

* regcomp.c (onig_print_compiled_byte_code): change prototype.

* regint.c (onig_print_compiled_byte_code): comment out.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26138 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-12-21 01:11:15 +00:00
nobu 7fd7e4a98b * regint.h: commit miss.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25039 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-22 12:16:19 +00:00
naruse 6ab36c6e19 *regparse.c (CC_DUP_WARN): use rb_compile_warn if ScanEnv has source
information. [ruby-dev:39105]

*re.c (rb_reg_compile): add sourcefile and sourceline to the arguments.

*re.c (make_regexp): ditto.

*re.c (rb_reg_initialize): ditto.

*re.c (rb_reg_initialize_str): ditto.

*re.c (rb_reg_compile): ditto.

*regcomp.c (onig_compile): ditto.

*regint.h (onig_compile): ditto.

*re.c (reg_compile_gen): follow above.

*re.c (rb_reg_to_s): ditto.

*re.c (make_regexp): ditto.

*re.c (rb_reg_initialize): ditto.

*re.c (rb_reg_initialize_str): ditto.

*re.c (rb_reg_new_str): ditto.

*re.c (rb_enc_reg_new): ditto.

*re.c (rb_reg_initialize_m): ditto.

*re.c (rb_reg_init_copy): ditto.

*regcomp.c (onig_new): ditto.

*regcomp.c (onig_compile): set sourcefile and sourceline to scan_env.

*regparse.h (ScanEnv): add sourcefile and sourceline.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-30 08:00:31 +00:00
nobu 23a32d6444 * include/ruby/oniguruma.h, include/ruby/re.h, re.c, regcomp.c,
regenc.c, regerror.c, regexec.c, regint.h, regparse.c: use long.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23907 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-30 02:08:54 +00:00
ko1 8f1e7a37e5 * thread.c (rb_thread_check_ints): added. please note that
this function may cause ruby's thread switching.
* include/ruby/intern.h: ditto.
* regint.h: use rb_thread_check_ints() instead of
  RUBY_CHECK_INTS() directly.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18572 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-13 08:21:24 +00:00