A code pattern `p + enclen(enc, p, pend)` may lead to a buffer overrun
if incomplete bytes of a UTF-8 character is placed at the end of a
string. Because this pattern is used in several places in onigmo,
this change fixes the issue in the side of `enclen`: the function should
not return a number that is larger than `pend - p`.
Co-Authored-By: Nobuyoshi Nakada <nobu@ruby-lang.org>
Use `Enumerable#find` to iterate over the candidates, not `Enumerable.each`.
(this makes the code more functional, and - IMO - slightly more idiomatic,
as it avoids setting the "global" (by which I mean: non-local) `tmp`
variable from inside the block)
https://github.com/ruby/tmpdir/commit/d1f20ad694
A regexp that ends with an escape following an incomplete UTF-8 char
might cause buffer overrun. Found by OSS-Fuzz.
```
$ valgrind ./miniruby -e 'Regexp.new("\\u2d73\\0\\0\\0\\0 \\\xE6".b)'
==296213== Memcheck, a memory error detector
==296213== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==296213== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==296213== Command: ./miniruby -e Regexp.new("\\\\u2d73\\\\0\\\\0\\\\0\\\\0\ \ \ \ \ \ \ \ \ \ \\\\\\xE6".b)
==296213==
==296213== Warning: client switching stacks? SP change: 0x1ffe8020e0 --> 0x1ffeffff10
==296213== to suppress, use: --max-stackframe=8379952 or greater
==296213== Invalid read of size 1
==296213== at 0x484EA10: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==296213== by 0x339568: memcpy (string_fortified.h:29)
==296213== by 0x339568: onig_strcpy (regparse.c:271)
==296213== by 0x339568: onig_node_str_cat (regparse.c:1413)
==296213== by 0x33CBA0: parse_exp (regparse.c:6198)
==296213== by 0x33EDE4: parse_branch (regparse.c:6511)
==296213== by 0x33EEA2: parse_subexp (regparse.c:6544)
==296213== by 0x34019C: parse_regexp (regparse.c:6593)
==296213== by 0x34019C: onig_parse_make_tree (regparse.c:6638)
==296213== by 0x32782D: onig_compile_ruby (regcomp.c:5779)
==296213== by 0x313EFA: onig_new_with_source (re.c:876)
==296213== by 0x313EFA: make_regexp (re.c:900)
==296213== by 0x313EFA: rb_reg_initialize (re.c:3136)
==296213== by 0x318555: rb_reg_initialize_str (re.c:3170)
==296213== by 0x318555: rb_reg_init_str (re.c:3205)
==296213== by 0x31A669: rb_reg_initialize_m (re.c:3856)
==296213== by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150)
==296213== by 0x3E5165: vm_call0_cfunc (vm_eval.c:164)
==296213== by 0x3E5165: vm_call0_body (vm_eval.c:210)
==296213== by 0x3E89BD: vm_call0_cc (vm_eval.c:87)
==296213== by 0x3E89BD: rb_call0 (vm_eval.c:551)
==296213== Address 0x9d45b10 is 0 bytes after a block of size 32 alloc'd
==296213== at 0x4844899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==296213== by 0x20FA7B: objspace_xmalloc0 (gc.c:12146)
==296213== by 0x35F8C9: str_buf_cat4.part.0 (string.c:3132)
==296213== by 0x31359D: unescape_escaped_nonascii (re.c:2690)
==296213== by 0x313A9D: unescape_nonascii (re.c:2869)
==296213== by 0x313A9D: rb_reg_preprocess (re.c:2992)
==296213== by 0x313DFC: rb_reg_initialize (re.c:3109)
==296213== by 0x318555: rb_reg_initialize_str (re.c:3170)
==296213== by 0x318555: rb_reg_init_str (re.c:3205)
==296213== by 0x31A669: rb_reg_initialize_m (re.c:3856)
==296213== by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150)
==296213== by 0x3E5165: vm_call0_cfunc (vm_eval.c:164)
==296213== by 0x3E5165: vm_call0_body (vm_eval.c:210)
==296213== by 0x3E89BD: vm_call0_cc (vm_eval.c:87)
==296213== by 0x3E89BD: rb_call0 (vm_eval.c:551)
==296213== by 0x3E957B: rb_call (vm_eval.c:877)
==296213== by 0x3E957B: rb_funcallv_kw (vm_eval.c:1074)
==296213== by 0x2A4123: rb_class_new_instance_pass_kw (object.c:1991)
==296213==
==296213==
==296213== HEAP SUMMARY:
==296213== in use at exit: 35,476,538 bytes in 9,489 blocks
==296213== total heap usage: 14,893 allocs, 5,404 frees, 37,517,821 bytes allocated
==296213==
==296213== LEAK SUMMARY:
==296213== definitely lost: 316,081 bytes in 2,989 blocks
==296213== indirectly lost: 136,808 bytes in 2,361 blocks
==296213== possibly lost: 1,048,624 bytes in 3 blocks
==296213== still reachable: 33,975,025 bytes in 4,136 blocks
==296213== suppressed: 0 bytes in 0 blocks
==296213== Rerun with --leak-check=full to see details of leaked memory
==296213==
==296213== For lists of detected and suppressed errors, rerun with: -s
==296213== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
```
SHOW_DOC_DIALOG will be called repeatedly whenever the corresponding key
is pressed, but we only need to require rdoc once. So ideally the
require can be put outside of the proc.
And because when rdoc is not available the entire proc will be
nonfunctional, we can stop registering the SHOW_DOC_DIALOG if we failed
to require rdoc.
https://github.com/ruby/irb/commit/b1278b7320
Fix per-instance Regexp timeout
This makes it follow what was decided in [Bug #19055]:
* `Regexp.new(str, timeout: nil)` should respect the global timeout
* `Regexp.new(str, timeout: huge_val)` should use the maximum value that
can be represented in the internal representation
* `Regexp.new(str, timeout: 0 or negative value)` should raise an error
So different timestamps for different paths will be used. Extentions
paths in bundled gems contain `ruby_version`, which includes the ABI
version, and the same timestamp file for different paths resulted in
build failures when it changed.
Not only powerpc64le, also s390x and arm32 seem failing too. These
failures are probably caused by filesystem settings on Travis, but
unrelated to CPUs.
`Complex.polar` accepts Complex values as arguments for the polar form as long
as the value of the complex has no imaginary part (ie it is 'real'). In
`f_complex_polar` this is handled by extracting the real part of the arguments.
However in the case `polar` is called with only a single argument, the absolute
value (abs), then the Complex is created without applying a check on the type
of abs, meaning it is possible to create a Complex where the real part is itself
an instance of a Complex. This change removes the short circuit for the single
argument case meaning the real part extraction is performed correctly
(by f_complex_polar).
Also adds an example to `spec/ruby/core/complex/polar_spec.rb` to check that
the real part of a complex argument is correctly extracted and used in the
resulting Complex real and imaginary parts.
jruby-head (which will be JRuby 9.4.0.0) can now properly process
the keywords to Kernel#warn. I cannot think of any capability based
test for this so I constrained it using a version guard. Only JRuby
will ever hit the version guard.
https://github.com/rubygems/rubygems/commit/cd468c7e0f
`iv_count` is a misleading name because when IVs are unset, the new
shape doesn't decrement this value. `next_iv_count` is an accurate, and
more descriptive name.
Before object shapes, we were using class serial to invalidate
inline caches. Now that we use shape_id for inline cache keys,
the class serial is unnecessary.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Previously, we found the current page by rounding the current pointer to
the closest smaller page size. This is incorrect because pages are
relative to the start of the address we reserve. For example, if the
starting address is 12KiB modulo the 16KiB page size, once we have more
than 4KiB of code, calculating with the address would incorrectly give
us page 1 when we're actually still on page 0.
Previously, I can reproduce crashes with:
make btest RUN_OPTS=--yjit-code-page-size=32
on ARM64 macOS, where system page sizes are 16KiB.
This patch makes sure that we're not accidentally reading rb_num_t
instruction arguments as VALUE and accidentally baking them into
code and marking them. Some of these are simply moving the cast earlier,
but some of these avoid potential problems for flag and ID arguments.
Follow-up for 39f7eddec4.