Ref: https://github.com/ruby/ruby/pull/10872
These should be the last internal uses of the old `Data` API
inside Ruby itself. Some use remain in a couple default gems.
In cases where `rb_ary_sort_bang` is called with a block and
tmp is an embedded array, we need to account for the block
potentially impacting the capacity of ary.
ex:
```
var_0 = (1..70).to_a
var_0.sort! do |var_0_block_129, var_1_block_129|
var_0.pop
var_1_block_129 <=> var_0_block_129
end.shift(3)
```
The above example can put the array into a corrupted state
resulting in a heap buffer overflow and possible segfault:
```
ERROR: AddressSanitizer: heap-buffer-overflow on address [...]
WRITE of size 560 at 0x60b0000034f0 thread T0 [...]
```
This commit adds a conditional to determine when the capacity
of ary has been modified by the provided block. If this is
the case, ensure that the capacity of ary is adjusted to
handle at minimum the len of tmp.
I discovered the problem in `compile.c` from a failing
TestIseqLoad#test_stressful_roundtrip test with ASAN enabled. The other
two changes in array.c and string.c I found by auditing similar usages
of DATA_PTR in the codebase.
[Bug #20402]
assert does not print the bug report, only the file and line number of
the assertion that failed. RUBY_ASSERT prints the full bug report, which
makes it much easier to debug.
ary could change embeddedness due to compaction, so we should only get
the pointer after allocations.
The included test was crashing with:
TestArray#test_slice_gc_compact_stress
ruby/lib/pp.rb:192: [BUG] Segmentation fault at 0x0000000000000038
We don't need to remove the write-barrier protected status after
splicing an array. We can simply add it to the rememberset for marking
during the next GC.
The benchmark illustrates the performance impact on minor GC:
```
require "benchmark"
arys = 1_000_000.times.map do
ary = Array.new(50)
ary.insert(1, 3)
ary
end
4.times { GC.start }
puts(Benchmark.measure do
1000.times do
GC.start(full_mark: false)
end
end)
```
This branch:
```
1.309910 0.004342 1.314252 ( 1.314580)
```
Master branch:
```
54.376091 0.219037 54.595128 ( 54.742996)
```
This commit introduces a new instruction `opt_newarray_send` which is
used when there is an array literal followed by either the `hash`,
`min`, or `max` method.
```
[a, b, c].hash
```
Will emit an `opt_newarray_send` instruction. This instruction falls
back to a method call if the "interested" method has been monkey
patched.
Here are some examples of the instructions generated:
```
$ ./miniruby --dump=insns -e '[@a, @b].max'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :max
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].min'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :min
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].hash'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :hash
0009 leave
```
[Feature #18897] [ruby-core:109147]
Co-authored-by: John Hawthorn <jhawthorn@github.com>
Remove !USE_RVARGC code
[Feature #19579]
The Variable Width Allocation feature was turned on by default in Ruby
3.2. Since then, we haven't received bug reports or backports to the
non-Variable Width Allocation code paths, so we assume that nobody is
using it. We also don't plan on maintaining the non-Variable Width
Allocation code, so we are going to remove it.
Sharing an array will cause it to be WB unprotected due to the use
of `RARRAY_PTR`. We don't need to WB unprotect the array because we're
not writing to the buffer of the array.
The following script demonstrates this issue:
```
ary = [1] * 1000
shared = ary[10..20]
puts ObjectSpace.dump(ary)
```
Freeing the memory of a Hash should be done by the garbage collector
and not by array functions. This could potentially leak memory if
ary_recycle_hash was not implemented properly.