Previously, because opt_aref and opt_aset don't push a frame, when they
would call rb_hash to determine the hash value of the key, the initial
level of recursion would incorrectly use the method id at the top of the
stack instead of "hash".
This commit replaces rb_exec_recursive_outer with
rb_exec_recursive_outer_mid, which takes an explicit method id, so that
we can make the hash calculation behave consistently.
rb_exec_recursive_outer was documented as being internal, so I believe
this should be okay to change.
[Feature #18683]
This allows parsers and similar libraries to create Hashes of
a certain capacity in advance. It's useful when the key and values
are streamed, hence `bulk_insert()` can't be used.
Method references is not only able to be marked up as code, also
reflects `--show-hash` option.
The bug that prevented the old rdoc from correctly parsing these
methods was fixed last month.
I used this regex:
([A-Za-z]+)\.html#(?:class|module)-[A-Za-z]+-label-([A-Za-z0-9\-\+]+)
And performed a global find & replace for this:
rdoc-ref:$1@$2
Before this change the write barrier was executed before the key and
value were actually reachable via the Hash. This could cause
inconsistencies in object coloration which would lead to accidental
collection of dup'd keys.
Example:
1. Object O is grey, Object P is white.
2. Write barrier fires O -> P
3. Write barrier does nothing
4. Malloc happens, which starts GC
5. GC colors O black
6. P is written in to O (now we have O -> P reference)
7. P is now accidentally treated as garbage
We found that we need to make Ruby objects while locking the environ
to ENV operation atomically, so we decided to use `RB_VM_LOCK_ENTER()`
instead of `env_lock`.
From the documentation of rb_obj_hash:
> Certain core classes such as Integer use built-in hash calculations and
> do not call the #hash method when used as a hash key.
So if you override, say, Integer#hash it won't be used from rb_hash_aref
and similar. This avoids method lookups in many common cases.
This commit uses the same optimization in rb_hash, a method used
internally and in the C API to get the hash value of an object. Usually
this is used to build the hash of an object based on its elements.
Previously it would always do a method lookup for 'hash'.
This is primarily intended to speed up hashing of Arrays and Hashes,
which call rb_hash for each element.
compare-ruby: ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux]
built-ruby: ruby 3.1.0dev (2021-09-29T02:13:24Z fast_hash d670bf88b2) [x86_64-linux]
# Iteration per second (i/s)
| |compare-ruby|built-ruby|
|:----------------|-----------:|---------:|
|hash_aref_array | 1.008| 1.769|
| | -| 1.76x|
* As the "doc/" prefix is specified by the `--page-dir` option,
remove from the rdoc references.
* Refer to the original .rdoc instead of the converted .html.
Instead of looking for Object::ENV (which can be overwritten),
directly look for the envtbl variable. As that is static in hash.c,
and the lookup code is in process.c, add a couple non-static
functions that will return envtbl (or envtbl#to_hash).
Fixes [Bug #18164]
We don't need to increment/decrement iteration level for frozen Hash
because frozen Hash can't be modified. We can assume that nobody
changes the target Hash while calling #each family.
How to reproduce:
a = {}
100.times do |i|
a[i] = true
end
Ractor.make_shareable(a)
4.times.collect do
Ractor.new(a) do |b|
100.times do
b.each_value do
end
end
end
end.each(&:take)
Example output:
internal:ractor>:267: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
#<Thread:0x00007fcfb087bb30 run> terminated with exception (report_on_exception is true):
#<Thread:0x00007fcfb087b8d8 run> terminated with exception (report_on_exception is true):
#<Thread:0x00007fcfb088d678 run> terminated with exception (report_on_exception is true):
#<Thread:0x00007fcfb087bd88 run> terminated with exception (report_on_exception is true):
/tmp/h.rb:10:in `each_value'/tmp/h.rb:10:in `each_value': : /tmp/h.rb:10:in `each_value'no implicit conversion from nil to integer/tmp/h.rb:10:in `each_value'no implicit conversion from nil to integer (: : (TypeErrorTypeError)no implicit conversion from nil to integer)no implicit conversion from nil to integer (
(TypeErrorTypeError from /tmp/h.rb:10:in `block (3 levels) in <main>'
from /tmp/h.rb:10:in `block (3 levels) in <main>'
)) from /tmp/h.rb:9:in `times'
from /tmp/h.rb:9:in `times'
from /tmp/h.rb:9:in `block (2 levels) in <main>'
from /tmp/h.rb:10:in `block (3 levels) in <main>'
from /tmp/h.rb:10:in `block (3 levels) in <main>'
from /tmp/h.rb:9:in `block (2 levels) in <main>'
from /tmp/h.rb:9:in `times'
from /tmp/h.rb:9:in `times'
from /tmp/h.rb:9:in `block (2 levels) in <main>'
from /tmp/h.rb:9:in `block (2 levels) in <main>'
<internal:ractor>:694:in `take': thrown by remote Ractor. (Ractor::RemoteError)
from /tmp/h.rb:14:in `each'
from /tmp/h.rb:14:in `<main>'
/tmp/h.rb:10:in `each_value': no implicit conversion from nil to integer (TypeError)
from /tmp/h.rb:10:in `block (3 levels) in <main>'
from /tmp/h.rb:9:in `times'
from /tmp/h.rb:9:in `block (2 levels) in <main>'
This makes the compare_by_identity setting always copied
for the following methods:
* except
* merge
* reject
* select
* slice
* transform_values
Some of these methods did not copy the setting, or only
copied the setting if the receiver was not empty.
Fixes [Bug #17757]
Co-authored-by: Kenichi Kamiya <kachick1@gmail.com>