This commit switches from a custom implemented bsearch algorithm to
use the one provided by the C standard library.
Because `is_pointer_to_heap` will only return true if the pointer
being searched for is a valid slot starting address within the heap
page body, we've extracted the bsearch call site into a more general
function so we can use it elsewhere.
The new function `heap_page_for_ptr` returns the heap page for any heap
page pointer, regardless of whether that is at the start of a slot or
in the middle of one.
We then use this function as the basis of `is_pointer_to_heap`.
Some callable method entries (cme) can be a key of `overloaded_cme_table`
and the keys should be pinned because the table is numtable (VALUE is a key).
Before the patch GC checks the cme is in `overloaded_cme_table` by looking up
the table, but it needs VM locking.
It works well in normal GC marking because it is protected by the VM lock,
but it doesn't work on `rb_objspace_reachable_objects_from` because it doesn't
use VM lock.
Now, the number of target cmes are small enough, I decide to pin down
all possible cmes instead of using looking up the table.
`overloaded_cme_table` keeps cme -> monly_cme pairs to manage
corresponding `monly_cme` for `cme`. The lifetime of the `monly_cme`
should be longer than `monly_cme`, but the previous patch losts the
reference to the living `monly_cme`.
Now `overloaded_cme_table` values are always root (keys are only weak
reference), it means `monly_cme` does not freed until corresponding
`cme` is invalidated.
To make managing easy, move `overloaded_cme_table` to `rb_vm_t`.
`def` (`rb_method_definition_t`) is shared by multiple callable
method entries (cme, `rb_callable_method_entry_t`).
There are two issues:
* old -> young reference: `cme1->def->mandatory_only_cme = monly_cme`
if `cme1` is young and `monly_cme` is young, there is no problem.
Howevr, another old `cme2` can refer `def`, in this case, old `cme2`
points young `monly_cme` and it violates gengc assumption.
* cme can have different `defined_class` but `monly_cme` only has
one `defined_class`. It does not make sense and `monly_cme`
should be created for a cme (not `def`).
To solve these issues, this patch allocates `monly_cme` per `cme`.
`cme` does not have another room to store a pointer to the `monly_cme`,
so this patch introduces `overloaded_cme_table`, which is weak key map
`[cme] -> [monly_cme]`.
`def::body::iseqptr::monly_cme` is deleted.
The first issue is reported by Alan Wu.
When using `rp(obj)` for debugging during development, it may be
useful to know that an object is soon to be swept. Add a new letter to
the object dump for whether the object is garbage. It's easy to forget
about lazy sweep.
Except on Windows and MinGW, we can only use compaction on systems that
use mmap (only systems that use mmap can use the read barrier that
compaction requires). We don't need to separately detect whether we can
support compaction or not.
* Lazily create singletons on instance_{exec,eval}
Previously when instance_exec or instance_eval was called on an object,
that object would be given a singleton class so that method
definitions inside the block would be added to the object rather than
its class.
This commit aims to improve performance by delaying the creation of the
singleton class unless/until one is needed for method definition. Most
of the time instance_eval is used without any method definition.
This was implemented by adding a flag to the cref indicating that it
represents a singleton of the object rather than a class itself. In this
case CREF_CLASS returns the object's existing class, but in cases that
we are defining a method (either via definemethod or
VM_SPECIAL_OBJECT_CBASE which is used for undef and alias).
This also happens to fix what I believe is a bug. Previously
instance_eval behaved differently with regards to constant access for
true/false/nil than for all other objects. I don't think this was
intentional.
String::Foo = "foo"
"".instance_eval("Foo") # => "foo"
Integer::Foo = "foo"
123.instance_eval("Foo") # => "foo"
TrueClass::Foo = "foo"
true.instance_eval("Foo") # NameError: uninitialized constant Foo
This also slightly changes the error message when trying to define a method
through instance_eval on an object which can't have a singleton class.
Before:
$ ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': no class/module to add method (TypeError)
After:
$ ./ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
IMO this error is a small improvement on the original and better matches
the (both old and new) message when definging a method using `def self.`
$ ruby -e '123.instance_eval{ def self.foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
Co-authored-by: Matthew Draper <matthew@trebex.net>
* Remove "under" argument from yield_under
* Move CREF_SINGLETON_SET into vm_cref_new
* Simplify vm_get_const_base
* Fix leaf VM_SPECIAL_OBJECT_CONST_BASE
Co-authored-by: Matthew Draper <matthew@trebex.net>
suseconds_t, which is the type of tv_usec, may be defined with a longer
size type than tv_nsec's type (long). So usec to nsec conversion needs
an explicit casting.
This commit adds a Ractor cache for every size pool. Previously, all VWA
allocated objects used the slowpath and locked the VM.
On a micro-benchmark that benchmarks String allocation:
VWA turned off:
29.196591 0.889709 30.086300 ( 9.434059)
VWA before this commit:
29.279486 41.477869 70.757355 ( 12.527379)
VWA after this commit:
16.782903 0.557117 17.340020 ( 4.255603)
Updating RCLASS_PARENT_SUBCLASSES and RCLASS_MODULE_SUBCLASSES while
compacting can trigger the read barrier. This commit makes
RCLASS_SUBCLASSES a doubly linked list with a dedicated head object so
that we can add and remove entries from the list without having to touch
an object in the Ruby heap
* `GC.measure_total_time = true` enables total time measurement (default: true)
* `GC.measure_total_time` returns current flag.
* `GC.total_time` returns measured total time in nano seconds.
* `GC.stat(:time)` (and Hash) returns measured total time in milli seconds.
Compare with the C methods, A built-in methods written in Ruby is
slower if only mandatory parameters are given because it needs to
check the argumens and fill default values for optional and keyword
parameters (C methods can check the number of parameters with `argc`,
so there are no overhead). Passing mandatory arguments are common
(optional arguments are exceptional, in many cases) so it is important
to provide the fast path for such common cases.
`Primitive.mandatory_only?` is a special builtin function used with
`if` expression like that:
```ruby
def self.at(time, subsec = false, unit = :microsecond, in: nil)
if Primitive.mandatory_only?
Primitive.time_s_at1(time)
else
Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
end
end
```
and it makes two ISeq,
```
def self.at(time, subsec = false, unit = :microsecond, in: nil)
Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
end
def self.at(time)
Primitive.time_s_at1(time)
end
```
and (2) is pointed by (1). Note that `Primitive.mandatory_only?`
should be used only in a condition of an `if` statement and the
`if` statement should be equal to the methdo body (you can not
put any expression before and after the `if` statement).
A method entry with `mandatory_only?` (`Time.at` on the above case)
is marked as `iseq_overload`. When the method will be dispatch only
with mandatory arguments (`Time.at(0)` for example), make another
method entry with ISeq (2) as mandatory only method entry and it
will be cached in an inline method cache.
The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254
but it only checks mandatory parameters or more, because many cases
only mandatory parameters are given. If we find other cases (optional
or keyword parameters are used frequently and it hurts performance),
we can extend the feature.
With RVARGC we always store the rb_classext_t in the same slot as the
RClass struct that refers to it. So we don't need to store the pointer
or access through the pointer anymore and can switch the RCLASS_EXT
macro to use an offset
This commit fixes a memory leak introduced in an early part of the
variable width allocation project that would prevent the rb_classext_t
struct from being free'd when the class is swept.
This commit deprecates rb_gc_force_recycle and coverts it to a no-op
function. Also removes invalidate_mark_stack_chunk since only
rb_gc_force_recycle uses it.
```
../gc.c:2342:45: warning: comparison of integers of different signs: 'short' and 'size_t' (aka 'unsigned long') [-Wsign-compare]
GC_ASSERT(size_pools[pool_id].slot_size == slot_size);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
```
Add cast to short, because `GC_ASSERT`s in `size_pool_for_size`
already use cast to short.
```
../gc.c:2342:25: warning: array subscript is of type 'char' [-Wchar-subscripts]
GC_ASSERT(size_pools[pool_id].slot_size == slot_size);
^~~~~~~~
```
It seems like `gc_verify_compaction_references` is not protected in case
alignment is wrong. This commit pushes the alignment check down to
`gc_start_internal` so that anyone trying to compact will check page
alignment
I think this method may be getting called on PowerPC and the alignment
might be wrong.
http://rubyci.s3.amazonaws.com/ppc64le/ruby-master/log/20211021T190006Z.fail.html.gz
I'm looking through the places where YJIT needs notifications. It looks
like these changes to gc.c and vm_callinfo.h have become unnecessary
since 84ab77ba59. This commit just makes the diff against upstream
smaller, but otherwise shouldn't change any behavior.
Added UJIT_CHECK_MODE. Set to 1 to double check method dispatch in
generated code.
It's surprising to me that we need to watch both cc and cme. There might
be opportunities to simplify there.
Runtime assertion for the argument declared as non-null.
This macro does nothing if `RBIMPL_ATTR_NONNULL` is effective,
otherwise asserts that the argument is non-null.
Commit 123eeb1c1a added incremental GC
which moved resetting malloc_increase, oldmalloc_increase to before
marking. However, during_minor_gc is not set until gc_marks_start. So
the value will be from the last GC run, rather than the current one.
Before the incremental GC commit, this code was in gc_before_sweep
which ran before sweep (after marking) so the value during_minor_gc
was correct.