RGENGC_CHECK_MODE >=3 fails with an incinsistency in the old object
count during ec_finalization.
This is due to inconsistency introduced to the object graph using T_DATA
finalizers.
This is explained in commit 79df14c04b,
which disabled gc during finalization to work around this.
```
/* prohibit GC because force T_DATA finalizers can break an object graph consistency */
dont_gc_on()
```
This object graph inconsistency also seems to break RGENGC_CHECK_MODE >=
3, when it attempt to verify the object age relationships during
finalization at VM shutdown (gc_enter is called during finalization).
This commit stops the internal consistency check during gc_enter only
when RGENGC_CHECK_MODE >= 3 and when gc is disabled.
This fixes `make btest` with `-DRGENGC_CHECK_MODE=3`
This PR moves `rb_copy_wb_protected_attribute` and
`rb_gc_copy_finalizer` into a single function called
`rb_gc_copy_attributes` to be called by `init_copy`. This reduces the
surface area of the GC API.
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
Currently, we check the values on the machine stack & register state to
see if they're actually a pointer to an ASAN fake stack, and mark the
values on the fake stack too if required. However, we are only doing
that for the _current_ thread (the one actually running the GC), not for
any other thread in the program.
Make rb_gc_mark_machine_context (which is called for marking non-current
threads) perform the same ASAN fake stack handling that
mark_current_machine_context performs.
[Bug #20310]
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.
Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.
So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.
`vm->mark_object_ary` is also being refactored.
Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.
This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.
But using a custom TypedData we can save from having to mark
all the references on minor GC runs.
Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
This frees FL_USER0 on both T_MODULE and T_CLASS.
Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
is_markable_object is called by rb_objspace_markable_object_p, which
may pass a T_NONE object. check_rvalue_consistency will fail if a T_NONE
object is passed in.
The name is misleading, as it seems like the function checks whether the
object is swept or not. But the function only checks whether the page is
before or after sweeping.
The FL_FINALIZE flag is set when there are finalizers for the object. We
can improver performance by not looking up in the table if the flag is
not set.
Using the following C extension:
#include "ruby/ruby.h"
static void data_free(void *_ptr) {}
static const rb_data_type_t data_type = {
"my_type",
{
NULL,
data_free,
},
0, 0, 0
};
static VALUE data_alloc(VALUE klass) {
return TypedData_Wrap_Struct(klass, &data_type, (void *)1);
}
void Init_myext(void) {
VALUE my_klass = rb_define_class("MyClass", rb_cObject);
rb_define_alloc_func(my_klass, data_alloc);
}
And the following benchmark:
require "benchmark"
final_objs = 1_000_000.times.map do
o = Object.new
ObjectSpace.define_finalizer(o, proc {})
o
end
puts(Benchmark.measure do
100_000_000.times do
MyClass.new
end
end)
Before:
10.974190 0.355037 11.329227 ( 11.416772)
After:
7.664310 0.347598 8.011908 ( 8.268969)
Rather than exposing that an imemo has a flag and four fields, this
changes the implementation to only expose one field (the klass) and
fills the rest with 0. The type will have to fill in the values themselves.
The switch statement is not exhaustive, meaning the "unreachable"
comment was not correct. This commit fixes it by making the list
exhaustive and adding an rb_bug in the default case.
Previously every call to vm_ci_new (when the CI was not packable) would
result in a different callinfo being returned this meant that every
kwarg callsite had its own CI.
When calling, different CIs result in different CCs. These CIs and CCs
both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop
this resulted in a memory leak of both types of object. This also likely
resulted in extra memory used, and extra time searching, in non-eval
cases.
For simplicity in this commit I always allocate a CI object inside
rb_vm_ci_lookup, but ideally we would lazily allocate it only when
needed. I hope to do that as a follow up in the future.
rb_objspace_call_finalizer didn't free fibers and neither did
rb_objspace_free_objects, which caused fibers to be reported as leaked
when using RUBY_FREE_AT_EXIT. This commit changes rb_objspace_free_objects
to free all remaining Ruby objects.
This returns whether or not _any_ piece of memory in the range is
poisoned, not if _all_ of it is. That means that currently, with ASAN
enabled, pages which contain a single poisoned object are skipped
entirely from being iterated with objspace_each* family of functions.
[Bug #20220]
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.
[Bug #20001]
T_DATA without a pointer or free function may still have ivars set on
them that need to be freed. The following leaked generic ivars for
example:
converter = Encoding::Converter.allocate
converter.instance_variable_set(:@foo, 1)
STACK OF 1 INSTANCE OF 'ROOT LEAK: <malloc in objspace_xmalloc0>':
<snip>
12 miniruby 0x10286ec50 ivar_set + 140 variable.c:1850
11 miniruby 0x102876afc generic_ivar_set + 136 variable.c:1668
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.
[Bug #20001]
I was trying to debug an (unrelated) issue in the GC, and wanted to turn
on the trace-level GC output by compiling it with -DRGENGC_DEBUG=5.
Unfortunately, this actually causes a crash in newobj_init() because the
code there tries to log the obj_info() of the newly created object.
However, the object is not actually sufficiently set up for some of the
things that obj_info() tries to do:
* The instance variable table for a class is not yet initialized, and
when using variable-length RVALUES, said ivar table is embedded in
as-yet unitialized memory after the struct RValue. Attempting to read
this, as obj_info() does, causes a crash.
* T_DATA variables need to dereference their ->type field to print out
the underlying C type name, which is not set up until newobj_fill() is
called.
To fix this, create a new method `obj_info_basic`, which dumps out only
the parts of the object that are valid before the object is fully
initialized.
[Fixes#18795]
The for loops for marking and reference updating declaratively marked
TypedData objects did not mark/reference update the very last element.
When RGENGC_CHECK_MODE is turned on, this caused the test in Enumerator
to fail with:
tool/lib/test/unit/testcase.rb:173:in `rescue in run': failed to allocate memory (NoMemoryError)