[Bug #19584]
Some C extensions pass a pointer to a global variable to
rb_gc_register_address. However, if a GC is triggered inside of
rb_gc_register_address, then the object could get swept since it does
not exist on the stack.
[Bug #19580]
The real-world scenario motivating this change is libxml2's pthread
code which uses `pthread_key_create` to set up a destructor that is
called at thread exit to free thread-local storage.
There is a small window of time -- after ruby_vm_destruct but before
the process exits -- in which a pthread may exit and the destructor is
called, leading to a segfault.
Please note that this window of time may be relatively large if
`atexit` is being used.
Remove !USE_RVARGC code
[Feature #19579]
The Variable Width Allocation feature was turned on by default in Ruby
3.2. Since then, we haven't received bug reports or backports to the
non-Variable Width Allocation code paths, so we assume that nobody is
using it. We also don't plan on maintaining the non-Variable Width
Allocation code, so we are going to remove it.
Make sure the transient heap is in the right mode when we finish warming
the heap. Also ensure the GC isn't allowed to run while we iterate and
mutate the heap.
[Feature #18885]
For now, the optimizations performed are:
- Run a major GC
- Compact the heap
- Promote all surviving objects to oldgen
Other optimizations may follow.
[Bug #19550]
If !RCLASS_EXT_EMBEDDED (e.g. 32 bit systems) then the rb_classext_t is
allocated throug malloc so it must be freed.
The issue can be seen in the following script:
```
20.times do
100_000.times do
mod = Module.new
Class.new do
include mod
end
end
# Output the Resident Set Size (memory usage, in KB) of the current Ruby process
puts `ps -o rss= -p #{$$}`
end
```
Before this fix, the max RSS is 280MB, while after this change, it's
30MB.
st tables will maintain insertion order so we can marshal dump / load
objects with instance variables in the same order they were set on that
particular instance
[ruby-core:112926] [Bug #19535]
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
When using rb_data_type_struct to wrap a C struct, that C struct can
contain VALUE references to other Ruby objects.
If this is the case then one must also define dmark and optionally
dcompact callbacks in order to allow these objects to be correctly
handled by the GC. This is suboptimal as it requires GC related logic to
be implemented by extension developers. This can be a cause of subtle
bugs when references are not marked of updated correctly inside these
callbacks.
This commit provides an alternative approach, useful in the simple case
where the C struct contains VALUE members (ie. there isn't any
conditional logic, or data structure manipulation required to traverse
these references).
In this case references can be defined using a declarative syntax
as a list of edges (or, pointers to references).
A flag can be set on the rb_data_type_struct to notify the GC that
declarative references are being used, and a list of those references
can be assigned to the dmark pointer instead of a function callback, on
the rb_data_type_struct.
Macros are also provided for simple declaration of the reference list,
and building edges.
To avoid having to also find space in the struct to define a length for
the references list, I've chosed to always terminate the references list
with RUBY_REF_END - defined as UINTPTR_MAX. My assumption is that no
single struct will ever be large enough that UINTPTR_MAX is actually a
valid reference.
These classes don't belong in gc.c as they're not actually part of the
GC. This commit refactors the code by moving all the code into a
weakmap.c file.
This makes the behavior of classes and modules when there are too many instance variables match the behavior of objects with too many instance variables.
When a Ractor is created whilst a tracepoint for
RUBY_INTERNAL_EVENT_NEWOBJ is active, the interpreter crashes. This is
because during the early setup of the Ractor, the stdio objects are
created, which allocates Ruby objects, which fires the tracepoint.
However, the tracepoint machinery tries to dereference the control frame
(ec->cfp->pc), which isn't set up yet and so crashes with a null pointer
dereference.
Fix this by not firing GC tracepoints if cfp isn't yet set up.
We need to zero out the whole slot when running the newobj hook for a
newly allocated class because the slot could be filled with garbage,
which would cause a crash if a GC runs inside of the newobj hook.
For example, the following script crashes:
```
require "objspace"
GC.stress = true
ObjectSpace.trace_object_allocations {
100.times do
Class.new
end
}
```
[Bug #19482]
This commit adds a function rb_data_free used by obj_free and
rb_objspace_call_finalizer to free T_DATA objects. This change also
means that RUBY_TYPED_FREE_IMMEDIATELY objects can be freed immediately
in rb_objspace_call_finalizer rather than being created into a zombie.
This feature was introduced in commit 2ccf6e5, but I realized that
using rb_warn is a bad idea because it allocates objects, which causes
a different crash ("object allocation during garbage collection phase").
We should just hard crash here instead.
If the previous instruction is not a leaf instruction, then the PC was
incremented before the instruction was ran (meaning the currently
executing instruction is actually the previous instruction), so we
should not increment the PC otherwise we will calculate the source
line for the next instruction.
This bug can be reproduced in the following script:
```
require "objspace"
ObjectSpace.trace_object_allocations_start
a =
1.0 / 0.0
p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)]
```
Which outputs: [4, "test.rb"]
This is incorrect because the object was allocated on line 10 and not
line 4. The behaviour is correct when we use a leaf instruction (e.g.
if we replaced `1.0 / 0.0` with `"hello"`), then the output is:
[10, "test.rb"].
[Bug #19456]
We shouldn't run gc_verify_internal_consistency after every GC step
when RGENGC_CHECK_MODE >= 2, only when GC has finished. Running it
on every GC step makes it too slow.
There is a `time` key in GC.stat that gives us the total time spent in
GC. However, we don't know what proportion of the time is spent between
marking and sweeping. This makes it difficult to tune the GC as we're
not sure where to focus our efforts on.
This PR adds keys `marking_time` and `sweeping_time` to GC.stat for the
time spent marking and sweeping, in milliseconds.
[Feature #19437]
This macro is broken when set to anything other than 0. And has had a
comment saying that it's broken for 3 years.
This commit deletes it and the associated logging code. It's clearly
not being used.
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
Given that signleton classes don't have an allocator,
we can re-use these bytes to store the attached object
in `rb_classext_struct` without making it larger.
The old RUBY_GC_HEAP_INIT_SLOTS isn't really usable anymore as
it initalize all the pools by the same factor, but it's unlikely
that pools will need similar sizes.
In production our 40B pool is 5 to 6 times bigger than our 80B pool.