Various things can cause GC to occur when compaction is running, for
example resizing the object identity map:
```
frame #24: 0x000000010c784a10 ruby`gc_grey [inlined] push_mark_stack(stack=<unavailable>, data=<unavailable>) at gc.c:4311:42
frame #25: 0x000000010c7849ff ruby`gc_grey(objspace=0x00007fc56c804400, obj=140485906037400) at gc.c:4907
frame #26: 0x000000010c78f881 ruby`gc_start at gc.c:6464:8
frame #27: 0x000000010c78f5d1 ruby`gc_start [inlined] gc_marks_start(objspace=0x00007fc56c804400, full_mark=<unavailable>) at gc.c:6009
frame #28: 0x000000010c78f3c0 ruby`gc_start at gc.c:6291
frame #29: 0x000000010c78f399 ruby`gc_start(objspace=0x00007fc56c804400, reason=<unavailable>) at gc.c:7104
frame #30: 0x000000010c78930c ruby`objspace_xmalloc0 [inlined] objspace_malloc_fixup(objspace=<unavailable>, mem=0x000000011372a000, size=<unavailable>) at gc.c:9665:5
frame #31: 0x000000010c7892f5 ruby`objspace_xmalloc0(objspace=0x00007fc56c804400, size=12582912) at gc.c:9707
frame #32: 0x000000010c89bc13 ruby`st_init_table_with_size(type=<unavailable>, size=<unavailable>) at st.c:605:39
frame #33: 0x000000010c89c5e2 ruby`rebuild_table_if_necessary [inlined] rebuild_table(tab=0x00007fc56c40b250) at st.c:780:19
frame #34: 0x000000010c89c5ac ruby`rebuild_table_if_necessary(tab=0x00007fc56c40b250) at st.c:1142
frame #35: 0x000000010c89c379 ruby`st_insert(tab=0x00007fc56c40b250, key=140486132605040, value=140485922918920) at st.c:1161:5
frame #36: 0x000000010c794a16 ruby`gc_compact_heap [inlined] gc_move(objspace=0x00007fc56c804400, scan=<unavailable>, free=<unavailable>, moved_list=140485922918960) at gc.c:7441:9
frame #37: 0x000000010c794917 ruby`gc_compact_heap(objspace=0x00007fc56c804400, comparator=<unavailable>) at gc.c:7695
frame #38: 0x000000010c79410d ruby`gc_compact [inlined] gc_compact_after_gc(objspace=0x00007fc56c804400, use_toward_empty=1, use_double_pages=<unavailable>, use_verifier=1) at gc.c:0:22
```
We *definitely* need the heap to be in a consistent state during
compaction, so this commit sets the current state to "during_gc" so that
nothing will trigger a GC until the heap finishes compacting.
This fixes the bug we saw when running the tests for https://github.com/ruby/ruby/pull/2264
pointers.
Instead of 4fe908c164, just locking the MJIT
worker may be fine for this case. And also we might have the same issue
in all `gc_compact_after_gc` calls.
rb_gc_finalize_deferred() is remained for compatibility with
C-extensions. However, this function is no longer working
from Ruby 2.4 (crash with SEGV immediately).
So remove it completely.
`heap_pages_sorted` includes eden and tomb pages, so we should not
use tomb pages for GC.compact (or we should move all of tomb pages
into eden pages). Now, I choose only eden pages. If we allow to
move Zombie objects (objects waiting for finalizers), we should
use both type of pages (TODO).
Fix debug output to dump more useful information on GC.compact
debugging.
check_rvalue_consistency_force() now accepts `terminate` flag
to terminate a program with rb_bug() or only print error message.
GC.verify_internal_consistency use this flag (== FALSE) to dump
all of debug output.
is_pointer_to_heap(obj) checks this obj belong to a heap page.
However, this function returns TRUE even if the page is tomb page.
This is re-commit of [712c027524].
heap_page_add_freeobj() should not use is_pointer_to_heap(), but
should check more explicitly.
GC.verify_internal_consistency() checks health of each RVALUE with
check_rvalue_consistency(). However, this function is enabled
only on debug environment (RGENGC_CHECK_MODE>1). So introduce
new function check_rvalue_consistency_force() and use it
in GC.verify_internal_consistency.
is_pointer_to_heap() checks if it is in valid pointer to the
RVALUE in any heap_page_body. However, it returns true if it
points tomb pages. This patch check it points to eden pages.
rb_gc() kicks gc_finalize_deferred(), which invokes finalizers.
This means that any Ruby program can be run this point and
it may be thread switching points and so on.
However, it is difficult to think it invokes any Ruby programs.
For example, `GC.compact` use `rb_gc()` to implement it, howver,
any Ruby program must not be run on this timing.
For this reason (it is difficult to image it run any Ruby program),
I removed `gc_finalize_deferred()` line in rb_gc().
This patch solves GC.compact issue.
[Bug #15809] and re-enable GC.compact test.
* variable.c: make the hidden ivars `classpath` and `tmp_classpath` the source
of truth for module and constant names. Assign to them when modules are bind
to constants.
* variable.c: remove references to module name cache, as what used to be the cache
is now the source of truth. Remove rb_class_path_no_cache().
* variable.c: remove the hidden ivar `classid`. This existed for the purposes of
module name search, which is now replaced. Also, remove the associated
rb_name_class().
* class.c: use rb_set_class_path_string to set the name of Object during boot.
Must use a fstring as this runs before rb_cString is initialized and
creating a normal string leads to a VALUE without a class.
* spec/ruby/core/module/name_spec.rb: add a few specs to specify what happens
to Module#name across multiple operations. These specs pass without other
code changes in this commit.
[Feature #15765]
If a dynamic symbol has been converted to a static symbol, it gets added
to the global ID list and should no longer move. C extensions can pass
symbols to rb_sym2id and those symbols should no longer be movable.
When the symbol is passed to rb_sym2id, the `id` member is set, so we
can use its existence to prevent movement.
`transient_heap_evacuate()` disables GC using `rb_gc_disable()`
to prohibt GC invocation because of new allocation for evacuated
memory. However, `rb_gc_disable()` sweep all rest of unswept pages.
We don't need to cancel lazy sweep so this patch introduce
`rb_gc_disable_no_rest()` which doesn't cancel lazy sweep.
This commit adds an alternative packing strategy for compaction.
Instead of packing towards "most pinned" pages, we can pack towards
"most empty" pages. The idea is that we can double the heap size, then
pack all objects towards the empty side of the heap. This will ensure
maximum chaos for testing / verification.
GC is required for pinning / marking objects. If the compactor runs
without pinning everything, then it will blow up, so just return early
if the GC is disabled.
Objects in the finalizer table stay pinned for now. In some cases, the
key could move which would cause a miss when removing the object from
the table (leading to a T_MOVED reference staying in the table).
`obj_info` will look at references of objects in some cases (for example
it will try to access path information on ISeq objects). But during the
sweep phase, if the referenced object is collected before `obj_info` is
called, then it could be a bad ref and a segv will occur.
For example:
A -> B
Sweep phase:
1. obj_info(B)
2. Sweep and free B
3. obj_info(A); A tries to read B
4. SEGV
This commit simply removes the call to `obj_info` during the sweep
phase.
`GC::Profiler.enable; GC::Profiler.clear` tries to clear
objspace->profile.records but it has never been allocated before.
Thus the MEMCPY took NULL argument before this changeset.
The objspace->profile.records is allocated appropriately elsewhere.
Why not juts free it if any? That should work.
Depending on architectures, setjmp might not fully fill a jmp_buf.
On such machines the union can contain wobbly bits. They are then
scanned during mark_locations_array(). This is bad.
These assertions check if a newly allocated object (which is marked
as an uninitialized memory region in MSAN) is in fact a T_NONE.
Thus they intentionally read uninitialized memory regions, which do
not interface well with MSAN. Just disalbe them.
Before this commit, classes and modules would be registered with the
VM's `defined_module_hash`. The key was the ID of the class, but that
meant that it was possible for hash collisions to occur. The compactor
doesn't allow classes in the `defined_module_hash` to move, but if there
is a conflict, then it's possible a class would be removed from the hash
and not get pined.
This commit changes the key / value of the hash just to be the class
itself, thus preventing movement.
Remembered objects can move between pages, so we need to make sure the
flags on the page are set correctly.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Some classes don't have a superclass, so we should check to see if it's
there before marking.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
For some reason symbols (or classes) are being overridden in trunk
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit adds the new method `GC.compact` and compacting GC support.
Please see this issue for caveats:
https://bugs.ruby-lang.org/issues/15626
[Feature #15626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67576 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Because hard to specify commits related to r67479 only.
So please commit again.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Weak map references can't move because the st_table needs their address
as a key. But, we also need to remove T_NONE from the map so they
aren't reused.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67485 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit adds the new method `GC.compact` and compacting GC support.
Please see this issue for caveats:
https://bugs.ruby-lang.org/issues/15626
[Feature #15626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (obj_memsize_of): T_RATIONAL and T_COMPLEX cannot be an
imemo.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67464 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This code was trying to access memory before unpoisoning it.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit just adds poisoning around the freelist to help debugging.
Also verify that the freelist only contains T_NONE objects when checking
the heap integrity
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67416 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
verify_internal_consistency_i and gc_verify_heap_page would walk the
heap, reading data from each slot, but would not unpoison the object
before reading. This commit unpoisons the slot before reading so that
we won't get ASAN errors
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67408 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
for types Float, Bignum and Symbol as they do not have references
and singleton classes.
[Fix GH-2091]
From: Lourens Naudé <lourens@bearmetal.eu>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h: move ar_table def to hash.c because other files
don't need to know implementation of ar_table.
* hash.c (rb_hash_ar_table_size): added because gc.c needs to know
the size_of(ar_table).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66638 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (gc_mark_ptr): this check was introduced by accidentaly
(this is why message is "...", crazy simple) for debugging.
However, this check is useful because if there is T_NONE
object here, we can't know which object points to it.
For debugging reason, I remain this checking code and
set reasonable error message.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h: rename the following names:
* li_table -> ar_table. "li" means linear (from linear search),
but we use the word "array" (from data layout).
* RHASH_ARRAY -> RHASH_AR_TABLE. AR_TABLE is more clear.
* rb_hash_array_* -> rb_hash_ar_table_*.
* RHASH_TABLE_P() -> RHASH_ST_TABLE_P(). more clear.
* RHASH_CLEAR() -> RHASH_ST_CLEAR().
* hash.c: rename "linear_" prefix functions to "ar_" prefix.
* hash.c (linear_init_table): rename to ar_alloc_table.
* debug_counter.h: rename obj_hash_array to obj_hash_ar.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Especially over checking argc then calling rb_scan_args just to
raise an ArgumentError.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The ruby setting was renamed to HEAP_PAGE_ALIGN_LOG, but the
configure.in (now configure.ac) file was not updated, so the
setting had no effect. The configure setting is unnecessary
after OpenBSD 5.2 and MirOS has been discontinued (with the last
release being over 10 years ago), so it is better to just remove
the related configure setting.
Fix [Bug #13438]
From: Jeremy Evans <code@jeremyevans.net>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* vm_trace.c (rb_tracepoint_enable_for_target): support targetting
TracePoint. [Feature #15289]
Tragetting TracePoint is only enabled on specified method, proc
and so on, example: `tp.enable(target: code)`.
`code` should be consisted of InstructionSeuqnece (iseq)
(RubyVM::InstructionSeuqnece.of(code) should not return nil)
If code is a tree of iseq, TracePoint is enabled on all of
iseqs in a tree.
Enabled tragetting TracePoints can not enabled again with
and without target.
* vm_core.h (rb_iseq_t): introduce `rb_iseq_t::local_hooks`
to store local hooks.
`rb_iseq_t::aux::trace_events` is renamed to
`global_trace_events` to contrast with `local_hooks`.
* vm_core.h (rb_hook_list_t): add `rb_hook_list_t::running`
to represent how many Threads/Fibers are used this list.
If this field is 0, nobody using this hooks and we can
delete it.
This is why we can remove code from cont.c.
* vm_core.h (rb_vm_t): because of above change, we can eliminate
`rb_vm_t::trace_running` field.
Also renamed from `rb_vm_t::event_hooks` to `global_hooks`.
* vm_core.h, vm.c (ruby_vm_event_enabled_global_flags): renamed
from `ruby_vm_event_enabled_flags.
* vm_core.h, vm.c (ruby_vm_event_local_num): added to count
enabled targetting TracePoints.
* vm_core.h, vm_trace.c (rb_exec_event_hooks): accepts
hook list.
* vm_core.h (rb_vm_global_hooks): added for convinience.
* method.h (rb_method_bmethod_t): added to maintain Proc
and `rb_hook_list_t` for bmethod (defined by define_method).
* prelude.rb (TracePoint#enable): extracet a keyword parameter
(because it is easy than writing in C).
It calls `TracePoint#__enable` internal method written in C.
* vm_insnhelper.c (vm_trace): check also iseq->local_hooks.
* vm.c (invoke_bmethod): check def->body.bmethod.hooks.
* vm.c (hook_before_rewind): check iseq->local_hooks
and def->body.bmethod.hooks before rewind by exception.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66003 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
These APIs are much like <valgrind/memcheck.h>. Use them to
fine-grain annotate the usage of our memory.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
In these functions we are intentionally reading memory address
not owned by us. These reads should not be diagnosed.
See also [Bug #8680]
See also https://travis-ci.org/ruby/ruby/jobs/451202718
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (rb_raw_obj_info): fix type mismatch specified by the following
build log: https://travis-ci.org/ruby/ruby/jobs/448634481
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c, internal.h: support theap for small Hash.
Introduce RHASH_ARRAY (li_table) besides st_table and small Hash
(<=8 entries) are managed by an array data structure.
This array data can be managed by theap.
If st_table is needed, then converting array data to st_table data.
For st_table using code, we prepare "stlike" APIs which accepts hash value
and are very similar to st_ APIs.
This work is based on the GSoC achievement
by tacinight <tacingiht@gmail.com> and refined by ko1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65454 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c: now instance variable space has theap supports.
obj_ivar_heap_alloc() tries to acquire memory from theap.
* debug_counter.h: add some counters for theap.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65451 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
(re-commit of r65444)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c: now instance variable space has theap supports.
obj_ivar_heap_alloc() tries to acquire memory from theap.
* debug_counter.h: add some counters for theap.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65446 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* class.c (rb_keyword_error_new): use RARRAY_AREF() because
RARRAY_CONST_PTR() can introduce additional overhead in a futre.
Same fixes for other files.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65430 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* debug_counter.h: add the following counters.
* frame_push: control frame counts (total counts).
* frame_push_*: control frame counts per every frame type.
* obj_*: add free'ed counts for each type.
* gc.c: ditto.
* vm_insnhelper.c (vm_push_frame): ditto.
* debug_counter.c (rb_debug_counter_show_results): widen counts field
to show >10G numbers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64867 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (obj_free): a table can be accessed for debug counters.
[Bug #15165] [Fix GH-1964]
A patch from Joe Truba <jtruba@meraki.com>
Also check USE_DEBUG_COUNTER macro.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64857 b2dd03c8-39d4-4d8f-98ff-823fe69b080e