ruby/gc.rb

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

329 строки
12 KiB
Ruby
Исходник Обычный вид История

2019-11-08 09:32:01 +03:00
# for gc.c
2023-08-14 21:04:39 +03:00
# The \GC module provides an interface to Ruby's mark and
2019-11-08 09:32:01 +03:00
# sweep garbage collection mechanism.
#
# Some of the underlying methods are also available via the ObjectSpace
# module.
#
# You may obtain information about the operation of the \GC through
2019-11-08 09:32:01 +03:00
# GC::Profiler.
module GC
2023-08-14 21:04:55 +03:00
# Initiates garbage collection, even if manually disabled.
2019-11-08 09:32:01 +03:00
#
2023-08-14 21:04:55 +03:00
# The +full_mark+ keyword argument determines whether or not to perform a
# major garbage collection cycle. When set to +true+, a major garbage
# collection cycle is ran, meaning all objects are marked. When set to
# +false+, a minor garbage collection cycle is ran, meaning only young
# objects are marked.
2019-11-08 09:32:01 +03:00
#
2023-08-14 21:04:55 +03:00
# The +immediate_mark+ keyword argument determines whether or not to perform
# incremental marking. When set to +true+, marking is completed during the
# call to this method. When set to +false+, marking is performed in steps
# that is interleaved with future Ruby code execution, so marking might not
# be completed during this method call. Note that if +full_mark+ is +false+
# then marking will always be immediate, regardless of the value of
# +immediate_mark+.
2019-11-08 09:32:01 +03:00
#
2023-08-14 21:04:55 +03:00
# The +immedate_sweep+ keyword argument determines whether or not to defer
# sweeping (using lazy sweep). When set to +true+, sweeping is performed in
# steps that is interleaved with future Ruby code execution, so sweeping might
# not be completed during this method call. When set to +false+, sweeping is
# completed during the call to this method.
2019-11-08 09:32:01 +03:00
#
2023-08-14 21:04:55 +03:00
# Note: These keyword arguments are implementation and version dependent. They
# are not guaranteed to be future-compatible, and may be ignored if the
# underlying implementation does not support them.
Revert "Combine sweeping and moving" This reverts commit 02b216e5a70235f42f537e895d6f1afd05d8916a. This reverts commit 9b8825b6f94696c9659f93f5da9bf02644625f67. I found that combining sweep and move is not safe. I don't think that we can do compaction concurrently with _anything_ unless there is a read barrier installed. Here is a simple example. A class object is freed, and during it's free step, it tries to remove itself from its parent's subclass list. However, during the sweep step, the parent class was moved and the "currently being freed" class didn't have references updated yet. So we get a segv like this: ``` (lldb) bt * thread #1, name = 'ruby', stop reason = signal SIGSEGV * frame #0: 0x0000560763e344cb ruby`rb_st_lookup at st.c:320:43 frame #1: 0x0000560763e344cb ruby`rb_st_lookup(tab=0x2f7469672f6e6f72, key=3809, value=0x0000560765bf2270) at st.c:1010 frame #2: 0x0000560763e8f16a ruby`rb_search_class_path at variable.c:99:9 frame #3: 0x0000560763e8f141 ruby`rb_search_class_path at variable.c:145 frame #4: 0x0000560763e8f141 ruby`rb_search_class_path(klass=94589785585880) at variable.c:191 frame #5: 0x0000560763ec744e ruby`rb_vm_bugreport at vm_dump.c:996:17 frame #6: 0x0000560763f5b958 ruby`rb_bug_for_fatal_signal at error.c:675:5 frame #7: 0x0000560763e27dad ruby`sigsegv(sig=<unavailable>, info=<unavailable>, ctx=<unavailable>) at signal.c:955:5 frame #8: 0x00007f8b891d33c0 libpthread.so.0`___lldb_unnamed_symbol1$$libpthread.so.0 + 1 frame #9: 0x0000560763efa8bb ruby`rb_class_remove_from_super_subclasses(klass=94589790314280) at class.c:93:56 frame #10: 0x0000560763d10cb7 ruby`gc_sweep_step at gc.c:2674:2 frame #11: 0x0000560763d1187b ruby`gc_sweep at gc.c:4540:2 frame #12: 0x0000560763d101f0 ruby`gc_start at gc.c:6797:6 frame #13: 0x0000560763d15153 ruby`rb_gc_compact at gc.c:7479:12 frame #14: 0x0000560763eb4eb8 ruby`vm_exec_core at vm_insnhelper.c:5183:13 frame #15: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #16: 0x0000560763eac08d ruby`rb_yield at vm.c:1132:9 frame #17: 0x0000560763edb4f2 ruby`rb_ary_collect at array.c:3186:9 frame #18: 0x0000560763e9ee15 ruby`vm_call_cfunc_with_frame at vm_insnhelper.c:2575:12 frame #19: 0x0000560763eb2e66 ruby`vm_exec_core at vm_insnhelper.c:4177:11 frame #20: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #21: 0x0000560763eac08d ruby`rb_yield at vm.c:1132:9 frame #22: 0x0000560763edb4f2 ruby`rb_ary_collect at array.c:3186:9 frame #23: 0x0000560763e9ee15 ruby`vm_call_cfunc_with_frame at vm_insnhelper.c:2575:12 frame #24: 0x0000560763eb2e66 ruby`vm_exec_core at vm_insnhelper.c:4177:11 frame #25: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #26: 0x0000560763ceee01 ruby`rb_ec_exec_node(ec=0x0000560765afa530, n=0x0000560765b088e0) at eval.c:296:2 frame #27: 0x0000560763cf3b7b ruby`ruby_run_node(n=0x0000560765b088e0) at eval.c:354:12 frame #28: 0x0000560763cee4a3 ruby`main(argc=<unavailable>, argv=<unavailable>) at main.c:50:9 frame #29: 0x00007f8b88e560b3 libc.so.6`__libc_start_main + 243 frame #30: 0x0000560763cee4ee ruby`_start + 46 (lldb) f 9 frame #9: 0x0000560763efa8bb ruby`rb_class_remove_from_super_subclasses(klass=94589790314280) at class.c:93:56 90 91 *RCLASS_EXT(klass)->parent_subclasses = entry->next; 92 if (entry->next) { -> 93 RCLASS_EXT(entry->next->klass)->parent_subclasses = RCLASS_EXT(klass)->parent_subclasses; 94 } 95 xfree(entry); 96 } (lldb) command script import -r misc/lldb_cruby.py lldb scripts for ruby has been installed. (lldb) rp entry->next->klass (struct RMoved) $1 = (flags = 30, destination = 94589792806680, next = 94589784369160) (lldb) ```
2020-06-09 23:46:29 +03:00
def self.start full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
2019-11-08 09:32:01 +03:00
end
def garbage_collect full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
end
2019-11-08 09:32:01 +03:00
# call-seq:
# GC.enable -> true or false
#
# Enables garbage collection, returning +true+ if garbage
# collection was previously disabled.
#
# GC.disable #=> false
# GC.enable #=> true
# GC.enable #=> false
#
def self.enable
Primitive.gc_enable
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.disable -> true or false
#
# Disables garbage collection, returning +true+ if garbage
# collection was already disabled.
#
# GC.disable #=> false
# GC.disable #=> true
def self.disable
Primitive.gc_disable
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.stress -> integer, true or false
#
# Returns current status of \GC stress mode.
2019-11-08 09:32:01 +03:00
def self.stress
Primitive.gc_stress_get
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.stress = flag -> flag
#
# Updates the \GC stress mode.
2019-11-08 09:32:01 +03:00
#
# When stress mode is enabled, the \GC is invoked at every \GC opportunity:
2019-11-08 09:32:01 +03:00
# all memory and object allocations.
#
# Enabling stress mode will degrade performance, it is only for debugging.
#
# flag can be true, false, or an integer bit-ORed following flags.
# 0x01:: no major GC
# 0x02:: no immediate sweep
# 0x04:: full mark after malloc/calloc/realloc
def self.stress=(flag)
Primitive.gc_stress_set_m flag
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.count -> Integer
#
# The number of times \GC occurred.
2019-11-08 09:32:01 +03:00
#
# It returns the number of times \GC occurred since the process started.
2019-11-08 09:32:01 +03:00
def self.count
Primitive.gc_count
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.stat -> Hash
# GC.stat(hash) -> Hash
2019-11-08 09:32:01 +03:00
# GC.stat(:key) -> Numeric
#
# Returns a Hash containing information about the \GC.
2019-11-08 09:32:01 +03:00
#
# The contents of the hash are implementation specific and may change in
# the future without notice.
2019-11-08 09:32:01 +03:00
#
# The hash includes information about internal statistics about \GC such as:
#
# [count]
# The total number of garbage collections ran since application start
# (count includes both minor and major garbage collections)
# [time]
# The total time spent in garbage collections (in milliseconds)
# [heap_allocated_pages]
2023-01-05 17:24:32 +03:00
# The total number of +:heap_eden_pages+ + +:heap_tomb_pages+
# [heap_sorted_length]
# The number of pages that can fit into the buffer that holds references to
# all pages
# [heap_allocatable_pages]
# The total number of pages the application could allocate without additional \GC
# [heap_available_slots]
2023-01-05 17:24:32 +03:00
# The total number of slots in all +:heap_allocated_pages+
# [heap_live_slots]
# The total number of slots which contain live objects
# [heap_free_slots]
# The total number of slots which do not contain live objects
# [heap_final_slots]
# The total number of slots with pending finalizers to be run
# [heap_marked_slots]
# The total number of objects marked in the last \GC
# [heap_eden_pages]
# The total number of pages which contain at least one live slot
# [heap_tomb_pages]
# The total number of pages which do not contain any live slots
# [total_allocated_pages]
# The cumulative number of pages allocated since application start
# [total_freed_pages]
# The cumulative number of pages freed since application start
# [total_allocated_objects]
# The cumulative number of objects allocated since application start
# [total_freed_objects]
# The cumulative number of objects freed since application start
# [malloc_increase_bytes]
# Amount of memory allocated on the heap for objects. Decreased by any \GC
# [malloc_increase_bytes_limit]
2023-01-05 17:24:32 +03:00
# When +:malloc_increase_bytes+ crosses this limit, \GC is triggered
# [minor_gc_count]
# The total number of minor garbage collections run since process start
# [major_gc_count]
# The total number of major garbage collections run since process start
# [compact_count]
# The total number of compactions run since process start
# [read_barrier_faults]
# The total number of times the read barrier was triggered during
# compaction
# [total_moved_objects]
# The total number of objects compaction has moved
# [remembered_wb_unprotected_objects]
# The total number of objects without write barriers
# [remembered_wb_unprotected_objects_limit]
2023-01-05 17:24:32 +03:00
# When +:remembered_wb_unprotected_objects+ crosses this limit,
# major \GC is triggered
# [old_objects]
# Number of live, old objects which have survived at least 3 garbage collections
# [old_objects_limit]
2023-01-05 17:24:32 +03:00
# When +:old_objects+ crosses this limit, major \GC is triggered
# [oldmalloc_increase_bytes]
# Amount of memory allocated on the heap for objects. Decreased by major \GC
# [oldmalloc_increase_bytes_limit]
2023-01-05 17:24:32 +03:00
# When +:old_malloc_increase_bytes+ crosses this limit, major \GC is triggered
#
# If the optional argument, hash, is given,
# it is overwritten and returned.
# This is intended to avoid probe effect.
#
# This method is only expected to work on CRuby.
2019-11-08 09:32:01 +03:00
def self.stat hash_or_key = nil
Primitive.gc_stat hash_or_key
2019-11-08 09:32:01 +03:00
end
# call-seq:
# GC.stat_heap -> Hash
# GC.stat_heap(nil, hash) -> Hash
# GC.stat_heap(heap_name) -> Hash
# GC.stat_heap(heap_name, hash) -> Hash
# GC.stat_heap(heap_name, :key) -> Numeric
#
# Returns information for memory pools in the \GC.
#
# If the first optional argument, +heap_name+, is passed in and not +nil+, it
# returns a +Hash+ containing information about the particular memory pool.
# Otherwise, it will return a +Hash+ with memory pool names as keys and
# a +Hash+ containing information about the memory pool as values.
#
# If the second optional argument, +hash_or_key+, is given as +Hash+, it will
# be overwritten and returned. This is intended to avoid the probe effect.
#
# If both optional arguments are passed in and the second optional argument is
# a symbol, it will return a +Numeric+ of the value for the particular memory
# pool.
#
# On CRuby, +heap_name+ is of the type +Integer+ but may be of type +String+
# on other implementations.
#
# The contents of the hash are implementation specific and may change in
# the future without notice.
#
# If the optional argument, hash, is given, it is overwritten and returned.
#
# This method is only expected to work on CRuby.
#
# The hash includes the following keys about the internal information in
# the \GC:
#
# [slot_size]
# The slot size of the heap in bytes.
# [heap_allocatable_pages]
# The number of pages that can be allocated without triggering a new
# garbage collection cycle.
# [heap_eden_pages]
# The number of pages in the eden heap.
# [heap_eden_slots]
# The total number of slots in all of the pages in the eden heap.
# [heap_tomb_pages]
# The number of pages in the tomb heap. The tomb heap only contains pages
# that do not have any live objects.
# [heap_tomb_slots]
# The total number of slots in all of the pages in the tomb heap.
# [total_allocated_pages]
# The total number of pages that have been allocated in the heap.
# [total_freed_pages]
# The total number of pages that have been freed and released back to the
# system in the heap.
# [force_major_gc_count]
# The number of times major garbage collection cycles this size pool has
# forced to start due to running out of free slots.
#
def self.stat_heap heap_name = nil, hash_or_key = nil
Primitive.gc_stat_heap heap_name, hash_or_key
end
2022-12-21 22:52:55 +03:00
# call-seq:
# GC.latest_gc_info -> hash
2019-11-08 09:32:01 +03:00
# GC.latest_gc_info(hash) -> hash
# GC.latest_gc_info(:major_by) -> :malloc
#
# Returns information about the most recent garbage collection.
#
# If the optional argument, hash, is given,
# it is overwritten and returned.
# This is intended to avoid probe effect.
2019-11-08 09:32:01 +03:00
def self.latest_gc_info hash_or_key = nil
Primitive.gc_latest_gc_info hash_or_key
2019-11-08 09:32:01 +03:00
end
if respond_to?(:compact)
# call-seq:
# GC.verify_compaction_references(toward: nil, double_heap: false) -> hash
#
# Verify compaction reference consistency.
#
# This method is implementation specific. During compaction, objects that
# were moved are replaced with T_MOVED objects. No object should have a
# reference to a T_MOVED object after compaction.
#
Add expand_heap option to GC.verify_compaction_references In order to reliably test compaction we need to be able to move objects between size pools. In order for this to happen there must be pages in a size pool into which we can allocate. The existing implementation of `double_heap` only doubled the existing number of pages in the heap, so if a size pool had a low number of pages (or 0) it's not guaranteed that enough space will be created to move objects into that size pool. This commit deprecates the `double_heap` option and replaces it with `expand_heap` instead. expand heap will expand each heap by enough pages to hold a number of slots defined by `GC_HEAP_INIT_SLOTS` or by `heap->total_pags` whichever is larger. If both `double_heap` and `expand_heap` are present, a deprecation warning will be shown for `double_heap` and the `expand_heap` behaviour will take precedence Given that this is an API intended for debugging and testing GC compaction I'm not concerned about the extra memory usage or time taken to create the pages. However, for completeness: Running the following `test.rb` and using `time` on my Macbook Pro shows the following memory usage and time impact: pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" GC.verify_compaction_references(double_heap: true, toward: :empty) pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" <internal:gc>:251: warning: double_heap is deprecated and will be removed "RSS (kb): 25232" ________________________________________________________ Executed in 124.37 millis fish external usr time 82.22 millis 0.09 millis 82.12 millis sys time 28.76 millis 2.61 millis 26.15 millis ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" "RSS (kb): 49040" ________________________________________________________ Executed in 150.13 millis fish external usr time 103.32 millis 0.10 millis 103.22 millis sys time 35.73 millis 2.59 millis 33.14 millis
2022-07-07 23:52:05 +03:00
# This function expands the heap to ensure room to move all objects,
# compacts the heap to make sure everything moves, updates all references,
# then performs a full \GC. If any object contains a reference to a T_MOVED
# object, that object should be pushed on the mark stack, and will
# make a SEGV.
Add expand_heap option to GC.verify_compaction_references In order to reliably test compaction we need to be able to move objects between size pools. In order for this to happen there must be pages in a size pool into which we can allocate. The existing implementation of `double_heap` only doubled the existing number of pages in the heap, so if a size pool had a low number of pages (or 0) it's not guaranteed that enough space will be created to move objects into that size pool. This commit deprecates the `double_heap` option and replaces it with `expand_heap` instead. expand heap will expand each heap by enough pages to hold a number of slots defined by `GC_HEAP_INIT_SLOTS` or by `heap->total_pags` whichever is larger. If both `double_heap` and `expand_heap` are present, a deprecation warning will be shown for `double_heap` and the `expand_heap` behaviour will take precedence Given that this is an API intended for debugging and testing GC compaction I'm not concerned about the extra memory usage or time taken to create the pages. However, for completeness: Running the following `test.rb` and using `time` on my Macbook Pro shows the following memory usage and time impact: pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" GC.verify_compaction_references(double_heap: true, toward: :empty) pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" <internal:gc>:251: warning: double_heap is deprecated and will be removed "RSS (kb): 25232" ________________________________________________________ Executed in 124.37 millis fish external usr time 82.22 millis 0.09 millis 82.12 millis sys time 28.76 millis 2.61 millis 26.15 millis ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" "RSS (kb): 49040" ________________________________________________________ Executed in 150.13 millis fish external usr time 103.32 millis 0.10 millis 103.22 millis sys time 35.73 millis 2.59 millis 33.14 millis
2022-07-07 23:52:05 +03:00
def self.verify_compaction_references(toward: nil, double_heap: false, expand_heap: false)
Primitive.gc_verify_compaction_references(double_heap, expand_heap, toward == :empty)
end
end
# call-seq:
# GC.measure_total_time = true/false
#
# Enable to measure \GC time.
# You can get the result with <tt>GC.stat(:time)</tt>.
# Note that \GC time measurement can cause some performance overhead.
def self.measure_total_time=(flag)
Primitive.cstmt! %{
rb_objspace.flags.measure_gc = RTEST(flag) ? TRUE : FALSE;
return flag;
}
end
# call-seq:
# GC.measure_total_time -> true/false
#
# Return measure_total_time flag (default: +true+).
# Note that measurement can affect the application performance.
def self.measure_total_time
Primitive.cexpr! %{
RBOOL(rb_objspace.flags.measure_gc)
}
end
# call-seq:
# GC.total_time -> int
#
# Return measured \GC total time in nano seconds.
def self.total_time
Primitive.cexpr! %{
ULL2NUM(rb_objspace.profile.marking_time_ns + rb_objspace.profile.sweeping_time_ns)
}
end
2019-11-08 09:32:01 +03:00
end
module ObjectSpace
def garbage_collect full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
2019-11-08 09:32:01 +03:00
end
module_function :garbage_collect
end