This feature provides a new method `GC.config` that configures internal
GC configuration variables provided by an individual GC implementation.
Implemented in this PR is the option `full_mark`: a boolean value that
will determine whether the Ruby GC is allowed to run a major collection
while the process is running.
It has the following semantics
This feature configures Ruby's GC to only run minor GC's. It's designed
to give users relying on Out of Band GC complete control over when a
major GC is run. Configuring `full_mark: false` does two main things:
* Never runs a Major GC. When the heap runs out of space during a minor
and when a major would traditionally be run, instead we allocate more
heap pages, and mark objspace as needing a major GC.
* Don't increment object ages. We don't promote objects during GC, this
will cause every object to be scanned on every minor. This is an
intentional trade-off between minor GC's doing more work every time,
and potentially promoting objects that will then never be GC'd.
The intention behind not aging objects is that users of this feature
should use a preforking web server, or some other method of pre-warming
the oldgen (like Nakayoshi fork)before disabling Majors. That way most
objects that are going to be old will have already been promoted.
This will interleave major and minor GC collections in exactly the same
what that the Ruby GC runs in versions previously to this. This is the
default behaviour.
* This new method has the following extra semantics:
- `GC.config` with no arguments returns a hash of the keys of the
currently configured GC
- `GC.config` with a key pair (eg. `GC.config(full_mark: true)` sets
the matching config key to the corresponding value and returns the
entire known config hash, including the new values. If the key does
not exist, `nil` is returned
* When a minor GC is run, Ruby sets an internal status flag to determine
whether the next GC will be a major or a minor. When `full_mark:
false` this flag is ignored and every GC will be a minor.
This status flag can be accessed at
`GC.latest_gc_info(:needs_major_by)`. Any value other than `nil` means
that the next collection would have been a major.
Thus it's possible to use this feature to check at a predetermined
time, whether a major GC is necessary and run one if it is. eg. After
a request has finished processing.
```ruby
if GC.latest_gc_info(:needs_major_by)
GC.start(full_mark: true)
end
```
[Feature #20443]
This feature is no longer possible under current design; now that our GC
is pluggable, we cannot assume what was achieved by this compiler flag
is always possble by the dynamically-loaded GC implementation.
Since dln.c is replaced with dmydln.c for miniruby, we cannot load shared
GC for miniruby. This means that many tests do not have the shared GC
loaded and many tests fail because of the warning that emits in miniruby.
This commit changes shared GC to directly use dlopen for loading the
shared GC.
This commit changes the external GC API to use `--with-shared-gc=DIR` at
configure time with a directory of the external GC and uses
`RUBY_GC_LIBRARY` environment variable to load the external GC at
runtime.
This commit splits gc.c into two files:
- gc.c now only contains code not specific to Ruby GC. This includes
code to mark objects (which the GC implementation may choose not to
use) and wrappers for internal APIs that the implementation may need
to use (e.g. locking the VM).
- gc_impl.c now contains the implementation of Ruby's GC. This includes
marking, sweeping, compaction, and statistics. Most importantly,
gc_impl.c only uses public APIs in Ruby and a limited set of functions
exposed in gc.c. This allows us to build gc_impl.c independently of
Ruby and plug Ruby's GC into itself.
This commit changes the external GC to be loaded with the `--gc-library`
command line argument instead of the RUBY_GC_LIBRARY_PATH environment
variable because @nobu pointed out that loading binaries using environment
variables can pose a security risk.
`check_rvalue_consistency` uses bitmap and `RVALUE_WB_UNPROTECTED`
etc calls `check_rvalue_consistency` again.
also `rb_raw_obj_info_common()` is called from `check_rvalue_consistency`
so it should not use call `check_rvalue_consistency`.
Dynamic symbols point to a fstring. When we free the symbol, we hash the
fstring to remove it from the table. However, the fstring could have
already been freed, which can cause a crash.
This commit changes it to remove the reference to the fstring before
freeing the symbol so we can avoid this crash.
Now that we're using the inline predicate functions everywhere, the only
remaining use of the RVALUE_?_BITMAP macros is inside their respective
inline function, so we can remove them.
This function loops twice through the array of size pools. Once to set
up the pages list, and then again later on in the function to set the
allocatable_pages count.
We don't do anything with the size pools in between the invocation of
these loops that will affect the size pools, so this commit removes the
second loop and moves the allocatable_pages count update into the first
loop.
Since we cannot read the rvalue_overhead out of the RVALUE anyways (since
it is at the end of the slot), it makes more sense to move it out of
the RVALUE.
miniruby is used by tool/runruby.rb, so we need to ensure we don't rb_bug
when RUBY_GC_LIBRARY_PATH is set so we can run tests using the make
commands. This commit changes it to warn instead.
The error message is often long, so using a small buffer could cause it
to be truncated. rb_bug also has a 256 byte message buffer, so it could
also be truncated.
Before, if dln_open failed to open RUBY_GC_LIBRARY_PATH, it would segfault
because it would try to raise an error, which cannot happen because the
GC has not been initialized yet.
This commit changes dln_open to return the error that occurred so the
caller can handle the error.
When we're searching for CCs, compare the argc and flags for CI rather
than comparing pointers. This means we don't need to store a reference
to the CI, and it also naturally "de-duplicates" CC objects.
We can observe the effect with the following code:
```ruby
require "objspace"
hash = {}
p ObjectSpace.memsize_of(Hash)
eval ("a".."zzz").map { |key|
"hash.merge(:#{key} => 1)"
}.join("; ")
p ObjectSpace.memsize_of(Hash)
```
On master:
```
$ ruby -v test.rb
ruby 3.4.0dev (2024-04-15T16:21:41Z master d019b3baec) [arm64-darwin23]
test.rb:3: warning: assigned but unused variable - hash
3424
527736
```
On this branch:
```
$ make runruby
compiling vm.c
linking miniruby
builtin_binary.inc updated
compiling builtin.c
linking static-library libruby.3.4-static.a
ln -sf ../../rbconfig.rb .ext/arm64-darwin23/rbconfig.rb
linking ruby
ld: warning: ignoring duplicate libraries: '-ldl', '-lobjc', '-lpthread'
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb
2240
2368
```
Co-authored-by: John Hawthorn <jhawthorn@github.com>
RGENGC_CHECK_MODE >=3 fails with an incinsistency in the old object
count during ec_finalization.
This is due to inconsistency introduced to the object graph using T_DATA
finalizers.
This is explained in commit 79df14c04b,
which disabled gc during finalization to work around this.
```
/* prohibit GC because force T_DATA finalizers can break an object graph consistency */
dont_gc_on()
```
This object graph inconsistency also seems to break RGENGC_CHECK_MODE >=
3, when it attempt to verify the object age relationships during
finalization at VM shutdown (gc_enter is called during finalization).
This commit stops the internal consistency check during gc_enter only
when RGENGC_CHECK_MODE >= 3 and when gc is disabled.
This fixes `make btest` with `-DRGENGC_CHECK_MODE=3`