Граф коммитов

129 Коммитов

Автор SHA1 Сообщение Дата
Peter Zhu 51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
Jean Boussier e82138e48a Stabilize TestObjSpace#test_dump_special_consts
The test assumes `:foo` is a static symbol, but that is only true
if a literal `:foo` was parsed before `"foo".to_sym` was evaled:

```ruby
require 'objspace'
foo_sym = "foo".to_sym
puts ObjectSpace.dump(eval(":foo"))
```

```
{"address":"0x100fb46d0", "type":"SYMBOL", "shape_id":10, "slot_size":40, "class":"0x100d3e9c8", "frozen":true, "bytesize":3, "value":"foo", "memsize":40, "flags":{"wb_protected":true, "marking":true, "marked":true}}
```
2024-05-09 12:23:34 +02:00
Nobuyoshi Nakada 91485d7dc6
Adjust indent [ci skip] 2024-05-04 01:15:09 +09:00
Jean Boussier 09d8c99cdc Ensure test suite is compatible with --frozen-string-literal
As preparation for https://bugs.ruby-lang.org/issues/20205
making sure the test suite is compatible with frozen string
literals is making things easier.
2024-03-14 17:56:15 +01:00
Takashi Kokubun 1721bb9dc6 Skip a flaky objspace test on Visual Studio
This seems to happen only on VisualStudio:
https://github.com/ruby/ruby/actions/runs/7130917319/job/19418375386

It fails relatively frequently. Nobody seems actively working on it, so
let's skip it until somebody starts working on it.
2023-12-07 09:42:56 -08:00
Jean Boussier 6391ae9ebc objspace_dump.c: dump call cache ids with dump_append_id
Not all `ID` have an associated string.

Fixes a SEGFAULT in ObjectSpace.dump_all spec.
2023-11-22 10:24:35 +01:00
Jean Boussier a1887f4dc2 Revert "Fix crash caused by concurrent ObjectSpace.dump_all calls"
This reverts commit 9a62fd3cba.
2023-11-13 08:57:57 +01:00
KJ Tsanaktsidis 9a62fd3cba Fix crash caused by concurrent ObjectSpace.dump_all calls
Since the callback defined in the objspace module might give up the GVL,
we need to make sure the right cr->mfd value is set back after the GVL
is re-obtained.
2023-11-12 17:50:37 +01:00
Jean Boussier ea1b1ea1aa String#force_encoding don't clear coderange if encoding is unchanged
Some code out there blind calls `force_encoding` without checking
what the original encoding was, which clears the coderange uselessly.

If the String is big, it can be a rather costly mistake.

For instance the `rack-utf8_sanitizer` gem does this on request
bodies.
2023-11-09 12:38:10 +01:00
John Hawthorn 635b92099e Fix ObjectSpace.dump with super() callinfo
super() uses 0 as mid for its callinfo, so we need to check for that to
avoid a segfault when using dump_all.
2023-10-12 10:22:32 +02:00
Takashi Kokubun e210b899dc
Move the PC regardless of the leaf flag (#8232)
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
2023-08-16 20:28:33 -07:00
Peter Zhu ecbedf9bf1 Remove assumption about object order
The address of objects can't be assumed since a later object may be
allocate in a swept slot.
2023-07-18 14:52:37 -04:00
Peter Zhu e8212c55f9 Fix flaky test in test_objspace.rb
Ensure that the frozen string is promoted to the old generation by
running the GC 4 times.
2023-05-31 14:03:30 -04:00
Nobuyoshi Nakada 4f4bc13eb9
Skip too-complex-shape test which is always flaky regardless JIT 2023-05-21 16:44:10 +09:00
Nobuyoshi Nakada a997f144fb
Skip first if flaky [ci skip] 2023-05-21 10:31:38 +09:00
Takashi Kokubun 875adad948 The too-complex test isn't stablefor RJIT either
https://github.com/ruby/ruby/actions/runs/5020231516
2023-05-19 12:56:15 +09:00
Takashi Kokubun b70e3f44c1 Skip test_dump_too_complex_shape for YJIT for now
It fails too often with YJIT:

* https://github.com/ruby/ruby/actions/runs/5015976941/jobs/8992254690
* https://github.com/ruby/ruby/actions/runs/5017310353/jobs/8995281395
* https://github.com/ruby/ruby/actions/runs/5019625711/jobs/9000322487
* https://github.com/ruby/ruby/actions/runs/5019883965/jobs/9000836915

ref: https://github.com/ruby/ruby/pull/7646
2023-05-19 11:33:19 +09:00
lukeg d74b32db9d
change to test/objectspace, don't rely on Object's shape not being "too complex" 2023-05-18 09:17:24 -07:00
Yusuke Endoh a19fa9b2bd Prevent warning: assigned but unused variable
http://rubyci.s3.amazonaws.com/debian10/ruby-master/log/20230510T123003Z.log.html.gz
```
/home/chkbuild/chkbuild/tmp/build/20230510T123003Z/ruby/test/objspace/test_objspace.rb:224: warning: assigned but unused variable - c4
/home/chkbuild/chkbuild/tmp/build/20230510T123003Z/ruby/test/ruby/test_class.rb:362: warning: assigned but unused variable - e
/home/chkbuild/chkbuild/tmp/build/20230510T123003Z/ruby/test/ruby/test_process.rb:2602: warning: assigned but unused variable - parent_pid
```
2023-05-10 23:49:44 +09:00
Nobuyoshi Nakada e135a21a85
Define `RubyVM::Shape` dependent test only if available 2023-05-04 13:41:30 +09:00
KJ Tsanaktsidis 7bd7aee02e Fix interpreter crash caused by RUBY_INTERNAL_EVENT_NEWOBJ + Ractors
When a Ractor is created whilst a tracepoint for
RUBY_INTERNAL_EVENT_NEWOBJ is active, the interpreter crashes. This is
because during the early setup of the Ractor, the stdio objects are
created, which allocates Ruby objects, which fires the tracepoint.
However, the tracepoint machinery tries to dereference the control frame
(ec->cfp->pc), which isn't set up yet and so crashes with a null pointer
dereference.

Fix this by not firing GC tracepoints if cfp isn't yet set up.
2023-03-09 09:46:14 +01:00
Peter Zhu e1bd45624c Fix crash when allocating classes with newobj hook
We need to zero out the whole slot when running the newobj hook for a
newly allocated class because the slot could be filled with garbage,
which would cause a crash if a GC runs inside of the newobj hook.

For example, the following script crashes:

```
require "objspace"

GC.stress = true

ObjectSpace.trace_object_allocations {
  100.times do
    Class.new
  end
}
```

[Bug #19482]
2023-03-08 08:47:18 -05:00
Peter Zhu 3e09822407 Fix incorrect line numbers in GC hook
If the previous instruction is not a leaf instruction, then the PC was
incremented before the instruction was ran (meaning the currently
executing instruction is actually the previous instruction), so we
should not increment the PC otherwise we will calculate the source
line for the next instruction.

This bug can be reproduced in the following script:

```
require "objspace"

ObjectSpace.trace_object_allocations_start
a =

  1.0 / 0.0
p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)]
```

Which outputs: [4, "test.rb"]

This is incorrect because the object was allocated on line 10 and not
line 4. The behaviour is correct when we use a leaf instruction (e.g.
if we replaced `1.0 / 0.0` with `"hello"`), then the output is:
[10, "test.rb"].

[Bug #19456]
2023-02-24 14:10:09 -05:00
Koichi Sasada 8c2b6926d2 Skip unfixed assertion about objspace/dump_all
```
{"address":"0x7f8c03e9fcf0", "type":"STRING", "shape_id":10, "slot_size":40, "class":"0x7f8c00dbed98", "frozen":true, "embedded":true, "fstring":true, "bytesize":5, "value":"TEST2", "encoding":"US-ASCII", "coderange":"7bit", "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x7f8c03e9ffc0", "type":"STRING", "shape_id":0, "slot_size":40, "class":"0x7f8c00dbed98", "embedded":true, "bytesize":5, "value":"TEST2", "encoding":"US-ASCII", "coderange":"7bit", "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x7f8c03e487c0", "type":"STRING", "shape_id":0, "slot_size":40, "class":"0x7f8c00dbed98", "embedded":true, "bytesize":5, "value":"TEST2", "encoding":"UTF-8", "coderange":"unknown", "file":"-", "line":4, "method":"dump_my_heap_please", "generation":1, "memsize":40, "flags":{"wb_protected":true}}
  1) Failure:
TestObjSpace#test_dump_all [/tmp/ruby/src/trunk-gc-asserts/test/objspace/test_objspace.rb:622]:
number of strings.
<2> expected but was
<3>.
```

This failure only occurred on a ruby built with `DEFS=\"-DRGENGC_CHECK_MODE=2\""`
and only on a specific machine (Docker container) and difficult to reproduce,
so skip this failure to check other failures.
2023-01-11 18:09:48 +09:00
Peter Zhu 2056c0a7c6 Add embedded status to dumps of T_OBJECT
This commit adds `"embedded":true` in ObjectSpace.dump for T_OBJECTs
that are embedded.
2023-01-05 16:00:36 -05:00
Koichi Sasada 3931921607 add debug print for the failure
http://ci.rvm.jp/results/trunk-gc-asserts@ruby-sp2-docker/4364584

```
  1) Failure:
TestObjSpace#test_dump_all [/tmp/ruby/src/trunk-gc-asserts/test/objspace/test_objspace.rb:599]:
number of strings.
<2> expected but was
<3>.
```
2022-12-28 15:46:16 +09:00
Jemma Issroff e9ba3042e1 Indicate if a shape is too_complex in ObjectSpace#dump 2022-12-15 13:41:47 -08:00
Jean Boussier 73771e4b19 ObjectSpace.dump_all: dump shapes as well
I see several arguments in doing so.

First they use a non trivial amount of memory, so for various memory
profiling/mapping tools it is relevant to have visibility of the space
occupied by shapes.

Then, some pathological code can create a tons of shape, so it is
valuable to have a way to have a way to observe shapes without having
to compile Ruby with `SHAPE_DEBUG=1`.

And additionally it's likely much faster to dump then this way than
to use `RubyVM::Shape`.

There are however a few open questions:

- Shapes can't respect the `since:` argument. Not sure what to do when
  it is provided. Would probably make sense to not dump them.
- Maybe it would make more sense to have a separate `ObjectSpace.dump_shapes`?
- Maybe instead `dump_all` should take a `shapes: false` argument?

Additionally, `ObjectSpace.dump_shapes` is added for the use case of
debugging the evolution of the shape tree.
2022-12-08 18:46:16 +01:00
Peter Zhu 5f95228c76 Add RVALUE_OVERHEAD and move ractor_belonging_id
This commit adds RVALUE_OVERHEAD for storing metadata at the end of the
slot. This commit moves the ractor_belonging_id in debug builds from the
flags to RVALUE_OVERHEAD which frees the 16 bits in the headers for
object shapes.
2022-11-21 11:26:26 -05:00
Jemma Issroff 5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Koichi Sasada e35c528d72 push dummy frame for loading process
This patch pushes dummy frames when loading code for the
profiling purpose.

The following methods push a dummy frame:
* `Kernel#require`
* `Kernel#load`
* `RubyVM::InstructionSequence.compile_file`
* `RubyVM::InstructionSequence.load_from_binary`

https://bugs.ruby-lang.org/issues/18559
2022-10-20 17:38:28 +09:00
Aaron Patterson 06abfa5be6
Revert this until we can figure out WB issues or remove shapes from GC
Revert "* expand tabs. [ci skip]"

This reverts commit 830b5b5c35.

Revert "This commit implements the Object Shapes technique in CRuby."

This reverts commit 9ddfd2ca00.
2022-09-26 16:10:11 -07:00
Jemma Issroff 9ddfd2ca00 This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26 09:21:30 -07:00
Nobuyoshi Nakada cf7d07570f
Dump non-ASCII char as unsigned
Non-ASCII code may be negative on platforms plain char is signed.
2022-07-22 09:56:48 +09:00
Jean byroot Boussier f0ae583a3d Revert "objspace_dump.c: skip dumping method name if not pure ASCII"
This reverts commit 79406e3600.
2022-07-21 19:56:08 +02:00
Jean Boussier 79406e3600 objspace_dump.c: skip dumping method name if not pure ASCII
Sidekiq has a method named `❨╯°□°❩╯︵┻━┻`which corrupts
heap dumps.

Normally we could just dump is as is since it's valid UTF-8 and need
no escaping. But our code to escape control characters isn't UTF-8
aware so it's more complicated than it seems.

Ultimately since the overwhelming majority of method names are
pure ASCII, it's not a big loss to just skip it.
2022-07-21 18:43:45 +02:00
Jean Boussier 890df5f812 ObjectSpace.dump: Include string coderange
I suspect that some shared pages are invalidated because
some static string don't have their coderange set eagerly.

So the first time they are scanned, the entire memory page is
invalidated.

Being able to see the coderange in `ObjectSpace` would help debug
this.

And in addition `dump` currently call `is_broken_string()`  and `is_ascii_string()`
which both end up scanning the string and assigning coderange. I think it's
undesirable as `dump` should be read only.
2022-07-04 20:04:59 +02:00
Jemma Issroff 87123c4fc7 Refactor test_dump_all to make assertions about the contents of the dumped hash 2022-03-29 08:21:10 -07:00
Peter Zhu fb724a887a Show embed status of array when len is 0 in objspace dump 2022-03-01 10:55:53 -05:00
John Hawthorn 05b1944c53 objspace: Hide identhash containing internal objs
Inside ObjectSpace.reachable_objects_from we keep an internal identhash
in order to de-duplicate reachable objects when wrapping them as
InternalObject. Previously this hash was not hidden, making it possible
to leak references to those internal objects to Ruby if using
ObjectSpace.each_object.

This commit solves this by hiding the hash. To simplify collection of
values, we instead now just use the hash as a set of visited objects,
and collect an Array (not hidden) of values to be returned.
2022-02-09 17:32:43 -08:00
Matt Valentine-House 9fab2c1a1a Add the size pool slot size to the output of ObjectSpace.dump/dump_all 2022-02-03 15:07:35 -05:00
Peter Zhu 7b77d46671 Decouple GC slot sizes from RVALUE
Add a new macro BASE_SLOT_SIZE that determines the slot size.

For Variable Width Allocation (compiled with USE_RVARGC=1), all slot
sizes are powers-of-2 multiples of BASE_SLOT_SIZE.

For USE_RVARGC=0, BASE_SLOT_SIZE is set to sizeof(RVALUE).
2022-02-02 09:52:04 -05:00
Peter Zhu a5b6598192 [Feature #18239] Implement VWA for strings
This commit adds support for embedded strings with variable capacity and
uses Variable Width Allocation to allocate strings.
2021-10-25 13:26:23 -04:00
Yusuke Endoh f210d456a8 test/objspace/test_objspace.rb: check stderr before stdout
When `require "objspace/trace"` fails, previously the failure says:
```
  1) Failure:
TestObjSpace#test_objspace_trace [/tmp/ruby/v3/src/trunk-mjit/test/objspace/test_objspace.rb:621]:
<3> expected but was
<0>.
```
but this is hard to debug.
2021-05-14 18:07:58 +09:00
Yusuke Endoh cf1e1879f1 ext/objspace/lib/objspace/trace.rb: Added
This file, when require'ed, starts tracing the object allocations, and
redefines `Kernel#p` to show the allocation site.

This commit is experimental; the library name and APIs may change.

[Feature #17762]
2021-05-14 13:40:32 +09:00
git 81513c9dab * remove trailing spaces. [ci skip] 2021-05-12 17:40:52 +09:00
Koichi Sasada 523a6998dd Use another class for the comparison.
`memsize_of(Object.new)` can be changed with past ivar creation
history for Object instances (another Object instance has 4 or
more ivars, next created Object instance has the area for the
ivars). So use antoher class for the comparison.
2021-05-12 17:40:31 +09:00
Koichi Sasada 5a6af44e20 skip test for debug.
test_memsize_of_iseq fails on repeat tests and it seems to difficult
to solve immediately. Now this test is skipped.

It seems that the result of `memsize_of(Object.new)` are increased.
Why...?
2021-05-12 12:57:53 +09:00
Ryuta Kamizono 33f2ff3bab Fix some typos by spell checker 2021-04-26 10:07:41 +09:00
Nobuyoshi Nakada 2a02b61fae
Use EnvUtil.under_gc_stress 2021-03-31 22:14:15 +09:00