Граф коммитов

67 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada 42c2c8caa5
Adjust indent [ci skip] 2023-10-23 19:28:14 +09:00
Jean Boussier e5364ea496 rb_shape_transition_shape_capa: use optimal sizes transitions
Previously the growth was 3(embed), 6, 12, 24, ...

With this change it's now 3(embed), 8, 16, 32, 64, ... by default.

However, since power of two isn't the best size for all allocators,
if `malloc_usable_size` is vailable, we use it to discover the best
offset.

On Linux/glibc 2.35 for instance, the growth will be 3(embed), 7, 15, 31
to avoid wasting 8B per object.

Test program:

```c

size_t test(size_t slots) {
    size_t allocated = slots * VALUE_SIZE;
    void *test_ptr = malloc(allocated);
    size_t wasted = malloc_usable_size(test_ptr) - allocated;
    free(test_ptr);
    fprintf(stderr, "slots = %lu, wasted_bytes = %lu\n", slots, wasted);
    return wasted;
}

int main(int argc, char *argv[]) {
    size_t best_padding = 0;
    size_t padding = 0;
    for (padding = 0; padding <= 2; padding++) {
        size_t wasted = test(8 - padding);
        if (wasted == 0) {
            best_padding = padding;
            break;
        }
    }

    size_t index = 0;
    fprintf(stderr, "=============== naive ================\n");

    size_t list_size = 4;
    for (index = 0; index < 10; index++) {
        test(list_size);
        list_size *= 2;
    }

    fprintf(stderr, "=============== auto-padded (-%lu) ================\n", best_padding);

    list_size = 4;
    for (index = 0; index < 10; index ++) {
        test(list_size - best_padding);
        list_size *= 2;
    }

    fprintf(stderr, "\n\n");
    return 0;
}
```

```
===== glibc ======
slots = 8, wasted_bytes = 8
slots = 7, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 8
slots = 8, wasted_bytes = 8
slots = 16, wasted_bytes = 8
slots = 32, wasted_bytes = 8
slots = 64, wasted_bytes = 8
slots = 128, wasted_bytes = 8
slots = 256, wasted_bytes = 8
slots = 512, wasted_bytes = 8
slots = 1024, wasted_bytes = 8
slots = 2048, wasted_bytes = 8
=============== auto-padded (-1) ================
slots = 3, wasted_bytes = 0
slots = 7, wasted_bytes = 0
slots = 15, wasted_bytes = 0
slots = 31, wasted_bytes = 0
slots = 63, wasted_bytes = 0
slots = 127, wasted_bytes = 0
slots = 255, wasted_bytes = 0
slots = 511, wasted_bytes = 0
slots = 1023, wasted_bytes = 0
slots = 2047, wasted_bytes = 0
```

```
==========  jemalloc =======
slots = 8, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
=============== auto-padded (-0) ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
```
2023-10-23 09:33:15 +02:00
Jean Boussier 5cc44f48c5 Refactor rb_shape_transition_shape_capa to not accept capacity
This way the groth factor is encapsulated, which allows
rb_shape_transition_shape_capa to be smarter about ideal sizes.
2023-10-10 14:47:54 +02:00
Nobuyoshi Nakada 8d242a33af
`rb_bug` prints a newline after the message 2023-05-20 21:43:30 +09:00
Jean Boussier 04ee666aab Make the maximum shapes variation warning non-verbose
[Feature #19538]

Since that category is not enabled by default, making it a
verbose warning is redundant. Enabling performance warning should
work with the default verbosity level.
2023-05-03 10:43:46 +02:00
Aaron Patterson 3016f30c95 Return NULL to indicate the next shape isn't found
During compaction we must fix up shapes on objects who were extended but
then became embedded.  `rb_shape_traverse_from_new_root` is supposed to
walk shape trees looking for a matching shape.  When a shape has a
"single child" we weren't returning NULL when the edge names didn't
match.

In the case of a single outgoing edge, this patch returns NULL when the
child edge name doesn't match (similar to the case when a shape has a
hash of outgoing edges)
2023-04-18 16:56:48 -07:00
Peter Zhu 24b137336b Move shape ID to flags for classes on 32 bit
Moves shape ID to FL_USER4 to FL_USER19 for the shape ID on 32 bit
systems. This makes the rb_classext_struct smaller so that it can be
embedded.
2023-04-16 11:06:31 -04:00
Jean Boussier ac123f167a Emit a performance warning when a class reached max variations
[Feature #19538]

This new `peformance` warning category is disabled by default.
It needs to be specifically enabled via `-W:performance` or `Warning[:performance] = true`
2023-04-13 16:36:17 +02:00
Matt Valentine-House 026321c5b9 [Feature #19474] Refactor NEWOBJ macros
NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec
2023-04-06 11:07:16 +01:00
Matt Valentine-House d91a82850a Pull the shape tree out of the vm object 2023-04-06 11:07:16 +01:00
Aaron Patterson 7c307e0379 Lazily allocate id tables for children
This patch lazily allocates id tables for shape children.  If a shape
has only one single child, it tags the child with a bit.  When we read
children, if the id table has the bit set, we know it's a single child.
If we need to add more children, then we create a new table and evacuate
the child to the new table.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2023-03-22 12:50:42 -07:00
Aaron Patterson 0519741702 pull child allocation in to a different function 2023-03-22 12:50:42 -07:00
Aaron Patterson 999ccb2b6b combine allocation functions 2023-03-22 12:50:42 -07:00
Aaron Patterson e055c0c716 Make shape functions static
These functions don't need to be in the header file, we can declare them
as static.
2023-03-22 12:50:42 -07:00
Aaron Patterson 1a9e2d20e2 Fix shape allocation limits
We can only allocate enough shapes to fit in the shape buffer.
MAX_SHAPE_ID was based on the theoretical maximum number of shapes we
could have, not on the amount of memory we can actually consume.  This
commit changes the MAX_SHAPE_ID to be based on the amount of memory
we're allowed to consume.

Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
2023-03-22 08:46:12 -07:00
Peter Zhu cb22d78354 Fix frozen status loss when moving objects
[Bug #19536]

When objects are moved between size pools, their frozen status is lost
in the shape. This will cause the frozen check to be bypassed when there
is an inline cache. For example, the following script should raise a
FrozenError, but doesn't on Ruby 3.2 and master.

    class A
      def add_ivars
        @a = @b = @c = @d = 1
      end

      def set_a
        @a = 10
      end
    end

    a = A.new
    a.add_ivars
    a.freeze

    b = A.new
    b.add_ivars
    b.set_a # Set the inline cache in set_a

    GC.verify_compaction_references(expand_heap: true, toward: :empty)

    a.set_a
2023-03-18 09:07:05 -04:00
Aaron Patterson 365fed6369
Revert "Allow classes and modules to become too complex"
This reverts commit 69465df424.
2023-03-10 08:50:43 -08:00
HParker 69465df424 Allow classes and modules to become too complex
This makes the behavior of classes and modules when there are too many instance variables match the behavior of objects with too many instance variables.
2023-03-09 15:34:49 -08:00
Takashi Kokubun 50a709fb9e Resurrect symbols used by ObjectSpace 2023-03-06 21:59:23 -08:00
Takashi Kokubun 233ddfac54 Stop exporting symbols for MJIT 2023-03-06 21:59:23 -08:00
Haldun Bayhantopcu b03b251aa4 Handle all non-object type objects 2023-02-15 15:43:46 -08:00
Haldun Bayhantopcu 0b4b2cd1ee Fix removing ivars from clases and modules.
Co-authored-by: Adam Hess <hparker@github.com>
2023-02-15 15:43:46 -08:00
Matt Valentine-House 72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Jemma Issroff 28da990984 Limit maximum number of IVs on a shape on T_OBJECTS
Create SHAPE_MAX_NUM_IVS (currently 50) and limit all shapes of
T_OBJECTS to that number of IVs. When a shape with a T_OBJECT has more than 50 IVs, fall back to the
obj_too_complex shape which uses hash lookup for ivs.

Note that a previous version of this commit
78fcc9847a was reverted in
88f2b94065 because it did not account for
non-T_OBJECTS
2023-02-06 08:40:51 -08:00
Peter Zhu c4cc3be195 Remove dead code in shapes.c and shapes.h 2023-01-30 14:55:20 -05:00
Aaron Patterson 88f2b94065
Revert "Limit maximum number of IVs on a shape"
This reverts commit 78fcc9847a.
2023-01-26 11:04:55 -05:00
Jemma Issroff 78fcc9847a Limit maximum number of IVs on a shape
Create SHAPE_MAX_NUM_IVS (currently 50) and limit all shapes to that
number of IVs. When a shape has more than 50 IVs, fallback to the
obj_too_complex shape which uses hash lookup for ivs.
2023-01-25 14:48:28 -05:00
Peter Zhu 273dca3aed Fix undefined behavior in shape.c
Under strict aliasing, writing to the memory location of a different
type is not allowed and will result in undefined behavior. This was
happening in shape.c due to `rb_id_table_lookup` writing to the memory
location of `VALUE *` that was casted from a `rb_shape_t **`.

This was causing test failures when compiled with LTO.

Fixes [Bug #19248]

Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
2023-01-05 13:14:11 -05:00
Takashi Kokubun c566c968f9 Hide RubyVM::Shape's interface as much as possible [ci skip]
RubyVM::Shape is usually not available (you need SHAPE_DEBUG macro,
which is not defined by default). So it seems confusing to leave
RubyVM::Shape in the document.

This hides only method definitions because, well, I can't find a way to
hide things defined by rb_define_const or rb_struct_define_under. I gave
up making the C-based documentation right. You should define things in
Ruby instead.
2022-12-22 15:06:37 -08:00
Jemma Issroff 297df92407 Clean up Ruby Shape API
Make printing shapes better, use a struct instead of specific methods
for each field on a shape.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-16 13:27:45 -05:00
Matt Valentine-House bfc66e07b7 Fix Object Movement allocation in GC
When moving Objects between size pools we have to assign a new shape.

This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.

This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.

This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.

This allows us to remove the shape allocation from update references.

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2022-12-15 15:27:38 -05:00
Jemma Issroff c1ab6ddc9a Transition complex objects to "too complex" shape
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.

Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.

This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.

For example:

```ruby
class NG; end

HUGE_NUMBER.times do
  NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```

We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.

For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:

```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```

However, the following class is *not* too complex because it only has
one leaf in the shape tree:

```ruby
class Foo
  def initialize
    @a = @b = @c = @d = @e = @f = @g = @h = @i = nil
  end
end
9.times { Foo.new }
``

This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-15 10:06:04 -08:00
Jemma Issroff a3d552aedd Add variation_count on classes
Count how many "variations" each class creates. A "variation" is a a
unique ordering of instance variables on a particular class. This can
also be thought of as a branch in the shape tree.

For example, the following Foo class will have 2 variations:

```ruby
class Foo ; end

Foo.new.instance_variable_set(:@a, 1) # case 1: creates one variation
Foo.new.instance_variable_set(:@b, 1) # case 2: creates another variation

foo = Foo.new
foo.instance_variable_set(:@a, 1) # does not create a new variation
foo.instance_variable_set(:@b, 1) # does not create a new variation (a continuation of the variation in case 1)
```

We will use this number to limit the amount of shapes that a class can
create and fallback to using a hash iv lookup.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-15 10:06:04 -08:00
Peter Zhu f50aa19da6 Revert "Fix Object Movement allocation in GC"
This reverts commit 9c54466e29.

We're seeing crashes in Shopify CI after this commit.
2022-12-15 12:00:30 -05:00
Matt Valentine-House 9c54466e29 Fix Object Movement allocation in GC
When moving Objects between size pools we have to assign a new shape.

This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.

This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.

This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.

This allows us to remove the shape allocation from update references.

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2022-12-15 09:04:30 -05:00
Peter Zhu 7a63114f8e Remove dead code in get_next_shape_internal
If the rb_id_table_lookup fails, then res is not updated so it cannot be
any value other than null.
2022-12-14 13:21:46 -05:00
Jemma Issroff 12003acbb9 Update shape capacity when removing ivar and rewriting shape transitions
Since edc7af48ac, we now no longer have
undef ivar transitions. Instead, we rebuild the shapes table. When we do
this, we need to ensure that we retain our capacities on shapes.
2022-12-10 16:10:21 +01:00
Jean Boussier 73771e4b19 ObjectSpace.dump_all: dump shapes as well
I see several arguments in doing so.

First they use a non trivial amount of memory, so for various memory
profiling/mapping tools it is relevant to have visibility of the space
occupied by shapes.

Then, some pathological code can create a tons of shape, so it is
valuable to have a way to have a way to observe shapes without having
to compile Ruby with `SHAPE_DEBUG=1`.

And additionally it's likely much faster to dump then this way than
to use `RubyVM::Shape`.

There are however a few open questions:

- Shapes can't respect the `since:` argument. Not sure what to do when
  it is provided. Would probably make sense to not dump them.
- Maybe it would make more sense to have a separate `ObjectSpace.dump_shapes`?
- Maybe instead `dump_all` should take a `shapes: false` argument?

Additionally, `ObjectSpace.dump_shapes` is added for the use case of
debugging the evolution of the shape tree.
2022-12-08 18:46:16 +01:00
Aaron Patterson edc7af48ac Stop transitioning to UNDEF when undefining an instance variable
Cases like this:

```ruby
obj = Object.new
loop do
  obj.instance_variable_set(:@foo, 1)
  obj.remove_instance_variable(:@foo)
end
```

can cause us to use many more shapes than we want (and even run out).
This commit changes the code such that when an instance variable is
removed, we'll walk up the shape tree, find the shape, then rebuild any
child nodes that happened to be below the "targetted for removal" IV.

This also requires moving any instance variables so that indexes derived
from the shape tree will work correctly.

Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
Co-authored-by: John Hawthorn <jhawthorn@github.com>
2022-12-07 09:57:11 -08:00
Jemma Issroff e7642d8095
YJIT: Extract SHAPE_ID_NUM_BITS into a constant (#6863) 2022-12-05 13:20:11 -08:00
Jemma Issroff 41bacd9b0d Remove unused rb_shape_flag_shift and rb_shape_flag_mask 2022-12-02 12:53:51 -08:00
Jemma Issroff 4c5e89791b Extracted rb_shape_id_offset 2022-12-02 12:53:51 -08:00
Aaron Patterson 17f9bcd7d7 implement IV writes 2022-12-02 12:53:51 -08:00
John Hawthorn f0cf70c840 Add a macro for SHAPE_DEBUG
Like before, default to VM_CHECK_MODE > 0, but this allows just enabling
shape debug helpers without the rest of VM_CHECK_MODE.
2022-12-01 15:37:15 -08:00
Peter Zhu 1f0888ab3e Speed up shape transitions
This commit significantly speeds up shape transitions as it changes
get_next_shape_internal to not perform a lookup (and instead require
the caller to perform the lookup). This avoids double lookups during
shape transitions.

There is a significant (~2x) speedup in the following micro-benchmark:

    puts(Benchmark.measure do
      o = Object.new

      100_000.times do |i|
        o.instance_variable_set(:"@a#{i}", 0)
      end
    end)

Before:

    22.393194   0.201639  22.594833 ( 22.684237)

After:

    11.323086   0.022284  11.345370 ( 11.389346)
2022-11-21 10:22:29 -05:00
Aaron Patterson 9e067df76b 32 bit comparison on shape id
This commit changes the shape id comparisons to use a 32 bit comparison
rather than 64 bit.  That means we don't need to load the shape id to a
register on x86 machines.

Given the following program:

```ruby
class Foo
  def initialize
    @foo = 1
    @bar = 1
  end

  def read
    [@foo, @bar]
  end
end

foo = Foo.new
foo.read
foo.read
foo.read
foo.read
foo.read

puts RubyVM::YJIT.disasm(Foo.instance_method(:read))
```

The machine code we generated _before_ this change is like this:

```
== BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ======================
  # getinstancevariable
  0x559a18623023: mov rax, qword ptr [r13 + 0x18]
  # guard object is heap
  0x559a18623027: test al, 7
  0x559a1862302a: jne 0x559a1862502d
  0x559a18623030: cmp rax, 4
  0x559a18623034: jbe 0x559a1862502d
  # guard shape, embedded, and T_OBJECT
  0x559a1862303a: mov rcx, qword ptr [rax]
  0x559a1862303d: movabs r11, 0xffff00000000201f
  0x559a18623047: and rcx, r11
  0x559a1862304a: movabs r11, 0xb000000002001
  0x559a18623054: cmp rcx, r11
  0x559a18623057: jne 0x559a18625046
  0x559a1862305d: mov rax, qword ptr [rax + 0x18]
  0x559a18623061: mov qword ptr [rbx], rax

== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ======================
  # gen_direct_jmp: fallthrough
  # getinstancevariable
  # regenerate_branch
  # getinstancevariable
  # regenerate_branch
  0x559a18623064: mov rax, qword ptr [r13 + 0x18]
  # guard shape, embedded, and T_OBJECT
  0x559a18623068: mov rcx, qword ptr [rax]
  0x559a1862306b: movabs r11, 0xffff00000000201f
  0x559a18623075: and rcx, r11
  0x559a18623078: movabs r11, 0xb000000002001
  0x559a18623082: cmp rcx, r11
  0x559a18623085: jne 0x559a18625099
  0x559a1862308b: mov rax, qword ptr [rax + 0x20]
  0x559a1862308f: mov qword ptr [rbx + 8], rax
```

After this change, it's like this:

```
== BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ======================
  # getinstancevariable
  0x5560c986d023: mov rax, qword ptr [r13 + 0x18]
  # guard object is heap
  0x5560c986d027: test al, 7
  0x5560c986d02a: jne 0x5560c986f02d
  0x5560c986d030: cmp rax, 4
  0x5560c986d034: jbe 0x5560c986f02d
  # guard shape
  0x5560c986d03a: cmp word ptr [rax + 6], 0x19
  0x5560c986d03f: jne 0x5560c986f046
  0x5560c986d045: mov rax, qword ptr [rax + 0x10]
  0x5560c986d049: mov qword ptr [rbx], rax

== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ======================
  # gen_direct_jmp: fallthrough
  # getinstancevariable
  # regenerate_branch
  # getinstancevariable
  # regenerate_branch
  0x5560c986d04c: mov rax, qword ptr [r13 + 0x18]
  # guard shape
  0x5560c986d050: cmp word ptr [rax + 6], 0x19
  0x5560c986d055: jne 0x5560c986f099
  0x5560c986d05b: mov rax, qword ptr [rax + 0x18]
  0x5560c986d05f: mov qword ptr [rbx + 8], rax
```

The first ivar read is a bit more complex, but the second ivar read is
much simpler.  I think eventually we could teach the context about the
shape, then emit only one shape guard.
2022-11-18 12:04:10 -08:00
Aaron Patterson 6582f34831 rename SHAPE_BITS to SHAPE_ID_NUM_BITS 2022-11-18 12:04:10 -08:00
Aaron Patterson 10788166e7 Differentiate T_OBJECT shapes from other objects
We would like to differentiate types of objects via their shape.  This
commit adds a special T_OBJECT shape when we allocate an instance of
T_OBJECT.  This allows us to avoid testing whether an object is an
instance of a T_OBJECT or not, we can just check the shape.
2022-11-18 08:31:56 -08:00
Peter Zhu 4b29eb17f2 Fix indentation of switch statement in shape.c 2022-11-17 14:43:46 -05:00
Peter Zhu 5dcbe58833 Fix buffer overrun in ivars when rebuilding shapes
In rb_shape_rebuild_shape, we need to increase the capacity when
capacity == next_iv_index since the next ivar will be writing at index
next_iv_index.

This bug can be reproduced when assertions are turned on and you run the
following code:

    class Foo
      def initialize
        @a1 = 1
        @a2 = 1
        @a3 = 1
        @a4 = 1
        @a5 = 1
        @a6 = 1
        @a7 = 1
      end

      def add_ivars
        @a8 = 1
        @a9 = 1
      end
    end

    class Bar < Foo
    end

    foo = Foo.new
    foo.add_ivars
    bar = Bar.new
    GC.start
    bar.add_ivars
    bar.clone

You will get the following crash:

    Assertion Failed: object.c:301:rb_obj_copy_ivar:src_num_ivs <= shape_to_set_on_dest->capacity
2022-11-15 08:53:46 -05:00