Граф коммитов

87 Коммитов

Автор SHA1 Сообщение Дата
Jean Boussier d898e8d6f8 Refactor rb_shape_transition_shape_capa out
Right now the `rb_shape_get_next` shape caller need to
first check if there is capacity left, and if not call
`rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`.

And on each of these it needs to checks if we got a TOO_COMPLEX
back.

All this logic is duplicated in the interpreter, YJIT and RJIT.

Instead we can have `rb_shape_get_next` do the capacity transition
when needed. The caller can compare the old and new shapes capacity
to know if resizing is needed. It also can check for TOO_COMPLEX
only once.
2023-11-08 11:02:55 +01:00
Jean Boussier b92b9e1e9e vm_getivar: assume the cached shape_id like have a common ancestor
When an inline cache misses, it is very likely that the stale shape_id
and the current instance shape_id have a close common ancestor.

For example if the instance variable is sometimes frozen sometimes
not, one of the two shape will be the direct parent of the other.

Another pattern that commonly cause IC misses is "memoization",
in such case the object will have a "base common shape" and then
a number of close descendants.

In addition, when we find a common ancestor, we store it in the
inline cache instead of the current shape. This help prevent the
cache from flip-flopping, ensuring the next lookup will be marginally
faster and more generally avoid writing in memory too much.

However, now that shapes have an ancestors index, we only check
for a few ancestors before falling back to use the index.

So overall this change speeds up what is assumed to be the more common
case, but makes what is assumed to be the less common case a bit slower.

```
compare-ruby: ruby 3.3.0dev (2023-10-26T05:30:17Z master 701ca070b4) [arm64-darwin22]
built-ruby: ruby 3.3.0dev (2023-10-26T09:25:09Z shapes_double_sear.. a723a85235) [arm64-darwin22]
warming up......

|                                     |compare-ruby|built-ruby|
|:------------------------------------|-----------:|---------:|
|vm_ivar_stable_shape                 |     11.672M|   11.679M|
|                                     |           -|     1.00x|
|vm_ivar_memoize_unstable_shape       |      7.551M|   10.506M|
|                                     |           -|     1.39x|
|vm_ivar_memoize_unstable_shape_miss  |     11.591M|   11.624M|
|                                     |           -|     1.00x|
|vm_ivar_unstable_undef               |      9.037M|    7.981M|
|                                     |       1.13x|         -|
|vm_ivar_divergent_shape              |      8.034M|    6.657M|
|                                     |       1.21x|         -|
|vm_ivar_divergent_shape_imbalanced   |     10.471M|    9.231M|
|                                     |       1.13x|         -|
```

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2023-11-03 12:47:43 +01:00
Peter Zhu 38ba040d8b Make every initial size pool shape a root shape
This commit makes every initial size pool shape a root shape and assigns
it a capacity of 0.
2023-11-02 13:42:11 -04:00
Jean Boussier 33795931a0 Better handle running out of shapes in remove_shape_recursive 2023-11-02 12:00:42 +01:00
Jean Boussier b77148ae9f remove_instance_variable: Handle running out of shapes
`remove_shape_recursive` wasn't considering that if we run out of
shapes, it might have to transition to SHAPE_TOO_COMPLEX.

When this happens, we now return with an error and the caller
initiates the evacuation.
2023-11-01 15:21:55 +01:00
Peter Zhu e2d950733e Add ST table to gen_ivtbl for complex shapes
On 32-bit systems, we must store the shape ID in the gen_ivtbl to not
lose the shape. If we directly store the ST table into the generic
ivar table, then we lose the shape. This makes it impossible to
determine the shape of the object and whether it is too complex or not.
2023-10-31 12:07:54 -04:00
Jean Boussier 4aacc559d9 Handle running out of shapes in `Object#dup`
There is a handful of call sites where we may transition to
OBJ_TOO_COMPLEX_SHAPE if we just ran out of shapes, but that
weren't handling it properly.
2023-10-31 12:07:54 -04:00
Jean Boussier 4aee6931c3 Make get_next_shape_internal idempotent
Since the check for MAX_SHAPE_ID was done before even checking
if the transition we're looking for even exists, as soon as the
max shape is reached, get_next_shape_internal would always return
`TOO_COMPLEX` regardless of whether the transition we're looking
for already exist or not.

In addition to entirely de-optimize all newly created objects, it
also made an assertion fail in `vm_setivar`:

```
vm_setivar:rb_shape_get_next_iv_shape(rb_shape_get_shape_by_id(source_shape_id), id) == dest_shape
```
2023-10-27 21:09:03 +02:00
Aaron Patterson bbf1d621ba Decrease redblack cache / shape size in debug
When running tests in debug mode, we have tests that try to exhaust the
space used for shapes and the redblack cache.  However, this can cause
Out of Memory issues on some machines, so this commit decreases the
cache sizes when RUBY_DEBUG is enabled
2023-10-26 14:49:42 -07:00
Jean Boussier 8e62596e38
Move some defines from shape.h to shape.c
If they are only used there, we might as well not expose them.
2023-10-26 13:07:08 -07:00
Aaron Patterson d8cb827f39 Remove SHAPE_MAX_NUM_IVS
There is no longer a limit on the number of IVs you can store.
SHAPE_MAX_NUM_IVS was used to work around the IV10K problem (the well
known problem where setting 10k instance variables in a row would be too
slow).  The redblack tree works well at any shape depth, even depths
greater than 80, and solves the IV10K problem.
2023-10-24 14:23:17 -07:00
Aaron Patterson afae8df373 `get_next_shape_internal` should always return a shape
If it runs out of shapes, or new variations aren't allowed, it will
return "too complex"
2023-10-24 14:23:17 -07:00
Aaron Patterson cfd7c1a276 Allow the shape tree to be traversed
This commit allows the shape tree to be traversed to locate an existing
shape, but it doesn't necessarily allow you to create new variations.
2023-10-24 14:23:17 -07:00
Aaron Patterson 3760baccac Remove new_shape_necessary code
We always create new shapes until we run out!
2023-10-24 14:23:17 -07:00
Aaron Patterson 702b8e3107 golf down ancestor caching 2023-10-24 14:23:17 -07:00
Aaron Patterson e71f343a99 Addressing feedback 2023-10-24 10:52:06 -07:00
Aaron Patterson 54230dea1b Don't cache on platforms without mmap
We're only going to create a redblack tree on platforms that have mmap
2023-10-24 10:52:06 -07:00
Aaron Patterson a3f66e09f6 geniv objects can become too complex 2023-10-24 10:52:06 -07:00
Aaron Patterson caf6a72348 remove IV limit / support complex shapes on classes 2023-10-24 10:52:06 -07:00
Aaron Patterson 84e4453436 Use a functional red-black tree for indexing the shapes
This is an experimental commit that uses a functional red-black tree to
create an index of the ancestor shapes.  It uses an Okasaki style
functional red black tree:

  https://www.cs.tufts.edu/comp/150FP/archive/chris-okasaki/redblack99.pdf

This tree is advantageous because:

* It offers O(n log n) insertions and O(n log n) lookups.
* It shares memory with previous "versions" of the tree

When we insert a node in the tree, only the parts of the tree that need
to be rebalanced are newly allocated.  Parts of the tree that don't need
to be rebalanced are not reallocated, so "new trees" are able to share
memory with old trees.  This is in contrast to a sorted set where we
would have to duplicate the set, and also resort the set on each
insertion.

I've added a new stat to RubyVM.stat so we can understand how the red
black tree increases.
2023-10-24 10:52:06 -07:00
Nobuyoshi Nakada 42c2c8caa5
Adjust indent [ci skip] 2023-10-23 19:28:14 +09:00
Jean Boussier e5364ea496 rb_shape_transition_shape_capa: use optimal sizes transitions
Previously the growth was 3(embed), 6, 12, 24, ...

With this change it's now 3(embed), 8, 16, 32, 64, ... by default.

However, since power of two isn't the best size for all allocators,
if `malloc_usable_size` is vailable, we use it to discover the best
offset.

On Linux/glibc 2.35 for instance, the growth will be 3(embed), 7, 15, 31
to avoid wasting 8B per object.

Test program:

```c

size_t test(size_t slots) {
    size_t allocated = slots * VALUE_SIZE;
    void *test_ptr = malloc(allocated);
    size_t wasted = malloc_usable_size(test_ptr) - allocated;
    free(test_ptr);
    fprintf(stderr, "slots = %lu, wasted_bytes = %lu\n", slots, wasted);
    return wasted;
}

int main(int argc, char *argv[]) {
    size_t best_padding = 0;
    size_t padding = 0;
    for (padding = 0; padding <= 2; padding++) {
        size_t wasted = test(8 - padding);
        if (wasted == 0) {
            best_padding = padding;
            break;
        }
    }

    size_t index = 0;
    fprintf(stderr, "=============== naive ================\n");

    size_t list_size = 4;
    for (index = 0; index < 10; index++) {
        test(list_size);
        list_size *= 2;
    }

    fprintf(stderr, "=============== auto-padded (-%lu) ================\n", best_padding);

    list_size = 4;
    for (index = 0; index < 10; index ++) {
        test(list_size - best_padding);
        list_size *= 2;
    }

    fprintf(stderr, "\n\n");
    return 0;
}
```

```
===== glibc ======
slots = 8, wasted_bytes = 8
slots = 7, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 8
slots = 8, wasted_bytes = 8
slots = 16, wasted_bytes = 8
slots = 32, wasted_bytes = 8
slots = 64, wasted_bytes = 8
slots = 128, wasted_bytes = 8
slots = 256, wasted_bytes = 8
slots = 512, wasted_bytes = 8
slots = 1024, wasted_bytes = 8
slots = 2048, wasted_bytes = 8
=============== auto-padded (-1) ================
slots = 3, wasted_bytes = 0
slots = 7, wasted_bytes = 0
slots = 15, wasted_bytes = 0
slots = 31, wasted_bytes = 0
slots = 63, wasted_bytes = 0
slots = 127, wasted_bytes = 0
slots = 255, wasted_bytes = 0
slots = 511, wasted_bytes = 0
slots = 1023, wasted_bytes = 0
slots = 2047, wasted_bytes = 0
```

```
==========  jemalloc =======
slots = 8, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
=============== auto-padded (-0) ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
```
2023-10-23 09:33:15 +02:00
Jean Boussier 5cc44f48c5 Refactor rb_shape_transition_shape_capa to not accept capacity
This way the groth factor is encapsulated, which allows
rb_shape_transition_shape_capa to be smarter about ideal sizes.
2023-10-10 14:47:54 +02:00
Nobuyoshi Nakada 8d242a33af
`rb_bug` prints a newline after the message 2023-05-20 21:43:30 +09:00
Jean Boussier 04ee666aab Make the maximum shapes variation warning non-verbose
[Feature #19538]

Since that category is not enabled by default, making it a
verbose warning is redundant. Enabling performance warning should
work with the default verbosity level.
2023-05-03 10:43:46 +02:00
Aaron Patterson 3016f30c95 Return NULL to indicate the next shape isn't found
During compaction we must fix up shapes on objects who were extended but
then became embedded.  `rb_shape_traverse_from_new_root` is supposed to
walk shape trees looking for a matching shape.  When a shape has a
"single child" we weren't returning NULL when the edge names didn't
match.

In the case of a single outgoing edge, this patch returns NULL when the
child edge name doesn't match (similar to the case when a shape has a
hash of outgoing edges)
2023-04-18 16:56:48 -07:00
Peter Zhu 24b137336b Move shape ID to flags for classes on 32 bit
Moves shape ID to FL_USER4 to FL_USER19 for the shape ID on 32 bit
systems. This makes the rb_classext_struct smaller so that it can be
embedded.
2023-04-16 11:06:31 -04:00
Jean Boussier ac123f167a Emit a performance warning when a class reached max variations
[Feature #19538]

This new `peformance` warning category is disabled by default.
It needs to be specifically enabled via `-W:performance` or `Warning[:performance] = true`
2023-04-13 16:36:17 +02:00
Matt Valentine-House 026321c5b9 [Feature #19474] Refactor NEWOBJ macros
NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec
2023-04-06 11:07:16 +01:00
Matt Valentine-House d91a82850a Pull the shape tree out of the vm object 2023-04-06 11:07:16 +01:00
Aaron Patterson 7c307e0379 Lazily allocate id tables for children
This patch lazily allocates id tables for shape children.  If a shape
has only one single child, it tags the child with a bit.  When we read
children, if the id table has the bit set, we know it's a single child.
If we need to add more children, then we create a new table and evacuate
the child to the new table.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2023-03-22 12:50:42 -07:00
Aaron Patterson 0519741702 pull child allocation in to a different function 2023-03-22 12:50:42 -07:00
Aaron Patterson 999ccb2b6b combine allocation functions 2023-03-22 12:50:42 -07:00
Aaron Patterson e055c0c716 Make shape functions static
These functions don't need to be in the header file, we can declare them
as static.
2023-03-22 12:50:42 -07:00
Aaron Patterson 1a9e2d20e2 Fix shape allocation limits
We can only allocate enough shapes to fit in the shape buffer.
MAX_SHAPE_ID was based on the theoretical maximum number of shapes we
could have, not on the amount of memory we can actually consume.  This
commit changes the MAX_SHAPE_ID to be based on the amount of memory
we're allowed to consume.

Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
2023-03-22 08:46:12 -07:00
Peter Zhu cb22d78354 Fix frozen status loss when moving objects
[Bug #19536]

When objects are moved between size pools, their frozen status is lost
in the shape. This will cause the frozen check to be bypassed when there
is an inline cache. For example, the following script should raise a
FrozenError, but doesn't on Ruby 3.2 and master.

    class A
      def add_ivars
        @a = @b = @c = @d = 1
      end

      def set_a
        @a = 10
      end
    end

    a = A.new
    a.add_ivars
    a.freeze

    b = A.new
    b.add_ivars
    b.set_a # Set the inline cache in set_a

    GC.verify_compaction_references(expand_heap: true, toward: :empty)

    a.set_a
2023-03-18 09:07:05 -04:00
Aaron Patterson 365fed6369
Revert "Allow classes and modules to become too complex"
This reverts commit 69465df424.
2023-03-10 08:50:43 -08:00
HParker 69465df424 Allow classes and modules to become too complex
This makes the behavior of classes and modules when there are too many instance variables match the behavior of objects with too many instance variables.
2023-03-09 15:34:49 -08:00
Takashi Kokubun 50a709fb9e Resurrect symbols used by ObjectSpace 2023-03-06 21:59:23 -08:00
Takashi Kokubun 233ddfac54 Stop exporting symbols for MJIT 2023-03-06 21:59:23 -08:00
Haldun Bayhantopcu b03b251aa4 Handle all non-object type objects 2023-02-15 15:43:46 -08:00
Haldun Bayhantopcu 0b4b2cd1ee Fix removing ivars from clases and modules.
Co-authored-by: Adam Hess <hparker@github.com>
2023-02-15 15:43:46 -08:00
Matt Valentine-House 72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Jemma Issroff 28da990984 Limit maximum number of IVs on a shape on T_OBJECTS
Create SHAPE_MAX_NUM_IVS (currently 50) and limit all shapes of
T_OBJECTS to that number of IVs. When a shape with a T_OBJECT has more than 50 IVs, fall back to the
obj_too_complex shape which uses hash lookup for ivs.

Note that a previous version of this commit
78fcc9847a was reverted in
88f2b94065 because it did not account for
non-T_OBJECTS
2023-02-06 08:40:51 -08:00
Peter Zhu c4cc3be195 Remove dead code in shapes.c and shapes.h 2023-01-30 14:55:20 -05:00
Aaron Patterson 88f2b94065
Revert "Limit maximum number of IVs on a shape"
This reverts commit 78fcc9847a.
2023-01-26 11:04:55 -05:00
Jemma Issroff 78fcc9847a Limit maximum number of IVs on a shape
Create SHAPE_MAX_NUM_IVS (currently 50) and limit all shapes to that
number of IVs. When a shape has more than 50 IVs, fallback to the
obj_too_complex shape which uses hash lookup for ivs.
2023-01-25 14:48:28 -05:00
Peter Zhu 273dca3aed Fix undefined behavior in shape.c
Under strict aliasing, writing to the memory location of a different
type is not allowed and will result in undefined behavior. This was
happening in shape.c due to `rb_id_table_lookup` writing to the memory
location of `VALUE *` that was casted from a `rb_shape_t **`.

This was causing test failures when compiled with LTO.

Fixes [Bug #19248]

Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
2023-01-05 13:14:11 -05:00
Takashi Kokubun c566c968f9 Hide RubyVM::Shape's interface as much as possible [ci skip]
RubyVM::Shape is usually not available (you need SHAPE_DEBUG macro,
which is not defined by default). So it seems confusing to leave
RubyVM::Shape in the document.

This hides only method definitions because, well, I can't find a way to
hide things defined by rb_define_const or rb_struct_define_under. I gave
up making the C-based documentation right. You should define things in
Ruby instead.
2022-12-22 15:06:37 -08:00
Jemma Issroff 297df92407 Clean up Ruby Shape API
Make printing shapes better, use a struct instead of specific methods
for each field on a shape.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-16 13:27:45 -05:00