Граф коммитов

112 Коммитов

Автор SHA1 Сообщение Дата
Aaron Patterson 8b8dcc7af1 Handle mmap failures for redblack tree cache
The redblack tree cache is totally optional, so if we can't allocate
room for the cache, then just pretend as if the cache is full if mmap
fails
2024-01-12 09:31:36 -08:00
Jean Boussier 7c2d819862 Fix a grammar issue in the shape performance warning message 2023-12-20 09:28:45 +01:00
Jean Boussier f6ad49b87c Use #initialize instead of `initialize` in shape perf warning
This is more consistent with other messages.
2023-12-18 13:42:02 +01:00
Nobuyoshi Nakada 40fc9b070c
[DOC] No document for internal or debug methods 2023-12-18 20:17:45 +09:00
Jean Boussier ba1d1522d3 Make the SHAPE_TOO_COMPLEX performance warning more actionable
As suggested by Mame, we should try to help users fix the issues
without having to lookup the meaning of the warning.
2023-12-18 10:33:18 +01:00
Peter Zhu 12e3b07455 Re-embed when removing Object instance variables
Objects with the same shape must always have the same "embeddedness"
(either embedded or heap allocated) because YJIT assumes so. However,
using remove_instance_variable, it's possible that some objects are
embedded and some are heap allocated because it does not re-embed heap
allocated objects.

This commit changes remove_instance_variable to re-embed Object
instance variables when it becomes small enough.
2023-12-06 11:34:07 -05:00
Peter Zhu 4a7151a8e4 Deduplicate assertions in redblack_balance
The bug in i686 was fixed in commit
71babe5536.
2023-12-06 10:16:58 -05:00
Peter Zhu 56eccb350b Fix alphabetical order of include in shape.c 2023-12-05 16:25:34 -05:00
Peter Zhu 43ef0da0fb Add assertions for shape cache grandchild nodes 2023-12-01 09:56:32 -05:00
Peter Zhu 4541e192d9 Add assertions in redblack_balance
These assertions check that binary search tree invariants are held for
the new tree.
2023-11-30 16:48:51 -05:00
Peter Zhu a1647c460f Rename variables redblack_balance
It's too difficult for me to keep track that y is the new node, x is the
new left node, z is the new right node, a is the new left left node,
b is the new left right node, c is the new right left node, and d is
the new right right node. This commit refactors the variable names to be
more descriptive.
2023-11-30 15:41:08 -05:00
Peter Zhu 57cb47bfe2 Assert that the left and right nodes are correct 2023-11-29 10:30:00 -05:00
Peter Zhu cb70994b0e Assert node inserted into red-black tree exists 2023-11-28 13:37:38 -05:00
Peter Zhu 4d71f70fd1 Add assertions to check created red-black tree 2023-11-27 14:05:25 -05:00
Peter Zhu 872922b03d Fix indentation in comment in shape.c 2023-11-27 14:04:56 -05:00
Peter Zhu b93a1bb40b Verify correctness of shape cache
This commit adds assertions to verify that the shape cache is correct
compared to the shape tree.
2023-11-25 09:32:36 -05:00
Peter Zhu 564ef66e26 Verify that duplicate shape is not created
This adds an assertion that the instance variable does not already exist
in the shape tree when creating a new shape.
2023-11-25 09:32:36 -05:00
Alan Wu 341321f115 Fix off-by-one with RubyVM::Shape.exhaust_shapes
Previously, the method left one shape available (MAX_SHAPE_ID) when
called without arguments.
2023-11-22 12:17:58 -05:00
Jean Boussier 2d7fb9c2fa Speedup test_shape.rb
Many tests start by exhausting all shapes, which is a slow process.
By exposing a method to directly move the bump allocator forward
we cut test runtime in half.

Before:
```
Finished tests in 1.544756s
```

After:
```
Finished tests in 0.759733s,
```
2023-11-22 10:12:07 +01:00
Peter Zhu 3dd77bc056 Fix corruption when out of shape during ivar remove
Reproduction script:

```
o = Object.new
10.times { |i| o.instance_variable_set(:"@a#{i}", i) }

i = 0
a = Object.new
while RubyVM::Shape.shapes_available > 2
  a.instance_variable_set(:"@i#{i}", 1)
  i += 1
end

o.remove_instance_variable(:@a0)

puts o.instance_variable_get(:@a1)
```

Before this patch, it would incorrectly output `2` and now it correctly
outputs `1`.
2023-11-17 13:08:43 -08:00
Jean Boussier 94c9f16663 Refactor rb_obj_evacuate_ivs_to_hash_table
That function is a bit too low level to called from multiple
places. It's always used in tandem with `rb_shape_set_too_complex`
and both have to know how the object is laid out to update the
`iv_ptr`.

So instead we can provide two higher level function:

  - `rb_obj_copy_ivs_to_hash_table` to prepare a `st_table` from an
    arbitrary oject.
  - `rb_obj_convert_to_too_complex` to assign the new `st_table`
    to the old object, and safely free the old `iv_ptr`.

Unfortunately both can't be combined into one, because `rb_obj_copy_ivar`
need `rb_obj_copy_ivs_to_hash_table` to copy from one object
to another.
2023-11-17 09:19:21 +01:00
Peter Zhu fabf5bead7 Don't overwrite shape capacity when removing ivar
Other objects may be using the shape, so we can't change the capacity
otherwise the other objects may have a buffer overflow.
2023-11-13 18:26:36 -05:00
Peter Zhu 68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e3.
2023-11-13 18:26:36 -05:00
Peter Zhu 5f3fb4f4e3 Revert "Remove SHAPE_CAPACITY_CHANGE shapes"
This reverts commit f6910a6112.

We're seeing crashes in the test suite of Shopify's core monolith after
this change.
2023-11-10 11:27:49 -05:00
Peter Zhu f6910a6112 Remove SHAPE_CAPACITY_CHANGE shapes
We don't need to create a shape to transition capacity as we can
transition the capacity when the capacity of the SHAPE_IVAR changes.
2023-11-09 09:25:02 -05:00
Jean Boussier d898e8d6f8 Refactor rb_shape_transition_shape_capa out
Right now the `rb_shape_get_next` shape caller need to
first check if there is capacity left, and if not call
`rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`.

And on each of these it needs to checks if we got a TOO_COMPLEX
back.

All this logic is duplicated in the interpreter, YJIT and RJIT.

Instead we can have `rb_shape_get_next` do the capacity transition
when needed. The caller can compare the old and new shapes capacity
to know if resizing is needed. It also can check for TOO_COMPLEX
only once.
2023-11-08 11:02:55 +01:00
Jean Boussier b92b9e1e9e vm_getivar: assume the cached shape_id like have a common ancestor
When an inline cache misses, it is very likely that the stale shape_id
and the current instance shape_id have a close common ancestor.

For example if the instance variable is sometimes frozen sometimes
not, one of the two shape will be the direct parent of the other.

Another pattern that commonly cause IC misses is "memoization",
in such case the object will have a "base common shape" and then
a number of close descendants.

In addition, when we find a common ancestor, we store it in the
inline cache instead of the current shape. This help prevent the
cache from flip-flopping, ensuring the next lookup will be marginally
faster and more generally avoid writing in memory too much.

However, now that shapes have an ancestors index, we only check
for a few ancestors before falling back to use the index.

So overall this change speeds up what is assumed to be the more common
case, but makes what is assumed to be the less common case a bit slower.

```
compare-ruby: ruby 3.3.0dev (2023-10-26T05:30:17Z master 701ca070b4) [arm64-darwin22]
built-ruby: ruby 3.3.0dev (2023-10-26T09:25:09Z shapes_double_sear.. a723a85235) [arm64-darwin22]
warming up......

|                                     |compare-ruby|built-ruby|
|:------------------------------------|-----------:|---------:|
|vm_ivar_stable_shape                 |     11.672M|   11.679M|
|                                     |           -|     1.00x|
|vm_ivar_memoize_unstable_shape       |      7.551M|   10.506M|
|                                     |           -|     1.39x|
|vm_ivar_memoize_unstable_shape_miss  |     11.591M|   11.624M|
|                                     |           -|     1.00x|
|vm_ivar_unstable_undef               |      9.037M|    7.981M|
|                                     |       1.13x|         -|
|vm_ivar_divergent_shape              |      8.034M|    6.657M|
|                                     |       1.21x|         -|
|vm_ivar_divergent_shape_imbalanced   |     10.471M|    9.231M|
|                                     |       1.13x|         -|
```

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2023-11-03 12:47:43 +01:00
Peter Zhu 38ba040d8b Make every initial size pool shape a root shape
This commit makes every initial size pool shape a root shape and assigns
it a capacity of 0.
2023-11-02 13:42:11 -04:00
Jean Boussier 33795931a0 Better handle running out of shapes in remove_shape_recursive 2023-11-02 12:00:42 +01:00
Jean Boussier b77148ae9f remove_instance_variable: Handle running out of shapes
`remove_shape_recursive` wasn't considering that if we run out of
shapes, it might have to transition to SHAPE_TOO_COMPLEX.

When this happens, we now return with an error and the caller
initiates the evacuation.
2023-11-01 15:21:55 +01:00
Peter Zhu e2d950733e Add ST table to gen_ivtbl for complex shapes
On 32-bit systems, we must store the shape ID in the gen_ivtbl to not
lose the shape. If we directly store the ST table into the generic
ivar table, then we lose the shape. This makes it impossible to
determine the shape of the object and whether it is too complex or not.
2023-10-31 12:07:54 -04:00
Jean Boussier 4aacc559d9 Handle running out of shapes in `Object#dup`
There is a handful of call sites where we may transition to
OBJ_TOO_COMPLEX_SHAPE if we just ran out of shapes, but that
weren't handling it properly.
2023-10-31 12:07:54 -04:00
Jean Boussier 4aee6931c3 Make get_next_shape_internal idempotent
Since the check for MAX_SHAPE_ID was done before even checking
if the transition we're looking for even exists, as soon as the
max shape is reached, get_next_shape_internal would always return
`TOO_COMPLEX` regardless of whether the transition we're looking
for already exist or not.

In addition to entirely de-optimize all newly created objects, it
also made an assertion fail in `vm_setivar`:

```
vm_setivar:rb_shape_get_next_iv_shape(rb_shape_get_shape_by_id(source_shape_id), id) == dest_shape
```
2023-10-27 21:09:03 +02:00
Aaron Patterson bbf1d621ba Decrease redblack cache / shape size in debug
When running tests in debug mode, we have tests that try to exhaust the
space used for shapes and the redblack cache.  However, this can cause
Out of Memory issues on some machines, so this commit decreases the
cache sizes when RUBY_DEBUG is enabled
2023-10-26 14:49:42 -07:00
Jean Boussier 8e62596e38
Move some defines from shape.h to shape.c
If they are only used there, we might as well not expose them.
2023-10-26 13:07:08 -07:00
Aaron Patterson d8cb827f39 Remove SHAPE_MAX_NUM_IVS
There is no longer a limit on the number of IVs you can store.
SHAPE_MAX_NUM_IVS was used to work around the IV10K problem (the well
known problem where setting 10k instance variables in a row would be too
slow).  The redblack tree works well at any shape depth, even depths
greater than 80, and solves the IV10K problem.
2023-10-24 14:23:17 -07:00
Aaron Patterson afae8df373 `get_next_shape_internal` should always return a shape
If it runs out of shapes, or new variations aren't allowed, it will
return "too complex"
2023-10-24 14:23:17 -07:00
Aaron Patterson cfd7c1a276 Allow the shape tree to be traversed
This commit allows the shape tree to be traversed to locate an existing
shape, but it doesn't necessarily allow you to create new variations.
2023-10-24 14:23:17 -07:00
Aaron Patterson 3760baccac Remove new_shape_necessary code
We always create new shapes until we run out!
2023-10-24 14:23:17 -07:00
Aaron Patterson 702b8e3107 golf down ancestor caching 2023-10-24 14:23:17 -07:00
Aaron Patterson e71f343a99 Addressing feedback 2023-10-24 10:52:06 -07:00
Aaron Patterson 54230dea1b Don't cache on platforms without mmap
We're only going to create a redblack tree on platforms that have mmap
2023-10-24 10:52:06 -07:00
Aaron Patterson a3f66e09f6 geniv objects can become too complex 2023-10-24 10:52:06 -07:00
Aaron Patterson caf6a72348 remove IV limit / support complex shapes on classes 2023-10-24 10:52:06 -07:00
Aaron Patterson 84e4453436 Use a functional red-black tree for indexing the shapes
This is an experimental commit that uses a functional red-black tree to
create an index of the ancestor shapes.  It uses an Okasaki style
functional red black tree:

  https://www.cs.tufts.edu/comp/150FP/archive/chris-okasaki/redblack99.pdf

This tree is advantageous because:

* It offers O(n log n) insertions and O(n log n) lookups.
* It shares memory with previous "versions" of the tree

When we insert a node in the tree, only the parts of the tree that need
to be rebalanced are newly allocated.  Parts of the tree that don't need
to be rebalanced are not reallocated, so "new trees" are able to share
memory with old trees.  This is in contrast to a sorted set where we
would have to duplicate the set, and also resort the set on each
insertion.

I've added a new stat to RubyVM.stat so we can understand how the red
black tree increases.
2023-10-24 10:52:06 -07:00
Nobuyoshi Nakada 42c2c8caa5
Adjust indent [ci skip] 2023-10-23 19:28:14 +09:00
Jean Boussier e5364ea496 rb_shape_transition_shape_capa: use optimal sizes transitions
Previously the growth was 3(embed), 6, 12, 24, ...

With this change it's now 3(embed), 8, 16, 32, 64, ... by default.

However, since power of two isn't the best size for all allocators,
if `malloc_usable_size` is vailable, we use it to discover the best
offset.

On Linux/glibc 2.35 for instance, the growth will be 3(embed), 7, 15, 31
to avoid wasting 8B per object.

Test program:

```c

size_t test(size_t slots) {
    size_t allocated = slots * VALUE_SIZE;
    void *test_ptr = malloc(allocated);
    size_t wasted = malloc_usable_size(test_ptr) - allocated;
    free(test_ptr);
    fprintf(stderr, "slots = %lu, wasted_bytes = %lu\n", slots, wasted);
    return wasted;
}

int main(int argc, char *argv[]) {
    size_t best_padding = 0;
    size_t padding = 0;
    for (padding = 0; padding <= 2; padding++) {
        size_t wasted = test(8 - padding);
        if (wasted == 0) {
            best_padding = padding;
            break;
        }
    }

    size_t index = 0;
    fprintf(stderr, "=============== naive ================\n");

    size_t list_size = 4;
    for (index = 0; index < 10; index++) {
        test(list_size);
        list_size *= 2;
    }

    fprintf(stderr, "=============== auto-padded (-%lu) ================\n", best_padding);

    list_size = 4;
    for (index = 0; index < 10; index ++) {
        test(list_size - best_padding);
        list_size *= 2;
    }

    fprintf(stderr, "\n\n");
    return 0;
}
```

```
===== glibc ======
slots = 8, wasted_bytes = 8
slots = 7, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 8
slots = 8, wasted_bytes = 8
slots = 16, wasted_bytes = 8
slots = 32, wasted_bytes = 8
slots = 64, wasted_bytes = 8
slots = 128, wasted_bytes = 8
slots = 256, wasted_bytes = 8
slots = 512, wasted_bytes = 8
slots = 1024, wasted_bytes = 8
slots = 2048, wasted_bytes = 8
=============== auto-padded (-1) ================
slots = 3, wasted_bytes = 0
slots = 7, wasted_bytes = 0
slots = 15, wasted_bytes = 0
slots = 31, wasted_bytes = 0
slots = 63, wasted_bytes = 0
slots = 127, wasted_bytes = 0
slots = 255, wasted_bytes = 0
slots = 511, wasted_bytes = 0
slots = 1023, wasted_bytes = 0
slots = 2047, wasted_bytes = 0
```

```
==========  jemalloc =======
slots = 8, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
=============== auto-padded (-0) ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
```
2023-10-23 09:33:15 +02:00
Jean Boussier 5cc44f48c5 Refactor rb_shape_transition_shape_capa to not accept capacity
This way the groth factor is encapsulated, which allows
rb_shape_transition_shape_capa to be smarter about ideal sizes.
2023-10-10 14:47:54 +02:00
Nobuyoshi Nakada 8d242a33af
`rb_bug` prints a newline after the message 2023-05-20 21:43:30 +09:00
Jean Boussier 04ee666aab Make the maximum shapes variation warning non-verbose
[Feature #19538]

Since that category is not enabled by default, making it a
verbose warning is redundant. Enabling performance warning should
work with the default verbosity level.
2023-05-03 10:43:46 +02:00