This fixes test failures when running tests with
RUBY_ISEQ_DUMP_DEBUG=to_binary, which started after
5899f6aa55 was committed.
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
In cases where a method accepts both keywords and an anonymous
keyword splat, the method was not marked as taking an anonymous
keyword splat. Fix that in the compiler.
Doing that broke handling of nil keyword splats in yjit, so
update yjit to handle that.
Add a test to check that calling a method that accepts both
a keyword argument and an anonymous keyword splat does not
modify a passed keyword splat hash.
Move the anon_kwrest check from setup_parameters_complex to
ignore_keyword_hash_p, and only use it if the keyword hash
is already a hash. This should speed things up slightly as
it avoids a check previously used for all callers of
setup_parameters_complex.
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
These previously resulted in 2 array allocations, one for newarray
and one for concatarray. This replaces newarray and concatarray
with pushtoarray, and changes splatarray false to splatarray true,
which reduces it to 1 array allocation, in splatarray true.
This also sets VM_CALL_ARGS_SPLAT_MUT, so if the super method
accepts a positional splat, this will avoid an additional array
allocation on the callee side.
This optimization stopped being using when the splatkw VM instruction
was added. This change allows the optimization to apply again. This
also optimizes the following cases:
super(*ary, **kw, &block)
f(...)
super(...)
This optimizes cases such as:
super(arg, *ary)
super(arg, *ary, &block)
super(*ary, **kw)
super(*ary, kw: 1)
super(*ary, kw: 1, &block)
The super(*ary, **kw, &block) case does not use the splatarray false
optimization. This is also true of the send case, since the
introduction of the splatkw VM instruction. That will be fixed in
a later commit.
This is designed to replace the newarraykwsplat instruction, which is
no longer used in the parse.y compiler after this commit. This avoids
an unnecessary array allocation in the case where ARGSCAT is followed
by LIST with keyword:
```ruby
a = []
kw = {}
[*a, 1, **kw]
```
Previous Instructions:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 newhash 0 ( 2)[Li]
0006 setlocal_WC_0 kw@1
0008 getlocal_WC_0 a@0 ( 3)[Li]
0010 splatarray true
0012 putobject_INT2FIX_1_
0013 putspecialobject 1
0015 newhash 0
0017 getlocal_WC_0 kw@1
0019 opt_send_without_block <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0021 newarraykwsplat 2
0023 concattoarray
0024 leave
```
New Instructions:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 newhash 0 ( 2)[Li]
0006 setlocal_WC_0 kw@1
0008 getlocal_WC_0 a@0 ( 3)[Li]
0010 splatarray true
0012 putobject_INT2FIX_1_
0013 pushtoarray 1
0015 putspecialobject 1
0017 newhash 0
0019 getlocal_WC_0 kw@1
0021 opt_send_without_block <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0023 pushtoarraykwsplat
0024 leave
```
pushtoarraykwsplat is designed to be simpler than newarraykwsplat.
It does not take a variable number of arguments from the stack, it
pops the top of the stack, and appends it to the second from the top,
unless the top of the stack is an empty hash.
During this work, I found the ARGSPUSH followed by HASH with keyword
did not compile correctly, as it pushed the generated hash to the
array even if the hash was empty. This fixes the behavior, to use
pushtoarraykwsplat instead of pushtoarray in that case:
```ruby
a = []
kw = {}
[*a, **kw]
[{}] # Before
[] # After
```
This does not remove the newarraykwsplat instruction, as it is still
referenced in the prism compiler (which should be updated similar
to this), YJIT (only in the bindings, it does not appear to be
implemented), and RJIT (in a couple comments). After those are
updated, the newarraykwsplat instruction should be removed.
Using RARRAY_CONST_PTR can cause the array object to not exist on the
stack, which could cause it to be GC'd or be moved by GC compaction. This
can cause RARRAY_CONST_PTR to point to the incorrect location if the
array is embedded and moved by GC compaction.
Fixesruby/prism#2444.
assert does not print the bug report, only the file and line number of
the assertion that failed. RUBY_ASSERT prints the full bug report, which
makes it much easier to debug.
Before this commit, there were many places where we had to generate
dummy line nodes to hold both the line number and the node id that
would then immediately get pulled out from the created node. Now
we pass them explicitly so that we don't have to generate these
nodes.
This makes a clearer line between the parser and compiler, and also
makes it easier to generate instructions when we don't have a
specific node to tie them to. As such, it removes almost every
single place where we needed to previously generate dummy nodes.
This also makes it easier for the prism compiler, because now we
can pass in line number and node id instead of trying to generate
dummy nodes for every instruction that we compile.
String nodes holds ruby string object on `VALUE nd_lit`.
This commit changes it to `struct rb_parser_string *string`
to reduce dependency on ruby object.
Sometimes these strings are concatenated with other string
therefore string concatenate functions are needed.
Before this commit, we were mixing a lot of concerns with the prism
compile between RubyVM::InstructionSequence and the general entry
points to the prism parser/compiler.
This commit makes all of the various prism-related APIs mirror
their corresponding APIs in the existing parser/compiler. This means
we now have the correct frame naming, and it's much easier to follow
where the logic actually flows. Furthermore this consolidates a lot
of the prism initialization, making it easier to see where we could
potentially be raising errors.
Previously, this would use newarray followed by concattoarray.
This now uses pushtoarray instead, avoiding the unnecessary
array allocation.
This is implemented by making compile_array take a first_chunk
argument, passing in 1 in the normal array case, and 0 in the
ARGSCAT with LIST body case.
newarray, duparray, concatarray, and splatarray always leave an
array at the top of the stack. expandarray does not, it takes
an array from the top of the stack as input, and leaves individual
elements on the stack. I assume no Ruby code generates the
expandarray/splatarray sequence, or it could break. The only
use of expandarray outside the peephole optimizer is in the
masgn code, and it does not appear to generate splatarray
directly after expandarray.
The splatarray/splatarray peephole optimization is probably
also wrong in the following case:
```
putobject [1,2]
splatarray false
splatarray true
```
This instruction sequence should result in a duplicate of
[1,2] at the top of the stack, but the peephole optimizer would
remove the `splatarray true`, resulting in change that made
[1,2] on top of the stack. I'm not sure Ruby code can generate
`splatarray false` followed by `splatarray true` (I could get it
to generate chains of `splatarray true`), so maybe this has no
effect.
newarray, duparray, and concatarray all result in newly allocated
arrays at the top of the stack, so they shouldn't have an issue
with removing either `splatarray true` or `splatarray false`.
Given code such as:
```ruby
h[*a, :a], h[*b] = v
```
Ruby would previously allocate 5 arrays for the mass assignment:
* splatarray true for a
* newarray for v[0]
* concatarray for [*a, a] and v[0]
* newarray for v[1]
* concatarray for b and v[1]
This optimizes it to only allocate 2 arrays:
* splatarray true for a
* splatarray true for b
Instead of the newarray/concatarray combination, pushtoarray is used.
Note above that there was no splatarray true for b originally. The
normal compilation uses splatarray false for b. Instead of trying
to find and modify the splatarray false to splatarray true, this
adds splatarray true for b, which requires a couple of swap
instructions, before the pushtoarray. This could be further
optimized to remove the need for those three instructions, but I'm
not sure if the complexity is worth it.
Additionally, this sets VM_CALL_ARGS_SPLAT_MUT on the call to
[]= in the h[*b] case, so that if []= has a splat parameter, the
new array can be used directly, without additional duplication.
Given code such as:
```ruby
h[*a, 1] += 1
h[*b] += 2
```
Ruby would previously allocate 5 arrays:
* splatarray true for a
* newarray for 1
* concatarray for [*a, 1] and [1]
* newarray for 2
* concatarray for b and [2]
This optimizes it to only allocate 2 arrays:
* splatarray true for a
* splatarray true for b
Instead of the newarray/concatarray combination, pushtoarray is used.
Note above that there was no splatarray true for b originally. The
normal compilation uses splatarray false for b. Instead of trying
to find and modify the splatarray false to splatarray true, this
adds splatarray true for b, which requires a couple of swap
instructions, before the pushtoarray. This could be further
optimized to remove the need for those three instructions, but I'm
not sure if the complexity is worth it.
Additionally, this sets VM_CALL_ARGS_SPLAT_MUT on the call to
[]= in the h[*b] case, so that if []= has a splat parameter, the
new array can be used directly, without additional duplication.
Previously, a literal array with a splat and any other args resulted in
more than one array allocation:
```ruby
[1, *a]
[*a, 1]
[*a, *a]
[*a, 1, 2]
[*a, a]
[*a, 1, *a]
[*a, 1, a]
[*a, a, a]
[*a, a, *a]
[*a, 1, *a, 1]
[*a, 1, *a, *a]
[*a, a, *a, a]
```
This is because previously Ruby would use newarray and concatarray
to create the array, which both each allocate an array internally.
This changes the compilation to use concattoarray and pushtoarray,
which do not allocate arrays. It also updates the peephole optimizer
to optimize the duparray/concattoarray sequence to
putobject/concattoarray, mirroring the existing duparray/concatarray
optimization.
These changes reduce the array allocations for the above examples to
a single array allocation, except for:
```
[*a, 1, a]
[*a, a, a]
```
The reason for this is because optimizing this case to only allocate
1 array requires changes to compile_array, which would currently
conflict with an unmerged pull request (#9721). After that pull
request is merged, it should be possible to refactor things to only
allocate a 1 array for all literal arrays (or 2 for arrays with
keyword splats).
To avoid stack overflow, Ruby splits compilation of large arrays
into smaller arrays, and concatenates the small arrays together.
It previously used newarray/concatarray for this, which is
inefficient. This switches the compilation to use pushtoarray,
which is much faster. This makes almost all literal arrays only
allocate a single array.
For cases where there is a large amount of static values in the
array, Ruby will statically compile subarrays, and previously
added them using concatarray. This switches to concattoarray,
avoiding an array allocation for the append.
Keyword splats are also supported in arrays, and ignored if the
keyword splat is empty. Previously, this used newarraykwsplat and
concatarray. This still uses newarraykwsplat, but switches to
concattoarray to save an allocation. So large arrays with keyword
splats can allocate 2 arrays instead of 1.
Previously, for the following array sizes (assuming local variable
access for each element), Ruby allocated the following number of
arrays:
1000 elements: 7 arrays
10000 elements: 79 arrays
100000 elements: 781 arrays
With these changes, only a single array is allocated (or 2 for a
large array with a keyword splat.
Results using the included benchmark:
```
array_1000
miniruby: 34770.0 i/s
./miniruby-before: 10511.7 i/s - 3.31x slower
array_10000
miniruby: 4938.8 i/s
./miniruby-before: 483.8 i/s - 10.21x slower
array_100000
miniruby: 727.2 i/s
./miniruby-before: 4.1 i/s - 176.98x slower
```
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
`__ENCODING__ `was managed by `NODE_LIT` with Encoding object.
Introduce `NODE_ENCODING` for
1. `__ENCODING__` is detectable from AST Node.
2. Reduce dependency Ruby object for parse.y
For zsuper calls with a keyword splat but no actual keywords, the
keyword splat is passed directly, so it cannot be mutable, because
if the callee accepts a keyword splat, changes to the keyword splat
by the callee would be reflected in the caller.
While here, simplify the logic when the method supports
literal keywords. I don't think it is possible for
a method with has_kw param flags to not have keywords, so add an
assertion for that, and set VM_CALL_KW_SPLAT_MUT in a single place.
Ruby makes it easy to delegate all arguments from one method to another:
```ruby
def f(*args, **kw)
g(*args, **kw)
end
```
Unfortunately, this indirection decreases performance. One reason it
decreases performance is that this allocates an array and a hash per
call to `f`, even if `args` and `kw` are not modified.
Due to Ruby's ability to modify almost anything at runtime, it's
difficult to avoid the array allocation in the general case. For
example, it's not safe to avoid the allocation in a case like this:
```ruby
def f(*args, **kw)
foo(bar)
g(*args, **kw)
end
```
Because `foo` may be `eval` and `bar` may be a string referencing `args`
or `kw`.
To fix this correctly, you need to perform something similar to escape
analysis on the variables. However, there is a case where you can
avoid the allocation without doing escape analysis, and that is when
the splat variables are anonymous:
```ruby
def f(*, **)
g(*, **)
end
```
When splat variables are anonymous, it is not possible to reference
them directly, it is only possible to use them as splats to other
methods. Since that is the case, if `f` is called with a regular
splat and a keyword splat, it can pass the arguments directly to
`g` without copying them, avoiding allocation. For example:
```ruby
def g(a, b:)
a + b
end
def f(*, **)
g(*, **)
end
a = [1]
kw = {b: 2}
f(*a, **kw)
```
I call this technique: Allocationless Anonymous Splat Forwarding.
This is implemented using a couple additional iseq param flags,
anon_rest and anon_kwrest. If anon_rest is set, and an array splat
is passed when calling the method when the array splat can be used
without modification, `setup_parameters_complex` does not duplicate
it. Similarly, if anon_kwest is set, and a keyword splat is passed
when calling the method, `setup_parameters_complex` does not
duplicate it.
This instruction is similar to concattoarray, but it takes the
number of arguments to push to the array, removes that number
of arguments from the stack, and adds them to the array now at
the top of the stack.
This allows `f(*a, 1)` to allocate only a single array on the
caller side (which can be reused on the callee side in the case of
`def f(*a)`). Prior to this commit, `f(*a, 1)` would generate
3 arrays:
* a dupped by splatarray true
* 1 wrapped in array by newarray
* a dupped again by concatarray
Instructions Before for `a = []; f(*a, 1)`:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 putself
0005 getlocal_WC_0 a@0
0007 splatarray true
0009 putobject_INT2FIX_1_
0010 newarray 1
0012 concatarray
0013 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|FCALL>
0015 leave
```
Instructions After for `a = []; f(*a, 1)`:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 putself
0005 getlocal_WC_0 a@0
0007 splatarray true
0009 putobject_INT2FIX_1_
0010 pushtoarray 1
0012 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0014 leave
```
With these changes, method calls to Ruby methods should
implicitly allocate at most one array.
Ignore typeprof bundled gem failure due to unrecognized instruction.
This instruction is similar to concatarray, but assumes the first
object is already an array, and appends to it directly. This is
different than concatarray, which will create a new array instead
of appending to an existing array.
Additionally, for both concatarray and concattoarray, if the second
argument cannot be converted to an array, then just push it onto
the array, instead of creating a new array to wrap it, and then
using concat array. This saves an array allocation in that case.
This allows `f(*a, *a, *1)` to allocate only a single array on the
caller side (which can be reused on the callee side in the case of
`def f(*a)`). Prior to this commit, `f(*a, *a, *1)` would generate
4 arrays:
* a dupped by splatarray true
* a dupped again by first concatarray
* 1 wrapped in array by third splatarray
* result of [*a, *a] dupped by second concatarray
Instructions Before for `a = []; f(*a, *a, *1)`:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 putself
0005 getlocal_WC_0 a@0
0007 splatarray true
0009 getlocal_WC_0 a@0
0011 splatarray false
0013 concatarray
0014 putobject_INT2FIX_1_
0015 splatarray false
0017 concatarray
0018 opt_send_without_block <calldata!mid:g, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0020 leave
```
Instructions After for `a = []; f(*a, *a, *1)`:
```
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 a@0
0004 putself
0005 getlocal_WC_0 a@0
0007 splatarray true
0009 getlocal_WC_0 a@0
0011 concattoarray
0012 putobject_INT2FIX_1_
0013 concattoarray
0014 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0016 leave
```
This flag is set when the caller has already created a new array to
handle a splat, such as for `f(*a, b)` and `f(*a, *b)`. Previously,
if `f` was defined as `def f(*a)`, these calls would create an extra
array on the callee side, instead of using the new array created
by the caller.
This modifies `setup_args_core` to set the flag whenver it would add
a `splatarray true` instruction. However, when `splatarray true` is
changed to `splatarray false` in the peephole optimizer, to avoid
unnecessary allocations on the caller side, the flag must be removed.
Add `optimize_args_splat_no_copy` and have the peephole optimizer call
that. This significantly simplifies the related peephole optimizer
code.
On the callee side, in `setup_parameters_complex`, set
`args->rest_dupped` to true if the flag is set.
This takes a similar approach for optimizing regular splats that was
previiously used for keyword splats in
d2c41b1bff (via VM_CALL_KW_SPLAT_MUT).
The order of iseq may differ from the order of tokens, typically
`while`/`until` conditions are put after the body.
These orders can match by using line numbers as builtin-indexes, but
at the same time, it introduces the restriction that multiple `cexpr!`
and `cstmt!` cannot appear in the same line.
Another possible idea is to use `RubyVM::AbstractSyntaxTree` and
`node_id` instead of ripper, with making BASERUBY 3.1 or later.
to BUILTIN_ATTR_SINGLE_NOARG_LEAF
The attribute was created when the other attribute was called BUILTIN_ATTR_INLINE.
Now that the original attribute is renamed to BUILTIN_ATTR_LEAF, it's
only confusing that we call it "_INLINE".
nil is treated similarly to the empty hash in this case, passing
no keywords and not calling any conversion methods.
Fixes [Bug #20064]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Previously, it used "expression", as that was the default. However,
op asgn expressions to constants use the NODE_OP_CDECL, so recognize
that node type as assignement.
Fixes [Bug #20111]
`:sym` was managed by `NODE_LIT` with `Symbol` object.
This commit introduces `NODE_SYM` so that
1. Symbol literal is detectable from AST Node
2. Reduce dependency on ruby object
parse.y converted NODE_STR when the string is hash key like
```
h1 = {"str1" => 1}
m1("str2" => 2)
m2({"str3" => 3})
```
This commit stop the conversion.
`static_literal_node_p` needs to know the node is for hash key or not
for the optimization.
When hash keys are duplicated, e.g. `h = {k: 1, l: 2, k: 3}`,
parser changes node structure for correct compilation.
This generates tricky AST. This commit removes AST manipulation
from parser to keep AST structure simple.
`__FILE__` was managed by `NODE_STR` with `String` object.
This commit introduces `NODE_FILE` and `struct rb_parser_string` so that
1. `__FILE__` is detectable from AST Node
2. Reduce dependency ruby object
`__LINE__` was managed by `NODE_LIT` with `Integer` object.
This commit introduces `NODE_LINE` so that
1. `__LINE__` is detectable from AST Node
2. Reduce dependency ruby object