In cases where the compiler can detect the hash is static, it would
use duphash for the hash part. As the hash is static, there is no need
to allocate an array.
The compiler already eliminates the array allocation for
f(*a, &lvar) and f(*a, &@iv). If that is safe, then eliminating
it for f(*a, **lvar) and f(*a, **@iv) as the last commit did is
as safe, and eliminating it for f(*a, **lvar, &lvar) and
f(*a, **@iv, &@iv) is also as safe.
The compiler already eliminates the array allocation for
f(*a, &lvar) and f(*a, &@iv), and eliminating the array allocation
for keyword splat is as safe as eliminating it for block passes.
Due to how the compiler works, while f(*a, &lvar) and f(*a, &@iv)
do not allocate an array, but f(1, *a, &lvar) and f(1, *a, &@iv)
do. It's probably possible to fix this in the compiler, but
seems easiest to fix this in the peephole optimizer.
Eliminating this array allocation is as safe as the current
elimination of the array allocation for f(*a, &lvar) and
f(*a, &@iv).
Due to how the compiler works, while f(*a) does not allocate an
array f(1, *a) does. This is possible to fix in the compiler, but
the change is much more complex. This attempts to fix the issue
in a simpler way using the peephole optimizer.
Eliminating this array allocation is safe, since just as in the
f(*a) case, nothing else on the caller side can modify the array.
The operands in each instruction needs to be pinned because if
auto-compaction runs in iseq_set_sequence, then the objects could exist
on the generated_iseq buffer, which would not be reference updated which
can lead to T_MOVED (and subsequently T_NONE) objects on the iseq.
The function iseq_set_exception_table allocates memory which can cause
a GC compaction to run. Since catch_table_ary is not on the stack, it
can be moved, which would make tptr incorrect.
- Unless `sizeof(BDIGIT) == 4`, (8-byte integer not available), the
size to be loaded was wrong.
- Since `BDIGIT`s are dumped as raw binary, the loaded byte order was
inverted unless little-endian.
ARGSCAT has been used for nd_args to hold index and rvalue,
because there was limitation on the number of members for Node.
We can easily change structure of node now, let's expand it.
We changed ScopeNodes to point to their parent (previous) ScopeNodes.
Accordingly, we can remove pm_compile_context_t, and store all
necessary context in ScopeNodes, allowing us to access locals from
outer scopes.
It's an estimator for application size and could be used as a
compilation heuristic later.
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Tracks other callinfo that references the same kwargs and frees them when all references are cleared.
[bug #19906]
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.
1. Low flexibility of data structure
Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.
2. No compile time check for union member access
It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.
This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
It seems not-uncommon for methods to have no IV, ISE, or ICVARC caches.
Calling malloc with 0 will actually allocate something, so if there
aren't any caches (`ISEQ_IS_SIZE(body) == 0`), then we can avoid
allocating memory by not calling malloc. If there are no caches, then
theoretically nobody should be reading from the buffer anyway.
This saves about 1MB on Lobsters benchmark.
There's a missing write barrier for operands in the iseq instruction
list, which can cause crashes.
It can be reproduced when Ruby is compiled with `-DRUBY_DEBUG_ENV=1`.
Using the following command:
```
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0 RUBY_DEBUG=gc_stress ruby -w --disable=gems -Itool/lib -W0 test.rb
```
The following script crashes:
```
require "test/unit"
```
* Add a compile_context arg to yp_compile_node
The compile_context will allow us to pass around the parser, and
the constants and lookup table (to be used in future commits).
* Compile yp_program_node_t and yp_statements_node_t
Add the compilation for program and statements node so that we can
successfully compile an empty program with YARP.
* Helper functions for parsing numbers, strings, and symbols
* Compile basic numeric / boolean node types in YARP
* Compile StringNode and SymbolNodes in YARP
* Compile several basic node types in YARP
* Added error return for missing node
* Add yarp/yarp_compiler.c as stencil for compiling YARP
This commit adds yarp/yarp_compiler.c, and changes the sync script
to ensure that yarp/yarp_compiler.c will not get overwritten
* [Misc #119772] Create and expose RubyVM::InstructionSequence.compile_yarp
This commit creates the stencil for a compile_yarp function, which
we will continue to fill out. It allows us to check the output
of compiled YARP code against compiled code without using YARP.
After a0f12a0258 NODE_GASGN and
NODE_GVAR hold same value on both nd_vid and nd_entry.
This commit stops setting value to nd_entry and makes to use only
nd_vid.
98637d421d changes the name of
the function. However this function is exported as global,
then change the name to origin one for keeping compatibility.
This commit introduces a new instruction `opt_newarray_send` which is
used when there is an array literal followed by either the `hash`,
`min`, or `max` method.
```
[a, b, c].hash
```
Will emit an `opt_newarray_send` instruction. This instruction falls
back to a method call if the "interested" method has been monkey
patched.
Here are some examples of the instructions generated:
```
$ ./miniruby --dump=insns -e '[@a, @b].max'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :max
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].min'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :min
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].hash'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 getinstancevariable :@a, <is:0> ( 1)[Li]
0003 getinstancevariable :@b, <is:1>
0006 opt_newarray_send 2, :hash
0009 leave
```
[Feature #18897] [ruby-core:109147]
Co-authored-by: John Hawthorn <jhawthorn@github.com>
The `catch_except_p` flag is used for communicating between parent and
child iseq's that a throw instruction was emitted. So for example if a
child iseq has a throw in it and the parent wants to catch the throw, we
use this flag to communicate to the parent iseq that a throw instruction
was emitted.
This flag is only useful at compile time, it only impacts the
compilation process so it seems to be fine to move it from the iseq body
to the compile_data struct.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and
the builtin-function (bf) is inline-able, the caller doesn't need to
build a method frame.
`vm_call_single_noarg_inline_builtin` is fast path for such cases.
`builtin_inline_index` is restored because THEN clause on
`Primitive.mandatory_only?` was compiled twice.
However, f29c9d6d36 skips to compile THEN clause so we don't
need to restore `builtin_inline_index`.
On `f(*a, **kw)` method calls, a rest keyword parameter is identically
same Hash object is passed and it should make `#dup`ed Hahs.
fix https://bugs.ruby-lang.org/issues/19526
This is a variation of the `defined` instruction, for use when we
are checking for an instance variable. Splitting this out as a
separate instruction lets us skip some checks, and it also allows
us to use an instance variable cache, letting shape analysis
speed up the operation further.
If the previous instruction is not a leaf instruction, then the PC was
incremented before the instruction was ran (meaning the currently
executing instruction is actually the previous instruction), so we
should not increment the PC otherwise we will calculate the source
line for the next instruction.
This bug can be reproduced in the following script:
```
require "objspace"
ObjectSpace.trace_object_allocations_start
a =
1.0 / 0.0
p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)]
```
Which outputs: [4, "test.rb"]
This is incorrect because the object was allocated on line 10 and not
line 4. The behaviour is correct when we use a leaf instruction (e.g.
if we replaced `1.0 / 0.0` with `"hello"`), then the output is:
[10, "test.rb"].
[Bug #19456]
It doesn't have the right write barriers in place. For example, there is
rb_mark_set(dump->global_buffer.obj_table);
in the mark function, but there is no corresponding write barrier when
adding to the table in the
`ibf_dump_object() -> ibf_table_find_or_insert() -> st_insert()` code path.
To insert write barrier correctly, we need to store the T_STRUCT VALUE
inside `struct ibf_dump`. Instead of doing that, let's just demote it
to WB unproected for correctness. These dumper object are ephemeral so
there is not a huge benefit for having them WB protected.
Users of the bootsnap gem ran into crashes due to this issue:
https://github.com/Shopify/bootsnap/issues/436
Fixes [Bug #19419]
The interrupt check will unintentionally release the VM lock when loading an iseq.
And this will cause issues with the `debug` gem's
[`ObjectSpace.each_iseq` method](0fcfc28aca/ext/debug/iseq_collector.c (L61-L67)),
which wraps iseqs with a wrapper and exposes their internal states when they're actually not ready to be used.
And when that happens, errors like this would occur and kill the `debug` gem's thread:
```
DEBUGGER: ReaderThreadError: uninitialized InstructionSequence
┃ DEBUGGER: Disconnected.
┃ ["/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:247:in `absolute_path'",
┃ "/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:247:in `block in iterate_iseq'",
┃ "/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:246:in `each_iseq'",
...
```
A way to reproduce the issue is to satisfy these conditions at the same time:
1. `debug` gem calling `ObjectSpace.each_iseq` (e.g. [activating a `LineBreakpoint`](0fcfc28aca/lib/debug/breakpoint.rb (L246))).
2. A large amount of iseq being loaded from another thread (possibly through the `bootsnap` gem).
3. 1 and 2 iterating through the same iseq(s) at the same time.
Because this issue requires external dependencies and a rather complicated timing setup to reproduce, I wasn't able to write a test case for it.
But here's some pseudo code to help reproduce it:
```rb
require "debug/session"
Thread.new do
100.times do
ObjectSpace.each_iseq do |iseq|
iseq.absolute_path
end
end
end
sleep 0.1
load_a_bunch_of_iseq
possibly_through_bootsnap
```
[Bug #19348]
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
With this change, we're storing the iv name on an inline cache on
setinstancevariable instructions. This allows us to check the inline
cache to count instance variables set in initialize and give us an
estimate of iv capacity for an object.
For the purpose of estimating the number of instance variables required
for an object, we're assuming that all initialize methods will call
`super`.
This change allows us to estimate the number of instance variables
required without disassembling instruction sequences.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
This was introduced by b609bdeb53
to suppress warnings. However these warngins were deleted by
beae6cbf0f. Therefore these codes
are not needed anymore.
`throw TAG_BREAK` instruction makes a jump only if the continuation of
catch of TAG_BREAK exactly matches the instruction immediately following
the "send" instruction that is currently being executed. Otherwise, it
seems to determine break from proc-closure.
Branch coverage may insert some recording instructions after "send"
instruction, which broke the conditions for TAG_BREAK to work properly.
This change forces to set the continuation of catch of TAG_BREAK
immediately after "send" (or "invokesuper") instruction.
[Bug #18991]
This patch pushes dummy frames when loading code for the
profiling purpose.
The following methods push a dummy frame:
* `Kernel#require`
* `Kernel#load`
* `RubyVM::InstructionSequence.compile_file`
* `RubyVM::InstructionSequence.load_from_binary`
https://bugs.ruby-lang.org/issues/18559