Before this commit, we were mixing a lot of concerns with the prism
compile between RubyVM::InstructionSequence and the general entry
points to the prism parser/compiler.
This commit makes all of the various prism-related APIs mirror
their corresponding APIs in the existing parser/compiler. This means
we now have the correct frame naming, and it's much easier to follow
where the logic actually flows. Furthermore this consolidates a lot
of the prism initialization, making it easier to see where we could
potentially be raising errors.
This flag is set when the caller has already created a new array to
handle a splat, such as for `f(*a, b)` and `f(*a, *b)`. Previously,
if `f` was defined as `def f(*a)`, these calls would create an extra
array on the callee side, instead of using the new array created
by the caller.
This modifies `setup_args_core` to set the flag whenver it would add
a `splatarray true` instruction. However, when `splatarray true` is
changed to `splatarray false` in the peephole optimizer, to avoid
unnecessary allocations on the caller side, the flag must be removed.
Add `optimize_args_splat_no_copy` and have the peephole optimizer call
that. This significantly simplifies the related peephole optimizer
code.
On the callee side, in `setup_parameters_complex`, set
`args->rest_dupped` to true if the flag is set.
This takes a similar approach for optimizing regular splats that was
previiously used for keyword splats in
d2c41b1bff (via VM_CALL_KW_SPLAT_MUT).
Make it easier to check what annotations an ISEQ has. SINGLE_NOARG_LEAF
is added automatically, so it's hard to be sure about the annotation by
just reading code. It's also unclear to me what happens to it with
Primitive.mandatory_only?, but this at least explains that LEAF
annotation is not added to the non-mandatory_only ISEQ.
This causes the Iseq file names to be wrong, which is affecting
Tracepoint events in certain cases.
because we're taking a pointer to the string and using it in
`pm_string_mapped_pointer` we also need to `RB_GC_GUARD` the relevant
Ruby object to ensure it's not moved or swept before the parser has been
free'd.
pm_scope_node_destroy frees the scope node after we're done using it to
make sure that the index_lookup_table is not leaked.
For example:
10.times do
100_000.times do
RubyVM::InstructionSequence.compile_prism("begin; 1; rescue; 2; end")
end
puts `ps -o rss= -p #{$$}`
end
Before:
33056
50304
67776
84544
101520
118448
135712
152352
169136
186656
After:
15264
15296
15408
17040
17152
17152
18320
18352
18400
18608
There is a memory leak when passing a file to
RubyVM::InstructionSequence.compile_prism because it does not free the
mapped file.
For example:
require "tempfile"
Tempfile.create(%w"test_iseq .rb") do |f|
f.puts "name = 'Prism'; puts 'hello'"
f.close
10.times do
1_000.times do
RubyVM::InstructionSequence.compile_prism(f)
end
puts `ps -o rss= -p #{$$}`
end
end
Before:
27968
44848
61408
77872
94144
110432
126640
142816
159200
175584
After:
11504
12144
12592
13072
13488
13664
14064
14368
14704
15168
Before this commit the Prism compiler would try to intern constants
every time it re-entered. This pool of constants is "constant" (there is
only one pool per parser instance), so we should do it only once: upon
the top level entry to the compiler.
This change does just that: it populates the interned constants once.
Fixes: https://github.com/ruby/prism/issues/2152
This value is only incremented when rb_iseq_translate_threaded_code is
called, which doesn't happen for iseqs which result in a syntax error.
This is easy to hit by running a debug build with RUBY_FREE_AT_EXIT=1,
but any build and options could underflow this value by running enough
evals.
when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown.
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
The operands in each instruction needs to be pinned because if
auto-compaction runs in iseq_set_sequence, then the objects could exist
on the generated_iseq buffer, which would not be reference updated which
can lead to T_MOVED (and subsequently T_NONE) objects on the iseq.
* Provide a new API compile_file_prism which mirrors compile_file
but uses prism to parse/compile.
* Provide the ability to run test-all with RUBY_ISEQ_DUMP_DEBUG set
to "prism". If it is, we'll use the new compile_file_prism API to
load iseqs during the test run.
pm_scope_node_init is only used for CRuby, so should not live in the
ruby/prism repo. We will merge the changes here first so they're
not breaking, and will then remove from ruby/prism
We changed ScopeNodes to point to their parent (previous) ScopeNodes.
Accordingly, we can remove pm_compile_context_t, and store all
necessary context in ScopeNodes, allowing us to access locals from
outer scopes.
It's an estimator for application size and could be used as a
compilation heuristic later.
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Add a compile_context arg to yp_compile_node
The compile_context will allow us to pass around the parser, and
the constants and lookup table (to be used in future commits).
* Compile yp_program_node_t and yp_statements_node_t
Add the compilation for program and statements node so that we can
successfully compile an empty program with YARP.
* Helper functions for parsing numbers, strings, and symbols
* Compile basic numeric / boolean node types in YARP
* Compile StringNode and SymbolNodes in YARP
* Compile several basic node types in YARP
* Added error return for missing node
* Add yarp/yarp_compiler.c as stencil for compiling YARP
This commit adds yarp/yarp_compiler.c, and changes the sync script
to ensure that yarp/yarp_compiler.c will not get overwritten
* [Misc #119772] Create and expose RubyVM::InstructionSequence.compile_yarp
This commit creates the stencil for a compile_yarp function, which
we will continue to fill out. It allows us to check the output
of compiled YARP code against compiled code without using YARP.
cc is callcache.
cc->klass (klass) should not be marked because if the klass is
free'ed, the cc->klass will be cleared by `vm_cc_invalidate()`.
cc->cme (cme) should not be marked because if cc is invalidated
when cme is free'ed.
- klass marks cme if klass uses cme.
- caller classe's ccs->cme marks cc->cme.
- if cc is invalidated (klass doesn't refer the cc),
cc is invalidated by `vm_cc_invalidate()` and cc->cme is
not be accessed.
- On the multi-Ractors, cme will be collected with global GC
so that it is safe if GC is not interleaving while accessing
cc and cme.
fix [Bug #19436]
```ruby
10_000.times{|i|
# p i if (i%1_000) == 0
str = "x" * 1_000_000
def str.foo = nil
eval "def call#{i}(s) = s.foo"
send "call#{i}", str
}
```
Without this patch:
```
real 1m5.639s
user 0m6.637s
sys 0m58.292s
```
and with this patch:
```
real 0m2.045s
user 0m1.627s
sys 0m0.164s
```
According to the C99 specification section 7.20.3.2 paragraph 2:
> If ptr is a null pointer, no action occurs.
So we do not need to check that the pointer is a null pointer.
Introduce Universal Parser mode for the parser.
This commit includes these changes:
* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.