Граф коммитов

1425 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada 8b98b9e274
Check only whether `RUBY_DEVEL` is defined 2022-07-12 17:13:57 +09:00
Yusuke Endoh a871fc4d86 Fix a regression of b2e58b02ae
At that commit, I fixed a wrong conditional expression that was always
true.  However, that seemed to have caused a regression. [Bug #18906]

This change removes the condition to make the code always enabled.
It had been enabled until that commit, albeit unintentionally, and even
if it is enabled it only consumes a tiny bit of memory, so I believe it
is harmless. [Bug #18906]
2022-07-11 23:38:37 +09:00
Aaron Patterson 3cf2c2e4a1 Remove ISEQ_MARKABLE_ISEQ flag
We don't need this flag anymore.  We have all the info we need via the
bitmap and the is_entries list.
2022-07-07 11:56:25 -07:00
Aaron Patterson e3ab525f69 Fix ISeq dump / load in array cases
We need to dump relative offsets for inline storage entries so that
loading iseqs as an array works as well.  This commit also has some
minor refactoring to make computing relative ISE information easier.

This should fix the iseq dump / load as array tests we're seeing fail in
CI.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-06-29 16:21:48 -07:00
Aaron Patterson 87e2e3f383 Dump inline storage partition information to binary format
ISeqs loaded from binary were breaking because the storage partition
calculation had bugs in it.  Specifically it couldn't take in to account
the case when inline storage was overallocated (for example when we
allocate inline storage for an instruction but peephole optimization
eliminates that instruction).

`RUBY_ISEQ_DUMP_DEBUG=to_binary make test-all` would break, and this
patch fixes it
2022-06-24 15:04:00 -07:00
Aaron Patterson 0b58059f15 Free bitmap buffer if it's not used
If the iseqs don't have any objects in them that need marking, then
immediately free the bitmap buffer
2022-06-23 16:52:00 -07:00
Aaron Patterson 8d63a04703 Flatten bitmap when there is only one element
We can avoid allocating a bitmap when the number of elements in the iseq
is fewer than the size of an iseq_bits_t
2022-06-23 16:52:00 -07:00
Aaron Patterson 1ccdb1a251 Update vm_core.h
Co-authored-by: Tomás Coêlho <36938811+tomascco@users.noreply.github.com>
2022-06-23 14:01:46 -07:00
Aaron Patterson e23540e566 Speed up ISeq by marking via bitmaps and IC rearranging
This commit adds a bitfield to the iseq body that stores offsets inside
the iseq buffer that contain values we need to mark.  We can use this
bitfield to mark objects instead of disassembling the instructions.

This commit also groups inline storage entries and adds a counter for
each entry.  This allows us to iterate and mark each entry without
disassembling instructions

Since we have a bitfield and grouped inline caches, we can mark all
VALUE objects associated with instructions without actually
disassembling the instructions at mark time.

[Feature #18875] [ruby-core:109042]
2022-06-23 14:01:46 -07:00
Peter Zhu 2790bddda6 Remove unused function declaration
iseq_alloc is not used in compile.c. It is also a static function
declared in iseq.c so it's not accessible in compile.c.
2022-06-17 09:44:17 -04:00
Yusuke Endoh b2e58b02ae compile.c (add_adjust_info): Remove `insns_info_index > 0`
... because insns_info_index could not be zero here. Also it adds an
invariant check for that.

This change will prevent the following warning of GCC 12.1

http://rubyci.s3.amazonaws.com/arch/ruby-master/log/20220613T000004Z.log.html.gz
```
compile.c:2230:39: warning: array subscript 2147483647 is outside array bounds of ‘struct iseq_insn_info_entry[2147483647]’ [-Warray-bounds]
 2230 |         insns_info[insns_info_index-1].line_no != adjust->line_no) {
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
```
2022-06-13 15:22:32 +09:00
Peter Zhu 5f10bd634f Add ISEQ_BODY macro
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using
this macro will make it easier for us to change the allocation strategy
of rb_iseq_constant_body when using Variable Width Allocation.
2022-03-24 10:03:51 -04:00
S.H f7491e89b9
Using macros to check iseq element 2022-03-02 09:27:30 +09:00
Nobuyoshi Nakada 8f3a36fb6e
Fix indents [ci skip] 2022-02-03 11:21:41 +09:00
Jemma Issroff 2913a2f5cf Treat TS_ICVARC cache as separate from TS_IVC cache 2022-02-02 09:20:34 -08:00
Jeremy Evans ca3d405242 Fix constant assignment evaluation order
Previously, the right hand side was always evaluated before the
left hand side for constant assignments.  For the following:

```ruby
lhs::C = rhs
```

rhs was evaluated before lhs, which is inconsistant with attribute
assignment (lhs.m = rhs), and apparently also does not conform to
JIS 3017:2013 11.4.2.2.3.

Fix this by changing evaluation order.  Previously, the above
compiled to:

```
0000 putself                                                          (   1)[Li]
0001 opt_send_without_block                 <calldata!mid:rhs, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 dup
0004 putself
0005 opt_send_without_block                 <calldata!mid:lhs, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 setconstant                            :C
0009 leave
```

After this change:

```
0000 putself                                                          (   1)[Li]
0001 opt_send_without_block                 <calldata!mid:lhs, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 putself
0004 opt_send_without_block                 <calldata!mid:rhs, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0006 swap
0007 topn                                   1
0009 swap
0010 setconstant                            :C
0012 leave
```

Note that if expr is not a module/class, then a TypeError is not
raised until after the evaluation of rhs.  This is because that
error is raised by setconstant.  If we wanted to raise TypeError
before evaluation of rhs, we would have to add a VM instruction
for calling vm_check_if_namespace.

Changing assignment order for single assignments caused problems
in the multiple assignment code, revealing that the issue also
affected multiple assignment.  Fix the multiple assignment code
so left-to-right evaluation also works for constant assignments.

Do some refactoring of the multiple assignment code to reduce
duplication after adding support for constants. Rename struct
masgn_attrasgn to masgn_lhs_node, since it now handles both
constants and attributes. Add add_masgn_lhs_node static function
for adding data for lhs attribute and constant setting.

Fixes [Bug #15928]
2022-01-14 11:00:26 -08:00
Nobuyoshi Nakada 54f0e63a8c Remove `NODE_DASGN_CURR` [Feature #18406]
This `NODE` type was used in pre-YARV implementation, to improve
the performance of assignment to dynamic local variable defined at
the innermost scope.  It has no longer any actual difference with
`NODE_DASGN`, except for the node dump.
2021-12-13 12:53:03 +09:00
John Hawthorn 4a3e7984bf
Avoid Array allocation when appending to args array (#5211)
* Use duparray when possible for argspush

ARGSPUSH is the node we see with a single value pushed to the end of a
splatted array. ARGSCAT is similar, but is used when multiple values are
being concatenated to the list.

Previously only ARGSCAT had an optimization where when all the values
were static it would use duparray instead of newarray to create the
intermediate array.

This commit adds similar behaviour for ARGSPUSH, using duparray instead
of putobject/newarray.

* Replace duparray with putobject before concatarray

When performing duparray/concatarray we know we'll never use the
intermediate array being created by duparray, so we should be able to
use it as a temporary object.

This avoids an extra array allocation for NODE_ARGSPUSH (ex. [*foo, 1])
and NODE_ARGSCAT (ex. [*foo, 1, 2]).
2021-12-07 15:18:11 -08:00
S.H ec7f14d9fa
Add `nd_type_p` macro 2021-12-04 00:01:24 +09:00
Nobuyoshi Nakada c14f230b26 Assign temporary ID to anonymous ID [Bug #18250]
Dumped iseq binary can not have unnamed symbols/IDs, and ID 0 is
stored instead.  As `struct rb_id_table` disallows ID 0, also for
the distinction, re-assign a new temporary ID based on the local
variable table index when loading from the binary, as well as the
parser.
2021-11-23 21:03:19 +09:00
Yusuke Endoh feda058531 Refactor hacky ID tables to struct rb_ast_id_table_t
The implementation of a local variable tables was represented as `ID*`,
but it was very hacky: the first element is not an ID but the size of
the table, and, the last element is (sometimes) a link to the next local
table only when the id tables are a linked list.

This change converts the hacky implementation to a normal struct.
2021-11-21 08:59:24 +09:00
Koichi Sasada 82ea287018 optimize `Struct` getter/setter
Introduce new optimized method type
`OPTIMIZED_METHOD_TYPE_STRUCT_AREF/ASET` with index information.
2021-11-19 08:32:39 +09:00
Jeremy Evans b08dacfea3
Optimize dynamic string interpolation for symbol/true/false/nil/0-9
This provides a significant speedup for symbol, true, false,
nil, and 0-9, class/module, and a small speedup in most other cases.

Speedups (using included benchmarks):
:symbol        :: 60%
0-9            :: 50%
Class/Module   :: 50%
nil/true/false :: 20%
integer        :: 10%
[]             :: 10%
""             :: 3%

One reason this approach is faster is it reduces the number of
VM instructions for each interpolated value.

Initial idea, approach, and benchmarks from Eric Wong. I applied
the same approach against the master branch, updating it to handle
the significant internal changes since this was first proposed 4
years ago (such as CALL_INFO/CALL_CACHE -> CALL_DATA). I also
expanded it to optimize true/false/nil/0-9/class/module, and added
handling of missing methods, refined methods, and RUBY_DEBUG.

This renames the tostring insn to anytostring, and adds an
objtostring insn that implements the optimization. This requires
making a few functions non-static, and adding some non-static
functions.

This disables 4 YJIT tests.  Those tests should be reenabled after
YJIT optimizes the new objtostring insn.

Implements [Feature #13715]

Co-authored-by: Eric Wong <e@80x24.org>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Yusuke Endoh <mame@ruby-lang.org>
Co-authored-by: Koichi Sasada <ko1@atdot.net>
2021-11-18 15:10:20 -08:00
Yusuke Endoh 1e9ef03639 compile.c: remove dead code 2021-11-18 03:47:35 +09:00
Yusuke Endoh e1f6ca1911 compile.c: Fix typo 2021-11-18 03:47:35 +09:00
Koichi Sasada b1b73936c1 `Primitive.mandatory_only?` for fast path
Compare with the C methods, A built-in methods written in Ruby is
slower if only mandatory parameters are given because it needs to
check the argumens and fill default values for optional and keyword
parameters (C methods can check the number of parameters with `argc`,
so there are no overhead). Passing mandatory arguments are common
(optional arguments are exceptional, in many cases) so it is important
to provide the fast path for such common cases.

`Primitive.mandatory_only?` is a special builtin function used with
`if` expression like that:

```ruby
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    if Primitive.mandatory_only?
      Primitive.time_s_at1(time)
    else
      Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
    end
  end
```

and it makes two ISeq,

```
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
  end

  def self.at(time)
    Primitive.time_s_at1(time)
  end
```

and (2) is pointed by (1). Note that `Primitive.mandatory_only?`
should be used only in a condition of an `if` statement and the
`if` statement should be equal to the methdo body (you can not
put any expression before and after the `if` statement).

A method entry with `mandatory_only?` (`Time.at` on the above case)
is marked as `iseq_overload`. When the method will be dispatch only
with mandatory arguments (`Time.at(0)` for example), make another
method entry with ISeq (2) as mandatory only method entry and it
will be cached in an inline method cache.

The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254
but it only checks mandatory parameters or more, because many cases
only mandatory parameters are given. If we find other cases (optional
or keyword parameters are used frequently and it hurts performance),
we can extend the feature.
2021-11-15 15:58:56 +09:00
Nobuyoshi Nakada 9b751db99c Fix script_lines in loaded iseq as nil 2021-10-29 06:39:57 +09:00
Nobuyoshi Nakada 7459a32af3 suppress warnings for probable NULL dererefences 2021-10-24 19:24:50 +09:00
Koichi Sasada c7550537f1 `RubyVM.keep_script_lines`
`RubyVM.keep_script_lines` enables to keep script lines
for each ISeq and AST. This feature is for debugger/REPL
support.

```ruby
RubyVM.keep_script_lines = true
RubyVM::keep_script_lines = true

eval("def foo = nil\ndef bar = nil")
pp RubyVM::InstructionSequence.of(method(:foo)).script_lines
```
2021-10-21 16:17:39 +09:00
Alan Wu 27358b6ee4 Simplify code for YJIT const cache in compile.c
Since opt_getinlinecache and opt_setinlinecache point to the same cache
struct, there is no need to track the index of the get instruction and
then store it on the cache struct later when processing the set
instruction. Setting it when processing the get instruction works just
as well.

This change reduces our diff.
2021-10-20 18:19:43 -04:00
Noah Gibbs be06112d48 Fix changes from rebase 2021-10-20 18:19:42 -04:00
Alan Wu 5b4305f71c Simpler fix for -DUSE_EMBED_CI=0
Nobu pointed out that saving the old ci to a local is enough to keep it
reachable.
2021-10-20 18:19:38 -04:00
Alan Wu 8cf01dd25c Revert "Fix use-after-free on USE_EMBED_CI=0"
This reverts commit 1e0f2e4b09.
2021-10-20 18:19:38 -04:00
Alan Wu 736eb29a3c Fix use-after-free on USE_EMBED_CI=0
The old code didn't keep old_operands[0] reachable while allocating. You
can crash it by requiring erb under GC stress mode.
2021-10-20 18:19:38 -04:00
Alan Wu b626dd7211 YJIT: Fancier opt_getinlinecache
Make sure `opt_getinlinecache` is in a block all on its own, and
invalidate it from the interpreter when `opt_setinlinecache`.
It will recompile with a filled cache the second time around.
This lets YJIT runs well when the IC for constant is cold.
2021-10-20 18:19:33 -04:00
Maxime Chevalier-Boisvert e4c65ec49c Refactor uJIT code into more files for readability 2021-10-20 18:19:26 -04:00
Alan Wu 8bda11f690 MicroJIT: compile after ten calls 2021-10-20 18:19:25 -04:00
Alan Wu e8c914c250 Implement the --disable-ujit command line option 2021-10-20 18:19:24 -04:00
Alan Wu 265c5ca8b1 Avoid triggering GC while translating threaded code 2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert 038f5d964f Avoid recompiling overlapping instruction sequences in ujit 2021-10-20 18:19:23 -04:00
Alan Wu 4929ba0a5c Generate multiple copies of native code for `pop`
Insert generated addresses into st_table for mapping native code
addresses back to info about VM instructions. Export `encoded_insn_data`
to do this. Also some style fixes.
2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert 1c8fb90f6b Add new files, ujit_compile.c, ujit_compile.h 2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert 566d4abee5 Added shift instructions 2021-10-20 18:19:23 -04:00
Alan Wu 16c5ce863c Yeah, this actually works! 2021-10-20 18:19:22 -04:00
Nobuyoshi Nakada 768ceb4ead
Cast to void pointer for `%p` in commented out code [ci skip] 2021-10-20 11:22:33 +09:00
Aaron Patterson 217df51f0e Dump outer variables tables when dumping an iseq to binary
This commit dumps the outer variables table when dumping an iseq to
binary.  This fixes a case where Ractors aren't able to tell what outer
variables belong to a lambda after the lambda is loaded via ISeq.load_from_binary

[Bug #18232] [ruby-core:105504]
2021-10-07 15:39:47 -07:00
S.H dc9112cf10
Using NIL_P macro instead of `== Qnil` 2021-10-03 22:34:45 +09:00
S-H-GAMELINKS 83a5e2bb5c Using RB_FLOAT_TYPE_P macro 2021-09-12 11:16:31 +09:00
S-H-GAMELINKS 56065f0686 Using SYMBOL_P macro 2021-09-11 08:48:56 +09:00
Nobuyoshi Nakada cfbf2bde40
Remove unused argument 2021-09-10 21:26:16 +09:00
卜部昌平 dddc618d30 suppress GCC's -Wsuggest-attribute=format
I was not aware of this because I use clang these days.
2021-09-10 20:00:06 +09:00
S-H-GAMELINKS bdd6d8746f Replace RBOOL macro 2021-09-05 23:01:27 +09:00
Nobuyoshi Nakada cb3df3d87b
Extract compile_attrasgn from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada aac2b0fc6b
Extract compile_kw_arg from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada cbf841e3ed
Extract compile_errinfo from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada d7bba95eba
Extract compile_dots from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada d58143f3b5
Extract compile_colon3 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada 70c8155d8b
Extract compile_colon2 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada 270a674a79
Extract compile_match from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada a92fdc90da
Extract compile_yield from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada 996489d7e0
Extract compile_super from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada 6cf9f17191
Extract compile_op_log from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada d045d5f860
Extract compile_op_cdecl from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada 0c7ff37540
Extract compile_op_asgn2 from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada 0b87b75ae9
Extract compile_op_asgn1 from iseq_compile_each0 2021-09-01 15:19:07 +09:00
Nobuyoshi Nakada f781e537b5
Remove no longer used variable line_node 2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada d23264d359
Extract compile_block from iseq_compile_each0
And constify `node` argument of `iseq_compile_each0`.
2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada 181207e830
Constify line_node in iseq_compile_each0 2021-08-31 10:21:22 +09:00
Jeremy Evans 48c8df9e0e
Allow tracing of optimized methods
This updates the trace instructions to directly dispatch to
opt_send_without_block.  So this should cause no slowdown in
non-trace mode.

To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL
and RUBY_EVENT_C_RETURN are added as events to the specialized
instructions.

Fixes [Bug #14870]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2021-08-21 10:15:01 -07:00
Kazuki Tsujimoto 4568ba0711
Show verbose error messages when single pattern match fails
[0] => [0, *, a]
    #=> [0] length mismatch (given 1, expected 2+) (NoMatchingPatternError)

Ignore test failures of typeprof caused by this change for now.
2021-08-15 09:38:24 +09:00
Alan Wu cbecf9c7ba
Fix use-after-free on -DUSE_EMBED_CI=0
On -DUSE_EMBED_CI=0, there are more GC allocations and the old code
didn't keep old_operands[0] reachable while allocating. On a Debian
based system, I get a crash requiring erb under GC stress mode. On
macOS, tool/transcode-tblgen.rb runs incorrectly if I put GC.stress=true
as the first line.
2021-07-29 12:04:36 -04:00
Jeremy Evans fa87f72e1e Add pattern matching pin support for instance/class/global variables
Pin matching for local variables and constants is already supported,
and it is fairly simple to add support for these variable types.

Note that pin matching for method calls is still not supported
without wrapping in parentheses (pin expressions).  I think that's
for the best as method calls are far more complex (arguments/blocks).

Implements [Feature #17724]
2021-07-15 09:56:02 -07:00
Aaron Patterson 2599d1a8df Store the dup'd CDHASH in the object list during IBF load
Since b2fc592c30 nothing was holding a reference to the dup'd CDHASH
during IBF loading.  If a GC happened to run during IBF load then the
copied hash wouldn't have anything to keep it alive.  We don't really
want to keep the originally loaded CDHASH hash, so this patch just
overwrites the original hash with the copied / modified hash.

[Bug #17984] [ruby-core:104259]
2021-07-06 17:48:40 -07:00
eileencodes 31f4d26273 Check type of instruction - can be INSN or ADJUST
If the type is ADJUST we don't want to treat it like an INSN so we have
to check the type before reading from `insn_info.events`.

[Bug #18001] [ruby-core:104371]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-23 11:34:37 -07:00
eileencodes b91b3bc771 Add a cache for class variables
Redo of 34a2acdac788602c14bf05fb616215187badd504 and
931138b00696419945dc03e10f033b1f53cd50f3 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Yusuke Endoh 0a36cab1b5 Enable USE_ISEQ_NODE_ID by default
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.

See also ff69ef27b0.

https://bugs.ruby-lang.org/issues/17930
2021-06-18 03:35:38 +09:00
Yusuke Endoh dfba87cd62 Make it possible to get AST::Node from Thread::Backtrace::Location
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.

This change also includes iseq's dump/load support of node_ids for each
instructions.
2021-06-18 03:35:38 +09:00
Yusuke Endoh fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh acae5f363d ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Nobuyoshi Nakada e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada 9f3888d6a3 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_REGEXP
2021-06-03 15:11:18 +09:00
Nobuyoshi Nakada 37eb5e7439 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_BIGNUM
* T_FLOAT (non-flonum)
* T_RATIONAL
* T_COMPLEX
2021-06-03 15:11:18 +09:00
Takashi Kokubun 070caf54d2
Refactor rb_vm_insn_addr2insn calls
It's been a way too much amount of ifdefs.
2021-06-02 01:16:50 -07:00
Alan Wu 5ada23ac12 compile.c: Emit send for === calls in when statements
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls
=== without a call cache. Emit a send instruction to make the call
instead. It includes a call cache.

The call cache improves throughput of using when statements to check the
class of a given object. This is useful for say, JSON serialization.

Use of a regular send instead of checkmatch also avoids taking the VM
lock every time, which is good for multi-ractor workloads.

    Calculating -------------------------------------
                             master        post
         vm_case_classes    11.013M     16.172M i/s -      6.000M times in 0.544795s 0.371009s
             vm_case_lit      2.296       2.263 i/s -       1.000 times in 0.435606s 0.441826s
                 vm_case    74.098M     64.338M i/s -      6.000M times in 0.080974s 0.093257s

    Comparison:
                      vm_case_classes
                    post:  16172114.4 i/s
                  master:  11013316.9 i/s - 1.47x  slower

                          vm_case_lit
                  master:         2.3 i/s
                    post:         2.3 i/s - 1.01x  slower

                              vm_case
                  master:  74097858.6 i/s
                    post:  64338333.9 i/s - 1.15x  slower

The vm_case benchmark is a bit slower post patch, possibily due to the
larger instruction sequence. The benchmark dispatches using
opt_case_dispatch so was not running checkmatch and does not make the
=== call post patch.
2021-05-28 12:34:03 -04:00
Alan Wu 788d30a8b3 Make range literal peephole optimization target "newrange"
It looks for "checkmatch", when it could be applied to anything that has
"newrange".

Making the optimization target more ranges might only be fair play when
all ranges are frozen. So I'm putting a reference to the ticket that
froze all ranges.

[Feature #15504]
2021-05-28 12:34:03 -04:00
Alan Wu b2fc592c30
Build CDHASH properly when loading iseq from binary
Before this change, CDHASH operands were built as plain hashes when
loaded from binary. Without setting up the hash with the correct
st_table type, the hash can sometimes be an ar_table. When the hash is
an ar_table, lookups can call the `eql?` method on keys of the hash,
which makes the `opt_case_dispatch` instruction not "leaf" as it
implicitly declares.

The following script trips the stack canary for checking the leaf
attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by
default with RUBY_DEBUG).

    rb_vm_iseq = RubyVM::InstructionSequence

    iseq = rb_vm_iseq.compile(<<-EOF)
      case Class.new(String).new("foo")
      when "foo"
        42
      end
    EOF

    puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval

This commit changes the binary loading logic to build CDHASH with the
right st_table type. The dumping logic and the dump format stays the
same
2021-05-21 12:13:55 -04:00
Koichi Sasada 817764bd82 simple rescue+while+break should not use `throw`
609de71f04 fixes the issue by using
`throw` insn if `ensure` is used. However, that patch introduce
additional `throw` even if it is not needed. This patch solves
the issue.

This issue is pointed by @mame.
2021-05-21 18:12:14 +09:00
Yusuke Endoh 5026f9a5d5 compile.c: stop the jump-jump optimization if the second has any event
Fixes [Bug #17868]
2021-05-20 19:13:39 +09:00
Jeremy Evans 9ce29c94d8 Avoid improper optimization of case statements mixed integer/rational/complex
Fixes [Bug #17857]
2021-05-12 19:30:05 -07:00
卜部昌平 0ab0b86c84 cdhash_cmp: should use ||
cf: https://github.com/ruby/ruby/pull/4469#discussion_r628386707
2021-05-12 10:30:46 +09:00
卜部昌平 e1eff837cf cdhash_cmp: recursively apply
For instance a rational's numerator can be a bignum.  Comparison using
C's == can be insufficient.
2021-05-12 10:30:46 +09:00
卜部昌平 cc0dc67bbb cdhash_cmp: can also take complex
There are complex literals `123i`, which can also be a case condition.
2021-05-12 10:30:46 +09:00
卜部昌平 d0e6c6e682 cdhash_cmp: rational literals with fractions
Nobu kindly pointed out that rational literals can have fractions.
2021-05-12 10:30:46 +09:00
卜部昌平 2bc293e899 cdhash_cmp: can take rational literals
Rational literals are those integers suffixed with `r`.  They tend to
be a part of more complex expressions like `123/456r`, but in theory
they can live alone.  When such "bare" rational literals are passed to
case-when branch, we have to take care of them.  Fixes [Bug #17854]
2021-05-12 10:30:46 +09:00
Aaron Patterson 07f055bb13
Revert "Filling cache values on cvar write"
This reverts commit 08de37f9fa.
This reverts commit e8ae922b62.
2021-05-11 13:31:00 -07:00
eileencodes 08de37f9fa Filling cache values on cvar write
Instead of on read. Once it's in the inline cache we never have to make
one again. We want to eventually put the value into the cache, and the
best opportunity to do that is when you write the value.
2021-05-11 12:04:27 -07:00
eileencodes e8ae922b62 Add a cache for class variables
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-05-11 12:04:27 -07:00
Yusuke Endoh ff69ef27b0 compile.c: Pass node instead of nd_line(node) to ADD_INSN* functions
... then, new_insn_core extracts nd_line(node).

Also, if a macro "EXPERIMENTAL_ISEQ_NODE_ID" is defined, this changeset
keeps nd_node_id(node) for each instruction. This is intended for
TypeProf to identify what AST::Node corresponds to each instruction.

This patch is originally authored by @yui-knk for showing which column a
NoMethodError occurred.

https://github.com/ruby/ruby/compare/master...yui-knk:feature/node_id

Co-Authored-By: Yuichiro Kaneko <yui-knk@ruby-lang.org>
2021-05-07 17:02:15 +09:00
Koichi Sasada 609de71f04 fix raise in exception with jump
add_ensure_iseq() adds ensure block to the end of
jump such as next/redo/return. However, if the rescue
cause are in the body, this rescue catches the exception
in ensure clause.

  iter do
    next
  rescue
    R
  ensure
    raise
  end

In this case, R should not be executed, but executed without this patch.

Fixes [Bug #13930]
Fixes [Bug #16618]

A part of tests are written by @jeremyevans https://github.com/ruby/ruby/pull/4291
2021-04-22 11:33:39 +09:00
Jeremy Evans 50c54d40a8
Evaluate multiple assignment left hand side before right hand side
In regular assignment, Ruby evaluates the left hand side before
the right hand side.  For example:

```ruby
foo[0] = bar
```

Calls `foo`, then `bar`, then `[]=` on the result of `foo`.

Previously, multiple assignment didn't work this way.  If you did:

```ruby
abc.def, foo[0] = bar, baz
```

Ruby would previously call `bar`, then `baz`, then `abc`, then
`def=` on the result of `abc`, then `foo`, then `[]=` on the
result of `foo`.

This change makes multiple assignment similar to single assignment,
changing the evaluation order of the above multiple assignment code
to calling `abc`, then `foo`, then `bar`, then `baz`, then `def=` on
the result of `abc`, then `[]=` on the result of `foo`.

Implementing this is challenging with the stack-based virtual machine.
We need to keep track of all of the left hand side attribute setter
receivers and setter arguments, and then keep track of the stack level
while handling the assignment processing, so we can issue the
appropriate topn instructions to get the receiver.  Here's an example
of how the multiple assignment is executed, showing the stack and
instructions:

```
self                                      # putself
abc                                       # send
abc, self                                 # putself
abc, foo                                  # send
abc, foo, 0                               # putobject 0
abc, foo, 0, [bar, baz]                   # evaluate RHS
abc, foo, 0, [bar, baz], baz, bar         # expandarray
abc, foo, 0, [bar, baz], baz, bar, abc    # topn 5
abc, foo, 0, [bar, baz], baz, abc, bar    # swap
abc, foo, 0, [bar, baz], baz, def=        # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz], baz, foo         # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0      # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0, baz # topn 2
abc, foo, 0, [bar, baz], baz, []=         # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz]                   # pop
[bar, baz], foo, 0, [bar, baz]            # setn 3
[bar, baz], foo, 0                        # pop
[bar, baz], foo                           # pop
[bar, baz]                                # pop
```

As multiple assignment must deal with splats, post args, and any level
of nesting, it gets quite a bit more complex than this in non-trivial
cases. To handle this, struct masgn_state is added to keep
track of the overall state of the mass assignment, which stores a linked
list of struct masgn_attrasgn, one for each assigned attribute.

This adds a new optimization that replaces a topn 1/pop instruction
combination with a single swap instruction for multiple assignment
to non-aref attributes.

This new approach isn't compatible with one of the optimizations
previously used, in the case where the multiple assignment return value
was not needed, there was no lhs splat, and one of the left hand side
used an attribute setter.  This removes that optimization. Removing
the optimization allowed for removing the POP_ELEMENT and adjust_stack
functions.

This adds a benchmark to measure how much slower multiple
assignment is with the correct evaluation order.

This benchmark shows:

* 4-9% decrease for attribute sets
* 14-23% decrease for array member sets
* Basically same speed for local variable sets

Importantly, it shows no significant difference between the popped
(where return value of the multiple assignment is not needed) and
!popped (where return value of the multiple assignment is needed)
cases for attribute and array member sets.  This indicates the
previous optimization, which was dropped in the evaluation
order fix and only affected the popped case, is not important to
performance.

Fixes [Bug #4443]
2021-04-21 10:49:19 -07:00
Jeremy Evans 7b3c5ab8a5 Make defined? cache the results of method calls
Previously, defined? could result in many more method calls than
the code it was checking. `defined? a.b.c.d.e.f` generated 15 calls,
with `a` called 5 times, `b` called 4 times, etc..  This was due to
the fact that defined works in a recursive manner, but it previously
did not cache results.  So for `defined? a.b.c.d.e.f`, the logic was
similar to

```ruby
return nil unless defined? a
return nil unless defined? a.b
return nil unless defined? a.b.c
return nil unless defined? a.b.c.d
return nil unless defined? a.b.c.d.e
return nil unless defined? a.b.c.d.e.f
"method"
```

With this change, the logic is similar to the following, without
the creation of a local variable:

```ruby
return nil unless defined? a
_ = a
return nil unless defined? _.b
_ = _.b
return nil unless defined? _.c
_ = _.c
return nil unless defined? _.d
_ = _.d
return nil unless defined? _.e
_ = _.e
return nil unless defined? _.f
"method"
```

In addition to eliminating redundant method calls for defined
statements, this greatly simplifies the instruction sequences by
eliminating duplication.  Previously:

```
0000 putnil                                                           (   1)[Li]
0001 putself
0002 defined                                func, :a, false
0006 branchunless                           73
0008 putself
0009 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0011 defined                                method, :b, false
0015 branchunless                           73
0017 putself
0018 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0020 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0022 defined                                method, :c, false
0026 branchunless                           73
0028 putself
0029 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0031 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0033 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0035 defined                                method, :d, false
0039 branchunless                           73
0041 putself
0042 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0044 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0046 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0048 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0050 defined                                method, :e, false
0054 branchunless                           73
0056 putself
0057 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0059 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0061 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0063 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0065 opt_send_without_block                 <calldata!mid:e, argc:0, ARGS_SIMPLE>
0067 defined                                method, :f, true
0071 swap
0072 pop
0073 leave
```

After change:

```
0000 putnil                                                           (   1)[Li]
0001 putself
0002 dup
0003 defined                                func, :a, false
0007 branchunless                           52
0009 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0011 dup
0012 defined                                method, :b, false
0016 branchunless                           52
0018 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0020 dup
0021 defined                                method, :c, false
0025 branchunless                           52
0027 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0029 dup
0030 defined                                method, :d, false
0034 branchunless                           52
0036 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0038 dup
0039 defined                                method, :e, false
0043 branchunless                           52
0045 opt_send_without_block                 <calldata!mid:e, argc:0, ARGS_SIMPLE>
0047 defined                                method, :f, true
0051 swap
0052 pop
0053 leave
```

This fixes issues where for pathological small examples, Ruby would generate
huge instruction sequences.

Unfortunately, implementing this support is kind of a hack.  This adds another
parameter to compile_call for whether we should assume the receiver is already
present on the stack, and has defined? set that parameter for the specific
case where it is compiling a method call where the receiver is also a method
call.

defined_expr0 also takes an additional parameter for whether it should leave
the results of the method call on the stack.  If that argument is true, in
the case where the method isn't defined, we jump to the pop before the leave,
so the extra result is not left on the stack.  This requires space for an
additional label, so lfinish now needs to be able to hold 3 labels.

Fixes [Bug #17649]
Fixes [Bug #13708]
2021-03-29 07:45:15 -07:00
Kazuki Tsujimoto 21863470d9
Pattern matching pin operator against expression [Feature #17411]
This commit is based on the patch by @nobu.
2021-03-21 15:14:31 +09:00
Aaron Patterson 17bf478de1 Store strings for `defined` in the iseqs
We can know the string used for "defined" calls at compile time, then
store the string in the instruction sequences
2021-03-17 10:55:37 -07:00
Jean Boussier ef88225886 Simplify ibf_dump_object_symbol by delegating to ibf_dump_object_string 2021-03-10 13:44:07 -08:00
Jean Boussier 2de7fbcdbb Pre-freeze ISeq names to avoid useless duplication 2021-03-10 13:44:07 -08:00
Jean Boussier d00e7deb5c Use rb_enc_interned_str in ibf_load_object_string 2021-03-10 13:44:07 -08:00
Jean Boussier 8463c8a425 Specialize ibf_load_object_symbol and ibf_dump_object_symbol 2021-03-10 13:44:07 -08:00
Aaron Patterson 938e027cdf Eliminate useless catch tables and nops from lambdas
Before this commit:

```
$ ruby --dump=insn -e '1.times { |x| puts x }'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE)
== catch table
| catch type: break  st: 0000 ed: 0004 sp: 0000 cont: 0004
| == disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE)
| == catch table
| | catch type: redo   st: 0001 ed: 0006 sp: 0000 cont: 0001
| | catch type: next   st: 0001 ed: 0006 sp: 0000 cont: 0006
| |------------------------------------------------------------------------
| local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
| [ 1] x@0<Arg>
| 0000 nop                                                              (   1)[Bc]
| 0001 putself                                [Li]
| 0002 getlocal_WC_0                          x@0
| 0004 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
| 0006 leave                                  [Br]
|------------------------------------------------------------------------
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 send                                   <calldata!mid:times, argc:0>, block in <main>
0004 leave
```

After this commit:

```
> ruby --dump=insn -e '1.times { |x| puts x }'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE)
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 send                                   <calldata!mid:times, argc:0>, block in <main>
0004 leave

== disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 putself                                                          (   1)[LiBc]
0001 getlocal_WC_0                          x@0
0003 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0005 leave
```

Fixes [ruby-core:102418] [Feature #17613]

Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2021-02-16 14:00:36 -08:00
Vladimir Dementyev 1b89b99941
Mark pattern labels as unremoveable
Peephole optimization doesn't play well with find pattern at
least. The only case when a pattern matching could have
unreachable patterns is when we have lasgn/dasgn node, which
shouldn't happen in real-life.

Fixes https://bugs.ruby-lang.org/issues/17534
2021-01-19 08:34:01 +09:00
Aaron Patterson 5e26619660
Fix WB for callinfo
The WB for callinfo needs to be executed *after* the reference is
written.  Otherwise we get a WB miss.
2021-01-14 09:55:54 -08:00
Aaron Patterson efcdf68e64 Guard callinfo
Callinfo was being written in to an array and the GC would not see the
reference on the stack.  `new_insn_send` creates a new callinfo object,
then it calls `new_insn_core`.  `new_insn_core` allocates a new INSN
linked list item, which can end up calling `xmalloc` which will trigger
a GC:

  70cd351c7c/compile.c (L968-L969)

Since the callinfo object isn't on the stack, the GC won't see it, and
it can get collected.  This patch just refactors `new_insn_send` to keep
the object on the stack

Co-authored-by: John Hawthorn <john@hawthorn.email>
2021-01-13 16:13:53 -08:00
Aaron Patterson c8b47eb7c9 only add the trailing nop if the catch table is not break / next / redo
We don't need nop padding when the catch tables are only for break /
next / redo, so lets avoid them.  This eliminates nop padding in
many lambdas.

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-01-13 14:56:05 -08:00
Koichi Sasada e7fc353f04 enable constant cache on ractors
constant cache `IC` is accessed by non-atomic manner and there are
thread-safety issues, so Ruby 3.0 disables to use const cache on
non-main ractors.

This patch enables it by introducing `imemo_constcache` and allocates
it by every re-fill of const cache like `imemo_callcache`.
[Bug #17510]

Now `IC` only has one entry `IC::entry` and it points to
`iseq_inline_constant_cache_entry`, managed by T_IMEMO object.

`IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and
`rb_mjit_after_vm_ic_update()` is not needed.
2021-01-05 02:27:58 +09:00
Nobuyoshi Nakada 7156248137
Hoisted out compile_builtin_arg to refine messages 2021-01-01 23:32:07 +09:00
Nobuyoshi Nakada 0fbf4d0374
Access to reserved word parameter like as `__builtin.arg!(:if)` 2020-12-31 15:11:38 +09:00
Nobuyoshi Nakada c8010fcec0
Dup kwrest hash when merging other keyword arguments [Bug #17481] 2020-12-28 01:52:18 +09:00
Nobuyoshi Nakada d143b75f8e
Adjusted indents [ci skip] 2020-12-25 00:56:17 +09:00
John Hawthorn 40b7358e93 Skip defined check in NODE_OP_ASGN_OR with ivar
Previously we would add code to check if an ivar was defined when using
`@foo ||= 123`, which was slower than `@foo || (@foo = 123)` when `@foo`
was already defined.

Recently 01b7d5acc7 made it so that
accessing an undefined variable no longer generates a warning, making
the defined check unnecessary and both statements exactly equal.

This commit avoids emitting the defined instruction when compiling
NODE_OP_ASGN_OR with a NODE_IVAR.

Before:

    $ ruby --dump=insn -e '@foo ||= 123'
    == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
    0000 putnil                                                           (   1)[Li]
    0001 defined                      instance-variable, :@foo, false
    0005 branchunless                 14
    0007 getinstancevariable          :@foo, <is:0>
    0010 dup
    0011 branchif                     20
    0013 pop
    0014 putobject                    123
    0016 dup
    0017 setinstancevariable          :@foo, <is:0>
    0020 leave

After:

    $ ./ruby --dump=insn -e '@foo ||= 123'
    == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
    0000 getinstancevariable                    :@foo, <is:0>             (   1)[Li]
    0003 dup
    0004 branchif                               13
    0006 pop
    0007 putobject                              123
    0009 dup
    0010 setinstancevariable                    :@foo, <is:0>
    0013 leave

This seems to be about 50% faster in this benchmark:

    require "benchmark/ips"

    class Foo
      def initialize
        @foo = nil
      end

      def test1
        @foo ||= 123
      end

      def test2
        @foo || (@foo = 123)
      end
    end

    FOO = Foo.new

    Benchmark.ips do |x|
      x.report("test1", "FOO.test1")
      x.report("test2", "FOO.test2")
    end

Before:

    $ ruby benchmark_ivar.rb
    Warming up --------------------------------------
                   test1     1.957M i/100ms
                   test2     3.125M i/100ms
    Calculating -------------------------------------
                   test1     20.030M (± 1.7%) i/s -    101.780M in   5.083040s
                   test2     31.227M (± 4.5%) i/s -    156.262M in   5.015936s

After:

    $ ./ruby benchmark_ivar.rb
    Warming up --------------------------------------
                   test1     3.205M i/100ms
                   test2     3.197M i/100ms
    Calculating -------------------------------------
                   test1     32.066M (± 1.1%) i/s -    163.440M in   5.097581s
                   test2     31.438M (± 4.9%) i/s -    159.860M in   5.098961s
2020-12-14 19:38:59 -08:00
Nobuyoshi Nakada 555bd83a8e
Raise when loading unprovided builtin function [Bug #17192] 2020-11-30 15:19:49 +09:00
Aaron Patterson 67b2c21c32
Add `GC.auto_compact= true/false` and `GC.auto_compact`
* `GC.auto_compact=`, `GC.auto_compact` can be used to control when
  compaction runs.  Setting `auto_compact=` to true will cause
  compaction to occurr duing major collections.  At the moment,
  compaction adds significant overhead to major collections, so please
  test first!

[Feature #17176]
2020-11-02 14:42:48 -08:00
wanabe 4f8d9b0db8 Revert "Use adjusted sp on `iseq_set_sequence()`" and "Delay `remove_unreachable_chunk()` after `iseq_set_sequence()`"
This reverts commit 3685ed7303 and 5dc107b03f.
Because of some CI failures https://github.com/ruby/ruby/pull/3404#issuecomment-719868313.
2020-10-31 11:56:41 +09:00
Nobuyoshi Nakada dd2f99d94a
Removed unused variable 2020-10-31 10:51:57 +09:00
wanabe 3685ed7303 Use adjusted sp on `iseq_set_sequence()` 2020-10-31 09:18:37 +09:00
wanabe 5dc107b03f Delay `remove_unreachable_chunk()` after `iseq_set_sequence()` 2020-10-31 09:18:37 +09:00
Nobuyoshi Nakada 799253dc46
strip trailing spaces [ci skip] 2020-10-30 12:26:59 +09:00
Koichi Sasada 07c03bc309 check isolated Proc more strictly
Isolated Proc prohibit to access outer local variables, but it was
violated by binding and so on, so they should be error.
2020-10-29 23:42:55 +09:00
Kenta Murata fb3c711df3
compile.c: separate compile_builtin_function_call (#3711) 2020-10-28 10:22:28 +09:00
Stefan Stüben 8c2e5bbf58 Don't redefine #rb_intern over and over again 2020-10-21 12:45:18 +09:00
Koichi Sasada f6661f5085 sync RClass::ext::iv_index_tbl
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.

This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
2020-10-17 08:18:04 +09:00
wanabe 65ae7f347a Adjust sp for `if true or ...`/`if false and ...` 2020-10-16 08:37:04 +09:00
wanabe ce7a053475 Adjust sp for `x = false; y = (return until x unless x)` [Bug #16695] 2020-10-16 08:37:04 +09:00
Aaron Patterson d528254095
Add missing WB for iseq
The write barrier wasn't being called for this object, so add the
missing WB.  Automatic compaction moved the reference because it didn't
know about the relationship (that's how I found the missing WB).
2020-10-07 17:01:14 -07:00
Nobuyoshi Nakada fced98f464
Added the room for builtin inline prefix 2020-10-03 12:19:56 +09:00
Nobuyoshi Nakada 5a665f6ce7
Check builtin inline function index overflow 2020-10-03 10:47:24 +09:00
Koichi Sasada 8d76b729a1 Put same frozen Range literal if possible
Range literal is now frozen so we can reuse same Range object if
the begin and the last are Numeric (frozen), such as `(1..2)`.
2020-10-02 09:22:17 +09:00
Nobuyoshi Nakada 7b2bea42a2
Unfreeze string-literal-only interpolated string-literal
[Feature #17104]
2020-09-30 22:15:28 +09:00
Benoit Daloze 9b535f3ff7 Interpolated strings are no longer frozen with frozen-string-literal: true
* Remove freezestring instruction since this was the only usage for it.
* [Feature #17104]
2020-09-15 21:32:35 +02:00
Koichi Sasada 79df14c04b Introduce Ractor mechanism for parallel execution
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.

[Feature #17100]

This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.

I hope this feature can help programmers from thread-safety issues.
2020-09-03 21:11:06 +09:00
Nobuyoshi Nakada a90f29ebb2
procnames-start-lines [ci skip] 2020-08-17 14:27:34 +09:00
Nobuyoshi Nakada 352e923242
Revisit "Refactor to reduce "swap" instruction of pattern matching"
Just moved "case base" after allocating cache space.
2020-08-17 14:25:09 +09:00
Kazuhiro NISHIYAMA 5849309c5a
Revert "Refactor to reduce "swap" instruction of pattern matching"
This reverts commit 3a4be429b5.

To fix following warning:

```
compiling ../compile.c
../compile.c:6336:20: warning: variable 'line' is uninitialized when used here [-Wuninitialized]
    ADD_INSN(head, line, putnil); /* allocate stack for cached #deconstruct value */
                   ^~~~
../compile.c:220:57: note: expanded from macro 'ADD_INSN'
  ADD_ELEM((seq), (LINK_ELEMENT *) new_insn_body(iseq, (line), BIN(insn), 0))
                                                        ^~~~
../compile.c:6327:13: note: initialize the variable 'line' to silence this warning
    int line;
            ^
             = 0
1 warning generated.
```
2020-08-17 09:28:15 +09:00
wanabe 3a4be429b5 Refactor to reduce "swap" instruction of pattern matching 2020-08-16 18:53:39 +09:00
wanabe 5c40c88a3e Adjust sp for `case ... in a: 0 ... end` 2020-08-16 18:39:08 +09:00
wanabe 691f10dd89 Adjust sp for `case ... in *, a, * end` 2020-08-16 18:39:08 +09:00
wanabe 6c407b3668 Adjust sp for `case ... in *v end`/`case ... in v1, v2 end` 2020-08-16 18:39:08 +09:00
wanabe c866d6563f Adjust sp for `case ... in v1 ... in v2 end` 2020-08-16 18:39:08 +09:00
wanabe d594078426 Adjust sp for `case ... in v1, v2 ... end` 2020-08-16 18:39:08 +09:00
wanabe 2bbb7c3d1f Adjust sp for `case ... in pat => var ... end` 2020-08-16 18:39:08 +09:00
wanabe 6bc0c6c18b Adjust sp for `case ... in pat1 | pat2 ... end` 2020-08-16 18:39:08 +09:00
wanabe 0759862458 Adjust sp for pattern matching implicit/explicit "else" 2020-08-16 18:39:08 +09:00
wanabe a7bd0ec570 Warn sp overwriting on compile time 2020-08-16 08:43:29 +09:00
wanabe ac399c2c7a Show hidden object and TS_BUILTIN for halfbaked insn data 2020-08-16 08:43:29 +09:00
Koichi Sasada a0f12a0258
Use ID instead of GENTRY for gvars. (#3278)
Use ID instead of GENTRY for gvars.

Global variables are compiled into GENTRY (a pointer to struct
rb_global_entry). This patch replace this GENTRY to ID and
make the code simple.

We need to search GENTRY from ID every time (st_lookup), so
additional overhead will be introduced.
However, the performance of accessing global variables is not
important now a day and this simplicity helps Ractor development.
2020-07-03 16:56:44 +09:00
卜部昌平 bacd03ebdf compile_redo: fix wrong condition 2020-06-29 11:05:41 +09:00
卜部昌平 9c92dcf366 ibf_dump_object_object: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a8d992ac00 compile_call: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 aa2cb7f722 compile_redo: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 cf29de7e6e compile_next: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 cc1e9b8e11 compile_break: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 1f90690a1d compile_branch_condition: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a6b1454a5d optimize_checktype: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a5342f46e6 iseq_set_exception_table: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
Nobuyoshi Nakada 25fbc88666
Removed non-ASCII code to suppress warnings by localized compilers 2020-06-27 23:15:13 +09:00
Kazuki Tsujimoto 600f3990d6
Cosmetic change 2020-06-27 13:53:47 +09:00
Vladimir Dementyev c9ee34a18b Add #deconstruct cache to find pattern 2020-06-27 13:51:03 +09:00
Vladimir Dementyev 5320375732 Optimize array pattern matching by caching #deconstruct value 2020-06-27 13:51:03 +09:00
Nobuyoshi Nakada 74c345c7b8 Removed no longer used flags 2020-06-27 10:55:18 +09:00
Nobuyoshi Nakada 61984d4760 Not to rewrite node while compiling
Moved this hack mark to an argument to `compile_hash`.
> Bad Hack: temporarily mark hash node with flag so
> compile_hash can compile call differently.
2020-06-27 10:55:18 +09:00
Takashi Kokubun 7561db8c00
Introduce Primitive.attr! to annotate 'inline' (#3242)
[Feature #15589]
2020-06-20 17:13:03 -07:00
Yusuke Endoh 50efa18c6c compile.c: Improve branch coverage instrumentation [Bug #16967]
Formerly, branch coverage measurement counters are generated for each
compilation traverse of the AST.  However, ensure clause node is
traversed twice; one is for normal-exit case (the resulted bytecode is
embedded in its outer scope), and the other is for exceptional case (the
resulted bytecode is used in catch table).  Two branch coverage counters
are generated for the two cases, but it is not desired.

This changeset revamps the internal representation of branch coverage
measurement.  Branch coverage counters are generated only at the first
visit of a branch node.  Visiting the same node reuses the
already-generated counter, so double counting is avoided.
2020-06-20 09:28:03 +09:00
Yusuke Endoh bc0aea0804 compile.c: pass NODE* instead of a quadruple of code location 2020-06-20 09:28:03 +09:00
Yusuke Endoh 4b523e79a0 compile.c (branch_coverage_valid_p): Refactored out 2020-06-20 09:28:03 +09:00
Yusuke Endoh 37cd877dbd compile.c: Use functions for building branch coverage instructions
instead of maros.  Just refactoring.
2020-06-20 09:28:03 +09:00
Nobuyoshi Nakada 49f0fd21e4 [Feature #16254] Allow `Primitive.func` style 2020-06-19 18:46:55 +09:00
Nobuyoshi Nakada c8703a17ce [Feature #16254] Allow `__builtin.func` style 2020-06-19 18:46:55 +09:00
Jeremy Evans aae8223c70 Dup splat array in certain cases where there is a block argument
This makes:

```ruby
  args = [1, 2, -> {}]; foo(*args, &args.pop)
```

call `foo` with 1, 2, and the lambda, in addition to passing the
lambda as a block.  This is different from the previous behavior,
which passed the lambda as a block but not as a regular argument,
which goes against the expected left-to-right evaluation order.

This is how Ruby already compiled arguments if using leading
arguments, trailing arguments, or keywords in the same call.

This works by disabling the optimization that skipped duplicating
the array during the splat (splatarray instruction argument
switches from false to true).  In the above example, the splat
call duplicates the array.  I've tested and cases where a
local variable or symbol are used do not duplicate the array,
so I don't expect this to decrease the performance of most Ruby
programs.  However, programs such as:

```ruby
  foo(*args, &bar)
```

could see a decrease in performance, if `bar` is a method call
and not a local variable.

This is not a perfect solution, there are ways to get around
this:

```ruby
  args = Struct.new(:a).new([:x, :y])
  def args.to_a; a; end
  def args.to_proc; a.pop; ->{}; end
  foo(*args, &args)
  # calls foo with 1 argument (:x)
  # not 2 arguments (:x and :y)
```

A perfect solution would require completely disabling the
optimization.

Fixes [Bug #16504]
Fixes [Bug #16500]
2020-06-18 08:19:33 -07:00
Nobuyoshi Nakada ccb7a4b9f2
Replaced accessors of `Struct` with `invokebuiltin` 2020-06-17 08:18:46 +09:00
Nobuyoshi Nakada 318d52e820
Revert "Replaced accessors of `Struct` with `invokebuiltin`"
This reverts commit 19cabe8b09,
which didn't support tool/lib/iseq_loader_checker.rb.
2020-06-16 18:44:58 +09:00
Nobuyoshi Nakada 19cabe8b09
Replaced accessors of `Struct` with `invokebuiltin` 2020-06-16 18:24:02 +09:00
Kazuki Tsujimoto ddded1157a
Introduce find pattern [Feature #16828] 2020-06-14 09:24:36 +09:00
Jeremy Evans 0ba27259d3 Fix crashes in the peephole optimizer on OpenBSD/sparc64
These crashes are due to alignment issues, casting ADJUST to INSN
and then accessing after the end of the ADJUST.  These patches
come from Stefan Sperling <stsp@apache.org>, who reported the
issue.
2020-06-08 11:11:27 -07:00
Nobuyoshi Nakada 6dcd10ce56
compile.c: Mark cursor in debug list 2020-05-31 03:07:30 +09:00
Nobuyoshi Nakada 2c711273bb
compile.c: Removed wrong conversion 2020-05-31 03:05:10 +09:00
Nobuyoshi Nakada 57fd44d374
Adjusted an indent 2020-05-30 21:20:50 +09:00
Koichi Sasada 03d7f3cdb2 add indent for debug disasm output 2020-05-29 11:25:45 +09:00
Nobuyoshi Nakada 79d9528ddc
Fixed potential memory leak
Create a wrapper object first, then buffer allocation which can
fail.
2020-05-22 06:50:29 +09:00
Aaron Patterson 891e253ee7
Use a pinning list for keeping objects alive during assembly.
The GC will not disassemble incomplete instruction sequences.  So it is
important that when instructions are being assembled, any objects the
instructions point at should not be moved.

This patch implements a fixed width array that pins its references.
When the instructions are done being assembled, the pinning array goes
away and the objects inside the iseqs are allowed to move.
2020-05-20 11:16:21 -07:00
Nobuyoshi Nakada c0cd474d4f
Prefer dedicated enum over int 2020-05-18 12:42:33 +09:00
Nobuyoshi Nakada b02c10b240
built-in method call must not have a receiver 2020-05-18 12:28:50 +09:00
卜部昌平 233c2018f1 drop varargs.h support
This header file is simply out of date (for decades since at least
1989).  It's the 21st century.  Just stop using it.
2020-05-11 14:56:51 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada 5d430c1b34
Added more NORETURN declarations 2020-05-11 00:40:14 +09:00
Kazuki Tsujimoto d6389224da
Fix pseudo code for NODE_ARYPTN, NODE_HSHPTN
Due to the change in 3893a8dd42,
there is no longer a need to put true/false.
2020-05-04 13:27:25 +09:00
Nobuyoshi Nakada 69b3e0ac59 Create succ_index_table as a part of `iseq_setup`
With compiling `CPDEBUG >= 2`, `rb_iseq_disasm` segfaults if this
table has not been created.  Also `ibf_load_iseq_each` calls
`rb_iseq_insns_info_encode_positions`.
2020-04-15 16:06:48 +09:00
Nobuyoshi Nakada 2c12d111cb
Disassemble nop-inserted list 2020-04-15 12:34:40 +09:00
Nobuyoshi Nakada de2afcbf0c
Show heading for update_catch_except_flags 2020-04-15 12:28:31 +09:00
Alan Wu 82fdffc5ec
Avoid UB with flexible array member
Accessing past the end of an array is technically UB. Use C99 flexible
array member instead to avoid the UB and simplify allocation size
calculation.

See also: DCL38-C in the SEI CERT C Coding Standard
2020-04-12 15:19:06 -04:00
Nobuyoshi Nakada e474c189da
Suppress -Wswitch warnings 2020-04-08 15:13:37 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00