Граф коммитов

1203 Коммитов

Автор SHA1 Сообщение Дата
Aaron Patterson 2599d1a8df Store the dup'd CDHASH in the object list during IBF load
Since b2fc592c30 nothing was holding a reference to the dup'd CDHASH
during IBF loading.  If a GC happened to run during IBF load then the
copied hash wouldn't have anything to keep it alive.  We don't really
want to keep the originally loaded CDHASH hash, so this patch just
overwrites the original hash with the copied / modified hash.

[Bug #17984] [ruby-core:104259]
2021-07-06 17:48:40 -07:00
eileencodes 31f4d26273 Check type of instruction - can be INSN or ADJUST
If the type is ADJUST we don't want to treat it like an INSN so we have
to check the type before reading from `insn_info.events`.

[Bug #18001] [ruby-core:104371]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-23 11:34:37 -07:00
eileencodes b91b3bc771 Add a cache for class variables
Redo of 34a2acdac788602c14bf05fb616215187badd504 and
931138b00696419945dc03e10f033b1f53cd50f3 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Yusuke Endoh 0a36cab1b5 Enable USE_ISEQ_NODE_ID by default
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.

See also ff69ef27b0.

https://bugs.ruby-lang.org/issues/17930
2021-06-18 03:35:38 +09:00
Yusuke Endoh dfba87cd62 Make it possible to get AST::Node from Thread::Backtrace::Location
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.

This change also includes iseq's dump/load support of node_ids for each
instructions.
2021-06-18 03:35:38 +09:00
Yusuke Endoh fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh acae5f363d ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Nobuyoshi Nakada e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada 9f3888d6a3 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_REGEXP
2021-06-03 15:11:18 +09:00
Nobuyoshi Nakada 37eb5e7439 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_BIGNUM
* T_FLOAT (non-flonum)
* T_RATIONAL
* T_COMPLEX
2021-06-03 15:11:18 +09:00
Takashi Kokubun 070caf54d2
Refactor rb_vm_insn_addr2insn calls
It's been a way too much amount of ifdefs.
2021-06-02 01:16:50 -07:00
Alan Wu 5ada23ac12 compile.c: Emit send for === calls in when statements
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls
=== without a call cache. Emit a send instruction to make the call
instead. It includes a call cache.

The call cache improves throughput of using when statements to check the
class of a given object. This is useful for say, JSON serialization.

Use of a regular send instead of checkmatch also avoids taking the VM
lock every time, which is good for multi-ractor workloads.

    Calculating -------------------------------------
                             master        post
         vm_case_classes    11.013M     16.172M i/s -      6.000M times in 0.544795s 0.371009s
             vm_case_lit      2.296       2.263 i/s -       1.000 times in 0.435606s 0.441826s
                 vm_case    74.098M     64.338M i/s -      6.000M times in 0.080974s 0.093257s

    Comparison:
                      vm_case_classes
                    post:  16172114.4 i/s
                  master:  11013316.9 i/s - 1.47x  slower

                          vm_case_lit
                  master:         2.3 i/s
                    post:         2.3 i/s - 1.01x  slower

                              vm_case
                  master:  74097858.6 i/s
                    post:  64338333.9 i/s - 1.15x  slower

The vm_case benchmark is a bit slower post patch, possibily due to the
larger instruction sequence. The benchmark dispatches using
opt_case_dispatch so was not running checkmatch and does not make the
=== call post patch.
2021-05-28 12:34:03 -04:00
Alan Wu 788d30a8b3 Make range literal peephole optimization target "newrange"
It looks for "checkmatch", when it could be applied to anything that has
"newrange".

Making the optimization target more ranges might only be fair play when
all ranges are frozen. So I'm putting a reference to the ticket that
froze all ranges.

[Feature #15504]
2021-05-28 12:34:03 -04:00
Alan Wu b2fc592c30
Build CDHASH properly when loading iseq from binary
Before this change, CDHASH operands were built as plain hashes when
loaded from binary. Without setting up the hash with the correct
st_table type, the hash can sometimes be an ar_table. When the hash is
an ar_table, lookups can call the `eql?` method on keys of the hash,
which makes the `opt_case_dispatch` instruction not "leaf" as it
implicitly declares.

The following script trips the stack canary for checking the leaf
attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by
default with RUBY_DEBUG).

    rb_vm_iseq = RubyVM::InstructionSequence

    iseq = rb_vm_iseq.compile(<<-EOF)
      case Class.new(String).new("foo")
      when "foo"
        42
      end
    EOF

    puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval

This commit changes the binary loading logic to build CDHASH with the
right st_table type. The dumping logic and the dump format stays the
same
2021-05-21 12:13:55 -04:00
Koichi Sasada 817764bd82 simple rescue+while+break should not use `throw`
609de71f04 fixes the issue by using
`throw` insn if `ensure` is used. However, that patch introduce
additional `throw` even if it is not needed. This patch solves
the issue.

This issue is pointed by @mame.
2021-05-21 18:12:14 +09:00
Yusuke Endoh 5026f9a5d5 compile.c: stop the jump-jump optimization if the second has any event
Fixes [Bug #17868]
2021-05-20 19:13:39 +09:00
Jeremy Evans 9ce29c94d8 Avoid improper optimization of case statements mixed integer/rational/complex
Fixes [Bug #17857]
2021-05-12 19:30:05 -07:00
卜部昌平 0ab0b86c84 cdhash_cmp: should use ||
cf: https://github.com/ruby/ruby/pull/4469#discussion_r628386707
2021-05-12 10:30:46 +09:00
卜部昌平 e1eff837cf cdhash_cmp: recursively apply
For instance a rational's numerator can be a bignum.  Comparison using
C's == can be insufficient.
2021-05-12 10:30:46 +09:00
卜部昌平 cc0dc67bbb cdhash_cmp: can also take complex
There are complex literals `123i`, which can also be a case condition.
2021-05-12 10:30:46 +09:00
卜部昌平 d0e6c6e682 cdhash_cmp: rational literals with fractions
Nobu kindly pointed out that rational literals can have fractions.
2021-05-12 10:30:46 +09:00
卜部昌平 2bc293e899 cdhash_cmp: can take rational literals
Rational literals are those integers suffixed with `r`.  They tend to
be a part of more complex expressions like `123/456r`, but in theory
they can live alone.  When such "bare" rational literals are passed to
case-when branch, we have to take care of them.  Fixes [Bug #17854]
2021-05-12 10:30:46 +09:00
Aaron Patterson 07f055bb13
Revert "Filling cache values on cvar write"
This reverts commit 08de37f9fa.
This reverts commit e8ae922b62.
2021-05-11 13:31:00 -07:00
eileencodes 08de37f9fa Filling cache values on cvar write
Instead of on read. Once it's in the inline cache we never have to make
one again. We want to eventually put the value into the cache, and the
best opportunity to do that is when you write the value.
2021-05-11 12:04:27 -07:00
eileencodes e8ae922b62 Add a cache for class variables
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-05-11 12:04:27 -07:00
Yusuke Endoh ff69ef27b0 compile.c: Pass node instead of nd_line(node) to ADD_INSN* functions
... then, new_insn_core extracts nd_line(node).

Also, if a macro "EXPERIMENTAL_ISEQ_NODE_ID" is defined, this changeset
keeps nd_node_id(node) for each instruction. This is intended for
TypeProf to identify what AST::Node corresponds to each instruction.

This patch is originally authored by @yui-knk for showing which column a
NoMethodError occurred.

https://github.com/ruby/ruby/compare/master...yui-knk:feature/node_id

Co-Authored-By: Yuichiro Kaneko <yui-knk@ruby-lang.org>
2021-05-07 17:02:15 +09:00
Koichi Sasada 609de71f04 fix raise in exception with jump
add_ensure_iseq() adds ensure block to the end of
jump such as next/redo/return. However, if the rescue
cause are in the body, this rescue catches the exception
in ensure clause.

  iter do
    next
  rescue
    R
  ensure
    raise
  end

In this case, R should not be executed, but executed without this patch.

Fixes [Bug #13930]
Fixes [Bug #16618]

A part of tests are written by @jeremyevans https://github.com/ruby/ruby/pull/4291
2021-04-22 11:33:39 +09:00
Jeremy Evans 50c54d40a8
Evaluate multiple assignment left hand side before right hand side
In regular assignment, Ruby evaluates the left hand side before
the right hand side.  For example:

```ruby
foo[0] = bar
```

Calls `foo`, then `bar`, then `[]=` on the result of `foo`.

Previously, multiple assignment didn't work this way.  If you did:

```ruby
abc.def, foo[0] = bar, baz
```

Ruby would previously call `bar`, then `baz`, then `abc`, then
`def=` on the result of `abc`, then `foo`, then `[]=` on the
result of `foo`.

This change makes multiple assignment similar to single assignment,
changing the evaluation order of the above multiple assignment code
to calling `abc`, then `foo`, then `bar`, then `baz`, then `def=` on
the result of `abc`, then `[]=` on the result of `foo`.

Implementing this is challenging with the stack-based virtual machine.
We need to keep track of all of the left hand side attribute setter
receivers and setter arguments, and then keep track of the stack level
while handling the assignment processing, so we can issue the
appropriate topn instructions to get the receiver.  Here's an example
of how the multiple assignment is executed, showing the stack and
instructions:

```
self                                      # putself
abc                                       # send
abc, self                                 # putself
abc, foo                                  # send
abc, foo, 0                               # putobject 0
abc, foo, 0, [bar, baz]                   # evaluate RHS
abc, foo, 0, [bar, baz], baz, bar         # expandarray
abc, foo, 0, [bar, baz], baz, bar, abc    # topn 5
abc, foo, 0, [bar, baz], baz, abc, bar    # swap
abc, foo, 0, [bar, baz], baz, def=        # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz], baz, foo         # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0      # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0, baz # topn 2
abc, foo, 0, [bar, baz], baz, []=         # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz]                   # pop
[bar, baz], foo, 0, [bar, baz]            # setn 3
[bar, baz], foo, 0                        # pop
[bar, baz], foo                           # pop
[bar, baz]                                # pop
```

As multiple assignment must deal with splats, post args, and any level
of nesting, it gets quite a bit more complex than this in non-trivial
cases. To handle this, struct masgn_state is added to keep
track of the overall state of the mass assignment, which stores a linked
list of struct masgn_attrasgn, one for each assigned attribute.

This adds a new optimization that replaces a topn 1/pop instruction
combination with a single swap instruction for multiple assignment
to non-aref attributes.

This new approach isn't compatible with one of the optimizations
previously used, in the case where the multiple assignment return value
was not needed, there was no lhs splat, and one of the left hand side
used an attribute setter.  This removes that optimization. Removing
the optimization allowed for removing the POP_ELEMENT and adjust_stack
functions.

This adds a benchmark to measure how much slower multiple
assignment is with the correct evaluation order.

This benchmark shows:

* 4-9% decrease for attribute sets
* 14-23% decrease for array member sets
* Basically same speed for local variable sets

Importantly, it shows no significant difference between the popped
(where return value of the multiple assignment is not needed) and
!popped (where return value of the multiple assignment is needed)
cases for attribute and array member sets.  This indicates the
previous optimization, which was dropped in the evaluation
order fix and only affected the popped case, is not important to
performance.

Fixes [Bug #4443]
2021-04-21 10:49:19 -07:00
Jeremy Evans 7b3c5ab8a5 Make defined? cache the results of method calls
Previously, defined? could result in many more method calls than
the code it was checking. `defined? a.b.c.d.e.f` generated 15 calls,
with `a` called 5 times, `b` called 4 times, etc..  This was due to
the fact that defined works in a recursive manner, but it previously
did not cache results.  So for `defined? a.b.c.d.e.f`, the logic was
similar to

```ruby
return nil unless defined? a
return nil unless defined? a.b
return nil unless defined? a.b.c
return nil unless defined? a.b.c.d
return nil unless defined? a.b.c.d.e
return nil unless defined? a.b.c.d.e.f
"method"
```

With this change, the logic is similar to the following, without
the creation of a local variable:

```ruby
return nil unless defined? a
_ = a
return nil unless defined? _.b
_ = _.b
return nil unless defined? _.c
_ = _.c
return nil unless defined? _.d
_ = _.d
return nil unless defined? _.e
_ = _.e
return nil unless defined? _.f
"method"
```

In addition to eliminating redundant method calls for defined
statements, this greatly simplifies the instruction sequences by
eliminating duplication.  Previously:

```
0000 putnil                                                           (   1)[Li]
0001 putself
0002 defined                                func, :a, false
0006 branchunless                           73
0008 putself
0009 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0011 defined                                method, :b, false
0015 branchunless                           73
0017 putself
0018 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0020 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0022 defined                                method, :c, false
0026 branchunless                           73
0028 putself
0029 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0031 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0033 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0035 defined                                method, :d, false
0039 branchunless                           73
0041 putself
0042 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0044 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0046 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0048 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0050 defined                                method, :e, false
0054 branchunless                           73
0056 putself
0057 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0059 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0061 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0063 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0065 opt_send_without_block                 <calldata!mid:e, argc:0, ARGS_SIMPLE>
0067 defined                                method, :f, true
0071 swap
0072 pop
0073 leave
```

After change:

```
0000 putnil                                                           (   1)[Li]
0001 putself
0002 dup
0003 defined                                func, :a, false
0007 branchunless                           52
0009 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0011 dup
0012 defined                                method, :b, false
0016 branchunless                           52
0018 opt_send_without_block                 <calldata!mid:b, argc:0, ARGS_SIMPLE>
0020 dup
0021 defined                                method, :c, false
0025 branchunless                           52
0027 opt_send_without_block                 <calldata!mid:c, argc:0, ARGS_SIMPLE>
0029 dup
0030 defined                                method, :d, false
0034 branchunless                           52
0036 opt_send_without_block                 <calldata!mid:d, argc:0, ARGS_SIMPLE>
0038 dup
0039 defined                                method, :e, false
0043 branchunless                           52
0045 opt_send_without_block                 <calldata!mid:e, argc:0, ARGS_SIMPLE>
0047 defined                                method, :f, true
0051 swap
0052 pop
0053 leave
```

This fixes issues where for pathological small examples, Ruby would generate
huge instruction sequences.

Unfortunately, implementing this support is kind of a hack.  This adds another
parameter to compile_call for whether we should assume the receiver is already
present on the stack, and has defined? set that parameter for the specific
case where it is compiling a method call where the receiver is also a method
call.

defined_expr0 also takes an additional parameter for whether it should leave
the results of the method call on the stack.  If that argument is true, in
the case where the method isn't defined, we jump to the pop before the leave,
so the extra result is not left on the stack.  This requires space for an
additional label, so lfinish now needs to be able to hold 3 labels.

Fixes [Bug #17649]
Fixes [Bug #13708]
2021-03-29 07:45:15 -07:00
Kazuki Tsujimoto 21863470d9
Pattern matching pin operator against expression [Feature #17411]
This commit is based on the patch by @nobu.
2021-03-21 15:14:31 +09:00
Aaron Patterson 17bf478de1 Store strings for `defined` in the iseqs
We can know the string used for "defined" calls at compile time, then
store the string in the instruction sequences
2021-03-17 10:55:37 -07:00
Jean Boussier ef88225886 Simplify ibf_dump_object_symbol by delegating to ibf_dump_object_string 2021-03-10 13:44:07 -08:00
Jean Boussier 2de7fbcdbb Pre-freeze ISeq names to avoid useless duplication 2021-03-10 13:44:07 -08:00
Jean Boussier d00e7deb5c Use rb_enc_interned_str in ibf_load_object_string 2021-03-10 13:44:07 -08:00
Jean Boussier 8463c8a425 Specialize ibf_load_object_symbol and ibf_dump_object_symbol 2021-03-10 13:44:07 -08:00
Aaron Patterson 938e027cdf Eliminate useless catch tables and nops from lambdas
Before this commit:

```
$ ruby --dump=insn -e '1.times { |x| puts x }'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE)
== catch table
| catch type: break  st: 0000 ed: 0004 sp: 0000 cont: 0004
| == disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE)
| == catch table
| | catch type: redo   st: 0001 ed: 0006 sp: 0000 cont: 0001
| | catch type: next   st: 0001 ed: 0006 sp: 0000 cont: 0006
| |------------------------------------------------------------------------
| local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
| [ 1] x@0<Arg>
| 0000 nop                                                              (   1)[Bc]
| 0001 putself                                [Li]
| 0002 getlocal_WC_0                          x@0
| 0004 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
| 0006 leave                                  [Br]
|------------------------------------------------------------------------
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 send                                   <calldata!mid:times, argc:0>, block in <main>
0004 leave
```

After this commit:

```
> ruby --dump=insn -e '1.times { |x| puts x }'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE)
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 send                                   <calldata!mid:times, argc:0>, block in <main>
0004 leave

== disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 putself                                                          (   1)[LiBc]
0001 getlocal_WC_0                          x@0
0003 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0005 leave
```

Fixes [ruby-core:102418] [Feature #17613]

Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2021-02-16 14:00:36 -08:00
Vladimir Dementyev 1b89b99941
Mark pattern labels as unremoveable
Peephole optimization doesn't play well with find pattern at
least. The only case when a pattern matching could have
unreachable patterns is when we have lasgn/dasgn node, which
shouldn't happen in real-life.

Fixes https://bugs.ruby-lang.org/issues/17534
2021-01-19 08:34:01 +09:00
Aaron Patterson 5e26619660
Fix WB for callinfo
The WB for callinfo needs to be executed *after* the reference is
written.  Otherwise we get a WB miss.
2021-01-14 09:55:54 -08:00
Aaron Patterson efcdf68e64 Guard callinfo
Callinfo was being written in to an array and the GC would not see the
reference on the stack.  `new_insn_send` creates a new callinfo object,
then it calls `new_insn_core`.  `new_insn_core` allocates a new INSN
linked list item, which can end up calling `xmalloc` which will trigger
a GC:

  70cd351c7c/compile.c (L968-L969)

Since the callinfo object isn't on the stack, the GC won't see it, and
it can get collected.  This patch just refactors `new_insn_send` to keep
the object on the stack

Co-authored-by: John Hawthorn <john@hawthorn.email>
2021-01-13 16:13:53 -08:00
Aaron Patterson c8b47eb7c9 only add the trailing nop if the catch table is not break / next / redo
We don't need nop padding when the catch tables are only for break /
next / redo, so lets avoid them.  This eliminates nop padding in
many lambdas.

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-01-13 14:56:05 -08:00
Koichi Sasada e7fc353f04 enable constant cache on ractors
constant cache `IC` is accessed by non-atomic manner and there are
thread-safety issues, so Ruby 3.0 disables to use const cache on
non-main ractors.

This patch enables it by introducing `imemo_constcache` and allocates
it by every re-fill of const cache like `imemo_callcache`.
[Bug #17510]

Now `IC` only has one entry `IC::entry` and it points to
`iseq_inline_constant_cache_entry`, managed by T_IMEMO object.

`IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and
`rb_mjit_after_vm_ic_update()` is not needed.
2021-01-05 02:27:58 +09:00
Nobuyoshi Nakada 7156248137
Hoisted out compile_builtin_arg to refine messages 2021-01-01 23:32:07 +09:00
Nobuyoshi Nakada 0fbf4d0374
Access to reserved word parameter like as `__builtin.arg!(:if)` 2020-12-31 15:11:38 +09:00
Nobuyoshi Nakada c8010fcec0
Dup kwrest hash when merging other keyword arguments [Bug #17481] 2020-12-28 01:52:18 +09:00
Nobuyoshi Nakada d143b75f8e
Adjusted indents [ci skip] 2020-12-25 00:56:17 +09:00
John Hawthorn 40b7358e93 Skip defined check in NODE_OP_ASGN_OR with ivar
Previously we would add code to check if an ivar was defined when using
`@foo ||= 123`, which was slower than `@foo || (@foo = 123)` when `@foo`
was already defined.

Recently 01b7d5acc7 made it so that
accessing an undefined variable no longer generates a warning, making
the defined check unnecessary and both statements exactly equal.

This commit avoids emitting the defined instruction when compiling
NODE_OP_ASGN_OR with a NODE_IVAR.

Before:

    $ ruby --dump=insn -e '@foo ||= 123'
    == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
    0000 putnil                                                           (   1)[Li]
    0001 defined                      instance-variable, :@foo, false
    0005 branchunless                 14
    0007 getinstancevariable          :@foo, <is:0>
    0010 dup
    0011 branchif                     20
    0013 pop
    0014 putobject                    123
    0016 dup
    0017 setinstancevariable          :@foo, <is:0>
    0020 leave

After:

    $ ./ruby --dump=insn -e '@foo ||= 123'
    == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
    0000 getinstancevariable                    :@foo, <is:0>             (   1)[Li]
    0003 dup
    0004 branchif                               13
    0006 pop
    0007 putobject                              123
    0009 dup
    0010 setinstancevariable                    :@foo, <is:0>
    0013 leave

This seems to be about 50% faster in this benchmark:

    require "benchmark/ips"

    class Foo
      def initialize
        @foo = nil
      end

      def test1
        @foo ||= 123
      end

      def test2
        @foo || (@foo = 123)
      end
    end

    FOO = Foo.new

    Benchmark.ips do |x|
      x.report("test1", "FOO.test1")
      x.report("test2", "FOO.test2")
    end

Before:

    $ ruby benchmark_ivar.rb
    Warming up --------------------------------------
                   test1     1.957M i/100ms
                   test2     3.125M i/100ms
    Calculating -------------------------------------
                   test1     20.030M (± 1.7%) i/s -    101.780M in   5.083040s
                   test2     31.227M (± 4.5%) i/s -    156.262M in   5.015936s

After:

    $ ./ruby benchmark_ivar.rb
    Warming up --------------------------------------
                   test1     3.205M i/100ms
                   test2     3.197M i/100ms
    Calculating -------------------------------------
                   test1     32.066M (± 1.1%) i/s -    163.440M in   5.097581s
                   test2     31.438M (± 4.9%) i/s -    159.860M in   5.098961s
2020-12-14 19:38:59 -08:00
Nobuyoshi Nakada 555bd83a8e
Raise when loading unprovided builtin function [Bug #17192] 2020-11-30 15:19:49 +09:00
Aaron Patterson 67b2c21c32
Add `GC.auto_compact= true/false` and `GC.auto_compact`
* `GC.auto_compact=`, `GC.auto_compact` can be used to control when
  compaction runs.  Setting `auto_compact=` to true will cause
  compaction to occurr duing major collections.  At the moment,
  compaction adds significant overhead to major collections, so please
  test first!

[Feature #17176]
2020-11-02 14:42:48 -08:00
wanabe 4f8d9b0db8 Revert "Use adjusted sp on `iseq_set_sequence()`" and "Delay `remove_unreachable_chunk()` after `iseq_set_sequence()`"
This reverts commit 3685ed7303 and 5dc107b03f.
Because of some CI failures https://github.com/ruby/ruby/pull/3404#issuecomment-719868313.
2020-10-31 11:56:41 +09:00
Nobuyoshi Nakada dd2f99d94a
Removed unused variable 2020-10-31 10:51:57 +09:00