Граф коммитов

1023 Коммитов

Автор SHA1 Сообщение Дата
Koichi Sasada df48db987d `mandatory_only_cme` should not be in `def`
`def` (`rb_method_definition_t`) is shared by multiple callable
method entries (cme, `rb_callable_method_entry_t`).

There are two issues:

* old -> young reference: `cme1->def->mandatory_only_cme = monly_cme`
  if `cme1` is young and `monly_cme` is young, there is no problem.
  Howevr, another old `cme2` can refer `def`, in this case, old `cme2`
  points young `monly_cme` and it violates gengc assumption.
* cme can have different `defined_class` but `monly_cme` only has
  one `defined_class`. It does not make sense and `monly_cme`
  should be created for a cme (not `def`).

To solve these issues, this patch allocates `monly_cme` per `cme`.
`cme` does not have another room to store a pointer to the `monly_cme`,
so this patch introduces `overloaded_cme_table`, which is weak key map
`[cme] -> [monly_cme]`.

`def::body::iseqptr::monly_cme` is deleted.

The first issue is reported by Alan Wu.
2021-12-21 11:03:09 +09:00
John Hawthorn 733500e9d0
Lazily create singletons on instance_{exec,eval} (#5146)
* Lazily create singletons on instance_{exec,eval}

Previously when instance_exec or instance_eval was called on an object,
that object would be given a singleton class so that method
definitions inside the block would be added to the object rather than
its class.

This commit aims to improve performance by delaying the creation of the
singleton class unless/until one is needed for method definition. Most
of the time instance_eval is used without any method definition.

This was implemented by adding a flag to the cref indicating that it
represents a singleton of the object rather than a class itself. In this
case CREF_CLASS returns the object's existing class, but in cases that
we are defining a method (either via definemethod or
VM_SPECIAL_OBJECT_CBASE which is used for undef and alias).

This also happens to fix what I believe is a bug. Previously
instance_eval behaved differently with regards to constant access for
true/false/nil than for all other objects. I don't think this was
intentional.

    String::Foo = "foo"
    "".instance_eval("Foo")   # => "foo"
    Integer::Foo = "foo"
    123.instance_eval("Foo")  # => "foo"
    TrueClass::Foo = "foo"
    true.instance_eval("Foo") # NameError: uninitialized constant Foo

This also slightly changes the error message when trying to define a method
through instance_eval on an object which can't have a singleton class.

Before:

    $ ruby -e '123.instance_eval { def foo; end }'
    -e:1:in `block in <main>': no class/module to add method (TypeError)

After:

    $ ./ruby -e '123.instance_eval { def foo; end }'
    -e:1:in `block in <main>': can't define singleton (TypeError)

IMO this error is a small improvement on the original and better matches
the (both old and new) message when definging a method using `def self.`

    $ ruby -e '123.instance_eval{ def self.foo; end }'
    -e:1:in `block in <main>': can't define singleton (TypeError)

Co-authored-by: Matthew Draper <matthew@trebex.net>

* Remove "under" argument from yield_under

* Move CREF_SINGLETON_SET into vm_cref_new

* Simplify vm_get_const_base

* Fix leaf VM_SPECIAL_OBJECT_CONST_BASE

Co-authored-by: Matthew Draper <matthew@trebex.net>
2021-12-02 15:53:39 -08:00
Alan Wu 9121e57a5f Rework tracing for blocks running as methods
The main impetus for this change is to fix [Bug #13392]. Previously, we
fired the "return" TracePoint event after popping the stack frame for
the block running as method (BMETHOD). This gave undesirable source
location outputs as the return event normally fires right before the
frame going away.

The iseq for each block can run both as a block and as a method. To
accommodate that, this commit makes vm_trace() fire call/return events for
instructions that have b_call/b_return events attached when the iseq is
running as a BMETHOD. The logic for rewriting to "trace_*" instruction
is tweaked so that when the user listens to call/return events,
instructions with b_call/b_return become trace variants.

To continue to provide the return value for non-local returns done using
the "return" or "break" keyword inside BMETHODs, the stack unwinding
code is tweaked. b_return events now provide the same return value as
return events for these non-local cases. A pre-existing test deemed not
providing a return value for these b_return events as a limitation.

This commit removes the checks for call/return TracePoint events that
happen when calling into BMETHODs when no TracePoints are active.
Technically, migrating just the return event is enough to fix the bug,
but migrating both call and return removes our reliance on
`VM_FRAME_FLAG_FINISH` and re-entering the interpreter when the caller
is already in the interpreter.
2021-12-01 17:42:33 -05:00
Eileen M. Uchitelle 459f9e3df8
Add setclassvariable to yjit (#5127)
Implements setclassvariable in yjit. Note that this version is not
faster than the standard version because we aren't handling the inline
cache in assembly. This is still important to implement because it will
prevent yjit from exiting in methods that call both a cvar setter and
other code that yjit can compile.

Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
2021-11-23 14:09:24 -05:00
Nobuyoshi Nakada 8f3432cd44
Fix setting struct member by public_send 2021-11-21 00:31:51 +09:00
Koichi Sasada 82ea287018 optimize `Struct` getter/setter
Introduce new optimized method type
`OPTIMIZED_METHOD_TYPE_STRUCT_AREF/ASET` with index information.
2021-11-19 08:32:39 +09:00
Koichi Sasada be71c95b88 `rb_method_optimized_t` for further extension
Now `rb_method_optimized_t optimized` field is added to represent
optimized method type.
2021-11-19 08:32:39 +09:00
Jeremy Evans b08dacfea3
Optimize dynamic string interpolation for symbol/true/false/nil/0-9
This provides a significant speedup for symbol, true, false,
nil, and 0-9, class/module, and a small speedup in most other cases.

Speedups (using included benchmarks):
:symbol        :: 60%
0-9            :: 50%
Class/Module   :: 50%
nil/true/false :: 20%
integer        :: 10%
[]             :: 10%
""             :: 3%

One reason this approach is faster is it reduces the number of
VM instructions for each interpolated value.

Initial idea, approach, and benchmarks from Eric Wong. I applied
the same approach against the master branch, updating it to handle
the significant internal changes since this was first proposed 4
years ago (such as CALL_INFO/CALL_CACHE -> CALL_DATA). I also
expanded it to optimize true/false/nil/0-9/class/module, and added
handling of missing methods, refined methods, and RUBY_DEBUG.

This renames the tostring insn to anytostring, and adds an
objtostring insn that implements the optimization. This requires
making a few functions non-static, and adding some non-static
functions.

This disables 4 YJIT tests.  Those tests should be reenabled after
YJIT optimizes the new objtostring insn.

Implements [Feature #13715]

Co-authored-by: Eric Wong <e@80x24.org>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Yusuke Endoh <mame@ruby-lang.org>
Co-authored-by: Koichi Sasada <ko1@atdot.net>
2021-11-18 15:10:20 -08:00
Eileen M. Uchitelle ea02b93bb9
Refactor setclassvariable (#5143)
We only need the cref when we have a cache miss so don't look it up until we
need it. This likely speeds up class variable writes in the interpreter but
also simplifies the jit code.

Before

```
Warming up --------------------------------------
        write a cvar   192.280k i/100ms
Calculating -------------------------------------
        write a cvar      1.915M (± 3.5%) i/s -      9.614M in   5.026694s
```

After

```
Warming up --------------------------------------
        write a cvar   216.308k i/100ms
Calculating -------------------------------------
        write a cvar      2.140M (± 3.1%) i/s -     10.815M in   5.058079s
```

Followup to ruby/ruby#5137
2021-11-18 16:17:40 -05:00
Eileen M. Uchitelle ec574ab345
Refactor getclassvariable (#5137)
* Refactor getclassvariable

We only need the cref when we have a cache miss so don't look it up until we
need it. This speeds up class variable reads in the interpreter but
also simplifies the jit code.

Benchmarks for master vs this branch (without yjit):

Before:

```
Warming up --------------------------------------
         read a cvar     1.276M i/100ms
Calculating -------------------------------------
         read a cvar     12.596M (± 1.7%) i/s -     63.781M in   5.064902s
```

After:

```
Warming up --------------------------------------
         read a cvar     1.336M i/100ms
Calculating -------------------------------------
         read a cvar     13.114M (± 3.6%) i/s -     65.488M in   5.000584s
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>

* Clean up function signatures / remove dead code

rb_vm_getclassvariable signature has changed and we don't need
rb_vm_get_cref.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-11-18 12:11:53 -05:00
Koichi Sasada b95d7d2099 no need to check `cme == NULL`
Now `cc->cme_` is not NULL.
2021-11-17 22:21:42 +09:00
Koichi Sasada b2255153cf `vm_empty_cc_for_super`
Same as `vm_empty_cc`, introduce a global variable which has
`.call_ = vm_call_super_method`. Use it if the `cme == NULL` on
`vm_search_super_method`.
2021-11-17 22:21:42 +09:00
Koichi Sasada 2d1a7bed03 a variable is not needed. 2021-11-17 22:21:42 +09:00
Jean Boussier 1af8ed5f0a `Primitive.mandatory_only?` consider splat args
`vm_ci_argc` gives the number of arguments, but `*[1, 2, 3]` only
counts for one.
2021-11-17 06:38:03 +09:00
Koichi Sasada b1b73936c1 `Primitive.mandatory_only?` for fast path
Compare with the C methods, A built-in methods written in Ruby is
slower if only mandatory parameters are given because it needs to
check the argumens and fill default values for optional and keyword
parameters (C methods can check the number of parameters with `argc`,
so there are no overhead). Passing mandatory arguments are common
(optional arguments are exceptional, in many cases) so it is important
to provide the fast path for such common cases.

`Primitive.mandatory_only?` is a special builtin function used with
`if` expression like that:

```ruby
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    if Primitive.mandatory_only?
      Primitive.time_s_at1(time)
    else
      Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
    end
  end
```

and it makes two ISeq,

```
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
  end

  def self.at(time)
    Primitive.time_s_at1(time)
  end
```

and (2) is pointed by (1). Note that `Primitive.mandatory_only?`
should be used only in a condition of an `if` statement and the
`if` statement should be equal to the methdo body (you can not
put any expression before and after the `if` statement).

A method entry with `mandatory_only?` (`Time.at` on the above case)
is marked as `iseq_overload`. When the method will be dispatch only
with mandatory arguments (`Time.at(0)` for example), make another
method entry with ISeq (2) as mandatory only method entry and it
will be cached in an inline method cache.

The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254
but it only checks mandatory parameters or more, because many cases
only mandatory parameters are given. If we find other cases (optional
or keyword parameters are used frequently and it hurts performance),
we can extend the feature.
2021-11-15 15:58:56 +09:00
Peter Zhu 84202963c5 [Bug #18329] Fix crash when calling non-existent super method
The cme is NULL when a method does not exist, so check it before
accessing the callcache.
2021-11-11 14:08:38 -05:00
Yusuke Endoh c1228f833c vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine
This caused Bus error on 32 bit Solaris
2021-10-29 10:57:46 +09:00
Satoshi Moris Tagomori 489e5e3a82 the core problem is the Proc is not shareable 2021-10-27 16:13:43 +09:00
Alan Wu b74d6563a6 Extract yjit_force_iv_index and make it work when object is frozen
In an effort to simplify the logic YJIT generates for accessing instance
variable, YJIT ensures that a given name-to-index mapping exists at
compile time. In the case that the mapping doesn't exist, it was created
by using rb_ivar_set() with Qundef on the sample object we see at
compile time. This hack isn't fine if the sample object happens to be
frozen, in which case YJIT would raise a FrozenError unexpectedly.

To deal with this, make a new function that only reserves the mapping
but doesn't touch the object. This is rb_obj_ensure_iv_index_mapping().
This new function superceeds the functionality of rb_iv_index_tbl_lookup()
so it was removed.

Reported by and includes a test case from John Hawthorn <john@hawthorn.email>

Fixes: GH-282
2021-10-20 18:19:43 -04:00
Alan Wu 5906a5a732 Add comments about special runtime routines YJIT calls
When YJIT make calls to routines without reconstructing interpreter
state through jit_prepare_routine_call(), it relies on the routine to
never allocate, raise, and push/pop control frames. Comment about this
on the routines that YJTI calls.

This is probably something we should dynamically verify on debug builds.
It's hard to statically verify this as it requires verifying all
functions in the call tree. Maybe something to look at in the future.
2021-10-20 18:19:43 -04:00
Alan Wu 7c08538aa3 Cleanup diff against upstream. Add comments
I did a `git diff --stat` against upstream and looked at all the files
that are outside of YJIT to come up with these minor changes.
2021-10-20 18:19:42 -04:00
Alan Wu f6da559d5b Put YJIT into a single compilation unit
For upstreaming, we want functions we export either prefixed with "rb_"
or made static. Historically we haven't been following this rule, so we
were "leaking" a lot of symbols as `make leak-globals` would tell us.

This change unifies everything YJIT into a single compilation unit,
yjit.o, and makes everything unprefixed static to pass `make leak-globals`.
This manual "unified build" setup is similar to that of vm.o.

Having everything in one compilation unit allows static functions to
be visible across YJIT files and removes the need for declarations in
headers in some cases. Unnecessary declarations were removed.

Other changes of note:
  - switched to MJIT_SYMBOL_EXPORT_BEGIN which indicates stuff as being
    off limits for native extensions
  - the first include of each YJIT file is change to be "internal.h"
  - undefined MAP_STACK before explicitly redefining it since it
    collide's with a definition in system headers. Consider renaming?
2021-10-20 18:19:42 -04:00
eileencodes c8e157bb5c Implement getclassvariable in yjit
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-10-20 18:19:42 -04:00
Alan Wu 78b5e95e41 Add a slowpath for opt_getinlinecache
Before this change, when we encounter a constant cache that is specific
to a lexical scope, we unconditionally exit. This change falls back to
the interpreter's cache in this situation.

This should help constant expressions in `class << self`, which is popular
at Shopify due to the style guide.

This change relies on the cache being warm while compiling to detect the
need for checking the lexical scope for simplicity.
2021-10-20 18:19:41 -04:00
John Hawthorn 5e37f280d1 Remove vm_opt_aset 2021-10-20 18:19:41 -04:00
Aaron Patterson bf8557f487 Add comments for new function 2021-10-20 18:19:40 -04:00
Aaron Patterson 5bc0343261 Refactor attrset to use a function
This new function will do the write barrier / resize the object / check
frozen for us
2021-10-20 18:19:40 -04:00
John Hawthorn 10f1d808d5 Remove rb_opt_equality_specialized 2021-10-20 18:19:40 -04:00
Kevin Newton be648e0940 Implement splatarray 2021-10-20 18:19:37 -04:00
Maxime Chevalier-Boisvert da30f21ab5 Try to fix MJIT symbol clash with cargo cult 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert 54fe43b45c Implement defined bytecode (#39) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert d6412126bc Implement setivar with a plain old function call (#34)
* Implement setivar with a plain old function call

* Remove return
2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert 0c3842d154 Implement opt_aset as interpreter handler call 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert c9feb72b65 Implement opt_mod as call to interpreter function (#29) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert e2c1d69331 Implement opt_eq by calling interpreter function (#28) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert e5f8b41786 Implement send with alias method (#23)
* Implement send with alias method

* Add alias_method tests
2021-10-20 18:19:34 -04:00
Alan Wu 36134f7d29 Implement calls to methods with simple optional params
* Implement calls to methods with simple optional params

* Remove unnecessary MJIT_STATIC

See comment for MJIT_STATIC. I added it not knowing whether it's
required because the function next to it has it. Don't use it and wait
for problems to come up instead.

* Better naming, some comments

* Count bailing on kw only iseqs

On railsbench:
```
opt_send_without_block exit reasons:
                  bmethod      59729 (27.7%)
         optimized_method      59137 (27.5%)
      iseq_complex_callee      41362 (19.2%)
             alias_method      33346 (15.5%)
      callsite_not_simple      19170 ( 8.9%)
       iseq_only_keywords       1300 ( 0.6%)
                 kw_splat       1299 ( 0.6%)
    cfunc_ruby_array_varg         18 ( 0.0%)
```
2021-10-20 18:19:34 -04:00
Alan Wu b626dd7211 YJIT: Fancier opt_getinlinecache
Make sure `opt_getinlinecache` is in a block all on its own, and
invalidate it from the interpreter when `opt_setinlinecache`.
It will recompile with a filled cache the second time around.
This lets YJIT runs well when the IC for constant is cold.
2021-10-20 18:19:33 -04:00
Alan Wu 5d834bcf9f YJIT: lazy polymorphic getinstancevariable
Lazily compile out a chain of checks for different known classes and
whether `self` embeds its ivars or not.

* Remove trailing whitespaces

* Get proper addresss in Capstone disassembly

* Lowercase address in Capstone disassembly

Capstone uses lowercase for jump targets in generated listings. Let's
match it.

* Use the same successor in getivar guard chains

Cuts down on duplication

* Address reviews

* Fix copypasta error

* Add a comment
2021-10-20 18:19:31 -04:00
Maxime Chevalier-Boisvert 9d8cc01b75 WIP JIT-to-JIT returns 2021-10-20 18:19:28 -04:00
Nobuyoshi Nakada 8d6dbecc80
Remove useless casts 2021-10-19 17:09:32 +09:00
Nobuyoshi Nakada ec021e469d
Get rid of type-punning cast 2021-10-19 17:08:25 +09:00
S.H dc9112cf10
Using NIL_P macro instead of `== Qnil` 2021-10-03 22:34:45 +09:00
Jeremy Evans 9b04909a85 Introduce rb_vm_call_with_refinements to DRY up a few calls 2021-10-01 08:12:46 -09:00
Jeremy Evans 1f5f8a187a
Make Array#min/max optimization respect refined methods
Pass in ec to vm_opt_newarray_{max,min}. Avoids having to
call GET_EC inside the functions, for better performance.

While here, add a test for Array#min/max being redefined to
test_optimization.rb.

Fixes [Bug #18180]
2021-09-30 15:18:14 -07:00
Nobuyoshi Nakada 70624ae43d
Extract hook macro for attributes 2021-09-19 16:27:47 +09:00
Nobuyoshi Nakada cd829bb078 Remove printf family from the mjit header
Linking printf family functions makes mjit objects to link
unnecessary code.
2021-09-11 08:41:32 +09:00
Jeremy Evans 2d98593bf5 Support tracing of attr_reader and attr_writer
In vm_call_method_each_type, check for c_call and c_return events before
dispatching to vm_call_ivar and vm_call_attrset.  With this approach, the
call cache will still dispatch directly to those functions, so this
change will only decrease performance for the first (uncached) call, and
even then, the performance decrease is very minimal.

This approach requires that we clear the call caches when tracing is
enabled or disabled.  The approach currently switches all vm_call_ivar
and vm_call_attrset call caches to vm_call_general any time tracing is
enabled or disabled. So it could theoretically result in a slowdown for
code that constantly enables or disables tracing.

This approach does not handle targeted tracepoints, but from my testing,
c_call and c_return events are not supported for targeted tracepoints,
so that shouldn't matter.

This includes a benchmark showing the performance decrease is minimal
if detectable at all.

Fixes [Bug #16383]
Fixes [Bug #10470]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2021-08-29 07:23:39 -07:00
Nobuyoshi Nakada a0a8f2abf5 Get rid of type-punning pointer casts [Bug #18062] 2021-08-11 12:07:44 +09:00
S.H 378e8cdad6
Using RBOOL macro 2021-08-02 12:06:44 +09:00