Граф коммитов

259 Коммитов

Автор SHA1 Сообщение Дата
Daniel Colson 45f2182438 YJIT: Implement intern
`intern` showed up in the top 20 most frequent exit ops (granted with a
fairly small percentage) in a benchmark run by @jhawthorn on
github/github.

This implementation is similar to gen_anytostring, but with 1
stack pop instead of 2.

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2021-12-18 15:59:30 -05:00
Alan Wu cc5fcae170 YJIT: Remove double check for block arg handling
Inline and remove iseq_supported_args_p(iseq) to remove a potentially
dangerous double check on `iseq->body->param.flags.has_block` and
`iseq->body->local_iseq == iseq`. Double checking should be fine at the
moment as there should be no case where we perform a call to an iseq
that takes a block but `local_iseq != iseq`, but such situation might
be possible when we add support for calling into BMETHODs, for example.
Inlining also has the benefit of mirroring the interpreter's code for
blockarg setup in `setup_parameters_complex()`, making checking for
parity easier.

Extract `vm_ci_flag(ci) & VM_CALL_KWARG` into a const local for brevity.
Constify `doing_kw_call` because we can.
2021-12-17 15:26:04 -08:00
John Hawthorn c2197bf821 YJIT: Fix check for required kwargs
Previously, YJIT would not check that all the required keywords were
specified in the case that there were optional arguments specified. In
this case YJIT would incorrectly call the method with invalid arguments.
2021-12-17 15:26:04 -08:00
John Hawthorn 83aa68447c YJIT: Allow iseq with both opt and kwargs
Previously we mirrored the fast paths the interpreter had for having
only one of kwargs or optional args. This commit aims to combine the
cases and reduce complexity.

Though this allows calling iseqs which have have both optional and
keyword arguments, it requires that all optional arguments are specified
when there are keyword arguments, since unspecified optional arguments
appear before the kwargs. Support for this can be added a in a future
PR.
2021-12-17 15:26:04 -08:00
Alan Wu ac5d6faea8
YJIT: Fix unexpected truncation when outputing VALUE
Previously, YJIT incorrectly discarded the upper 32 bits of the object
pointer when writing out VALUEs to setup default keyword arguments.

In addition to incorrectly truncating, the output pointers were not
properly tracked for handling GC compaction moving the referenced
objects.

YJIT previously attempted to encode a mov instruction with a memory
destination and a 64 bit immediate when there is no such encoding
possible in the ISA. Add an assert to mitigate not being able to
catch this at build time.
2021-12-14 19:47:42 -05:00
Alan Wu 286c07f0dc YJIT: Remove guard_self_is_heap()
It's superseded by functionality added to jit_guard_known_klass().
In weird situations such as the ones in the included test,
guard_self_is_heap() triggered assertions.

Co-authored-by: Jemma Issroff <jemmaissroff@gmail.com>
2021-12-07 17:20:34 -05:00
Alan Wu 794b9a28b5 YJIT: Add integrity checks for blockid
Verify that the iseq idx pair for the block is valid in
invalidate_block_version(). While we are at it, bound loop
iterating over instructions to `iseq_body->iseq_size`.
2021-12-06 20:27:15 -05:00
Alan Wu f41b4d44f9 YJIT: Bounds check every byte in the assembler
Previously, YJIT assumed that basic blocks never consume more than
1 KiB of memory. This assumption does not hold for long Ruby methods
such as the one in the following:

```ruby
eval(<<RUBY)
def set_local_a_lot
  #{'_=0;'*0x40000}
end
RUBY

set_local_a_lot
```

For low `--yjit-exec-mem-size` values, one basic block could exhaust the
entire buffer.

Introduce a new field `codeblock_t::dropped_bytes` that the assembler
sets whenever it runs out of space. Check this field in
gen_single_block() to respond to out of memory situations and other
error conditions. This design avoids making the control flow graph of
existing code generation functions more complex.

Use POSIX shell in misc/test_yjit_asm.sh since bash is expanding
`0%/*/*` differently.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-12-03 20:02:25 -05:00
eileencodes 54ca530dbe YJIT: Add ivar counter exits
On Rails we're seeing a lot of exits for ivars in the Active Record
tests. In trying to track them down it was hard to find what code is
exiting.

This change adds a counted exit for when an object is "megamorphic". In
these cases there are too many specializations in the Ruby code so YJIT
exits.

Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
2021-12-03 09:45:58 -08:00
Adam Hess fbc1615761
YJIT: Fix side-exit typo in comments [ci skip] 2021-12-02 12:01:42 -05:00
Aaron Patterson 157095b3a4 Mark JIT code as writeable / executable depending on the situation
Some platforms don't want memory to be marked as writeable and
executable at the same time. When we write to the code block, we
calculate the OS page that the buffer position maps to.  Then we call
`mprotect` to allow writes on that particular page.  As an optimization,
we cache the "last written" aligned page which allows us to amortize the
cost of the `mprotect` call.  In other words, sequential writes to the
same page will only call `mprotect` on the page once.

When we're done writing, we call `mprotect` on the entire JIT buffer.
This means we don't need to keep track of which pages were marked as
writeable, we let the OS take care of that.

Co-authored-by: John Hawthorn <john@hawthorn.email>
2021-12-01 12:45:59 -08:00
Alan Wu d0772632bf YJIT: Fail gracefully while OOM for new entry points
Previously, YJIT crashes with rb_bug() when asked to compile new methods
while out of executable memory.

To handle this situation gracefully, this change keeps track of all the
blocks compiled each invocation in case YJIT runs out of memory in the
middle of a compliation sequence. The list is used to free all blocks in
case compilation fails.

yjit_gen_block() is renamed to gen_single_block() to make it distinct from
gen_block_version(). Call to limit_block_version() and block_t
allocation is moved into the function to help tidy error checking in the
outer loop.

limit_block_version() now returns by value. I feel that an out parameter
with conditional mutation is unnecessarily hard to read in code that
does not need to go for last drop performance. There is a good chance
that the optimizer is able to output identical code anyways.
2021-12-01 12:25:28 -05:00
Alan Wu b5b6ab4194
YJIT: Add ability to exit to interpreter from stubs
Previously, YJIT assumed that it's always possible to generate a new
basic block when servicing a stub in branch_stub_hit(). When YJIT is out
of executable memory, for example, this assumption doesn't hold up.

Add handling to branch_stub_hit() for servicing stubs without consuming
more executable memory by adding a code path that exits to the
interpreter at the location the branch stub represents. The new code
path reconstructs interpreter state in branch_stub_hit() and then exits
with a new snippet called `code_for_exit_from_stub` that returns
`Qundef` from the YJIT native stack frame.

As this change adds another place where we regenerate code from
`branch_t`, extract the logic for it into a new function and call it
regenerate_branch(). While we are at it, make the branch shrinking code
path in branch_stub_hit() more explicit.

This new functionality is hard to test without full support for out of
memory conditions. To verify this change, I ran
`RUBY_YJIT_ENABLE=1 make check -j12` with the following patch to stress
test the new code path:

```diff
diff --git a/yjit_core.c b/yjit_core.c
index 4ab63d9806..5788b8c5ed 100644
--- a/yjit_core.c
+++ b/yjit_core.c
@@ -878,8 +878,12 @@ branch_stub_hit(branch_t *branch, const uint32_t target_idx, rb_execution_contex
                 cb_set_write_ptr(cb, branch->end_addr);
             }

+if (rand() < RAND_MAX/2) {
             // Compile the new block version
             p_block = gen_block_version(target, target_ctx, ec);
+}else{
+    p_block = NULL;
+}

             if (!p_block && branch_modified) {
                 // We couldn't generate a new block for the branch, but we modified the branch.
```

We can enable the new test along with other OOM tests once full support
lands.

Other small changes:
 * yjit_utils.c (print_str): Update to work with new native frame shape.
       Follow up for 8fa0ee4d40.
 * yjit_iface.c (rb_yjit_init): Run yjit_init_core() after
       yjit_init_codegen() so `cb` and `ocb` are available.
2021-11-26 18:00:42 -05:00
John Hawthorn b6f543d4ae
YJIT: Introduce jit_putobject (#5179)
* YJIT: Introduce jit_putobject

This extracts the logic previously inside gen_putobject to a more
reusable helper method jit_putobject.

The motivation for this is that it both simplifies the implementation of
other instructions, and other instructions can reuse the optimized
behaviour for 32-bit special constants (most importantly
opt_getinlinecache).

This commit also expands the optimization to use a mov directly to
memory when we encounter a 32-bit immediate constant. Previously it
covered fixnums and Qtrue/Qfalse, now it will cover any SPECIAL_CONST_P
value which can be represented as a 32-bit immediate. Notably, this
includes static symbols, and Qnil.

* Style touchups and a comment

* delete empty line

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-11-25 15:10:42 -08:00
John Hawthorn de9a1e4a96
YJIT: Implement new struct accessors (#5161)
* YJIT: Implement optimized_method_struct_aref

* YJIT: Implement struct_aref without method call

Struct member reads can be compiled directly into a memory read (with
either one or two levels of indirection).

* YJIT: Implement optimized struct aset

* YJIT: Update tests for struct access

* YJIT: Add counters for remaining optimized methods

* Check for INT32_MAX overflow

It only takes a struct with 0x7fffffff/8+1 members. Also add some
cheap compile time checks.

* Add tests for non-embedded struct aref/aset

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-11-25 11:56:58 -08:00
Eileen M. Uchitelle 459f9e3df8
Add setclassvariable to yjit (#5127)
Implements setclassvariable in yjit. Note that this version is not
faster than the standard version because we aren't handling the inline
cache in assembly. This is still important to implement because it will
prevent yjit from exiting in methods that call both a cvar setter and
other code that yjit can compile.

Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
2021-11-23 14:09:24 -05:00
Alan Wu 13d1ded253 YJIT: Make block invalidation more robust
This commit adds an entry_exit field to block_t for use in
invalidate_block_version(). By patching the start of the block while
invalidating it, invalidate_block_version() can function correctly
while there is no executable memory left for new branch stubs.

This change additionally fixes correctness for situations where we
cannot patch incoming jumps to the invalidated block. In situations
such as Shopify/yjit#226, the address to the start of the block
is saved and used later, possibly after the block is invalidated.

The assume_* family of function now generate block->entry_exit before
remembering blocks for invalidation.

RubyVM::YJIT.simulate_oom! is introduced for testing out of memory
conditions. The test for it is disabled for now because OOM triggers
other failure conditions not addressed by this commit.

Fixes Shopify/yjit#226
2021-11-22 18:23:28 -05:00
Adam Hess 73388aff5e
Add YJIT codegen for objtostring (#5149)
This is the minimal correct objtostring implementation in YJIT.
For correctness, it is important that to_string not get called on strings or subclasses of string.
There is a new test for this behavior.

A follow up should implement an optimized version for other types as performed in `vm_objtostring`.

Co-authored-by: John Hawthorn <jhawthorn@github.com>

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2021-11-19 16:57:09 -05:00
Jeremy Evans b08dacfea3
Optimize dynamic string interpolation for symbol/true/false/nil/0-9
This provides a significant speedup for symbol, true, false,
nil, and 0-9, class/module, and a small speedup in most other cases.

Speedups (using included benchmarks):
:symbol        :: 60%
0-9            :: 50%
Class/Module   :: 50%
nil/true/false :: 20%
integer        :: 10%
[]             :: 10%
""             :: 3%

One reason this approach is faster is it reduces the number of
VM instructions for each interpolated value.

Initial idea, approach, and benchmarks from Eric Wong. I applied
the same approach against the master branch, updating it to handle
the significant internal changes since this was first proposed 4
years ago (such as CALL_INFO/CALL_CACHE -> CALL_DATA). I also
expanded it to optimize true/false/nil/0-9/class/module, and added
handling of missing methods, refined methods, and RUBY_DEBUG.

This renames the tostring insn to anytostring, and adds an
objtostring insn that implements the optimization. This requires
making a few functions non-static, and adding some non-static
functions.

This disables 4 YJIT tests.  Those tests should be reenabled after
YJIT optimizes the new objtostring insn.

Implements [Feature #13715]

Co-authored-by: Eric Wong <e@80x24.org>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Yusuke Endoh <mame@ruby-lang.org>
Co-authored-by: Koichi Sasada <ko1@atdot.net>
2021-11-18 15:10:20 -08:00
Eileen M. Uchitelle ec574ab345
Refactor getclassvariable (#5137)
* Refactor getclassvariable

We only need the cref when we have a cache miss so don't look it up until we
need it. This speeds up class variable reads in the interpreter but
also simplifies the jit code.

Benchmarks for master vs this branch (without yjit):

Before:

```
Warming up --------------------------------------
         read a cvar     1.276M i/100ms
Calculating -------------------------------------
         read a cvar     12.596M (± 1.7%) i/s -     63.781M in   5.064902s
```

After:

```
Warming up --------------------------------------
         read a cvar     1.336M i/100ms
Calculating -------------------------------------
         read a cvar     13.114M (± 3.6%) i/s -     65.488M in   5.000584s
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>

* Clean up function signatures / remove dead code

rb_vm_getclassvariable signature has changed and we don't need
rb_vm_get_cref.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-11-18 12:11:53 -05:00
John Hawthorn fbd6cc5856
YJIT: Support iseq sends with mixed kwargs (#5082)
* YJIT: Support iseq sends with mixed kwargs

Co-authored-by: Kevin Newton <kddnewton@gmail.com>

* Add additional comments to iseq sends

Co-authored-by: Kevin Newton <kddnewton@gmail.com>
2021-11-05 17:01:07 -04:00
John Hawthorn 9cc2c74b83
YJIT: Implement checkkeyword (#5083)
Co-authored-by: John Crepezzi <john.crepezzi@gmail.com>

Co-authored-by: John Crepezzi <john.crepezzi@gmail.com>
2021-11-05 16:54:23 -04:00
Maxime Chevalier-Boisvert 2421527d6e
YJIT code pages refactoring for code GC (#5073)
* New code page allocation logic

* Fix leaked globals

* Fix leaked symbols, yjit asm tests

* Make COUNTED_EXIT take a jit argument, so we can eliminate global ocb

* Remove extra whitespace

* Change block start_pos/end_pos to be pointers instead of uint32_t

* Change branch end_pos and start_pos to end_addr, start_addr
2021-11-04 16:05:41 -04:00
John Hawthorn 2fa51c7068
YJIT: Support kwargs sends with all defaults (#5067)
* YJIT: Support kwargs sends with all defaults

Previously keyword argument methods were only compiled by YJIT when all
keywords were specified in the caller.

This adds support for calling methods with keyword arguments when no
keyword arguments are specified and all are filled with the defaults.

* Remove unused send_iseq_kwargs_none_passed
2021-11-01 10:54:59 -04:00
Maxime Chevalier-Boisvert 2e14fb7df7
Add comments about send method types (#5059) 2021-10-29 14:58:49 -04:00
Yusuke Endoh c1228f833c vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine
This caused Bus error on 32 bit Solaris
2021-10-29 10:57:46 +09:00
John Hawthorn a6104b392a
YJIT: Support newhash with values (#5029)
* YJIT: Implement newhash with values

* YJIT: Add test of duphash

* Fix compilation on macos/clang
2021-10-27 10:55:43 -04:00
Ian C. Anderson e943511455
YJIT: Implement duphash (#5009)
`duphash` showed up in the top-20 most frequent exit ops for @jhawthorn's benchmark that renders github.com/about

The implementation was almost exactly the same as `duparray`

Co-authored-by: John Hawthorn <john@hawthorn.email>

Co-authored-by: John Hawthorn <john@hawthorn.email>
2021-10-25 10:40:33 -04:00
Alan Wu bdfc23cba9
YJIT: don't compile attr_accessor methods when tracing (#4998)
2d98593bf5 made it so that
attr_accessor methods fire C method tracing events.
Previously, we weren't checking for whether we are tracing before
compiling, leading to missed events.

Since global invalidation invalidates all code, and that attr_accessor
methods can never enable tracing while running, events are only dropped
when YJIT tries to compile when tracing is already enabled.

Factor out the code for checking tracing and check it before generating
code for attr_accessor methods.

This change fixes TestSetTraceFunc#test_tracepoint_attr when it's
ran in isolation.
2021-10-21 15:07:32 -04:00
Alan Wu 00be5846e4 Fix non RUBY_DEBUG build warnings
On non RUBY_DEBUG builds, assert() compiles to nothing and the compiler
warns about uninitialized variables in those code paths. Replace
those asserts with rb_bug() to fix the warnings and do the assert in
all builds. Since yjit_asm_tests.c compiles outside of Ruby, it needed
a distinct version of rb_bug().

Also put YJIT_STATS check for function delcaration that is only defined
in YJIT_STATS builds.
2021-10-20 18:19:43 -04:00
Alan Wu cffa116275 Do kwarg shuffle after checking for interrupts
Previously, we were shuffling keyword arguments before checking for
interrupts. In the case that we side exit in the interrupt check,
we left the interpreter with an already-shuffled argument list for
the call, resulting in a double shuffle, leaving the locals in the
wrong order for the callee.

Do keyword shuffling after all the possible side exits.

Co-authored-by: Kevin Newton <kddnewton@gmail.com>
2021-10-20 18:19:43 -04:00
Alan Wu b74d6563a6 Extract yjit_force_iv_index and make it work when object is frozen
In an effort to simplify the logic YJIT generates for accessing instance
variable, YJIT ensures that a given name-to-index mapping exists at
compile time. In the case that the mapping doesn't exist, it was created
by using rb_ivar_set() with Qundef on the sample object we see at
compile time. This hack isn't fine if the sample object happens to be
frozen, in which case YJIT would raise a FrozenError unexpectedly.

To deal with this, make a new function that only reserves the mapping
but doesn't touch the object. This is rb_obj_ensure_iv_index_mapping().
This new function superceeds the functionality of rb_iv_index_tbl_lookup()
so it was removed.

Reported by and includes a test case from John Hawthorn <john@hawthorn.email>

Fixes: GH-282
2021-10-20 18:19:43 -04:00
Alan Wu 454fbe1046 Expand tabs 2021-10-20 18:19:43 -04:00
eileencodes 4cad893080 Add String#bytesize
Fixes: https://github.com/Shopify/yjit/issues/258

Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
2021-10-20 18:19:43 -04:00
Alan Wu 32b5125c5e else if style 2021-10-20 18:19:43 -04:00
Maxime Chevalier-Boisvert 35b37c5873 Update yjit_codegen.c
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-10-20 18:19:43 -04:00
Maxime Chevalier-Boisvert 201721b713 Update yjit_codegen.c
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2021-10-20 18:19:43 -04:00
Kevin Newton 56b1b93a0c Feedback, tests, and rebase for kwargs 2021-10-20 18:19:43 -04:00
Kevin Newton c5acbd0208 Bail out if passing keyword arguments to only positional and/or optional methods 2021-10-20 18:19:43 -04:00
Kevin Newton 06a826b8c8 Set up the callee stack pointer properly taking into account the bits object 2021-10-20 18:19:43 -04:00
Kevin Newton 5759d840c3 Correct for positional required arguments 2021-10-20 18:19:43 -04:00
Kevin Newton 266e12ac22 Push the unspecified_bits_value onto the stack 2021-10-20 18:19:42 -04:00
Kevin Newton 9aed5809e1 Reuse stack swapping logic 2021-10-20 18:19:42 -04:00
Kevin Newton 2c0891be20 Get kwargs reordering working 2021-10-20 18:19:42 -04:00
Kevin Newton 885bb972bf Get kwargs working for all passed in the correct order 2021-10-20 18:19:42 -04:00
Alan Wu f6da559d5b Put YJIT into a single compilation unit
For upstreaming, we want functions we export either prefixed with "rb_"
or made static. Historically we haven't been following this rule, so we
were "leaking" a lot of symbols as `make leak-globals` would tell us.

This change unifies everything YJIT into a single compilation unit,
yjit.o, and makes everything unprefixed static to pass `make leak-globals`.
This manual "unified build" setup is similar to that of vm.o.

Having everything in one compilation unit allows static functions to
be visible across YJIT files and removes the need for declarations in
headers in some cases. Unnecessary declarations were removed.

Other changes of note:
  - switched to MJIT_SYMBOL_EXPORT_BEGIN which indicates stuff as being
    off limits for native extensions
  - the first include of each YJIT file is change to be "internal.h"
  - undefined MAP_STACK before explicitly redefining it since it
    collide's with a definition in system headers. Consider renaming?
2021-10-20 18:19:42 -04:00
Alan Wu 7eea96c780 Fix gen_getclassvariable
We need to reconstruct interpreter state before calling into the
routines to be able to raise exceptions. I'm getting a crash in
debug build with:
    make test-all 'TESTS=test/ruby/variable.rb' RUN_OPTS='--yjit-call-threshold=1 --yjit-max-versions=1'
2021-10-20 18:19:42 -04:00
Maxime Chevalier-Boisvert 70c5bbf84b Fix counter names for getblockparamproxy. Print in --yjit-stats. 2021-10-20 18:19:42 -04:00
eileencodes c8e157bb5c Implement getclassvariable in yjit
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-10-20 18:19:42 -04:00
eileencodes f911e264a1 Add counted side exit to getblockparamproxy
This is so we know the specific reason we're exiting this instruction.

Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
2021-10-20 18:19:42 -04:00