Граф коммитов

272 Коммитов

Автор SHA1 Сообщение Дата
Kevin Newton 6773832ab9
More Arm64 lowering/backend work (https://github.com/Shopify/ruby/pull/307)
* More Arm64 lowering/backend work

* We now have encoding support for the LDR instruction for loading a PC-relative memory location
* You can now call add/adds/sub/subs with signed immediates, which switches appropriately based on sign
* We can now load immediates into registers appropriately, attempting to keep the minimal number of instructions:
  * If it fits into 16 bytes, we use just a single movz.
  * Else if it can be encoded into a bitmask immediate, we use a single mov.
  * Otherwise we use a movz, a movk, and then optionally another one or two movks.
* Fixed a bunch of code to do with the Op::Load opcode.
* We now handle GC-offsets properly for Op::Load by skipping around them with a jump instruction. (This will be made better by constant pools in the future.)
* Op::Lea is doing what it's supposed to do now.
* Fixed a bug in the backend tests to do with not using the result of an Op::Add.

* Fix the remaining tests for Arm64

* Move split loads logic into each backend
2022-08-29 08:46:59 -07:00
Maxime Chevalier-Boisvert 0551115912
Add #[must_use] annotations to asm instructions 2022-08-29 08:46:59 -07:00
Maxime Chevalier-Boisvert ab2fa6ebdd
Add a backend test with a load of a GC'd VALUE 2022-08-29 08:46:59 -07:00
Maxime Chevalier-Boisvert 580f26959e
Get started on branchunless port 2022-08-29 08:46:59 -07:00
Maxime Chevalier-Boisvert 65019ed60c
Get codegen for deferred compilation working 2022-08-29 08:46:59 -07:00
Maxime Chevalier-Boisvert aab53e2868
Add test for direct jump to a code pointer 2022-08-29 08:46:59 -07:00
Kevin Newton 7a9b581e08
Arm64 progress (https://github.com/Shopify/ruby/pull/304)
* Get initial wiring up

* Split IncrCounter instruction

* Breakpoints in Arm64

* Support for ORR

* MOV instruction encodings

* Implement JmpOpnd and CRet

* Add ORN

* Add MVN

* PUSH, POP, CCALL for Arm64

* Some formatting and implement Op::Not for Arm64

* Consistent constants when working with the Arm64 SP

* Allow OR-ing values into the memory buffer

* Test lowering Arm64 ADD

* Emit unconditional jumps consistently in Arm64

* Begin emitting conditional jumps for A64

* Back out some labelref changes

* Remove label API that no longer exists

* Use a trait for the label encoders

* Encode nop

* Add in nops so jumps are the same width no matter what on Arm64

* Op::Jbe for CodePtr

* Pass src_addr and dst_addr instead of calculated offset to label refs

* Even more jump work for Arm64

* Fix up jumps to use consistent assertions

* Handle splitting Add, Sub, and Not insns for Arm64

* More Arm64 splits and various fixes

* PR feedback for Arm64 support

* Split up jumps and conditional jump logic
2022-08-29 08:46:58 -07:00
Kevin Newton b272c57f27
LSL, LSR, B.cond (https://github.com/Shopify/ruby/pull/303)
* LSL and LSR

* B.cond

* Move A64 files around to make more sense

* offset -> byte_offset for bcond
2022-08-29 08:46:58 -07:00
Alan Wu d916328078
Conscise IR disassembly (https://github.com/Shopify/ruby/pull/302)
The output from `dbg!` was too verbose. For `test_jo` the output went
from 37 lines to 5 lines. The added index helps parsing InsnOut
indicies.

Samples:

```
test backend::tests::test_jo ... [src/backend/ir.rs:589] &self = Assembler
    000 Load(Mem64[Reg(3) + 8]) -> Out64(0)
    001 Sub(Out64(0), 1_i64) -> Out64(1)
    002 Load(Out64(1)) -> Out64(2)
    003 Add(Out64(2), Mem64[Reg(3)]) -> Out64(3)
    004 Jo() target=CodePtr(CodePtr(0x5)) -> Out64(4)
    005 Mov(Mem64[Reg(3)], Out64(3)) -> Out64(5)

test backend::tests::test_reuse_reg ... [src/backend/ir.rs:589] &self = Assembler
    000 Load(Mem64[Reg(3)]) -> Out64(0)
    001 Add(Out64(0), 1_u64) -> Out64(1)
    002 Load(Mem64[Reg(3) + 8]) -> Out64(2)
    003 Add(Out64(2), 1_u64) -> Out64(3)
    004 Add(Out64(1), 1_u64) -> Out64(4)
    005 Add(Out64(1), Out64(4)) -> Out64(5)
    006 Store(Mem64[Reg(3)], Out64(4)) -> Out64(6)
    007 Store(Mem64[Reg(3) + 8], Out64(5)) -> Out64(7)
```
2022-08-29 08:46:58 -07:00
Alan Wu 0a96a39189
Delete dbg!() calls 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert f1b188143b
Fix backend transform bug, add test 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 4c0a440b18
Port over duphash and newarray 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 2eba6aef72
Port over get_branch_target() 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 4254174ca7
Port over setn 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 24db233fc7
Add jo insn and test for jo 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 8bb7421d8e
Port topn, adjuststack, most of opt_plus 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert d0204e51e2
Port guard_two_fixnums 2022-08-29 08:46:58 -07:00
Maxime Chevalier-Boisvert 00ad14f8c9
Port gen_full_cfunc_return 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert b89d878ea6
Port getlocal_WC0 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert 4c7d7080d2
Port over gen_putspecialobject 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert c5ae52630f
Port gen_putself, log what can't be compiled in --yjit-dump-insns 2022-08-29 08:46:57 -07:00
Kevin Newton 27dd43bbc5
TST, CMP, AND/ANDS with registers (https://github.com/Shopify/ruby/pull/301)
* Add TST instruction and AND/ANDS entrypoints for immediates

* TST/AND/ANDS for registers

* CMP instruction
2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert 57e64f70c0
Make sure allocated reg size in bits matches insn out size 2022-08-29 08:46:57 -07:00
Kevin Newton eb4c7b4ea5
AND/ANDS for A64 (https://github.com/Shopify/ruby/pull/300) 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert 67de662c44
Add Opnd.rm_num_bits() method 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert 084d4bb192
Implement X86Reg::sub_reg() method 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert 4932a6ef75
Fix small bug in x86_split 2022-08-29 08:46:57 -07:00
Maxime Chevalier-Boisvert b8fc9909bf
Get rid of temporary context methods 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 40ac79ada8
Add bitwise and to x86 backend 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert abea8c8983
Add stores to one of the tests 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 1923842b3d
Move backend tests to their own file 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 59b818ec87
Add support for using InsnOut as memory operand base 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 401521ca14
Rename transform_insns to forward_pass 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert ae9bcfec8c
Add assert 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert e743e3bf20
Remove unused code, add backend asm test 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 4dbc1e1d82
Port bitwise not, gen_check_ints() 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert 9d18e6c300
Port gen_code_for_exit_from_stub() 2022-08-29 08:46:56 -07:00
Maxime Chevalier-Boisvert e72dab304e
Add atomic counter increment instruction 2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert 27fcab995e
Get side exits working, get miniruby to boot with threshold=1 2022-08-29 08:46:55 -07:00
Kevin Newton c10e018e1c
LDADDAL, STUR, BL (https://github.com/Shopify/ruby/pull/299)
* LDADDAL instruction

* STUR

* BL instruction

* Remove num_bits from imm and uimm

* Tests for imm_fits_bits and uimm_fits_bits

* Reorder arguments to LDADDAL
2022-08-29 08:46:55 -07:00
Kevin Newton 1daa5942b8
MOVK, MOVZ, BR (https://github.com/Shopify/ruby/pull/296)
* MOVK instruction

* More tests for the A64 entrypoints

* Finish testing entrypoints

* MOVZ

* BR instruction
2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert 0000984fed
Port over putnil, putobject, and gen_leave()
* Remove x86-64 dependency from codegen.rs

* Port over putnil and putobject

* Port over gen_leave()

* Complete port of gen_leave()

* Fix bug in x86 instruction splitting
2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert d75c346c1c
Port gen_leave_exit(), add support for labels to backend 2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert ea9abe547d
Add cpush and cpop IR instructions 2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert 77383b3958
Add conditional jumps 2022-08-29 08:46:55 -07:00
Kevin Newton b63f8bb456
LDUR (https://github.com/Shopify/ruby/pull/295)
* LDUR

* Fix up immediate masking

* Consume operands directly

* Consistency and cleanup

* More consistency and entrypoints

* Cleaner syntax for masks

* Cleaner shifting for encodings
2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert 71770ceee5
Map comments in backend 2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert c2fdec93a9
First pass at porting gen_entry_prologue() 2022-08-29 08:46:55 -07:00
Maxime Chevalier-Boisvert 03ed50310d
Have Assembler::compile() return a list of GC offsets 2022-08-29 08:46:54 -07:00
Kevin Newton 26ba0a454c
RET A64 instructions (https://github.com/Shopify/ruby/pull/294) 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert e22134277b
Remove x86_64 dependency in core.rs 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert 3133540be7
Progress on codegen.rs port 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert a1b8c94738
* Arm64 Beginnings (https://github.com/Shopify/ruby/pull/291)
* Initial setup for aarch64

* ADDS and SUBS

* ADD and SUB for immediates

* Revert moved code

* Documentation

* Rename Arm64* to A64*

* Comments on shift types

* Share sig_imm_size and unsig_imm_size
2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert 39dd8b2dfb
Add test for lea and ret. Fix codegen for lea and ret. 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert 04e2ccede4
Change codegen.rs to use backend Assembler directly 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert 7c83a904a4
Implement gc offset logic 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert efb45acb29
Load GC Value operands into registers 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert a88fc48b3a
Add CCall IR insn, implement gen_swap() 2022-08-29 08:46:54 -07:00
Maxime Chevalier-Boisvert 0032b02045
Add gen_dupn 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 872940e215
Add test with register reuse 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 151cc55baa
Fix issue with load, gen_dup 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 1b2ee62149
Implement target-specific insn splitting with Kevin. Add tests. 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 564f950360
Make assembler methods public, sketch gen_dup with new backend 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 99cfbdca6b
Fix bug with asm.comment() 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 75c995b0d1
Bias register allocator to reuse first operand 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert 369911d31d
Add dbg!() for Assembler. Fix regalloc issue. 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert a2aa289594
Function to map from Opnd => X86Opnd 2022-08-29 08:46:53 -07:00
Maxime Chevalier-Boisvert e9cc17dcc9
Start work on platform-specific codegen 2022-08-29 08:46:53 -07:00
Kevin Newton a3d8e20cea
Split insns (https://github.com/Shopify/ruby/pull/290)
* Split instructions if necessary

* Add a reusable transform_insns function

* Split out comments labels from transform_insns

* Refactor alloc_regs to use transform_insns
2022-08-29 08:37:49 -07:00
Kevin Newton 2b7d4f277d
IR register allocation
PR: https://github.com/Shopify/ruby/pull/289
2022-08-29 08:37:49 -07:00
Maxime Chevalier-Boisvert 7753b6b8b6
Removed String opnd so that we can derive Copy for Opnd 2022-08-29 08:37:49 -07:00
Maxime Chevalier-Boisvert 5021f26b4b
Complete sketch for guard_object_is_heap 2022-08-29 08:37:49 -07:00
Maxime Chevalier-Boisvert 884cbaabd9
Change push insn macros 2022-08-29 08:37:49 -07:00
Maxime Chevalier-Boisvert 92e9d1e661
Switch IR to use Option<Target> 2022-08-29 08:37:48 -07:00
Maxime Chevalier-Boisvert 96e5f9def0
Add macro to define ops 2022-08-29 08:37:48 -07:00
Maxime Chevalier-Boisvert 909d214708
Progress on IR sketch 2022-08-29 08:37:48 -07:00
Maxime Chevalier-Boisvert 2ffaa377c2
WIP backend IR sketch 2022-08-29 08:37:48 -07:00
Noah Gibbs b4be3c00c5 add --yjit-dump-iseqs param (https://github.com/Shopify/ruby/pull/332) 2022-08-24 10:42:45 -07:00
Takashi Kokubun 485019c2bd
Rename mjit_exec to jit_exec (#6262)
* Rename mjit_exec to jit_exec

* Rename mjit_exec_slowpath to mjit_check_iseq

* Remove mjit_exec references from comments
2022-08-19 23:57:17 -07:00
Jeremy Evans b7e492fa9e Regen YJIT bindings 2022-08-09 22:19:46 -07:00
Noah Gibbs 1e7a2415a4
YJIT: Allow str-concat arg to be any string subtype, not just rb_cString (#6205)
Allow str-concat arg to be any string subtype, not just rb_cString
2022-08-04 12:19:14 -04:00
John Hawthorn 7f5f9d19c5
YJIT: Add known_* helpers for Type (#6208)
* YJIT: Add known_* helpers for Type

This adds a few helpers to Type which all return Options representing
what is known, from a Ruby perspective, about the type.

This includes:
* known_class_of: If known, the class represented by this type
* known_value_type: If known, the T_ value type
* known_exact_value: If known, the exact VALUE represented by this type
  (currently this is only available for true/false/nil)
* known_truthy: If known, whether or not this value evaluates as true
  (not false or nil)

The goal of this is to abstract away the specifics of the mappings
between types wherever possible from the codegen. For example previously
by introducing Type::CString as a more specific version of
Type::TString, uses of Type::TString in codegen needed to be updated to
check either case. Now by using known_value_type, at least in theory we
can introduce new types with minimal (if any) codegen changes.

I think rust's Option type allows us to represent this uncertainty
fairly well, and should help avoid mistakes, and the matching using this
turned out pretty cleanly.

* YJIT: Use known_value_type for checktype

* YJIT: Use known_value_type for T_STRING check

* YJIT: Use known_class_of in guard_known_klass

* YJIT: Use known truthyness in jit_rb_obj_not

* YJIT: Rename known_class_of => known_class
2022-08-04 11:18:24 -04:00
John Hawthorn 0e85586ecc Add --enable-yjit=dev_nodebug configure option 2022-07-29 16:32:14 -07:00
John Hawthorn fbd24793cb Add --enable-yjit=stats configure option 2022-07-29 16:32:14 -07:00
Matthew Draper ab08a43ec5
YJIT: Teach getblockparamproxy to handle the no-block case without exiting (#6191)
Teach getblockparamproxy to handle the no-block case without exiting

Co-authored-by: John Hawthorn <john@hawthorn.email>

Co-authored-by: John Hawthorn <john@hawthorn.email>
2022-07-28 11:38:07 -04:00
John Hawthorn 660b1e973c
YJIT: Skip setlocal WB check for immediate values (#6122)
Write barriers may be required when VM_ENV_FLAG_WB_REQUIRED is set,
however write barriers only affect heap objects being written. If we
know an immediate value is being written we can skip this check.
2022-07-20 12:31:40 -04:00
Jemma Issroff ecff334995 Extract vm_ic_entry API to mimic vm_cc behavior 2022-07-18 12:44:01 -07:00
Peter Zhu 7424ea184f Implement Objects on VWA
This commit implements Objects on Variable Width Allocation. This allows
Objects with more ivars to be embedded (i.e. contents directly follow the
object header) which improves performance through better cache locality.
2022-07-15 09:21:07 -04:00
Eileen M. Uchitelle 59c6b7b7ab
Speed up --yjit-trace-exits code (#6106)
In a small script the speed of this feature isn't really noticeable but
on Rails it's very noticeable how slow this can be. This PR aims to
speed up two parts of the functionality.

1) The Rust exit recording code

Instead of adding all samples as we see them to the yjit_raw_samples and
yjit_line_samples, we can increment the counter on the ones we've seen
before. This will be faster on traces where we are hitting the same
stack often. In a crude measurement of booting just the active record
base test (`test/cases/base_test.rb`) we found that this improved the
speed by 1 second.

This also results in a smaller marshal dump file which sped up the test
boot time by 4 seconds with trace exits on.

2) The Ruby parsing code

Previously we were allocating new arrays using `shift` and
`each_with_index`. This change avoids allocating new arrays by using an
index. This change saves us the most amount of time, gaining 11 seconds.

Before this change the test boot time took 62 seconds, after it took 47
seconds. This is still too long but it's a step closer to faster
functionality. Next we're going to tackle allowing you to collect trace
exits for a specific instruction. There is also some potential slowness
in the GC code that I'd like to take a second look at.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2022-07-12 16:40:49 -04:00
Noah Gibbs (and/or Benchmark CI) a2e0815e27 Switch YJIT to using rb_str_buf_append rather than rb_str_append when encodings don't match, as discussed with byroot 2022-07-06 17:25:58 +02:00
Maxime Chevalier-Boisvert 3c61e1e77f
YJIT: add a counter for gc object refs in the machine code (#6089)
Add a counter for gc object refs in the machine code

This is to gather data for the eventual implementation of
a constant pool.
2022-07-06 11:13:22 -04:00
Dave Schwantes b6f6fc6e87
YJIT: Refactor gen_opt_mod (#6078)
Refactor gen_opt_mod in YJIT
2022-06-30 10:26:46 -04:00
Noah Gibbs 118e3edc32
Add a check-yjit-bindgen-unused target. Add to CI. (#6066)
This fails if there are any unused rust-bindgen "allow" entries. For
that target we turn on Rust warnings (there are a lot) and grep for the
ones that correspond to unused allow entries.

I've added check-yjit-bindgen-unused as a dependency of
check-yjit-bindings, so unused allow entries will now fail CI.

This change also removes our single unused allow entry (VM_CALL.*) which
was known to be bad.
2022-06-29 12:49:46 -04:00
Noah Gibbs (and/or Benchmark CI) 0fab06f3c3 Separate Type::String into Type::CString and Type::TString.
Also slightly broaden the cases where << on two strings will generate
specialised code rather than a plain method call.
2022-06-27 09:25:57 -07:00
Alan Wu ef79f0a9e5 YJIT: Fix copy pasted comment [ci skip] 2022-06-26 08:36:17 -04:00
John Hawthorn 0c1f64396f Skip protected ancestry guard for FCALLs in YJIT 2022-06-21 18:33:51 -07:00
Alan Wu 9f09397bfe
YJIT: On-demand executable memory allocation; faster boot (#5944)
This commit makes YJIT allocate memory for generated code gradually as
needed. Previously, YJIT allocates all the memory it needs on boot in
one go, leading to higher than necessary resident set size (RSS) and
time spent on boot initializing the memory with a large memset().

Users should no longer need to search for a magic number to pass to
`--yjit-exec-mem` since physical memory consumption should now more
accurately reflect the requirement of the workload.

YJIT now reserves a range of addresses on boot. This region start out
with no access permission at all so buggy attempts to jump to the region
crashes like before this change. To get this hardening at finer
granularity than the page size, we fill each page with trapping
instructions when we first allocate physical memory for the page.

Most of the time applications don't need 256 MiB of executable code, so
allocating on-demand ends up doing less total work than before. Case in
point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about
half as long after this change. In terms of memory consumption, here is
a table to give a rough summary of the impact:

    | Peak RSS in MiB | -eitself example | railsbench once |
    | :-------------: | ---------------: | --------------: |
    |     before      |              265 |             377 |
    |      after      |               11 |             143 |
    |     no YJIT     |               10 |             101 |

A new module is introduced to handle allocation bookkeeping.
`CodePtr` is moved into the module since it has a close relationship
with the new `VirtualMemory` struct. This new interface has a slightly
smaller surface than before in that marking a region as writable is no
longer a public operation.
2022-06-14 10:23:13 -04:00
Noah Gibbs 9ed9cc9852
Add tests for a variety of string-subclass operations (#5999)
This way YJIT has to match CRuby for each of them.
Remove unused string_p() Rust function
2022-06-10 13:52:43 -04:00
Noah Gibbs e777ac9161
Don't return a value from jit_guard_known_klass. We never return anything but true at this point and we don't usually check the returned value. (#6000) 2022-06-10 12:29:26 -04:00
Eileen M. Uchitelle 473ee328c5
Add ability to trace exit locations in yjit (#5970)
When running with `--yjit-stats` turned on, yjit can inform the user
what the most common exits are. While this is useful information it
doesn't tell you the source location of the code that exited or what the
code that exited looks like. This change intends to fix that.

To use the feature, run yjit with the `--yjit-trace-exits` option,
which will record the backtrace for every exit that occurs. This functionality
requires the stats feature to be turned on. Calling `--yjit-trace-exits`
will automatically set the `--yjit-stats` option.

Users must call `RubyVM::YJIT.dump_exit_locations(filename)` which will
Marshal dump the contents of `RubyVM::YJIT.exit_locations` into a file
based on the passed filename.

*Example usage:*

Given the following script, we write to a file called
`concat_array.dump` the results of `RubyVM::YJIT.exit_locations`.

```ruby
def concat_array
  ["t", "r", *x = "u", "e"].join
end

1000.times do
  concat_array
end

RubyVM::YJIT.dump_exit_locations("concat_array.dump")
```

When we run the file with this branch and the appropriate flags the
stacktrace will be recorded. Note Stackprof needs to be installed or you
need to point to the library directly.

```
./ruby --yjit --yjit-call-threshold=1 --yjit-trace-exits -I/Users/eileencodes/open_source/stackprof/lib test.rb
```

We can then read the dump file with Stackprof:

```
./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump
```

Results will look similar to the following:

```
==================================
  Mode: ()
  Samples: 1817 (0.00% miss rate)
  GC: 0 (0.00%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
      1001  (55.1%)        1001  (55.1%)     concatarray
       335  (18.4%)         335  (18.4%)     invokeblock
       178   (9.8%)         178   (9.8%)     send
       140   (7.7%)         140   (7.7%)     opt_getinlinecache
       ...etc...
```

Simply inspecting the `concatarray` method will give `SOURCE
UNAVAILABLE` because the source is insns.def.

```
./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method concatarray
```

Result:

```
concatarray (nonexistent.def:1)
  samples:  1001 self (55.1%)  /   1001 total (55.1%)
  callers:
    1000  (   99.9%)  Object#concat_array
       1  (    0.1%)  Gem.suffixes
  callees (0 total):
  code:
        SOURCE UNAVAILABLE
```

However if we go deeper to the callee we can see the exact
source of the `concatarray` exit.

```
./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method Object#concat_array
```

```
Object#concat_array (/Users/eileencodes/open_source/rust_ruby/test.rb:1)
  samples:     0 self (0.0%)  /   1000 total (55.0%)
  callers:
    1000  (  100.0%)  block in <main>
  callees (1000 total):
    1000  (  100.0%)  concatarray
  code:
                                  |     1  | def concat_array
 1000   (55.0%)                   |     2  |   ["t", "r", *x = "u", "e"].join
                                  |     3  | end
```

The `--walk` option is recommended for this feature as it make it
easier to traverse the tree of exits.

*Goals of this feature:*

This feature is meant to give more information when working on YJIT.
The idea is that if we know what code is exiting we can decide what
areas to prioritize when fixing exits. In some cases this means adding
prioritizing avoiding certain exits in yjit. In more complex cases it
might mean changing the Ruby code to be more performant when run with
yjit. Ultimately the more information we have about what code is exiting
AND why, the better we can make yjit.

*Known limitations:*

* Due to tracing exits, running this on large codebases like Rails
can be quite slow.
* On complex methods it can still be difficult to pinpoint the exact cause of
an exit.
* Stackprof is a requirement to to view the backtrace information from
the dump file.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2022-06-09 12:59:39 -04:00
Noah Gibbs 1598c9458a
Add special-case code for the String unary plus operator (#5982) 2022-06-07 11:20:57 -04:00
Noah Gibbs 653e517eef
Use bindgen to import Ruby constants wherever possible. (#5943)
Constants that can't be imported via bindgen should have
a comment saying why not.
2022-06-06 13:47:24 -04:00
Nobuyoshi Nakada 689b5ae752
Split YJIT rules for CODEOWNERS 2022-06-02 10:12:34 +09:00
Noah Gibbs 9d18661e1d
Revert incorrect string-guard optimisation. (#5969)
Also add jhawthorn's test to for this bug.
Fix String#to_s invalidation test
2022-06-01 10:22:08 -04:00
Alan Wu 899c90cf8a
YJIT: Relax minimum Rust version requirement to 1.58.1
We want to make it convenient for people to build YJIT and Rust version 1.58.1
or above is available on Ubuntu Jammy, Debian testing, and Fedora 36 through
the usual package manager on those systems. This saves the need to install
`rustup` for some people.

Our code is already 1.58.1 compatible so this commit simply tweaks CI to make
sure that we keep supporting that version. We still test against the latest Rust
version in `--enable-yjit=dev` builds through the Rust version available in
GitHub's CI image.

Rust versions older than 1.58.1 might build YJIT today, but we might make
incompatible changes in the future.

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2022-05-29 13:43:02 -04:00
Noah Gibbs (and/or Benchmark CI) ba88787087 Use bindgen to import CRuby constants for YARV instruction bytecodes 2022-05-26 13:06:47 -04:00
Jemma Issroff 80ad0e751f Remove unnecessary module flag, add module assertions to other module flags 2022-05-23 11:04:34 -07:00
Noah Gibbs 50bad7159a
Special-case jit_guard_known_class for strings. This can remove (#5920)
runtime guard-checks for String#to_s, making some blocks too
short to invalidate later. Add NOPs in those cases to reserve space.
2022-05-20 19:39:37 -04:00
Takashi Kokubun b8a268e293
YJIT: Add opt_succ (#5919) 2022-05-19 11:52:52 -04:00
Aaron Patterson ebaf56c013 YJIT: Implement getblockparam
This implements the getblockparam instruction.

There are two cases we need to handle depending on whether or not
VM_FRAME_FLAG_MODIFIED_BLOCK_PARAM is set in the environment flag.

When the modified flag is unset, we need to call rb_vm_bh_to_procval to
get a proc from our passed block, save the proc in the environment, and
set the modified flag.

In the case that the modified flag is set we are able to just use the
existing proc in the environment.

One quirk of this is that we need to call jit_prepare_routine_call early
and ensure we update PC and SP regardless of the branch taken, so that
we have a consistent SP offset at the start of the next instruction.

We considered using a chain guard to generate these two paths
separately, but decided against it because it's very common to see both
and the modified case is basically a subset of the instructions in the
unmodified case.

This includes tests for both getblockparam and getblockparamproxy which
was previously missing a test.
2022-05-12 14:34:18 -07:00
Aaron Patterson f07a0e79a2
YJIT: Fix getting the EP with registers other than RAX (#5882)
Before this commit we were accidentally clobbering RAX.  Additionally,
since this function had RAX hardcoded then the function may not have
worked with registers other than RAX.

Co-authored-by: John Hawthorn <john@hawthorn.email>
2022-05-12 12:08:35 -07:00
Noah Gibbs e88ada4699
Ruby shovel operator (<<) speedup. (#5896)
For string concat, see if compile-time encoding of strings matches.
If so, use simple buffer string concat at runtime. Otherwise, use
encoding-checking string concat.
2022-05-11 11:20:21 -04:00
Maxime Chevalier-Boisvert 35e111fd3e
Fix bug identified by @noahgibbs. (#5876)
Turned out to be a one-character fix :)
2022-05-02 16:30:05 -04:00
Koichi ITO 8587bacc25
YJIT: Remove redundant `extern crate` (#5869)
Follow up https://github.com/ruby/ruby/commit/0514d81

Rust YJIT requires Rust 1.60.0 or later. So, `extern crate` looks unnecessary
because it can use the following Rust 2018 edition feature:
https://doc.rust-lang.org/stable/edition-guide/rust-2018/path-changes.html#no-more-extern-crate

It passes the following tests.

```console
% cd yjit
% cargo test --features asm_comments,disasm
(snip)

test result: ok. 56 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
```
2022-05-02 10:05:01 -04:00
Alan Wu 5c843a1a6e
YJIT: Enable default rustc lints (warnings) (#5864)
`rustc` performs in depth dead code analysis and issues warning
even for things like unused struct fields and unconstructed enum
variants. This was annoying for us during the port but hopefully
they are less of an issue now.

This patch enables all the unused warnings we disabled and address
all the warnings we previously ignored. Generally, the approach I've
taken is to use `cfg!` instead of using the `cfg` attribute and
to delete code where it makes sense. I've put `#[allow(unused)]`
on things we intentionally keep around for printf style debugging
and on items that are too annoying to keep warning-free in all
build configs.
2022-04-29 18:20:23 -04:00
Alan Wu fead7107ab YJIT: Adopt Clippy suggestions we like
This adopts most suggestions that rust-clippy is confident enough to
auto apply. The manual changes mostly fix manual if-lets and take
opportunities to use the `Default` trait on standard collections.

Co-authored-by: Kevin Newton <kddnewton@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2022-04-29 15:03:45 -04:00
Dmitry Dygalo f8e4488e5e
YJIT: Do not create `CodeBlock.asm_comments` if the `asm_comments` feature is disabled (#5863) 2022-04-29 10:07:48 -04:00
Maxime Chevalier-Boisvert 0eb237d99c
YJIT: replace BLOCKID_NULL with Option<BlockId>, more idiomatic (#5858)
* YJIT: replace BLOCKID_NULL with Option<BlockId>, more idiomatic

* Update yjit/src/core.rs

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

* Update yjit/src/core.rs

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2022-04-28 17:12:24 -04:00
Kazuhiro NISHIYAMA 7c141f996b
Fix typos [ci skip] 2022-04-28 17:51:05 +09:00
Alan Wu 0514d81715 YJIT: Remove unnecessary `extern crate` declaration
Thanks to suggestion from bjorn3 on GitHub.

Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
2022-04-27 11:00:22 -04:00
Alan Wu 932bfd0beb YJIT: Make add_comment() more concise
Thanks to suggestions from Stranger6667 on GitHub.

Co-authored-by: Dmitry Dygalo <dmitry@dygalo.dev>
2022-04-27 11:00:22 -04:00
Alan Wu f90549cd38 Rust YJIT
In December 2021, we opened an [issue] to solicit feedback regarding the
porting of the YJIT codebase from C99 to Rust. There were some
reservations, but this project was given the go ahead by Ruby core
developers and Matz. Since then, we have successfully completed the port
of YJIT to Rust.

The new Rust version of YJIT has reached parity with the C version, in
that it passes all the CRuby tests, is able to run all of the YJIT
benchmarks, and performs similarly to the C version (because it works
the same way and largely generates the same machine code). We've even
incorporated some design improvements, such as a more fine-grained
constant invalidation mechanism which we expect will make a big
difference in Ruby on Rails applications.

Because we want to be careful, YJIT is guarded behind a configure
option:

```shell
./configure --enable-yjit # Build YJIT in release mode
./configure --enable-yjit=dev # Build YJIT in dev/debug mode
```

By default, YJIT does not get compiled and cargo/rustc is not required.
If YJIT is built in dev mode, then `cargo` is used to fetch development
dependencies, but when building in release, `cargo` is not required,
only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer.

The YJIT command-line options remain mostly unchanged, and more details
about the build process are documented in `doc/yjit/yjit.md`.

The CI tests have been updated and do not take any more resources than
before.

The development history of the Rust port is available at the following
commit for interested parties:
1fd9573d8b

Our hope is that Rust YJIT will be compiled and included as a part of
system packages and compiled binaries of the Ruby 3.2 release. We do not
anticipate any major problems as Rust is well supported on every
platform which YJIT supports, but to make sure that this process works
smoothly, we would like to reach out to those who take care of building
systems packages before the 3.2 release is shipped and resolve any
issues that may come up.

[issue]: https://bugs.ruby-lang.org/issues/18481

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com>
Co-authored-by: Kevin Newton <kddnewton@gmail.com>
2022-04-27 11:00:22 -04:00