* Remove references to explicit instruction parts
Previously we would reference individual instruction fields
manually. We can't do that with instructions that are enums, so
this commit removes those references. As a side effect, we can
remove the push_insn_parts() function from the assembler because we
now explicitly push instruction structs every time.
* Switch instructions to enum
Instructions are now no longer a large struct with a bunch of
optional fields. Instead they are an enum with individual shapes
for the variants.
In terms of size, the instruction struct was 120 bytes while the
new instruction enum is 106 bytes. The bigger win however is that
we're not allocating any vectors for instruction operands (except
for CCall), which should help cut down on memory usage.
Adding new instructions will be a little more complicated going
forward, but every mission-critical function that needs to be
touched will have an exhaustive match, so the compiler should guide
any additions.
* Operand iterators
There are a couple of times when we're dealing with instructions
that we need to iterate through their operands. At the moment this
is relatively easy because there's an opnds field and we can work
with it directly. When the instructions become enums, however, the
shape of each variant will be different so we'll need an iterator
to make sense of the shape.
This commit introduces two new iterators that are created from an
instruction. One iterates over references to each operand (for
instances where they don't need to be mutable like updating live
ranges) and one iterates over mutable references to each operand
(for instances where you need to mutate them like loading values in
arm64).
Note that because iterators can't have generic items (i.e., be
associated with lifetimes) the mutable iterator forces you to use
the `while let Some` syntax as opposed to the for-loop like we did
with instructions.
This commit eliminates the last reference to insn.opnds, which is
going to make it much easier to transition to an enum.
* Consolidate output operand fetching
Currently we always look at the .out field on instructions whenever
we want to access the output operand. When the instructions become
an enum, this is not going to be possible since the shape of the
variants will be different. Instead, this commit introduces two
functions on Insn: out_opnd() and out_opnd_mut(). These return an
Option containing a reference to the output operand and a mutable
reference to the output operand, respectively.
This commit then uses those functions to replace all instances of
accessing the output operand. For the most part this was
straightforward; when we previously checked if it was Opnd::None
we now check that it's None, when we assumed there was an output
operand we now unwrap.
Yet another case of `jit_mov_gc_ptr()` being yanked out during the
transition to the new backend, causing a crash after object movement.
The intresting wrinkle with this one is that not all callinfos are GC'ed
objects, so the old code had an implicit assumption.
b0b9f7201a/yjit/src/codegen.rs (L4087-L4095)
* Mutate in place for register allocation
Currently we allocate a new instruction every time when we're
doing register allocation by first splitting up the instruction
into its component parts, mapping the operands and the output, and
then pushing all of its parts onto the new assembler.
Since we don't need the old instruction, we can mutate the existing
one in place. While it's not that big of a win in and of itself, it
matches much more closely to what we're going to have to do when we
switch the instruction from being a struct to being an enum,
because it's much easier for the instruction to modify itself since
it knows its own shape than it is to push a new instruction that
very closely matches.
* Mutate in place for arm64 split
When we're splitting instructions for the arm64 backend, we map all
of the operands for a given instruction when it has an Opnd::Value.
We can do this in place with the existing operand instead of
allocating a new vector each time. This enables us to pattern match
against the entire instruction instead of just the opcode, which is
much closer to matching against an enum.
* Match against entire instruction in arm64_emit
Instead of matching against the opcode and then accessing all of
the various fields on the instruction when emitting bytecode for
arm64, we should instead match against the entire instruction.
This makes it much closer to what's going to happen when we switch
it over to being an enum.
* Match against entire instruction in x86_64 backend
When we're splitting or emitting code for x86_64, we should match
against the entire instruction instead of matching against just the
opcode. This gets us closer to matching against an enum instead of
a struct.
* Reuse instructions for arm64_split
When we're splitting, the default behavior was previously to split
up the instruction into its component parts and then reassemble
them in a new instruction. Instead, we can reuse the existing
instruction.
* Only check lowest bit for _Bool type
The `test AL, AL` got lost during porting and we were
generating `test RAX, RAX` instead. The upper bits of a `_Bool` return
type is unspecified and we were failing
`TestClass#test_singleton_class_should_has_own_namespace`
due to interpreterting the return value incorrectly.
* Enable test_class for test-all on x86_64
When we're pushing instructions onto the assembler, we previously
would iterate through the instruction's operands and then assign
the output operand to it through the push_insn function. This is
easy when all instructions have a vector of operands, but is much
more difficult when the shape differs in an enum.
This commit changes it so that we explicitly define the output
operand for each instruction before it gets pushed onto the
assembler. This has the added benefit of changing the definition
of push_insn to no longer require a mutable instruction.
This paves the way to make the out field on the instructions an
Option<Opnd> instead which is going to more accurately reflect
the behavior we're going to have once we switch the instructions
over to an enum instead of a struct.
Currently we use macros to define the shape of each of the
instruction building methods. This works while all of the
instructions share the same fields, but is really hard to get
working when they're an enum with different shapes. This is an
incremental step toward a bigger refactor of changing the Insn
from a struct to an enum.
* Create code generation func
* Make rb_vm_concat_array available to use in Rust
* Map opcode to code gen func
* Implement code gen for concatarray
* Add test for concatarray
* Use new asm backend
* Add comment to C func wrapper
We have a large extern block in cruby.rs leftover from the port. We can
use bindgen for it now and reserve the manual declaration for just a
handful of vm_insnhelper.c functions.
Fixup a few minor discrepencies bindgen found between the C declaration
and the manual declaration. Mostly missing `const` on the C side.
`YJIT.simulate_oom!` used to leave one byte of space in the code block,
so our test didn't expose a problem with asserting that the write
position is in bounds in `CodeBlock::set_pos`. We do the following when
patching code:
1. save current write position
2. seek to middle of the code block and patch
3. restore old write position
The bounds check fails on (3) when the code block is already filled up.
Leaving one byte of space also meant that when we write that byte, we
need to fill the entire code region with trapping instruction in
`VirtualMem`, which made the OOM tests unnecessarily slow.
Remove the incorrect bounds check and stop leaving space in the code
block when simulating OOM.
* Iterator
* Use the new iterator for the X86 backend split
* Use iterator for reg alloc, remove forward pass
* Fix up iterator usage on AArch64
* Update yjit/src/backend/ir.rs
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Various PR feedback for iterators for IR
* Use a local mutable reference for a64_split
* Move tests from ir.rs to tests.rs in backend
* Fix x86 shift instructions live range calculation
* Iterator
* Use the new iterator for the X86 backend split
* Fix up x86 iterator usage
* Fix ARM iterator usage
* Remove unintentionally duplicated tests
* Port gen_send_iseq to the new backend IR
* Replace occurrences of 8 by SIZEOF_VALUE
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
* Update flags for data processing on ARM
* Update yjit/src/backend/arm64/mod.rs
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Previously, we patched in an x64 JMP even on A64, which resulted in
invalid machine code. Use the new assembler to generate a jump instead.
Add an assert to make sure patches don't step on each other since it's
less clear cut on A64, where the size of the jump varies depending on
its placement relative to the target.
Fixes a lot of tests that use `set_trace_func` in `test_insns.rb`.
PR: https://github.com/Shopify/ruby/pull/379
* Left and right shift for IR
* Update yjit/src/backend/x86_64/mod.rs
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Port opt_eq and opt_neq to the new backend
* Just use into() outside
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Use C_RET_OPND to share the register
* Revert "Use C_RET_OPND to share the register"
This reverts commit 99381765d0008ff0f03ea97c6c8db608a2298e2b.
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Port setivar to the new backend IR
* Add a few more setivar test cases
* Prefer const_ptr
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Fix asm.load(VALUE)
- `<VALUE as impl Into<Opnd>>` didn't track that the value is a value
- `Iterator::map` doesn't evaluate the closure you give it until you
call `collect`. Use a for loop instead so we put the gc offsets
into the compiled block properly.
* x64: Mov(mem, VALUE) should load the value first
Tripped in codegen for putobject now that we are actually feeding
`Opnd::Value` into the backend.
* x64 split: Canonicallize VALUE loads
* Update yjit/src/backend/x86_64/mod.rs
* Port gen_send_cfunc to the new backend
* Remove an obsoleted test
* Add more cfunc tests
* Use csel_e instead and more into()
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Add a missing lea for build_kwargs
* Split cfunc test cases
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
forward_pass adjusts the indexes of our opnds to reflect the new
instructions as they are generated in the forward pass. However, we were
using the old live_ranges array, for which the new indexes are
incorrect.
This caused us to previously generate an IR which contained unnecessary
trivial load instructions (ex. mov rax, rax), because it was looking at
the wrong lifespans. Presumably this could also cause bugs because the
lifespan of the incorrectly considered operand idx could be short.
We've added an assert which would have failed on the previous trivial
case (but not necessarily all cases).
Co-authored-by: Matthew Draper <matthew@trebex.net>
* Convert getinstancevariable to new backend IR
* Support mem-based mem
* Use more into()
* Add tests for getivar
* Just load obj_opnd to a register
* Apply another into()
* Flip the nil-out condition
* Fix duplicated counts of side_exit
* Move allocation into Assembler::pos_marker
We wanted to do this to begin with but didn't because we were confused
about the lifetime parameter. It's actually talking about the lifetime
of the references that the closure captures. Since all of our usages
capture no references (they use `move`), it's fine to put a `+ 'static`
here.
* Use optional token syntax for calling convention macro
* Explicitly request C ABI on ARM
It looks like the Rust calling convention for functions are the same as
the C ABI for now and it's unlikely to change, but it's easy for us to
be explicit here. I also tried saying `extern "aapcs"` but that
unfortunately doesn't work.
* A64: Fix off by one in offset encoding for BL
It's relative to the address of the instruction not the end of it.
* A64: Fix off by one when encoding B
It's relative to the start of the instruction not the end.
* A64: Add some tests for boundary offsets
It allows for reserving a specific register and prevents the register
allocator from clobbering it. Without this
`./miniruby --yjit-stats --yjit-callthreshold=1 -e0` was crashing because
the counter incrementing code was clobbering RAX incorrectly.
* Better splitting for Op::Add, Op::Sub, and Op::Cmp
* Split stores if the displacement is too large
* Use a shifted immediate argument
* Split all places where shifted immediates are used
* Add more tests to the cirrus workflow
* Refactor defer_compilation to use PosMarker
* Port gen_direct_jump() to use PosMarker
* Port gen_branch, branchunless
* Port over gen_jump()
* Port over branchif and branchnil
* Fix use od record_boundary_patch_point in jump_to_next_insn
* get_dupn was allocating and throwing away an Assembler object instead of using the one passed in
* Uncomment remaining tests in codegen.rs, which seem to work now
* Implement PosMarker instruction
* Implement PosMarker in the arm backend
* Make bindgen run only for clang image
* Fix if-else in cirrus CI file
* Add missing semicolon
* Try removing trailing semicolon
* Try to fix shell/YAML syntax
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Move to/from SP on AArch64
* Consolidate loads and stores
* Implement LDR post-index and LDR pre-index for AArch64
* Implement STR post-index and STR pre-index for AArch64
* Module entrypoints for LDR pre/post -index and STR pre/post -index
* Use STR (pre-index) and LDR (post-index) to implement push/pop
* Go back to using MOV for to/from SP
* ADR and ADRP for AArch64
* Implement Op::Jbe on X86
* Lera instruction
* Op::BakeString
* LeaPC -> LeaLabel
* Port print_str to the new backend
* Port print_value to the new backend
* Port print_ptr to the new backend
* Write null-terminators in Op::BakeString
* Fix up rebase issues on print-str port
* Add back in panic for X86 backend for unsupported instructions being lowered
* Fix target architecture
Previously we were using a `Box<dyn FnOnce>` to support patching the
code when jumping to labels. We needed to do this because some of the
closures that were being used to patch needed to capture local variables
(on both X86 and ARM it was the type of condition for the conditional
jumps).
To get around that, we can instead use const generics since the
condition codes are always known at compile-time. This means that the
closures go from polymorphic to monomorphic, which means they can be
represented as an `fn` instead of a `Box<dyn FnOnce>`, which means they
can fall back to a plain function pointer. This simplifies the storage
of the `LabelRef` structs and should hopefully be a better default
going forward.
* More Arm64 lowering/backend work
* We now have encoding support for the LDR instruction for loading a PC-relative memory location
* You can now call add/adds/sub/subs with signed immediates, which switches appropriately based on sign
* We can now load immediates into registers appropriately, attempting to keep the minimal number of instructions:
* If it fits into 16 bytes, we use just a single movz.
* Else if it can be encoded into a bitmask immediate, we use a single mov.
* Otherwise we use a movz, a movk, and then optionally another one or two movks.
* Fixed a bunch of code to do with the Op::Load opcode.
* We now handle GC-offsets properly for Op::Load by skipping around them with a jump instruction. (This will be made better by constant pools in the future.)
* Op::Lea is doing what it's supposed to do now.
* Fixed a bug in the backend tests to do with not using the result of an Op::Add.
* Fix the remaining tests for Arm64
* Move split loads logic into each backend
* Get initial wiring up
* Split IncrCounter instruction
* Breakpoints in Arm64
* Support for ORR
* MOV instruction encodings
* Implement JmpOpnd and CRet
* Add ORN
* Add MVN
* PUSH, POP, CCALL for Arm64
* Some formatting and implement Op::Not for Arm64
* Consistent constants when working with the Arm64 SP
* Allow OR-ing values into the memory buffer
* Test lowering Arm64 ADD
* Emit unconditional jumps consistently in Arm64
* Begin emitting conditional jumps for A64
* Back out some labelref changes
* Remove label API that no longer exists
* Use a trait for the label encoders
* Encode nop
* Add in nops so jumps are the same width no matter what on Arm64
* Op::Jbe for CodePtr
* Pass src_addr and dst_addr instead of calculated offset to label refs
* Even more jump work for Arm64
* Fix up jumps to use consistent assertions
* Handle splitting Add, Sub, and Not insns for Arm64
* More Arm64 splits and various fixes
* PR feedback for Arm64 support
* Split up jumps and conditional jump logic
* Remove x86-64 dependency from codegen.rs
* Port over putnil and putobject
* Port over gen_leave()
* Complete port of gen_leave()
* Fix bug in x86 instruction splitting
* LDUR
* Fix up immediate masking
* Consume operands directly
* Consistency and cleanup
* More consistency and entrypoints
* Cleaner syntax for masks
* Cleaner shifting for encodings
* Initial setup for aarch64
* ADDS and SUBS
* ADD and SUB for immediates
* Revert moved code
* Documentation
* Rename Arm64* to A64*
* Comments on shift types
* Share sig_imm_size and unsig_imm_size
* Split instructions if necessary
* Add a reusable transform_insns function
* Split out comments labels from transform_insns
* Refactor alloc_regs to use transform_insns