ruby/yjit/bindgen
Aaron Patterson 9e067df76b 32 bit comparison on shape id
This commit changes the shape id comparisons to use a 32 bit comparison
rather than 64 bit.  That means we don't need to load the shape id to a
register on x86 machines.

Given the following program:

```ruby
class Foo
  def initialize
    @foo = 1
    @bar = 1
  end

  def read
    [@foo, @bar]
  end
end

foo = Foo.new
foo.read
foo.read
foo.read
foo.read
foo.read

puts RubyVM::YJIT.disasm(Foo.instance_method(:read))
```

The machine code we generated _before_ this change is like this:

```
== BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ======================
  # getinstancevariable
  0x559a18623023: mov rax, qword ptr [r13 + 0x18]
  # guard object is heap
  0x559a18623027: test al, 7
  0x559a1862302a: jne 0x559a1862502d
  0x559a18623030: cmp rax, 4
  0x559a18623034: jbe 0x559a1862502d
  # guard shape, embedded, and T_OBJECT
  0x559a1862303a: mov rcx, qword ptr [rax]
  0x559a1862303d: movabs r11, 0xffff00000000201f
  0x559a18623047: and rcx, r11
  0x559a1862304a: movabs r11, 0xb000000002001
  0x559a18623054: cmp rcx, r11
  0x559a18623057: jne 0x559a18625046
  0x559a1862305d: mov rax, qword ptr [rax + 0x18]
  0x559a18623061: mov qword ptr [rbx], rax

== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ======================
  # gen_direct_jmp: fallthrough
  # getinstancevariable
  # regenerate_branch
  # getinstancevariable
  # regenerate_branch
  0x559a18623064: mov rax, qword ptr [r13 + 0x18]
  # guard shape, embedded, and T_OBJECT
  0x559a18623068: mov rcx, qword ptr [rax]
  0x559a1862306b: movabs r11, 0xffff00000000201f
  0x559a18623075: and rcx, r11
  0x559a18623078: movabs r11, 0xb000000002001
  0x559a18623082: cmp rcx, r11
  0x559a18623085: jne 0x559a18625099
  0x559a1862308b: mov rax, qword ptr [rax + 0x20]
  0x559a1862308f: mov qword ptr [rbx + 8], rax
```

After this change, it's like this:

```
== BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ======================
  # getinstancevariable
  0x5560c986d023: mov rax, qword ptr [r13 + 0x18]
  # guard object is heap
  0x5560c986d027: test al, 7
  0x5560c986d02a: jne 0x5560c986f02d
  0x5560c986d030: cmp rax, 4
  0x5560c986d034: jbe 0x5560c986f02d
  # guard shape
  0x5560c986d03a: cmp word ptr [rax + 6], 0x19
  0x5560c986d03f: jne 0x5560c986f046
  0x5560c986d045: mov rax, qword ptr [rax + 0x10]
  0x5560c986d049: mov qword ptr [rbx], rax

== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ======================
  # gen_direct_jmp: fallthrough
  # getinstancevariable
  # regenerate_branch
  # getinstancevariable
  # regenerate_branch
  0x5560c986d04c: mov rax, qword ptr [r13 + 0x18]
  # guard shape
  0x5560c986d050: cmp word ptr [rax + 6], 0x19
  0x5560c986d055: jne 0x5560c986f099
  0x5560c986d05b: mov rax, qword ptr [rax + 0x18]
  0x5560c986d05f: mov qword ptr [rbx + 8], rax
```

The first ivar read is a bit more complex, but the second ivar read is
much simpler.  I think eventually we could teach the context about the
shape, then emit only one shape guard.
2022-11-18 12:04:10 -08:00
..
src 32 bit comparison on shape id 2022-11-18 12:04:10 -08:00
Cargo.lock Update bindgen crate (#6397) 2022-09-18 20:42:57 +09:00
Cargo.toml Update bindgen crate (#6397) 2022-09-18 20:42:57 +09:00