github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Maxime Chevalier-Boisvert	c9b30f9d76	YJIT: implement imul instruction encoding in x86 assembler (#8191 )	2023-08-09 13:12:21 -04:00
Kevin Newton	a41c617e41	Implement MUL instruction for aarch64 (#8193 )	2023-08-09 12:21:53 -04:00
Takashi Kokubun	6acfc50bcc	YJIT: Count all opt_getconstant_path exit reasons (#8187 )	2023-08-09 09:54:24 -04:00
Alan Wu	5eef3ce21f	YJIT: Correct name of a counter (#8186 )	2023-08-09 09:47:42 -04:00
Takashi Kokubun	cd8d20cd1f	YJIT: Compile exception handlers (#8171 ) Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-08-08 16:06:22 -07:00
Maxime Chevalier-Boisvert	8d7861e3da	YJIT: expand bitwise shift support in x86 assembler (#8174 )	2023-08-04 14:57:56 -04:00
Maxime Chevalier-Boisvert	fc0b2a8df2	YJIT: guard for array_len >= num in expandarray (#8169 ) Avoid generating long dispatch chains for all array lengths seen.	2023-08-04 10:09:43 -04:00
Maxime Chevalier-Boisvert	4f99240b2e	YJIT: add jb (unsigned less-than) instruction to backend (#8168 )	2023-08-03 16:14:44 -04:00
Maxime Chevalier-Boisvert	98b4256aa7	YJIT: handle expandarray_rhs_too_small case (#8161 ) * YJIT: handle expandarray_rhs_too_small case YJIT: fix csel bug in x86 backend, add test Remove commented out lines Refactor expandarray to use chain guards Propagate Type::Nil when known Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * Add missing counter, use get_array_ptr() in expandarray * Make change suggested by Kokubun to reuse loop --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-08-03 16:09:18 -04:00
Jean byroot Boussier	e20f1e443f	YJIT: Fallback setivar if the receiver isn't T_OBJECT (#8160 ) Followup: https://github.com/ruby/ruby/pull/8152 If the receiver is a T_MODULE or T_CLASS and has a lot of ivars, `get_next_shape_internal` will return `NULL`. Co-authored-by: Jean Boussier <byroot@ruby-lang.org>	2023-08-02 11:33:12 -04:00
Takashi Kokubun	81c198b5cf	YJIT: Distinguish exit and fallback reasons for send (#8159 )	2023-08-02 10:19:39 -04:00
Takashi Kokubun	452debba22	YJIT: Fix --yjit-dump-disasm coloring on less(1) (#8158 )	2023-08-02 10:16:37 -04:00
Takashi Kokubun	d405410e3c	YJIT: Move ROBJECT_OFFSET_* to yjit.c (#8157 )	2023-08-02 10:15:29 -04:00
Hiroshi SHIBATA	fd782dcd1e	Revert "YJIT: implement `expandarray_rhs_too_small` case (#8153 )" This reverts commit `3b88a0bee8`. This commit break aarch64 platform and Apple Silicon	2023-08-02 14:25:16 +09:00
Takashi Kokubun	5ff1c00e17	YJIT: Let local yjit-bindgen exit successfully (#8156 )	2023-08-01 14:46:14 -07:00
Maxime Chevalier-Boisvert	3b88a0bee8	YJIT: implement `expandarray_rhs_too_small` case (#8153 ) * YJIT: handle expandarray_rhs_too_small case * YJIT: fix csel bug in x86 backend, add test * Remove commented out lines	2023-08-01 15:58:00 -04:00
Takashi Kokubun	16b91a346f	YJIT: Fallback setivar if the next shape is too complex (#8152 )	2023-08-01 11:43:32 -07:00
Takashi Kokubun	bde4080b39	YJIT: Drop Copy trait from Context (#8138 )	2023-07-29 21:14:04 -04:00
Takashi Kokubun	cf0c907bc7	YJIT: Count setivar too-complex exits (#8131 )	2023-07-27 19:35:30 -04:00
Maxime Chevalier-Boisvert	5669a28fde	YJIT: implement missing `asm.jg` instruction in backend (#8130 ) YJIT: implement missing jg instruction in backend While trying to implement a specialize integer left shift, I ran into a problem where we have no way to do a greater-than comparison at the moment. Surprising we went this far without ever needing it.	2023-07-27 17:47:29 -04:00
Alan Wu	7633299c50	YJIT: getblockparamproxy for when block is a Proc	2023-07-27 15:03:46 -04:00
Alan Wu	34825ed20d	Revert "YJIT: Fix naming for a getblockparamproxy counter" This reverts commit e7804963f09d7df7f6cce44fbb3e37809c9a15cc. Oops. The counter was for getblockparam, without "proxy", so it was aptly named.	2023-07-27 15:03:46 -04:00
Takashi Kokubun	e5effa4bd0	YJIT: Use dynamic dispatch for megamorphic send (#8125 )	2023-07-27 13:09:17 -04:00
Takashi Kokubun	9bdd485972	YJIT: Count the number of dynamic send dispatches (#8122 )	2023-07-26 12:59:59 -07:00
Alan Wu	37160be439	YJIT: Fix naming for a getblockparamproxy counter The rest of the counters are prefixed with `gbpp_` and that's what `yjit.rb` uses when printing the summary. This counter wasn't included in the summary.	2023-07-26 13:58:15 -04:00
ywenc	8ca399d640	Implement `opt_aref_with` instruction (#8118 ) Implement gen_opt_aref_with Vm opt_aref_with is available Test opt_aref_with Stats for opt_aref_with Co-authored-by: jhawthorn <jhawthorn@github.com>	2023-07-26 10:38:59 -04:00
Takashi Kokubun	cef60e93e6	YJIT: Fallback send instructions to vm_sendish (#8106 )	2023-07-24 13:51:46 -07:00
Takashi Kokubun	c4ef3d767b	YJIT: Rename exec_instruction to yjit_insns_count (#8102 )	2023-07-20 15:54:59 -04:00
Takashi Kokubun	a7127745f1	Get rid of obsoleted __bp__ references	2023-07-20 11:55:31 -07:00
Takashi Kokubun	b41fc9b9a4	YJIT: Avoid undercounting retired_in_yjit (#8038 ) * YJIT: Count the number of failed instructions * Rename yjit_insns_count to exec_instructions instead * Hoist out the exec_instruction counter	2023-07-20 13:14:25 -04:00
Alan Wu	f302e725e1	Remove __bp__ and speed-up bmethod calls (#8060 ) Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since `9121e57a5f`. Closes ruby/ruby#6428	2023-07-17 13:57:58 -04:00
Maxime Chevalier-Boisvert	d70484f0eb	YJIT: refactoring to allow for fancier call threshold logic (#8078 ) * YJIT: refactoring to allow for fancier call threshold logic * Avoid potentially compiling functions multiple times. * Update vm.c Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2023-07-17 10:41:18 -04:00
Takashi Kokubun	d814722fb8	YJIT: Make ratio_in_yjit always available (#8064 )	2023-07-13 18:14:43 -04:00
Peter Zhu	3223181284	Remove RARRAY_CONST_PTR_TRANSIENT RARRAY_CONST_PTR now does the same things as RARRAY_CONST_PTR_TRANSIENT.	2023-07-13 14:48:14 -04:00
Peter Zhu	1e7b67f733	[Feature #19730 ] Remove transient heap	2023-07-13 09:27:33 -04:00
Hiroshi SHIBATA	76ef28186f	[DOC] Removed redundant `the`	2023-07-13 20:30:44 +09:00
Matt Valentine-House	d426343418	Store object age in a bitmap Closes [Feature #19729] Previously 2 bits of the flags on each RVALUE are reserved to store the number of GC cycles that each object has survived. This commit introduces a new bit array on the heap page, called age_bits, to store that information instead. This patch still reserves one of the age bits in the flags (the old FL_PROMOTED0 bit, now renamed FL_PROMOTED). This is set to 0 for young objects and 1 for old objects, and is used as a performance optimisation for the write barrier. Fetching the age_bits from the heap page and doing the required math to calculate if the object was old or not would slow down the write barrier. So we keep this bit synced in the flags for fast access.	2023-07-13 09:21:36 +01:00
Maxime Chevalier-Boisvert	e770006486	YJIT: add counter for untracked gbpp exit reason (#8052 )	2023-07-11 10:17:48 -04:00
Takashi Kokubun	48d95de6a6	YJIT: Use registers to pass stack temps to C calls (#7920 ) * YJIT: Use registers to pass stack temps to C calls * YJIT: Update comments in ccall	2023-07-06 15:51:01 -04:00
Maxime Chevalier-Boisvert	2acb44e044	YJIT: add new stats counter for compiled ISEQ entry points (#8032 ) * YJIT: add new stats counter for compiled ISEQ entry points * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-07-06 10:17:03 -04:00
Takashi Kokubun	9c1776e6b0	YJIT: Use --yjit-exec-mem-size=128 by default (#8031 )	2023-07-05 14:21:02 -07:00
Takashi Kokubun	6b2abe570f	YJIT: Avoid reloading InsnOut operands (#8021 )	2023-07-04 16:02:39 -04:00
Takashi Kokubun	a1d4dada6b	YJIT: Break register cycles for C arguments (take 2) (#8018 ) * Revert "Revert "YJIT: Break register cycles for C arguments (#7918)"" This reverts commit `78ca085785`. * Use shfited_live_ranges for the last-insn check	2023-07-04 15:57:32 -04:00
Alan Wu	296782ab60	YJIT: Fix autosplat miscomp for blocks with optionals (#8006 ) * YJIT: Fix autosplat miscomp for blocks with optionals When passing an array as the sole argument to `yield`, and the yieldee takes more than 1 optional parameter, the array is expanded similar to `array` splat calls. This is called "autosplat" in `setup_parameters_complex()`. Previously, YJIT did not detect this autosplat condition. It passed the array without expanding it, deviating from interpreter behavior. Detect this conditon and refuse to compile it. Fixes: Shopify/yjit#313 RJIT: Fix autosplat miscomp for blocks with optionals This is mirrors the same issue as YJIT. See previous commit.	2023-07-04 10:45:29 -04:00
Hiroshi SHIBATA	218f913aa6	Suppressing security alert of atty dependency by env_logger-0.9.0	2023-07-04 10:18:54 -04:00
Nobuyoshi Nakada	512cac3240	Remove taint and untrusted flags (#7958 ) * Make TAINT and UNTRUSTED flags zero These flags do nothing already, and should break nothing. * Remove TAINT and UNTRUSTED macros same as functions These macros had been defined to use with `#ifdef`, but should not be used anymore.	2023-06-19 11:29:24 -04:00
Takashi Kokubun	78ca085785	Revert "YJIT: Break register cycles for C arguments (#7918 )" This reverts commit `888ba29e46`. It caused a CI failure http://ci.rvm.jp/results/trunk-yjit@ruby-sp2-docker/4598881 and I'm investigating it.	2023-06-12 11:30:25 -07:00
Takashi Kokubun	888ba29e46	YJIT: Break register cycles for C arguments (#7918 )	2023-06-12 12:54:06 -04:00
Peter Zhu	441302be1a	Remove RHASH_TRANSIENT_FLAG Hashes are no longer allocated on the transient heap.	2023-06-08 10:42:59 -04:00
Alan Wu	2b54c135ff	YJIT: Avoid identity-based known-class guards for IO objects (#7911 ) `IO#reopen` is very special in that it is able to change the class and singleton class of IO instances. In its presence, it is not correct to assume that IO instances has a stable class/singleton class and guard by comparing identity.	2023-06-06 10:21:29 -04:00
Peter Zhu	7577c101ed	Unify length field for embedded and heap strings (#7908 ) * Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN	2023-06-06 10:19:20 -04:00
Takashi Kokubun	ebe1077330	YJIT: Fix a warning on cargo test (#7909 )	2023-06-05 14:58:04 -07:00
Peter Zhu	2543a6573f	Implement Struct on VWA The benchmark results show that this feature has either a positive or no impact on performance. The memory usage is also mostly unchanged, except in hexapdf, where there is a decrease in RSS. -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- ------------- bench master (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr master/branch activerecord 70.8 2.2 56.0 71.7 2.2 56.0 0.99 0.99 erubi_rails 20.5 13.6 94.7 20.5 14.3 94.2 0.93 1.00 hexapdf 2541.0 0.7 212.8 2544.4 0.7 203.4 1.00 1.00 liquid-c 65.6 0.3 38.9 65.3 0.3 38.9 1.01 1.01 liquid-compile 63.7 0.3 34.6 61.1 0.2 34.6 1.04 1.04 liquid-render 163.1 0.1 37.1 163.3 0.1 37.1 1.00 1.00 mail 139.3 0.1 50.5 137.0 0.1 50.1 0.99 1.02 psych-load 2065.7 0.1 36.9 2068.2 0.1 37.3 1.00 1.00 railsbench 2034.6 0.5 103.9 2031.9 0.5 103.8 1.02 1.00 ruby-lsp 65.3 3.1 89.8 66.2 3.0 89.7 1.01 0.99 sequel 73.2 1.0 40.3 73.4 1.0 40.3 1.00 1.00 -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- -------------	2023-06-05 15:47:16 -04:00
eileencodes	40f090f433	Revert "Revert "Fix cvar caching when class is cloned"" This reverts commit `10621f7cb9`. This was reverted because the gc integrity build started failing. We have figured out a fix so I'm reopening the PR. Original commit message: Fix cvar caching when class is cloned The class variable cache that was added in ruby#4544 changed the behavior of class variables on cloned classes. As reported when a class is cloned AND a class variable was set, and the class variable was read from the original class, reading a class variable from the cloned class would return the value from the original class. This was happening because the IC (inline cache) is stored on the ISEQ which is shared between the original and cloned class, therefore they share the cache too. To fix this we are now storing the `cref` in the cache so that we can check if it's equal to the current `cref`. If it's different we don't want to read from the cache. If it's the same we do. Cloned classes don't share the same cref with their original class. This will need to be backported to 3.1 in addition to 3.2 since the bug exists in both versions. We also added a marking function which was missing. Fixes [Bug #19379] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2023-06-05 11:11:12 -07:00
Takashi Kokubun	4e26ae3cb9	YJIT: Use #[cfg] instead of if cfg! (#7899 )	2023-06-02 14:16:52 -07:00
Aaron Patterson	10621f7cb9	Revert "Fix cvar caching when class is cloned" This reverts commit `77d1b08247`.	2023-06-01 14:55:36 -07:00
eileencodes	77d1b08247	Fix cvar caching when class is cloned The class variable cache that was added in https://github.com/ruby/ruby/pull/4544 changed the behavior of class variables on cloned classes. As reported when a class is cloned AND a class variable was set, and the class variable was read from the original class, reading a class variable from the cloned class would return the value from the original class. This was happening because the IC (inline cache) is stored on the ISEQ which is shared between the original and cloned class, therefore they share the cache too. To fix this we are now storing the `cref` in the cache so that we can check if it's equal to the current `cref`. If it's different we don't want to read from the cache. If it's the same we do. Cloned classes don't share the same cref with their original class. This will need to be backported to 3.1 in addition to 3.2 since the bug exists in both versions. We also added a marking function which was missing. Fixes [Bug #19379] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2023-06-01 08:52:48 -07:00
Takashi Kokubun	1f74e25c3b	YJIT: Force showing a backtrace on panic (#7869 )	2023-05-30 11:20:02 -04:00
Nobuyoshi Nakada	d8da563f9e	Add a newline at EOF [ci skip]	2023-05-24 15:04:07 +09:00
Takashi Kokubun	94a513b08f	YJIT: Enable debug symbols in dev_nodebug (#7822 )	2023-05-19 09:05:39 +09:00
Jimmy Miller	bd532fe3ae	YJIT: Move exits in gen_send_iseq to functions and use ? (#7725 )	2023-05-01 17:01:06 -04:00
Jeremy Evans	99c6d19e50	Generalize cfunc large array splat fix to fix many additional cases raising SystemStackError Originally, when `2e7bceb34e` fixed cfuncs to no longer use the VM stack for large array splats, it was thought to have fully fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs) back in Ruby 2.2. After additional research, I determined that same issue affects almost all types of method calls, not just iseq and cfunc calls. There were two main types of remaining issues, important cases (where large array splat should work) and pedantic cases (where large array splat raised SystemStackError instead of ArgumentError). Important cases: ```ruby define_method(:a){\|a\|} a(1380888.times) def b(a); end send(:b, 1380888.times) :b.to_proc.call(self, 1380888.times) def d; yield(1380888.times) end d(&method(:b)) def self.method_missing(a); end not_a_method(1380888.times) ``` Pedantic cases: ```ruby def a; end a(1380888.times) def b(_); end b(1380888.times) def c(_=nil); end c(1380888.times) c = Class.new do attr_accessor :a alias b a= end.new c.a(1380888.times) c.b(1380888.times) c = Struct.new(:a) do alias b a= end.new c.a(1380888.times) c.b(*1380888.times) ``` This patch fixes all usage of CALLER_SETUP_ARG with splatting a large number of arguments, and required similar fixes to use a temporary hidden array in three other cases where the VM would use the VM stack for handling a large number of arguments. However, it is possible there may be additional cases where splatting a large number of arguments still causes a SystemStackError. This has a measurable performance impact, as it requires additional checks for a large number of arguments in many additional cases. This change is fairly invasive, as there were many different VM functions that needed to be modified to support this. To avoid too much API change, I modified struct rb_calling_info to add a heap_argv member for storing the array, so I would not have to thread it through many functions. This struct is always stack allocated, which helps ensure sure GC doesn't collect it early. Because of how invasive the changes are, and how rarely large arrays are actually splatted in Ruby code, the existing test/spec suites are not great at testing for correct behavior. To try to find and fix all issues, I tested this in CI with VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used for all array splat method calls. This was very helpful in finding breaking cases, especially ones involving flagged keyword hashes. Fixes [Bug #4040] Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>	2023-04-25 08:06:16 -07:00
Takashi Kokubun	f492e3b4e5	YJIT: Use general definedivar at the end of chains (#7756 )	2023-04-24 12:20:52 -07:00
John Hawthorn	072ef7a1aa	YJIT: invokesuper: Remove cme mid matching check This check was introduced to match an assertion in the C YJIT when this was originally introduced. I don't believe it's necessary for correctness of the generated code. Co-authored-by: Adam Hess <HParker@github.com> Co-authored-by: Daniel Colson <danieljamescolson@gmail.com> Co-authored-by: Luan Vieira <luanzeba@github.com>	2023-04-20 16:09:16 -07:00
Takashi Kokubun	a42c36c0f1	YJIT: Merge lower_stack into the split pass (#7748 )	2023-04-20 15:35:27 -07:00
Maxime Chevalier-Boisvert	277098bde2	Fix inaccurate comment	2023-04-20 16:31:34 -04:00
Takashi Kokubun	64a25977ed	YJIT: Merge csel and mov on arm64 (#7747 ) * YJIT: Refactor arm64_split with &mut insn * YJIT: Merge csel and mov on arm64	2023-04-20 13:08:42 -07:00
Takashi Kokubun	995b960c70	YJIT: Avoid splitting mov for small values on arm64 (#7745 ) * YJIT: Avoid splitting mov for small values on arm64 * Fix a comment Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * YJIT: Test the 0xffff boundary --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2023-04-20 10:05:30 -07:00
Takashi Kokubun	d2d0954c16	YJIT: Replace Mov with LoadInto on arm64 (#7744 ) * YJIT: Replace Mov with LoadInto on arm64 * YJIT: Add a test for the new pass	2023-04-19 17:25:24 -07:00
Takashi Kokubun	5fc11f306c	YJIT: Tweak asm comments (#7743 )	2023-04-19 19:30:43 -04:00
Takashi Kokubun	2531bb0b66	YJIT: Remove Insn::RegTemps (#7741 ) * YJIT: Remove Insn::RegTemps * Update a comment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-04-19 13:08:35 -07:00
Aaron Patterson	b816ea8772	Implement opt_newarray_send in YJIT This commit implements opt_newarray_send along with min / max / hash for stack allocated arrays	2023-04-18 17:16:22 -07:00
Aaron Patterson	66938fc703	updating bindgen	2023-04-18 17:16:22 -07:00
John Hawthorn	2dff1d4fda	YJIT: Fix raw sample stack lengths in exit traces (#7728 ) yjit-trace-exits appends a synthetic sample for the instruction being exited, but we didn't increment the size of the stack. Fixing this count correctly lets us successfully generate a flamegraph from the exits. I also replaced the line number for instructions with 0, as I don't think the previous value had meaning. Co-authored-by: Adam Hess <HParker@github.com>	2023-04-18 10:09:16 -04:00
Jimmy Miller	293913905e	YJIT: Fixes failure reported by rails for opt+splat+rest (#7727 )	2023-04-17 17:58:04 -04:00
Takashi Kokubun	5aa3be65f4	YJIT: Spill a caller stack as late as possible (#7726 )	2023-04-17 17:57:33 -04:00
Takashi Kokubun	45c6b58768	YJIT: Add a counter to all side exits (#7720 )	2023-04-14 19:49:44 -04:00
Alan Wu	0b95cbcbde	YJIT: Remove duplicate `asm.spill_temps()` `jit_prepare_routine_call()` calls it, and there is another call above on line 2302. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-04-14 18:01:14 -04:00
Alan Wu	31e67a476f	YJIT: Fix false object collection when setting ivar Previously, setinstancevariable could generate code that calls `rb_ensure_iv_list_size()` without first updating `cfp->sp`. This means in the event that a GC start from within said routine the top few objects would not be marked, causing them to be falsly collected. Call `jit_prepare_routine_call()` first. [Bug #19601]	2023-04-14 18:01:14 -04:00
Takashi Kokubun	4501fb8b46	YJIT: Introduce Target::SideExit (#7712 ) * YJIT: Introduce Target::SideExit * YJIT: Obviate Insn::SideExitContext * YJIT: Avoid cloning a Context for each insn	2023-04-14 17:00:10 -04:00
Jimmy Miller	d83e59e6b8	YJIT: Change to Option<CodegenStatus> (#7717 )	2023-04-14 14:21:29 -04:00
Takashi Kokubun	8286eed20b	Allow testing a different version	2023-04-13 17:53:41 -07:00
Jimmy Miller	08413f982c	YJIT: Add support for rest with option and splat args (#7698 )	2023-04-13 16:21:02 -07:00
Takashi Kokubun	bbe69fba59	YJIT: Use an enum to represent counters (#7701 )	2023-04-13 13:54:09 -04:00
Takashi Kokubun	2fcd3ea6d8	YJIT: Move stack_opnd from Context to Assembler (#7700 )	2023-04-13 12:20:29 -04:00
Adam Hess	854baee2c9	YJIT: Add a sampling option to exit tracing (#7693 ) Add a sampling option to trace exits Running YJIT with trace exits enabled can make very large metrics files. This allows us to configure a sample rate to make tracing exits possible on larger tests. This also updates the documented YJIT options. Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-04-13 10:07:07 -04:00
John Hawthorn	0ce2bdc76d	YJIT: Fix missing argc check in known cfuncs Previously we were missing a compile-time check that the known cfuncs receive the correct number of arguments. We noticied this because in particular when using ARGS_SPLAT, which also wasn't checked, YJIT would crash on code which was otherwise correct (didn't raise exceptions in the VM). This still supports vararg (argc == -1) cfuncs. I added an additional assertion that when we use the specialized codegen for one of these known functions that the argc are popped off the stack correctly, which should help ensure they're implemented correctly (previously the crash was usually observed on a future `leave` insn). [Bug #19595]	2023-04-12 17:48:34 -07:00
Takashi Kokubun	00bbd31edb	YJIT: Let Assembler own Context (#7691 ) * YJIT: Let Assembler own Context * Update a comment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-04-12 12:11:44 -07:00
Alan Wu	c767e6f0c2	YJIT: Fix build on A64 Typo fix for the last commit (`1432b37`)	2023-04-11 19:35:30 -04:00
Takashi Kokubun	1432b3768a	YJIT: Fix a compilation warning in x86_64 This is used only for arm64's cb.jmp_ptr_bytes().	2023-04-11 14:59:21 -07:00
Takashi Kokubun	7297374c5e	YJIT: Reduce paddings if --yjit-exec-mem-size <= 128 on arm64 (#7671 ) * YJIT: Reduce paddings if --yjit-exec-mem-size <= 128 on arm64 * YJIT: Define jmp_ptr_bytes on CodeBlock	2023-04-11 11:02:52 -04:00
Takashi Kokubun	1ff14a855a	YJIT: Avoid using a register for unspecified_bits (#7685 ) Fix [Bug #19586]	2023-04-10 16:35:48 -07:00
Nobuyoshi Nakada	08324ab9eb	Include `--no-llvm-bc` option in `NM` macro only if usable	2023-04-08 12:47:27 +09:00
Takashi Kokubun	89bdf6e94c	YJIT: Stack temp register allocation for arm64 (#7659 ) * YJIT: Stack temp register allocation for arm64 * Update a comment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> * Update comments about assertion * Update a comment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-04-06 11:34:58 -04:00
Matt Valentine-House	2a34bcaa10	Update VPATH for socket, & dependencies The socket extensions rubysocket.h pulls in the "private" include/gc.h, which now depends on vm_core.h. vm_core.h pulls in id.h when tool/update-deps generates the dependencies for the makefiles, it generates the line for id.h to be based on VPATH, which is configured in the extconf.rb for each of the extensions. By default VPATH does not include the actual source directory of the current Ruby so the dependency fails to resolve and linking fails. We need to append the topdir and top_srcdir to VPATH to have the dependancy picked up correctly (and I believe we need both of these to cope with in-tree and out-of-tree builds). I copied this from the approach taken in https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3	2023-04-06 11:07:16 +01:00
Takashi Kokubun	1587494b0b	YJIT: Add codegen for Integer methods (#7665 ) * YJIT: Add codegen for Integer methods * YJIT: Update dependencies * YJIT: Fix Integer#[] for argc=2	2023-04-05 13:19:31 -07:00
Takashi Kokubun	615a1bc470	YJIT: Count the number of actually written bytes (#7658 )	2023-04-05 10:32:04 -04:00
Alan Wu	3e1e09b2b7	YJIT: Smoke test on Rust 1.58.0 Since warnings might show up on older version but not newer ones.	2023-04-05 09:49:31 -04:00
Alan Wu	8f734cf93e	YJIT: Eanble `unsafe_op_in_unsafe_fn` on crate::core Encourages commenting about soundness of `unsafe` usages.	2023-04-05 09:49:31 -04:00
Alan Wu	929d55c3c7	Revert "YJIT: Suppress unnecessary `unsafe` block (GH-7634)" This reverts commit `9e678cdbd0`. Without the `unsafe` annotations, the SAFETY comments make less sense. I want to keep the SAFETY comments.	2023-04-05 09:49:31 -04:00
Peter Zhu	1da2e7fca3	[Feature #19579 ] Remove !USE_RVARGC code (#7655 ) Remove !USE_RVARGC code [Feature #19579] The Variable Width Allocation feature was turned on by default in Ruby 3.2. Since then, we haven't received bug reports or backports to the non-Variable Width Allocation code paths, so we assume that nobody is using it. We also don't plan on maintaining the non-Variable Width Allocation code, so we are going to remove it.	2023-04-04 17:30:06 -04:00
Maxime Chevalier-Boisvert	d26d3575ca	YJIT: add stats for ratio of versions per block (#7653 )	2023-04-04 16:41:52 -04:00
Takashi Kokubun	e5de0fe108	Remove an unused counter I ended up not using it.	2023-04-04 11:09:09 -07:00
Takashi Kokubun	116c0f92ef	Resurrect yjit-smoke-test before #7651	2023-04-04 11:07:57 -07:00
Takashi Kokubun	b7717fc390	YJIT: Stack temp register allocation (#7651 ) Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-04-04 10:58:11 -07:00
Takashi Kokubun	38209ffdca	YJIT: Add codegen for Array#<< (#7645 )	2023-04-03 14:10:35 -07:00
Takashi Kokubun	1b475fcd10	Remove an unneeded function copy	2023-04-01 23:09:05 -07:00
Takashi Kokubun	df1b007fbd	Remove unused VM_CALL_BLOCKISEQ flag	2023-04-01 10:22:47 -07:00
Alan Wu	8938f146ab	YJIT: Remove unused variable [ci skip]	2023-03-31 15:19:02 -04:00
Nobuyoshi Nakada	9e678cdbd0	YJIT: Suppress unnecessary `unsafe` block (#7634 )	2023-03-31 10:15:39 -04:00
Jimmy Miller	a8782c454c	YJIT: Test more kw and rest cases and change exit name	2023-03-30 18:01:26 -04:00
Takashi Kokubun	9a617c067f	YJIT: Generate side exits as late as possible (#7612 ) * YJIT: Generate side exits late as possible * YJIT: s/for_stack_size/with_stack_size/ * YJIT: s/get_counter/exit_counter/	2023-03-30 14:15:59 -07:00
Alan Wu	1b06422767	YJIT: Leave cfp->pc uninitialized for VM_FRAME_MAGIC_CFUNC C function frames don't need to use the VM-specific pc field to run properly. When pushing a control frame from output code, save one instruction by leaving the field uninitialized. Fix-up rb_vm_svar_lep(), which is used while setting local variables via Regexp#=~. Use cfp->iseq as a secondary signal so it can stop assuming that all CFUNC frames always have zero pc's.	2023-03-29 17:57:52 -04:00
Alan Wu	a1a4d77472	YJIT: code_gc(): Assert self is inline to avoid other_cb() The derived `&mut` from `other_cb()` overlapped with the parameter `ocb`. Use `cfg!()` instead of `#[cfg...]` to avoid unused warnings.	2023-03-29 14:53:49 -04:00
Alan Wu	cdededf24d	YJIT: Take VM lock in RubyVM::YJIT.code_gc Code GC needs synchronization.	2023-03-29 14:53:49 -04:00
Alan Wu	93b6997103	YJIT: Fix overlapping &mut in Assembler::code_gc() Making overlapping `&mut`s triggers Undefined Bahavior. This function previously had them through `cb` and `ocb` aliasing with `self` or live references in the caller. To fix the overlap, take `ocb` as a parameter and don't use `get_inline_cb()` in the body of the function.	2023-03-29 14:53:49 -04:00
Jimmy Miller	a8c6ba23a6	YJIT: Rest and keyword (non-supplying) (#7608 ) * YJIT: Rest and keyword (non-supplying) * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-03-29 12:31:41 -04:00
Maxime Chevalier-Boisvert	39a34694a0	YJIT: Add `--yjit-pause` and `RubyVM::YJIT.resume` (#7609 ) * YJIT: Add --yjit-pause and RubyVM::YJIT.resume This allows booting YJIT in a suspended state. We chose to add a new command line option as opposed to simply allowing YJIT.resume to work without any command line option because it allows for combining with YJIT tuning command line options. It also simpifies implementation. Paired with Kokubun and Maxime. * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-03-28 15:21:19 -04:00
Takashi Kokubun	2f8a598dc5	YJIT: Stop using the starting_context pattern (#7610 )	2023-03-28 11:40:48 -07:00
Jimmy Miller	59c3fac6c4	YJIT: Rest and block_arg support (#7584 )	2023-03-24 17:01:59 -04:00
Alan Wu	35e9b5348d	YJIT: Constify EC to avoid an `as` pointer cast (#7591 )	2023-03-24 12:36:06 -04:00
Takashi Kokubun	b4e438d8aa	YJIT: Save PC on rb_str_concat (#7586 ) [Bug #19483] Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2023-03-23 13:42:49 -07:00
Jimmy Miller	8286544dc5	YJIT: Use starting context for status === CantCompile (#7583 )	2023-03-23 13:11:46 -04:00
Ole Friis Østergaard	e950781880	Use shape information in YJIT's definedivar implementation (#7579 ) * Use shape information in YJIT's definedivar implementation * Handle complex shape for definedivar	2023-03-23 11:04:30 -04:00
Koichi Sasada	c9fd81b860	`vm_call_single_noarg_inline_builtin` If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and the builtin-function (bf) is inline-able, the caller doesn't need to build a method frame. `vm_call_single_noarg_inline_builtin` is fast path for such cases.	2023-03-23 14:03:12 +09:00
Alan Wu	aa54082d70	YJIT: Fix large ISeq rejection (#7576 ) We crashed in some edge cases due to the recent change to not compile encoded iseqs that are larger than `u16::MAX`. - Match the C signature of rb_yjit_constant_ic_update() and clamp down to `IseqIdx` size - Return failure instead of panicking with `unwrap()` in codegen when the iseq is too large Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Noah Gibbs <noah.gibbs@shopify.com>	2023-03-21 14:24:17 -04:00
Jimmy Miller	5de26bc031	YJIT: Fix incorrect exit in splat (#7575 ) So by itself, this shouldn't have been a correctness issue, but we also pop the stack for block_args. Doing stack manipulation like that and then side-exiting causes issues. So, while this fixes the immediate failure, we have a bigger issue with block_args popping and then exiting that we need to deal with.	2023-03-21 12:57:26 -04:00
Peter Zhu	30e7561d1d	Revert "YJIT: Rest and block_arg support (#7557 )" This reverts commit `5d0a1ffafa`. This commit is causing sequel in yjit-bench to raise with this stack trace: ``` sequel-5.64.0/lib/sequel/dataset/sql.rb:266:in `literal': wrong argument type Array (expected Proc) (TypeError) from sequel-5.64.0/lib/sequel/database/misc.rb:269:in `literal' from sequel-5.64.0/lib/sequel/adapters/shared/sqlite.rb:314:in `column_definition_default_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:564:in `block in column_definition_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:564:in `each' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:564:in `column_definition_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:634:in `block in column_list_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:634:in `map' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:634:in `column_list_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:753:in `create_table_sql' from sequel-5.64.0/lib/sequel/adapters/shared/sqlite.rb:348:in `create_table_sql' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:702:in `create_table_from_generator' from sequel-5.64.0/lib/sequel/database/schema_methods.rb:203:in `create_table' from benchmarks/sequel/benchmark.rb:19:in `<main>' ```	2023-03-21 10:51:35 -04:00
Takashi Kokubun	51834ff2ec	YJIT: Make dev_nodebug closer to dev (#7570 )	2023-03-20 13:03:22 -07:00
Maxime Chevalier-Boisvert	44f444478a	YJIT: tag output type as UnknownHeap in `toregexp` (#7562 )	2023-03-20 10:16:22 -04:00
Alan Wu	b9f411b3a8	YJIT: Simplify using the BITS associated constant All the integer types have it.	2023-03-17 17:32:06 -04:00
Maxime Chevalier-Boisvert	6ba07df490	YJIT: make type info more specific in gen_fixnum_cmp and gen_opt_mod (#7555 )	2023-03-17 16:16:34 -04:00
Alan Wu	7fc796f92a	YJIT: Delete --yjit-global-constant-state (#7559 ) It was useful for evaluating `6068da8937` but I think we should remove it now to make the logic around invalidation more straight forward.	2023-03-17 16:16:17 -04:00
Alan Wu	2a26a5e677	YJIT: Add and use Branch::assert_layout() This assert would've caught a bug I wrote while developing ruby/ruby#7443 so I figured it would be good to commit it as it could be helpful in the future.	2023-03-17 16:15:58 -04:00
Jimmy Miller	5d0a1ffafa	YJIT: Rest and block_arg support (#7557 ) * YJIT: Rest and block_arg support * Update bootstraptest/test_yjit.rb --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-03-17 16:11:30 -04:00
Takashi Kokubun	9fd94d6a0c	YJIT: Support entry for multiple PCs per ISEQ (GH-7535)	2023-03-17 11:53:17 -07:00
Alan Wu	10e4fa3a0f	YJIT: Use raw pointers and shared references over `Rc<RefCell<_>>` `Rc` and `RefCell` both incur runtime space costs. In addition, `RefCell` has given us some headaches with the non obvious borrow panics it likes to throw out. The latest one started with `7fd53eeb46` and is yet to be resolved. Since we already rely on the GC to properly reclaim memory for `Block` and `Branch`, we might as well stop paying the overhead of `Rc` and `RefCell`. The `RefCell` panics go away with this change, too. On 25 iterations of `railsbench` with a stats build I got `yjit_alloc_size: 8,386,129 => 7,348,637`, with the new memory size 87.6% of the status quo. This makes the metadata and machine code size roughly line up one-to-one. The general idea here is to use `&` shared references with [interior mutability][1] with `Cell`, which doesn't take any extra space. The `noalias` requirement that `&mut` imposes is way too hard to meet and verify. Imagine replacing places where we would've gotten `BorrowError` from `RefCell` with Rust/LLVM miscompiling us due to aliasing violations. With shared references, we don't have to think about subtle cases like the GC _sometimes_ calling the mark callback while codegen has an aliasing reference in a stack frame below. We mostly only need to worry about liveness, with which the GC already helps. There is now a clean split between blocks and branches that are not yet fully constructed and ones that are "in-service", so to speak. Working with `PendingBranch` and `JITState` don't really involve `unsafe` stuff. This change allows `Branch` and `Block` to not have as many optional fields as many of them are only optional during compilation. Fields that change post-compilation are wrapped in `Cell` to facilitate mutation through shared references. I do some `unsafe` dances here. I've included just a couple tests to run with Miri (`cargo +nightly miri test miri`). We can add more Miri tests if desired. [1]: https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html	2023-03-17 09:30:24 -07:00
Jimmy Miller	5825d7d4a1	YJIT: Remove exit for rest and send combo (#7546 )	2023-03-16 17:40:36 -04:00
Maxime Chevalier-Boisvert	473009d7cb	YJIT: add stats to keep track of when branch direction is known (#7544 ) This measures the impact of changes made by @jhawthorn last year.	2023-03-16 17:24:08 -04:00
Takashi Kokubun	6183180603	YJIT: Eliminate unnecessary mov for trampolines (#7537 )	2023-03-15 16:27:36 -07:00
Alan Wu	ca10274fe3	YJIT: Use assert_disasm! in an A64 test to avoid unused warning I kept getting unused warnings for this macro on A64 macOS.	2023-03-15 19:07:49 -04:00
Maxime Chevalier-Boisvert	9a735c776b	YJIT: use u16 for insn_idx instead of u32 (#7534 )	2023-03-15 17:55:29 -04:00
Alan Wu	de174681f7	YJIT: Assert that we have the VM lock while marking Somewhat important because having the lock is a key part of the soundness reasoning for the `unsafe` usage here.	2023-03-15 15:45:20 -04:00
Aaron Patterson	77c8daa2d4	Make EC required on JIT state (#7520 ) * Make EC required on JIT state Lets make EC required on the JITState object so we don't need to `unwrap` it. * Minor nitpicks --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2023-03-15 10:55:07 -04:00
Takashi Kokubun	70ba310212	YJIT: Introduce no_gc attribute (#7511 )	2023-03-14 15:38:58 -07:00
Takashi Kokubun	9a43c63d43	YJIT: Implement throw instruction (#7491 ) * Break up jit_exec from vm_sendish * YJIT: Implement throw instruction * YJIT: Explain what rb_vm_throw does [ci skip]	2023-03-14 13:39:06 -07:00
Takashi Kokubun	76f2031884	YJIT: Allow testing assembler with disasm (#7470 ) * YJIT: Allow testing assembler with disasm * YJIT: Drop new dependencies * YJIT: Avoid address manipulation * YJIT: Introduce assert_disasm! macro * YJIT: Update the comment about assert_disasm	2023-03-14 13:26:05 -04:00
Takashi Kokubun	c7822b8dbb	YJIT: Merge add/sub/and/or/xor and mov on x86_64 (#7492 )	2023-03-13 16:32:45 -04:00
Jimmy Miller	45127c84d9	YJIT: Handle rest+splat where non-splat < required (#7499 )	2023-03-13 11:12:23 -04:00
Takashi Kokubun	83f6eee76c	YJIT: Bump SEND_MAX_DEPTH to 20 (#7469 ) * YJIT: Bump SEND_MAX_DEPTH to 20 * Fix a test failure	2023-03-10 17:14:38 -05:00
Maxime Chevalier-Boisvert	65a95b8259	YJIT: upgrade type in `guard_object_is_string` (#7489 ) * YJIT: upgrade type in guard_object_is_string Also make logic more in line with other guard_xxx methods * Update yjit/src/core.rs * Revert changes to Type::upgrade()	2023-03-09 18:26:21 -05:00
Takashi Kokubun	487142928a	YJIT: Merge x86_merge into x86_split (#7487 )	2023-03-09 11:58:40 -08:00
Takashi Kokubun	74f44dae96	Another fix for `262254dc7d` Koichi might want to adjust his editor configuration.	2023-03-09 09:29:08 -08:00
Takashi Kokubun	3938b79ef9	Revert an unneeded diff in `262254dc7d`	2023-03-09 09:15:21 -08:00
Koichi Sasada	262254dc7d	rename `defined_ivar` to `definedivar` because non-opt instructions should contain `_` char.	2023-03-10 00:37:11 +09:00
Takashi Kokubun	22d8e95ffe	YJIT: Optimize `cmp REG, 0` into `test REG, REG` (#7471 )	2023-03-09 10:19:59 -05:00
Ole Friis Østergaard	4667a3a665	Add defined_ivar as YJIT instruction as well This works much like the existing `defined` implementation, but calls out to rb_ivar_defined instead of the more general rb_vm_defined. Other difference to the existing `defined` implementation is that this new instruction has to take the same operands as the CRuby `defined_ivar` instruction.	2023-03-08 09:34:31 -08:00
Takashi Kokubun	e93e780f3d	Remove MJIT's builtin function compiler	2023-03-07 23:16:24 -08:00
Takashi Kokubun	3e731cd945	YJIT: Add comments to peek and x86_merge	2023-03-07 14:59:37 -08:00
Takashi Kokubun	7f557d02c3	YJIT: Merge lea and mov on x86_64 when possible	2023-03-07 14:59:37 -08:00
Jimmy Miller	56df6d5f9d	YJIT: Handle splat+rest for args pass greater than required (#7468 ) For example: ```ruby def my_func(x, y, rest) p [x, y, rest] end my_func(1, 2, 3, [4, 5]) ```	2023-03-07 17:03:43 -05:00
Takashi Kokubun	33edcc1120	YJIT: Protect strings from GC on String#<< (#7466 ) Fix https://github.com/Shopify/yjit/issues/310 [Bug #19483] Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>	2023-03-07 16:10:07 -05:00
Jimmy Miller	719a7726d1	YJIT: Handle special case of splat and rest lining up (#7422 ) If you have a method that takes rest arguments and a splat call that happens to line up perfectly with that rest, you can just dupe the array rather than move anything around. We still have to dupe, because people could have a custom to_a method or something like that which means it is hard to guarantee we have exclusive access to that array. Example: ```ruby def foo(a, b, rest) end foo(1, 2, [3, 4]) ```	2023-03-07 12:29:59 -05:00
Takashi Kokubun	a6de8b0d2d	YJIT: Bump SEND_MAX_DEPTH to 10 (#7452 )	2023-03-07 10:21:22 -05:00
Maxime Chevalier-Boisvert	4d59d01621	YJIT: fix CI issue reported by Koichi caused by small stack patch (#7442 ) Includes small reproduction produced by Kokubun. http://ci.rvm.jp/results/trunk-yjit@ruby-sp2-docker	2023-03-03 15:02:52 -08:00
Takashi Kokubun	8c8548b175	YJIT: Fix a cargo test warning on x86_64 (#7428 )	2023-03-03 15:48:14 -05:00
Maxime Chevalier-Boisvert	7b9aeaffcb	YJIT: shrink stack_size/sp_offet to u8/i8 (#7426 )	2023-03-02 17:30:31 -05:00
Alan Wu	34026afd04	YJIT: Delete stale `frozen_bytes` related code (#7423 ) The code and comments in there have been disabled by comments for a long time. The issues that the counter used to solve are now solved more comprehensively by "runningness" [tracking][1] introduced by Code GC and [delayed deallocation][2]. Having a single counter doesn't fit our current model where code pages that could be touched or not are interleaved, anyway. Just delete the code. [1]: `e7c71c6c92` [2]: `a0b0365e90`	2023-03-02 16:21:05 -05:00
Jimmy Miller	ce476cdfb7	YJIT: Fix cfunc splat Follow-up for `cb8a040b79`.	2023-03-02 10:57:19 -05:00
Jimmy Miller	cb8a040b79	YJIT: Properly deal with cfunc splat when no args needed (#7413 ) Related to: https://github.com/ruby/ruby/pull/7377 Previously it was believed that there was a problem with a combination of cfuncs + splat + send, but it turns out the same issue happened without send. For example `Integer.sqrt(1, *[])`. The issue was happened not because of send, but because of setting the wrong argc when we don't need to splat any args.	2023-03-01 16:33:16 -05:00
Maxime Chevalier-Boisvert	27c2572dbd	YJIT: reject large stacks so we can use i8/u8 stack_size and stack_offset (#7412 ) * Reject large stacks so we can use i8/u8 stack_size and stack_offset * Add rejection test for iseq too long as well	2023-03-01 15:09:25 -05:00
Takashi Kokubun	5e607cfa4c	YJIT: Use a boxed slice for outgoing branches and cme dependencies (#7409 ) YJIT: Use a boxed slice for outgoing branches and cme dependencies	2023-03-01 12:15:36 -05:00
Takashi Kokubun	966adfb799	YJIT: Compress BranchGenFn and BranchShape (#7401 ) * YJIT: Compress BranchGenFn and BranchShape * YJIT: Derive Debug for Branch * YJIT: Capitalize BranchGenFn names Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2023-02-28 10:04:28 -08:00
Takashi Kokubun	67ad831b5f	YJIT: Use a boxed slice for gc_obj_offsets (#7397 ) * YJIT: Use a boxed slice for gc_obj_offsets * YJIT: Stop using Option * YJIT: s/add_counter/incr_counter_by/ Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>	2023-02-28 10:03:24 -08:00
Matt Valentine-House	3766cbce13	Update Rust bindgen	2023-02-28 09:09:00 -08:00
Maxime Chevalier-Boisvert	de66b60f33	YJIT: add defer_empty_count stat Count how often we defer from a block that is empty	2023-02-28 11:57:41 -05:00
Matt Valentine-House	98e4bdf3e7	Update YJIT-bindgen	2023-02-27 10:11:56 -08:00
Matt Valentine-House	ae5e62ee90	Merge internal/intern/gc.h into internal/gc.h	2023-02-27 10:11:56 -08:00
Alan Wu	0eb634ae73	YJIT: Detect and reject `send(:alias_for_send, :foo)` Previously, YJIT failed to put the stack into the correct shape when `BasicObject#send` calls an alias method for the send method itself. This can manifest as strange `NoMethodError`s in the final non-send receiver, as [seen][1] with the kt-paperclip gem. I also found a case where it makes YJIT fail the stack size assertion while compiling `leave`. YJIT's `BasicObject#__send__` implementation already rejects sends to `send`, but didn't detect sends to aliases of `send`. Adjust the detection and reject these cases. Fixes [Bug #19464] [1]: https://github.com/Shopify/yjit/issues/306	2023-02-27 11:12:22 -05:00
Alan Wu	55a24f9b08	YJIT: Reject __send__ with splat to cfunc for now `make test-spec` revealed this issue after applying an unrelated bug fix. A crashing case is included, though I suspect there are other scenarios where it misbehaves. Don't compile for now. Note that this is not an issue on the 3.2.x series; it has `send_args_splat_non_iseq` which already rejects all splats to cfuncs, including sends with splats.	2023-02-27 11:12:22 -05:00
Alan Wu	132934b82b	YJIT: Generate Block::entry_exit with block entry PC Previously, when Block::entry_exit is requested from any instruction that is not the first one in the block, we generated the exit with an incorrect PC. We should always be using the PC for the entry of the block for Block::entry_exit. It was a simple typo. The bug was [introduced][1] while we were refactoring to use the current backend. Later, we had a chance to spot this issue while [preparing][2] to enable unused variable warnings, but didn't spot the issue. Fixes [Bug #19463] [1]: `27fcab995e` [2]: `31461c7e0e`	2023-02-24 16:18:53 -05:00
Peter Zhu	3e09822407	Fix incorrect line numbers in GC hook If the previous instruction is not a leaf instruction, then the PC was incremented before the instruction was ran (meaning the currently executing instruction is actually the previous instruction), so we should not increment the PC otherwise we will calculate the source line for the next instruction. This bug can be reproduced in the following script: ``` require "objspace" ObjectSpace.trace_object_allocations_start a = 1.0 / 0.0 p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)] ``` Which outputs: [4, "test.rb"] This is incorrect because the object was allocated on line 10 and not line 4. The behaviour is correct when we use a leaf instruction (e.g. if we replaced `1.0 / 0.0` with `"hello"`), then the output is: [10, "test.rb"]. [Bug #19456]	2023-02-24 14:10:09 -05:00
Takashi Kokubun	f471f46184	YJIT: Use enum for expressing type diff (#7370 )	2023-02-24 09:03:59 -05:00
Takashi Kokubun	d8d152e681	YJIT: Compress TempMapping (#7368 )	2023-02-24 09:01:53 -05:00
Takashi Kokubun	b9f9440e95	YJIT: Trivial fixes in codegen.rs	2023-02-23 10:08:26 -08:00
Takashi Kokubun	5444dde738	YJIT: Skip type checks on splat args and expandarray if possible (#7363 ) YJIT: Skip type checks on splat args and expandarray if possible	2023-02-23 10:03:34 -08:00
Alan Wu	c3cd191092	YJIT: Add `make yjit-smoke-test` [ci skip] I have this as a shell command and Maxime told me that she finds it useful, too. I tested this on a release build and a dev build. Note I intentional didn't put `$(Q)` in front of everything so `make` echos the command it runs.	2023-02-23 12:12:57 -05:00
Takashi Kokubun	e9e4e1cb46	YJIT: Introduce Opnd::Stack (#7352 )	2023-02-22 16:22:41 -05:00
eileencodes	ae9e1aee59	Call rb_ivar_set instead of exiting for many ivars Previously, when we have a lot of ivars defined, we would exit via `jit_chain_guard` for megamorphic ivars. Now if we have more than the max depth of ivars we can call `rb_ivar_set` instead of exiting. Using the following script: ```ruby class A def initialize @a = 1 end def a @a end end N = 30 N.times do \|i\| eval <<-eorb class A#{i} < A def initialize @a#{i} = 1 super end end eorb end klasses = N.times.map { Object.const_get(:"A#{_1}") } 1000.times do klasses.each do \|k\| k.new.a end end ``` Exits before this change show exits for `setinstancevariable`: ``` *YJIT: Printing YJIT statistics on exit* method call exit reasons: klass_megamorphic: 24,975 (100.0%) invokeblock exit reasons: (all relevant counters are zero) invokesuper exit reasons: (all relevant counters are zero) leave exit reasons: interp_return: 26,948 (100.0%) se_interrupt: 1 ( 0.0%) getblockparamproxy exit reasons: (all relevant counters are zero) getinstancevariable exit reasons: megamorphic: 13,986 (100.0%) setinstancevariable exit reasons: megamorphic: 19,980 (100.0%) opt_aref exit reasons: (all relevant counters are zero) expandarray exit reasons: (all relevant counters are zero) opt_getinlinecache exit reasons: (all relevant counters are zero) invalidation reasons: (all relevant counters are zero) num_send: 155,823 num_send_known_class: 0 ( 0.0%) num_send_polymorphic: 119,880 (76.9%) bindings_allocations: 0 bindings_set: 0 compiled_iseq_count: 36 compiled_block_count: 158 compiled_branch_count: 240 block_next_count: 10 defer_count: 70 freed_iseq_count: 0 invalidation_count: 0 constant_state_bumps: 0 inline_code_size: 29,216 outlined_code_size: 27,948 freed_code_size: 0 code_region_size: 65,536 live_context_size: 8,322 live_context_count: 219 live_page_count: 4 freed_page_count: 0 code_gc_count: 0 num_gc_obj_refs: 130 object_shape_count: 295 side_exit_count: 58,942 total_exit_count: 85,890 yjit_insns_count: 1,023,581 avg_len_in_yjit: 11.2 Top-4 most frequent exit ops (100.0% of exits): opt_send_without_block: 24,975 (42.4%) setinstancevariable: 19,980 (33.9%) getinstancevariable: 13,986 (23.7%) leave: 1 ( 0.0%) ``` Exits after this change show we have no exits for `setinstancevariable`. ``` *YJIT: Printing YJIT statistics on exit* method call exit reasons: klass_megamorphic: 24,975 (100.0%) invokeblock exit reasons: (all relevant counters are zero) invokesuper exit reasons: (all relevant counters are zero) leave exit reasons: interp_return: 60,912 (100.0%) se_interrupt: 3 ( 0.0%) getblockparamproxy exit reasons: (all relevant counters are zero) getinstancevariable exit reasons: (all relevant counters are zero) setinstancevariable exit reasons: (all relevant counters are zero) opt_aref exit reasons: (all relevant counters are zero) expandarray exit reasons: (all relevant counters are zero) opt_getinlinecache exit reasons: (all relevant counters are zero) invalidation reasons: (all relevant counters are zero) num_send: 155,823 num_send_known_class: 0 ( 0.0%) num_send_polymorphic: 119,880 (76.9%) bindings_allocations: 0 bindings_set: 0 compiled_iseq_count: 36 compiled_block_count: 179 compiled_branch_count: 240 block_next_count: 11 defer_count: 70 freed_iseq_count: 0 invalidation_count: 0 constant_state_bumps: 0 inline_code_size: 31,032 outlined_code_size: 29,708 freed_code_size: 0 code_region_size: 65,536 live_context_size: 8,360 live_context_count: 220 live_page_count: 4 freed_page_count: 0 code_gc_count: 0 num_gc_obj_refs: 130 object_shape_count: 295 side_exit_count: 24,978 total_exit_count: 85,890 yjit_insns_count: 1,076,966 avg_len_in_yjit: 12.2 Top-2 most frequent exit ops (100.0% of exits): opt_send_without_block: 24,975 (100.0%) leave: 3 ( 0.0%) ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2023-02-21 13:59:54 -08:00
Alan Wu	9f8056a7dd	YJIT: Fastpath for Module#=== (#7351 ) Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>	2023-02-21 16:41:23 -05:00
Takashi Kokubun	0353277b20	YJIT: Avoid checking symbol ID twice on send (#7350 )	2023-02-21 16:10:10 -05:00
Jimmy Miller	5baef07506	YJIT: Fix clippy issues and remove unused params (#7348 ) * YJIT: Fix clippy issues and remove unused params * Remove an unnecessary whitespace --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-02-21 13:21:25 -05:00
Takashi Kokubun	ecd0cdaf82	YJIT: Fix assertion for partially mapped last pages (#7337 ) Follows up [Bug #19400]	2023-02-20 09:06:09 -08:00
Jimmy Miller	c024cc05ef	YJIT: Consolidate jit methods in JITState impl (#7336 ) These jit_* methods don't jit code, but instead check things on the JITState. We had other methods that did the same thing that were just added on the impl JITState. For consistency I added these methods there.	2023-02-17 16:40:01 -05:00
Takashi Kokubun	034d5ee43c	YJIT: Use rb_ivar_get at the end of ivar chains (#7334 ) * YJIT: Use rb_ivar_get at the end of ivar chains * Rename the counter to get_ivar_max_depth	2023-02-17 12:44:39 -08:00
Maxime Chevalier-Boisvert	c3bae033eb	Add asm comment to YJIT's rb_str_empty_p	2023-02-17 13:10:16 -05:00
Alan Wu	a4b7ec1229	YJIT: Fix false assumption that String#+@ => ::String Could return a subclass. [Bug #19444]	2023-02-16 18:50:42 -05:00
Alan Wu	c178926fbe	YJIT: jit_prepare_routine_call() for String#+@ missing We saw SEGVs due to this when running with StackProf, which needs a correct PC for RUBY_INTERNAL_EVENT_NEWOBJ, the same event used for ObjectSpace allocation tracing. [Bug #19444]	2023-02-16 18:50:42 -05:00
Takashi Kokubun	21f9c92c71	YJIT: Show Context stats on exit (#7327 )	2023-02-16 11:32:13 -08:00
Takashi Kokubun	8f22dc39f3	YJIT: Refactor getlocal and setlocal insns (#7320 )	2023-02-16 11:29:45 -05:00
Jimmy Miller	b4c38f3c59	YJIT: Initial support for rest args (#7311 ) * YJIT: Initial support for rest args * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-02-16 11:25:48 -05:00
Jean Boussier	1a4b4cd7f8	Move `attached_object` into `rb_classext_struct` Given that signleton classes don't have an allocator, we can re-use these bytes to store the attached object in `rb_classext_struct` without making it larger.	2023-02-16 08:14:44 +01:00
Jimmy Miller	8943b0d411	YJIT: `Kernel#{is_a?,instance_of?}` fast paths (GH-7297) Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2023-02-15 14:05:42 -05:00
Jean Boussier	7413079dae	Encapsulate RCLASS_ATTACHED_OBJECT Right now the attached object is stored as an instance variable and all the call sites that either get or set it have to know how it's stored. It's preferable to hide this implementation detail behind accessors so that it is easier to change how it's stored.	2023-02-15 15:24:22 +01:00
Takashi Kokubun	15ef2b2d7c	YJIT: Optimize != for Integers and Strings (#7301 )	2023-02-14 16:31:33 -05:00
Takashi Kokubun	6c5582815d	YJIT: Check correct BOP on gen_fixnum_cmp (#7303 )	2023-02-14 12:54:50 -08:00
Takashi Kokubun	55af69b15e	YJIT: Don't side-exit on too-complex shapes (#7298 )	2023-02-14 12:12:48 -05:00
Takashi Kokubun	dbe5b0dcff	YJIT: Fix a typo in a counter name I added `invokeblock_iseq_arg0_args_splat` counter but it wasn't used because of a typo. Related to https://github.com/ruby/ruby/pull/7234	2023-02-13 16:58:46 -08:00
Maxime Chevalier-Boisvert	a7e8eabeed	YJIT: add counters for polymorphic send and send with known class (#7288 )	2023-02-10 16:05:16 -05:00
Koichi Sasada	be94808282	use correct svar even if env is escaped This patch is follo-up of `0a82bfe`. Without this patch, if env is escaped (Proc'ed), strange svar can be touched. This patch tracks escaped env and use it.	2023-02-10 17:55:25 +09:00
Maxime Chevalier-Boisvert	810aeb2d91	YJIT: optimized codegen for `rb_ary_empty_p` (WIP) (#7242 ) * YJIT: add specialized implementation of rb_ary_empty_p() * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-02-09 15:14:26 -05:00
Maple Ong	381bdee118	YJIT: Support invokesuper in a block (#7264 ) Support invokesuper in a block on YJIT invokesuper previously side exited when it is in a block. To make sure we're compiling the correct method in super, we now use the local environment pointer (LEP) to get the method, which will work in a block. Co-authored-by: John Hawthorn <john@hawthorn.email>	2023-02-09 10:41:29 -05:00
Takashi Kokubun	0601ba6a1b	YJIT: Add counter for megamorphic send (#7274 )	2023-02-09 10:38:31 -05:00
Alan Wu	b78f871d83	YJIT: Use the system page size when the code page size is too small (#7267 ) Previously on ARM64 Linux systems that use 64 KiB pages (`CONFIG_ARM64_64K_PAGES=y`), YJIT was panicking on boot due to a failed assertion. The assertion was making sure that code GC can free the last code page that YJIT manages without freeing unrelated memory. YJIT prefers picking 16 KiB as the granularity at which to free code memory, but when the system can only free at 64 KiB granularity, that is not possible. The fix is to use the system page size as the code page size when the system page size is 64 KiB. Continue to use 16 KiB as the code page size on common systems that use 16/4 KiB pages. Add asserts to code_gc() and free_page() about code GC's assumptions. Fixes [Bug #19400]	2023-02-09 10:34:19 -05:00
Matt Valentine-House	72aba64fff	Merge gc.h and internal/gc.h [Feature #19425]	2023-02-09 10:32:29 -05:00
Takashi Kokubun	e2b6289bab	YJIT: Add counters for ivar exits (#7266 )	2023-02-09 10:16:17 -05:00
Takashi Kokubun	c30602e64c	YJIT: Support arg0 splat on invokeblock (#7234 )	2023-02-06 16:12:20 -05:00
Takashi Kokubun	21dcf5d766	YJIT: Check interrupts on frame pop (#7248 ) YJIT: Skip gen_check_ints on ISEQ send On the interpreter, vm_push_frame doesn't check interrupts. Only vm_pop_frame does.	2023-02-06 10:29:41 -05:00
Alan Wu	f901b934fd	YJIT: Make Block::start_addr non-optional We set the block address as soon as we make the block, so there is no point in making it `Option<CodePtr>`. No memory saving, unfortunately, as `mem::size_of::<Block>() = 176` before and after this change. Still a simplification for the logic, though.	2023-02-03 14:58:01 -05:00
Takashi Kokubun	08c529be90	YJIT: Support ifunc on invokeblock (#7233 )	2023-02-03 10:14:42 -05:00
Maxime Chevalier-Boisvert	73674cac2b	YJIT: log the names of methods we call to in disasm (#7231 ) * YJIT: log the names of methods we call to in disasm * Assert that pointer is not null * Handle case where UTF8 conversion not possible	2023-02-02 16:54:16 -05:00
Alan Wu	92ac5f686b	Fix typos in YJIT [ci skip]	2023-02-02 16:16:45 -05:00
Alan Wu	3b83b265f1	YJIT: Crash with rb_bug() when panicking Helps with getting good bug reports in the wild. Intended to be backported to the 3.2.x series.	2023-02-02 15:16:09 -05:00
Alan Wu	188688a53e	YJIT: ARM64: Fix long jumps to labels Previously, with Code GC, YJIT panicked while trying to emit a B.cond instruction with an offset that is not encodable in 19 bits. This only happens when the code in an assembler instance straddles two pages. To fix this, when we detect that a jump to a label can land on a different page, we switch to a fresh new page and regenerate all the code in the assembler there. We still assume that no one assembler has so much code that it wouldn't fit inside a fresh new page. [Bug #19385] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>	2023-02-02 10:05:00 -05:00
Alan Wu	905e12a30d	YJIT: ARM64: Move functions out of arm64_emit()	2023-02-02 10:05:00 -05:00
Alan Wu	a690db390d	YJIT: other_cb is None in tests Since the other cb is in CodegenGlobals, and we want Rust tests to be self-contained.	2023-02-02 10:05:00 -05:00
Alan Wu	81b7f86f47	YJIT: Move CodegenGlobals::freed_pages into an Rc This allows for supplying a freed_pages vec in Rust tests. We need it so we can test scenarios that occur after code GC.	2023-02-02 10:05:00 -05:00
Koichi Sasada	0a82bfe5e1	use correct svar (#7225 ) * use correct svar Without this patch, svar location is used "nearest Ruby frame". It is almost correct but it doesn't correct when the `each` method is written in Ruby. ```ruby class C include Enumerable def each %w(bar baz).each{\|e\| yield e} end end C.new.grep(/(b.)/){\|e\| p [$1, e]} ``` This patch fix this issue by traversing ifunc's cfp. Note that if cfp doesn't specify this Thread's cfp stack, reserved svar location (`ec->root_svar`) is used. * make yjit-bindgen --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-02-01 16:13:19 -08:00
Maxime Chevalier-Boisvert	2675f2c864	Remove whitespace	2023-02-01 16:05:22 -05:00
Jimmy Miller	1148fab7ae	YJIT: Handle splat with opt more fully (#7209 ) * YJIT: Handle splat with opt more fully * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-01-31 16:18:56 -05:00
Takashi Kokubun	e11067ebbf	YJIT: Fix BorrowMutError on BOP invalidation (#7212 )	2023-01-31 15:26:56 -05:00
Alan Wu	eac5ae22e2	YJIT: Group unimplemented method types together Grouping these together helps with finding all of the unimplemented method types. It was interleaved with some other match arm long and short previously.	2023-01-31 14:29:18 -05:00
Takashi Kokubun	2a0bf269c9	YJIT: Implement codegen for Kernel#block_given? (#7202 )	2023-01-31 10:11:10 -05:00
Nobuyoshi Nakada	be81495c16	Silence dozens of useless warnings from `nm` on macOS	2023-01-31 19:42:01 +09:00
Jimmy Miller	07d1b3ddc3	YJIT: Add splat optimized_send (#7167 )	2023-01-30 15:54:09 -05:00
Jimmy Miller	b32e1169c9	YJIT: Initial implementation of splat with optional params (#7166 )	2023-01-30 15:51:55 -05:00
Takashi Kokubun	2e0f3b5546	YJIT: Fix BorrowMutError on GC.compact (#7176 ) YJIT: Fix BorrowMutError	2023-01-30 11:16:33 -08:00
Takashi Kokubun	bc0dc9d40e	YJIT: Skip defer_compilation for fixnums if possible (#7168 ) * YJIT: Skip defer_compilation for fixnums if possible * YJIT: It should be Some(false) * YJIT: Define two_fixnums_on_stack on Context	2023-01-30 13:55:00 -05:00
Alan Wu	e1ffafb285	YJIT: Inline return address callback (#7198 ) This makes it so that the generator and the output code read in the same order. I think it reads better this way.	2023-01-30 12:50:08 -05:00
Alan Wu	7d4395cb69	YJIT: Fix shared/static library symbol leaks Rust 1.58.0 unfortunately doesn't provide facilities to control symbol visibility/presence, but we care about controlling the list of symbols exported from libruby-static.a and libruby.so. This commit uses `ld -r` to make a single object out of rustc's staticlib output, libyjit.a. This moves libyjit.a out of MAINLIBS and adds libyjit.o into COMMONOBJS, which obviates the code for merging libyjit.a into libruby-static.a. The odd appearance of libyjit.a in SOLIBS is also gone. To filter out symbols we do not want to export on ELF platforms, we use objcopy after the partial link. On darwin, we supply a symbol list to the linker which takes care of hiding unprefixed symbols. [Bug #19255] Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2023-01-27 12:28:09 -05:00
Takashi Kokubun	887d21613c	YJIT: Avoid BorrowError on GC.compact (#7164 )	2023-01-20 13:07:03 -08:00
Jimmy Miller	36fa4f13ca	YJIT: get rid of unneeded `.into()`	2023-01-20 10:57:41 -05:00
Jimmy Miller	bf3940a306	YJIT: Refactor side_exits	2023-01-19 16:10:58 -05:00
Takashi Kokubun	5ce0c13f18	YJIT: Remove duplicated information in BranchTarget (#7151 ) Note: On the new code of yjit/src/core.rs:2178, we no longer leave the state `.block=None` but `.address=Some...`, which might be important. We assume it's actually not needed and take a risk here to minimize heap allocations, but in case it turns out to be necessary, we could signal/resurrect that state by introducing a new BranchTarget (or BranchShape) variant dedicated to it.	2023-01-19 12:02:25 -08:00
Jimmy Miller	762a3d80f7	Implement splat for cfuncs. Split exit exit cases to better capture where we are exiting (#6929 ) YJIT: Implement splat for cfuncs. Split exit cases This also implements a new check for ruby2keywords as the last argument of a splat. This does mean that we generate more code, but in actual benchmarks where we gained speed from this (binarytrees) I don't see any significant slow down. I did have to struggle here with the register allocator to find code that didn't allocate too many registers. It's a bit hard when everything is implicit. But I think I got to the minimal amount of copying and stuff given our current allocation strategy.	2023-01-19 13:42:49 -05:00
Alan Wu	4b42392f8e	YJIT: Use .as_side_exit() for jumps to counted exits Fewer cycles running nops when these jumps are not taken. Fixing all these so when they get copy pasted in the future we save on padding.	2023-01-18 20:52:19 -05:00
Maxime Chevalier-Boisvert	6bb576fe75	YJIT: implement codegen for `String#empty?` (#7148 ) YJIT: implement codegen for String#empty?	2023-01-18 15:41:28 -05:00
Maxime Chevalier-Boisvert	cd97976328	Add stats so we can keep track of x86 rel32 vs register calls (#7142 ) * Add stats so we can keep track of x86 rel32 vs register calls To know if we get that "prime real estate" as Alan put it. * Fix bug pointed by Alan	2023-01-18 11:08:55 -05:00
Alan Wu	14fe7a081a	YJIT: Use ThinLTO for Rust parts in release mode This reduces the code size of libyjit.a by a lot. On darwin it went from 23 MiB to 12 MiB for me. I chose ThinLTO over fat LTO for the relatively fast build time; in case we need to debug release-build-only problems it won't be painful.	2023-01-16 17:32:15 -05:00
Alan Wu	b4cdde468b	YJIT: Use SIZEOF_VALUE_I32 instead of `... as i32` Shorter, and easier to parse without parentheses.	2023-01-13 15:32:28 -05:00
Alan Wu	84b1f48891	YJIT: Factor out VALUE_BITS = (8 * SIZE_OF_VALUE as u8) Using a constant shows intention better and is less noisy. It always took me a second to parse the long expression.	2023-01-13 15:32:28 -05:00
Ian Ker-Seymer	8d3ff66389	Enable `clippy` checks for yjit in CI (#7093 ) * Add job to check clippy lints in CI * Address all remaining clippy lints * Check lints on arm64 as well * Apply latest clippy lints * Do not exit 0 on clippy warnings	2023-01-12 10:14:17 -05:00
Nobuyoshi Nakada	cc15963aa3	Strip trailing spaces [ci skip]	2023-01-12 09:29:56 +09:00
Takashi Kokubun	3642006872	YJIT: Add a few asm comments (#7105 ) * YJIT: Add a few asm comments * YJIT: Clarify exiting insns * YJIT: Fix cargo test	2023-01-11 11:12:15 -08:00
Aaron Patterson	5bf7218b01	Differentiate T_ARRAY and array subclasses (#7091 ) * Differentiate T_ARRAY and array subclasses This commit teaches the YJIT context the difference between Arrays (objects with type T_ARRAY and class rb_cArray) vs Array subclasses (objects with type T_ARRAY but _not_ class rb_cArray). It uses this information to reduce the number of guards emitted when using `jit_guard_known_klass` with rb_cArray, notably opt_aref * Update yjit/src/core.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-01-10 13:54:07 -05:00
Alan Wu	aeddc19340	YJIT: Save PC and SP before calling leaf builtins (#7090 ) Previously, we did not update `cfp->sp` before calling the C function of ISEQs marked with `Primitive.attr! "inline"` (leaf builtins). This caused the GC to miss temporary values on the stack in case the function allocates and triggers a GC run. Right now, there is only a few leaf builtins in numeric.rb on Integer methods such as `Integer#~`. Since these methods only allocate when operating on big numbers, we missed this issue. Fix by saving PC and SP before calling the functions -- our usual protocol for calling C functions that may allocate on the GC heap. [Bug #19316]	2023-01-10 11:11:10 -05:00
Takashi Kokubun	6a585dbd5a	YJIT: Fix a compilation warning with release build (#7092 ) warning: unused variable: `start_addr` --> ../yjit/src/asm/mod.rs:359:39 \| 359 \| pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) { \| ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_start_addr` \| = note: `#[warn(unused_variables)]` on by default warning: unused variable: `end_addr` --> ../yjit/src/asm/mod.rs:359:60 \| 359 \| pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) { \|	2023-01-10 11:00:25 -05:00
Takashi Kokubun	a7fbdc35a2	YJIT: Remove old comments for regenerated branches (#7083 )	2023-01-09 11:29:41 -05:00
Takashi Kokubun	00d58afb5d	YJIT: Make iseq_get_location consistent with iseq.c (#7074 ) * YJIT: Make iseq_get_location consistent with iseq.c * YJIT: Call it "YJIT entry point" Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-01-06 11:49:59 -08:00
Takashi Kokubun	311ce91733	YJIT: Colorize outlined code differently on --yjit-dump-disasm (#7073 ) * YJIT: Colorize outlined code differently on --yjit-dump-disasm * YJIT: Reduce the number of escape sequences	2023-01-06 11:49:45 -08:00
Aaron Patterson	6c618cb789	Use a different name for megamorphic setivar exits We should differentiate between set and get for megamorphic exits. This patch fixes the megamorphic exit name in gen_setinstancevariable so that we can tell the difference between megamorphic get / set sites	2023-01-05 17:49:30 -08:00
Alan Wu	c240a18968	YJIT: Dump spill error to stderr [ci skip] Since the panic message is in stderr, better to use the same stream in case stdout and stderr are not synced due to IO redirection.	2023-01-03 16:33:47 -05:00
Alan Wu	43ff0c2c48	YJIT: Fix `yield` into block with >=30 locals on ARM It's a register spill issue. Fix by moving the Qnil fill snippet to after registers are released. [Bug #19299]	2023-01-03 16:17:50 -05:00
Takashi Kokubun	1d3bfd804c	MJIT: Export fewer shape functions (#7007 )	2022-12-23 10:18:57 -08:00
John Hawthorn	fbaa5db44a	Use a BOP for Hash#default On a hash miss we need to call default if it is redefined in order to return the default value to be used. Previously we checked this with rb_method_basic_definition_p, which avoids the method call but requires a method lookup. This commit replaces the previous check with BASIC_OP_UNREDEFINED_P and a new BOP_DEFAULT. We still need to fall back to rb_method_basic_definition_p when called on a subclasss of hash. \| \|compare-ruby\|built-ruby\| \|:---------------\|-----------:\|---------:\| \|hash_aref_miss \| 2.692\| 3.531\| \| \| -\| 1.31x\| Co-authored-by: Daniel Colson <danieljamescolson@gmail.com> Co-authored-by: "Ian C. Anderson" <ian@iancanderson.com> Co-authored-by: Jack McCracken <me@jackmc.xyz>	2022-12-17 14:51:49 -08:00
Alan Wu	14158f1f8c	YJIT: Fix `obj.send(:call)` All the method call types need to handle argument shifting in case they're called by `.send`, and we weren't handling that in `OPTIMIZED_METHOD_TYPE_CALL`. Lack of shifting caused the stack size assertion in gen_leave() to fail. Discovered by Rails CI: https://buildkite.com/rails/rails/builds/91705#018516c4-f8f8-469e-bc2d-ddeb25ca8317/1920-2067 Diagnosed with help from `@eileencodes` and `@k0kubun`.	2022-12-15 18:10:28 -05:00
Peter Zhu	c505448cdb	Move definition of SIZE_POOL_COUNT back to gc.h SIZE_POOL_COUNT is a GC macro, it should belong in gc.h and not shape.h. SIZE_POOL_COUNT doesn't depend on shape.h so we can have shape.h depend on gc.h. Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>	2022-12-15 16:33:46 -05:00
Alan Wu	5fa608ed79	YJIT: Fix code GC freeing stubs with a trampoline (#6937 ) Stubs we generate for invalidation don't necessarily co-locate with the code that jump to the stub. Since we rely on co-location to keep stubs alive as they are in the outlined code block, it used to be possible for code GC inside branch_stub_hit() to free the stub that's its direct caller, leading us to return to freed code after. Stubs used to look like: ``` mov arg0, branch_ptr mov arg1, target_idx mov arg2, ec call branch_stub_hit jmp return_reg ``` Since the call and the jump after the call is the same for all stubs, we can extract them and use a static trampoline for them. That makes branch_stub_hit() always return to static code. Stubs now look like: ``` mov arg0, branch_ptr mov arg1, target_idx jmp trampoline ``` Where the trampoline is: ``` mov arg2, ec call branch_stub_hit jmp return_reg ``` Code GC can now free stubs without problems since we'll always return to the trampoline, which we generate once on boot and lives forever. This might save a small bit of memory due to factoring out the static part of stubs, but it's probably minor. [Bug #19234] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2022-12-15 15:10:14 -05:00
Jemma Issroff	c1ab6ddc9a	Transition complex objects to "too complex" shape When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is not too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>	2022-12-15 10:06:04 -08:00
Alan Wu	693c01d509	YJIT: Remove duplicate call to jit_prepare_routine_call() It's idempotent.	2022-12-14 16:17:46 -05:00
Takashi Kokubun	65dfe2eea8	Suppress the output of `if [ 'xyes' = xyes ];` code itself	2022-12-13 22:26:24 -08:00
Takashi Kokubun	a66a69865d	YJIT: Change the default mem size to 64MiB (#6912 ) * YJIT: Change the default mem size to 64MiB * Also update ruby --help Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2022-12-13 11:00:22 -05:00
Alan Wu	258ac07907	YJIT: Generate debug info in release builds (#6910 ) * YJIT: Generate debug info in release builds They are helpful in case we need to do core dump debugging. * Remove Cirrus DOC skip rule The syntax for this is weird, and escaping [ and ] cause parse failures. Cirrus' docs said to surround with .*, but then that seems to skip everything. Revert `e0a4205eb7` for now.	2022-12-12 15:59:29 -05:00
Takashi Kokubun	ece6246057	YJIT: Implement opt_newarray_max instruction (#6893 )	2022-12-12 10:19:24 -05:00
Takashi Kokubun	24043031be	YJIT: Split send_iseq_complex_callee exit reasons (#6895 )	2022-12-09 16:45:38 -08:00
Maxime Chevalier-Boisvert	daa893db41	YJIT: implement `getconstant` YARV instruction (#6884 ) * YJIT: implement getconstant YARV instruction * Constant id is not a pointer * Stack operands must be read after jit_prepare_routine_call Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2022-12-09 14:12:15 -08:00
Alan Wu	e714907d82	YJIT: Upgrade bindgen to stabilize and reduce output The new version has an option to merge everything into a big `extern "C"` block and it's nicer. More importantly, this upgrade fixes an issue where Ubuntu with Clang 12 and macOS with Clang 14 gave a one line diff for `rb_shape_t`. It was slightly annoying because we use macOS locally.	2022-12-08 17:35:18 -05:00
Takashi Kokubun	51ef991d8d	YJIT: Drop Copy trait from Context (#6889 )	2022-12-08 17:33:18 -05:00
Maxime Chevalier-Boisvert	b26c9ce5e9	YJIT: implement opt_newarray_min YARV instruction (#6888 )	2022-12-08 17:31:33 -05:00
Alan Wu	47a5b34aba	YJIT: Fold check-yjit-bindings into yjit-bindgen So it's shorter on CI and the hint about how the fix the failure shows up. It's going to print a diff locally too, but that should be fine.	2022-12-08 15:58:00 -05:00
Samuel Williams	6fd5d2dc00	Introduce `IO.new(..., path:)` and promote `File#path` to `IO#path`. (#6867 )	2022-12-08 18:19:53 +13:00
Jemma Issroff	40a9964b89	Set max_iv_count (used for object shapes) based on inline caches With this change, we're storing the iv name on an inline cache on setinstancevariable instructions. This allows us to check the inline cache to count instance variables set in initialize and give us an estimate of iv capacity for an object. For the purpose of estimating the number of instance variables required for an object, we're assuming that all initialize methods will call `super`. This change allows us to estimate the number of instance variables required without disassembling instruction sequences. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>	2022-12-06 13:43:42 -08:00
Daniel Colson	e69b91fae4	Introduce BOP_CMP for optimized comparison Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup to determine whether `<=>` was overridden. The result of the lookup was cached, but only for the duration of the specific method that initialized the cmp_opt_data cache structure. With this method lookup, `[x,y].max` is slower than doing `x > y ? x : y` even though there's an optimized instruction for "new array max". (John noticed somebody a proposed micro-optimization based on this fact in https://github.com/mastodon/mastodon/pull/19903.) ```rb a, b = 1, 2 Benchmark.ips do \|bm\| bm.report('conditional') { a > b ? a : b } bm.report('method') { [a, b].max } bm.compare! end ``` Before: ``` Comparison: conditional: 22603733.2 i/s method: 19820412.7 i/s - 1.14x (± 0.00) slower ``` This commit replaces the method lookup with a new CMP basic op, which gives the examples above equivalent performance. After: ``` Comparison: method: 24022466.5 i/s conditional: 23851094.2 i/s - same-ish: difference falls within error ``` Relevant benchmarks show an improvement to Array#max and Array#min when not using the optimized newarray_max instruction as well. They are noticeably faster for small arrays with the relevant types, and the same or maybe a touch faster on larger arrays. ``` $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max ``` The benchmarks added in this commit also look generally improved. Co-authored-by: John Hawthorn <jhawthorn@github.com>	2022-12-06 12:37:23 -08:00
Daniel Colson	c43951e60e	Move BOP macros to separate file This commit moves ruby_basic_operators and the unredefined macros out of vm_core.h and into basic_operators.h so that we can use them more broadly in places where we currently use a method look up via `rb_method_basic_definition_p` (e.g. object.c, numeric.c, complex.c, enum.c, but also in internal/compar.h after introducing BOP_CMP and elsewhere if we introduce more BOPs) The most controversial part of this change is probably moving redefined_flag out of rb_vm_t. [vm_opt_method_def_table and vm_opt_mid_table](`9da2a5204f/vm.c`) are not part of rb_vm_t either, and I think this fits well with those. But more significantly it seems to result in one fewer instruction. For example: Before: ``` (lldb) disassemble -n vm_opt_str_freeze miniruby`vm_exec_core: miniruby[0x10028233e] <+14558>: movq 0x11a86b(%rip), %rax ; ruby_current_vm_ptr miniruby[0x100282345] <+14565>: testb $0x4, 0x242c(%rax) ``` After: ``` (lldb) disassemble -n vm_opt_str_freeze ruby`vm_exec_core: ruby[0x100280ebe] <+14510>: testb $0x4, 0x120147(%rip) ; ruby_vm_redefined_flag + 43 ``` Co-authored-by: John Hawthorn <jhawthorn@github.com>	2022-12-06 12:37:23 -08:00
Alan Wu	235fc50447	YJIT: Remove --yjit-code-page-size (#6865 ) Certain code page sizes don't work and can cause crashes, so having this value available as a command-line option is a bit dangerous. Remove it and turn it into a constant instead.	2022-12-05 17:43:17 -05:00
Jemma Issroff	e7642d8095	YJIT: Extract SHAPE_ID_NUM_BITS into a constant (#6863 )	2022-12-05 13:20:11 -08:00
Jemma Issroff	41bacd9b0d	Remove unused rb_shape_flag_shift and rb_shape_flag_mask	2022-12-02 12:53:51 -08:00
Jemma Issroff	ebd4c7bb01	Fixed yjit bindings rb_gc_write_barrier	2022-12-02 12:53:51 -08:00
Jemma Issroff	4c5e89791b	Extracted rb_shape_id_offset	2022-12-02 12:53:51 -08:00
Maxime Chevalier-Boisvert	606653e43a	Update yjit/src/codegen.rs	2022-12-02 12:53:51 -08:00
Aaron Patterson	be40af284a	make flag clearing better	2022-12-02 12:53:51 -08:00
Aaron Patterson	07fe3d37c5	only generate wb when we really need to	2022-12-02 12:53:51 -08:00
Aaron Patterson	744b0527ea	bail on compilation if the comptime receiver is frozen	2022-12-02 12:53:51 -08:00
Aaron Patterson	7b5ee9a8a6	do not fire the wb when writing immediates	2022-12-02 12:53:51 -08:00
Aaron Patterson	17f9bcd7d7	implement IV writes	2022-12-02 12:53:51 -08:00
Alan Wu	eb2b717a8b	YJIT: Make case-when optimization respect === redefinition (#6846 ) * YJIT: Make case-when optimization respect === redefinition Even when a fixnum key is in the dispatch hash, if there is a case such that its basic operations for === is redefined, we need to fall back to checking each case like the interpreter. Semantically we're always checking each case by calling === in order, it's just that this is not observable when basic operations are intact. When all the keys are fixnums, though, we can do the optimization we're doing right now. Check for this condition. * Update yjit/src/cruby_bindings.inc.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2022-12-02 11:40:16 -05:00
Takashi Kokubun	fa77bcf722	YJIT: Change the default --yjit-call-threshold to 30 (#6850 )	2022-12-02 11:32:49 -05:00
Takashi Kokubun	dcbea7671b	YJIT: Respect destination num_bits on STUR (#6848 )	2022-12-01 16:13:38 -08:00
Takashi Kokubun	2c939458ca	YJIT: Reorder branches for Fixnum opt_case_dispatch (#6841 ) * YJIT: Reorder branches for Fixnum opt_case_dispatch Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * YJIT: Don't support too large values Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2022-12-01 10:59:56 -05:00
Jemma Issroff	06a0c58016	YJIT: fix 32 and 16 bit register store (#6840 ) * Fix 32 and 16 bit register store in YJIT Co-Authored-By: Takashi Kokubun <takashikkbn@gmail.com> * Remove an unnecessary diff * Reuse an rm_num_bits result * Use u16::MAX instead * Update the link Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Just use sturh for 16 bits Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2022-12-01 10:53:50 -05:00
Takashi Kokubun	0d3fc08ff4	YJIT: Optimize rb_int_equal (#6838 )	2022-11-30 16:16:11 -05:00
Maxime Chevalier-Boisvert	d98d84b75d	YJIT: add new counters for deferred compilation and queued blocks (#6837 )	2022-11-30 14:09:10 -05:00
Alan Wu	a0b0365e90	YJIT: Deallocate `struct Block` to plug memory leaks Previously we essentially never freed block even after invalidation. Their reference count never reached zero for a couple of reasons: 1. `Branch::block` formed a cycle with the block holding the branch 2. Strong count on a branch that has ever contained a stub never reached 0 because we increment the `.clone()` call for `BranchRef::into_raw()` didn't have a matching decrement. It's not safe to immediately deallocate blocks during invalidation since `branch_stub_hit()` can end up running with a branch pointer from an invalidated branch. To plug the leaks, we wait until code GC or global invalidation and deallocate the blocks for iseqs that are definitely not running.	2022-11-30 12:23:50 -05:00
Alan Wu	b30248f74a	YJIT: Deallocate when assumptions tables are empty When we run global invalidation for TracePoints or code GC, we clear out all blocks in our assumptions table but we don't deallocate the backing buffers. Let's reclaim some memory during these rare events.	2022-11-30 12:23:50 -05:00
Alan Wu	03f1e6a2aa	YJIT: Fix IseqPayload::pages memory bloat HashSet::clear() doesn't deallocate the backing buffer and shrink the capacity. Replace with a 0-capcity set instead so we reclaim some memory each code GC.	2022-11-30 12:23:50 -05:00
Takashi Kokubun	3e4d1a1dd1	YJIT: Skip checking interrupt_mask (#6825 )	2022-11-29 10:09:32 -05:00
Takashi Kokubun	6844bcc6b4	MJIT: Use a String buffer in builtin compilers instead of FILE*. Using C.fprintf is slower than String manipulation on memory. I'm going to change the way MJIT writes files, and this is a prerequisite for it.	2022-11-27 21:11:33 -08:00
Maxime Chevalier-Boisvert	d2fa67de81	YJIT: rename `InsnOpnd` => `YARVOpnd` (#6801 ) Rename InsnOpnd => YARVOpnd Make it more clear this refers to YARV insn/vm operands rather than backend IR, x86 or ARM insn operands.	2022-11-24 10:30:28 -05:00
Alan Wu	d92054e371	YJIT: Use a Box for branch targets to save memory We frequently make branches that only have one target but we used to always allocate space for two branch targets. This patch moves all the information a branch target has into a struct and refer to them using Option<Box<BranchTarget>>, this way when the second branch target is not present it only takes 8 bytes. Retained heap size on railsbench went from 16.17 MiB to 14.57 MiB, a ratio of about 1.1.	2022-11-23 18:00:12 -05:00
Takashi Kokubun	a50aabde9c	YJIT: Simplify Insn::CCall to obviate Target::FunPtr (#6793 )	2022-11-23 12:14:43 -05:00
Takashi Kokubun	d88adaad7e	YJIT: Use NonNull pointer for CodePtr (#6792 )	2022-11-23 12:02:05 -05:00
Takashi Kokubun	9c36de3c48	YJIT: Stop passing target1 to gen_return_branch	2022-11-23 11:59:50 -05:00
Takashi Kokubun	fe2bed6778	YJIT: Simplify code for RB_SPECIAL_CONST_P (#6795 )	2022-11-23 11:59:02 -05:00
Jemma Issroff	e82b15b660	Fix YJIT backend to account for unsigned int immediates (#6789 ) YJIT: x86_64: Fix cmp with number where sign bit is set Before this commit, we were unconditionally treating unsigned ints as signed ints when counting the number of bits required for representing the immediate in machine code. When the size of the immediate matches the size of the other operand, no sign extension happens, so this was incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this issue. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Alan Wu <alanwu@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Alan Wu <alanwu@ruby-lang.org>	2022-11-23 10:48:17 -05:00
Takashi Kokubun	63f4a7a1ec	YJIT: Skip padding jumps to side exits on Arm (#6790 ) YJIT: Skip padding jumps to side exits Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2022-11-22 15:57:17 -05:00
Takashi Kokubun	607fb49dbc	YJIT: Lower the required Rust version from 1.58.1 to 1.58.0 (#6780 )	2022-11-21 10:27:39 -08:00
Takashi Kokubun	6dcb7b9216	YJIT: Improve the failure message on enlarging a branch (#6769 )	2022-11-18 17:27:07 -08:00
Aaron Patterson	9e067df76b	32 bit comparison on shape id This commit changes the shape id comparisons to use a 32 bit comparison rather than 64 bit. That means we don't need to load the shape id to a register on x86 machines. Given the following program: ```ruby class Foo def initialize @foo = 1 @bar = 1 end def read [@foo, @bar] end end foo = Foo.new foo.read foo.read foo.read foo.read foo.read puts RubyVM::YJIT.disasm(Foo.instance_method(:read)) ``` The machine code we generated _before_ this change is like this: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ====================== # getinstancevariable 0x559a18623023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x559a18623027: test al, 7 0x559a1862302a: jne 0x559a1862502d 0x559a18623030: cmp rax, 4 0x559a18623034: jbe 0x559a1862502d # guard shape, embedded, and T_OBJECT 0x559a1862303a: mov rcx, qword ptr [rax] 0x559a1862303d: movabs r11, 0xffff00000000201f 0x559a18623047: and rcx, r11 0x559a1862304a: movabs r11, 0xb000000002001 0x559a18623054: cmp rcx, r11 0x559a18623057: jne 0x559a18625046 0x559a1862305d: mov rax, qword ptr [rax + 0x18] 0x559a18623061: mov qword ptr [rbx], rax == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ====================== # gen_direct_jmp: fallthrough # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x559a18623064: mov rax, qword ptr [r13 + 0x18] # guard shape, embedded, and T_OBJECT 0x559a18623068: mov rcx, qword ptr [rax] 0x559a1862306b: movabs r11, 0xffff00000000201f 0x559a18623075: and rcx, r11 0x559a18623078: movabs r11, 0xb000000002001 0x559a18623082: cmp rcx, r11 0x559a18623085: jne 0x559a18625099 0x559a1862308b: mov rax, qword ptr [rax + 0x20] 0x559a1862308f: mov qword ptr [rbx + 8], rax ``` After this change, it's like this: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ====================== # getinstancevariable 0x5560c986d023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5560c986d027: test al, 7 0x5560c986d02a: jne 0x5560c986f02d 0x5560c986d030: cmp rax, 4 0x5560c986d034: jbe 0x5560c986f02d # guard shape 0x5560c986d03a: cmp word ptr [rax + 6], 0x19 0x5560c986d03f: jne 0x5560c986f046 0x5560c986d045: mov rax, qword ptr [rax + 0x10] 0x5560c986d049: mov qword ptr [rbx], rax == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ====================== # gen_direct_jmp: fallthrough # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x5560c986d04c: mov rax, qword ptr [r13 + 0x18] # guard shape 0x5560c986d050: cmp word ptr [rax + 6], 0x19 0x5560c986d055: jne 0x5560c986f099 0x5560c986d05b: mov rax, qword ptr [rax + 0x18] 0x5560c986d05f: mov qword ptr [rbx + 8], rax ``` The first ivar read is a bit more complex, but the second ivar read is much simpler. I think eventually we could teach the context about the shape, then emit only one shape guard.	2022-11-18 12:04:10 -08:00
Jimmy Miller	98e9165b0a	Fix bug involving .send and overwritten methods. (#6752 ) @casperisfine reporting a bug in this gist https://gist.github.com/casperisfine/d59e297fba38eb3905a3d7152b9e9350 After investigating I found it was caused by a combination of send and a c_func that we have overwritten in the JIT. For send calls, we need to do some stack manipulation before making the call. Because of the way exits works, we need to do that stack manipulation at the last possible moment. In this case, we weren't doing that stack manipulation at all. Unfortunately, with how the code is structured there isn't a great place to do that stack manipulation for our overridden C funcs. Each overridden C func can return a boolean stating that it shouldn't be used. We would need to do the stack manipulation after all of those checks are done. We could pass a lambda(?) or separate out the logic for "can I run this override" from "now generate the code for it". Since we are coming up on a release, I went with the path of least resistence and just decided to not use these overrides if we are in a send call. We definitely should revist this in the future.	2022-11-17 23:17:40 -05:00
Takashi Kokubun	a777ec0d85	YJIT: Shrink version lists after mutation (#6749 )	2022-11-16 16:30:39 -08:00
Takashi Kokubun	3259aceb35	YJIT: Pack BlockId and CodePtr (#6748 )	2022-11-16 15:48:46 -08:00
Takashi Kokubun	1b8236acc2	YJIT: Add compiled_branch_count stats (#6746 )	2022-11-16 15:31:13 -08:00
Takashi Kokubun	6de4032e40	YJIT: Stop wrapping CmePtr with CmeDependency (#6747 ) * YJIT: Stop wrapping CmePtr with CmeDependency * YJIT: Fix an outdated comment [ci skip]	2022-11-16 15:30:29 -08:00
Takashi Kokubun	3eb7a6521c	YJIT: Shrink the vectors of Block after mutation (#6739 )	2022-11-16 10:09:15 -08:00
Takashi Kokubun	41b0f641ef	YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733 ) * YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * Introduce heap_object_p * Leave original mov intact * Remove unneeded branches * Add a test for movabs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2022-11-15 15:23:20 -08:00
Takashi Kokubun	0d384ce6e6	YJIT: Include actual memory region size in stats (#6736 )	2022-11-15 15:20:02 -08:00
Takashi Kokubun	d1fb659547	YJIT: Count getivar side exits by receiver flag changes (#6735 )	2022-11-15 14:50:12 -08:00
Takashi Kokubun	1125274c4e	YJIT: Invalidate redefined methods only through cme (#6734 ) Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2022-11-15 12:57:43 -08:00
Aaron Patterson	e74d82f674	Implement LDURH on Aarch64 When RUBY_DEBUG is enabled, shape ids are 16 bits. I would like to do 16 bit comparisons, so I need to load halfwords sometimes. This commit adds LDURH so that I can load halfwords. https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/LDURH--Load-Register-Halfword--unscaled--?lang=en I verified the bytes using clang: ``` $ cat asmthing.s .global _start .align 2 _start: ldurh w10, [x1] ldurh w10, [x1, #123] $ as asmthing.s -o asmthing.o && objdump --disassemble asmthing.o asmthing.o: file format mach-o arm64 Disassembly of section __TEXT,__text: 0000000000000000 <ltmp0>: 0: 2a 00 40 78 ldurh w10, [x1] 4: 2a b0 47 78 ldurh w10, [x1, #123] ```	2022-11-14 17:04:50 -08:00
Aaron Patterson	b7d591643a	Remove USE_RVARGC code We don't need this constant to be exposed anymore, so remove it	2022-11-14 15:42:11 -08:00
Takashi Kokubun	6246788bc4	YJIT: Instrument global allocations on stats build (#6712 ) * YJIT: Instrument global allocations on stats build * Just use GLOVAL_ALLOCATOR.stats()	2022-11-13 12:54:41 -05:00
Takashi Kokubun	d5e1b82f5c	YJIT: Remove unused src_ctx from Block (#6714 )	2022-11-13 12:33:23 -05:00
Alan Wu	04c5adf806	YJIT: Fix staying in invalidated code after proc calls Previously, there is no instruction boundary patch point after the call to a non-leaf C function we generate for OPTIMIZED_METHOD_TYPE_CALL. This meant that if code GC is triggered while inside the C function, we would keep running invalidated code when we return from the C function. This had the effect of running stale branch stubs, jumping to bad code, etc. Use jit_prepare_routine_call() to make sure we exit from the invalidated region as soon as possible after the C call in case of invalidation.	2022-11-11 11:13:07 -05:00
Jimmy Miller	8b3347950e	Enable --yjit-stats for release builds (#6694 ) * Enable --yjit-stats for release builds In order for people in the real world to report information about how their application runs with YJIT, we want to expose stats without requiring rebuilding ruby. We can do this without overhead, with the exception of count ratio in yjit, since this relies on the interpreter also counting instructions. This change exposes those stats, while not showing ratio in yjit if we are not in a stats build. * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2022-11-10 12:56:22 -05:00
Jemma Issroff	c726c48a3d	Remove numiv from RObject Since object shapes store the capacity of an object, we no longer need the numiv field on RObjects. This gives us one extra slot which we can use to give embedded objects one more instance variable (for a total of 3 ivs). This commit removes the concept of numiv from RObject.	2022-11-10 10:11:34 -05:00
Jemma Issroff	5246f4027e	Transition shape when object's capacity changes This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>	2022-11-10 10:11:34 -05:00
Alan Wu	5d95cd99f4	YJIT: Reset dropped_bytes when patching code We switch to a new page when we detect dropped_bytes flipping from false to true. Previously, when we patch code for invalidation during code gc, we start with the flag being set to true, so we failed to apply patches that straddle pages. We would write out jumps half way and then stop, which left the code corrupted. Reset the flag before patching so we patch across pages properly.	2022-11-08 16:09:02 -05:00
Jimmy Miller	1a65ab20cb	Implement optimize call (#6691 ) This dispatches to a c func for doing the dynamic lookup. I experimented with chain on the proc but wasn't able to detect which call sites would be monomorphic vs polymorphic. There is definitely room for optimization here, but it does reduce exits.	2022-11-08 15:28:28 -05:00
Takashi Kokubun	7442cb461b	YJIT: Free pages after ObjectSpace API usages (#6676 )	2022-11-07 10:48:26 -05:00
Takashi Kokubun	ea77aa2fd0	YJIT: Make Code GC metrics available for non-stats builds (#6665 )	2022-11-03 13:41:35 -04:00
Takashi Kokubun	124f10f56b	YJIT: Fix a wrong type reference (#6661 ) * YJIT: Fix a wrong type reference * YJIT: Just remove CapturedSelfOpnd for now	2022-11-03 13:33:49 -04:00
Takashi Kokubun	68ef97d788	YJIT: Stop incrementing write_pos if cb.has_dropped_bytes (#6664 ) Co-Authored-By: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2022-11-03 11:42:28 -04:00
Takashi Kokubun	81e84e0a4d	YJIT: Support invokeblock (#6640 ) * YJIT: Support invokeblock * Update yjit/src/backend/arm64/mod.rs * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2022-11-02 12:30:48 -04:00
Takashi Kokubun	946bb34fb5	YJIT: Avoid accumulating freed pages in the payload (#6657 ) Co-Authored-By: Alan Wu <alansi.xingwu@shopify.com> Co-Authored-By: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>	2022-11-02 11:14:31 -04:00
Takashi Kokubun	0d1e1987d1	YJIT: Visualize live ranges on register spill (#6651 )	2022-11-01 15:05:36 -04:00
Alan Wu	cbf15e5cbe	YJIT: Add an assert to help with Context changes While experimenting I found that it's easy to change Context and forget to also change the copying operation in limit_block_versions(). Add an assert to make sure we substitute a compatible generic context when limiting the number of versions.	2022-11-01 14:16:32 -04:00
Alan Wu	a70f90e1a9	YJIT: Delete redundant ways to make Context Context::new() is the same as Context::default() and Context::new_with_stack_size() was only used in tests.	2022-11-01 14:16:32 -04:00
Takashi Kokubun	2b39640b0b	YJIT: Add RubyVM::YJIT.code_gc (#6644 ) * YJIT: Add RubyVM::YJIT.code_gc * Rename compiled_page_count to live_page_count	2022-10-31 14:29:45 -04:00
Maxime Chevalier-Boisvert	5e6633fcf9	YJIT: reduce default `--yjit-exec-mem-size` to 128MiB instead of 256 (#6649 ) Reduce default --yjit-exec-mem-size to 128MiB instead of 256	2022-10-31 14:29:11 -04:00
Alan Wu	9cf027f83a	YJIT: Use guard_known_class() for opt_aref on Arrays (#6643 ) This code used to roll its own heap object check before we made a better version in guard_known_class(). The improved version uses one fewer comparison, so let's use that.	2022-10-27 18:52:58 -04:00

... 5 6 7 8 9 ...

975 Коммитов