github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Aaron Patterson	bbeebc9258	Only set shape id for CCs on attr_set + ivar Only ivar reader and writer methods should need the shape ID set on the inline cache. We shouldn't accidentally overwrite parts of the CC union when it's not necessary.	2024-07-31 16:23:28 -07:00
Gabriel Lacroix	1652c194c8	Fix comment for VM_CALL_ARGS_SIMPLE (#11067 ) * Set VM_CALL_KWARG flag first and reuse it to avoid checking kw_arg twice * Fix comment for VM_CALL_ARGS_SIMPLE * Make VM_CALL_ARGS_SIMPLE set-site match its comment	2024-06-28 10:11:35 -04:00
Aaron Patterson	cdf33ed5f3	Optimized forwarding callers and callees This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>	2024-06-18 09:28:25 -07:00
Aaron Patterson	0434dfb76b	We don't need to check if the ci is markable anymore It doesn't matter if CI's are stack allocated or not.	2024-04-24 15:09:06 -07:00
Aaron Patterson	147ca9585e	Implement equality for CI comparison when CC searching When we're searching for CCs, compare the argc and flags for CI rather than comparing pointers. This means we don't need to store a reference to the CI, and it also naturally "de-duplicates" CC objects. We can observe the effect with the following code: ```ruby require "objspace" hash = {} p ObjectSpace.memsize_of(Hash) eval ("a".."zzz").map { \|key\| "hash.merge(:#{key} => 1)" }.join("; ") p ObjectSpace.memsize_of(Hash) ``` On master: ``` $ ruby -v test.rb ruby 3.4.0dev (2024-04-15T16:21:41Z master `d019b3baec`) [arm64-darwin23] test.rb:3: warning: assigned but unused variable - hash 3424 527736 ``` On this branch: ``` $ make runruby compiling vm.c linking miniruby builtin_binary.inc updated compiling builtin.c linking static-library libruby.3.4-static.a ln -sf ../../rbconfig.rb .ext/arm64-darwin23/rbconfig.rb linking ruby ld: warning: ignoring duplicate libraries: '-ldl', '-lobjc', '-lpthread' RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb 2240 2368 ``` Co-authored-by: John Hawthorn <jhawthorn@github.com>	2024-04-18 09:06:33 -07:00
Peter Zhu	330830dd1a	Add IMEMO_NEW Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.	2024-02-21 11:33:05 -05:00
John Hawthorn	1c97abaaba	De-dup identical callinfo objects Previously every call to vm_ci_new (when the CI was not packable) would result in a different callinfo being returned this meant that every kwarg callsite had its own CI. When calling, different CIs result in different CCs. These CIs and CCs both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop this resulted in a memory leak of both types of object. This also likely resulted in extra memory used, and extra time searching, in non-eval cases. For simplicity in this commit I always allocate a CI object inside rb_vm_ci_lookup, but ideally we would lazily allocate it only when needed. I hope to do that as a follow up in the future.	2024-02-20 18:55:00 -08:00
Jeremy Evans	22e488464a	Add VM_CALL_ARGS_SPLAT_MUT callinfo flag This flag is set when the caller has already created a new array to handle a splat, such as for `f(a, b)` and `f(a, b)`. Previously, if `f` was defined as `def f(a)`, these calls would create an extra array on the callee side, instead of using the new array created by the caller. This modifies `setup_args_core` to set the flag whenver it would add a `splatarray true` instruction. However, when `splatarray true` is changed to `splatarray false` in the peephole optimizer, to avoid unnecessary allocations on the caller side, the flag must be removed. Add `optimize_args_splat_no_copy` and have the peephole optimizer call that. This significantly simplifies the related peephole optimizer code. On the callee side, in `setup_parameters_complex`, set `args->rest_dupped` to true if the flag is set. This takes a similar approach for optimizing regular splats that was previiously used for keyword splats in `d2c41b1bff` (via VM_CALL_KW_SPLAT_MUT).	2024-01-24 18:25:55 -08:00
Jeremy Evans	3081c83169	Support tracing of struct member accessor methods This follows the same approach used for attr_reader/attr_writer in `2d98593bf5`, skipping the checking for tracing after the first call using the call cache, and clearing the call cache when tracing is turned on/off. Fixes [Bug #18886]	2023-12-07 10:29:33 -08:00
HParker	c74dc8b4af	Use reference counting to avoid memory leak in kwargs Tracks other callinfo that references the same kwargs and frees them when all references are cleared. [bug #19906] Co-authored-by: Peter Zhu <peter@peterzhu.ca>	2023-10-01 10:55:19 -04:00
Koichi Sasada	cfd7729ce7	use inline cache for refinements From Ruby 3.0, refined method invocations are slow because resolved methods are not cached by inline cache because of conservertive strategy. However, `using` clears all caches so that it seems safe to cache resolved method entries. This patch caches resolved method entries in inline cache and clear all of inline method caches when `using` is called. fix [Bug #18572] ```ruby # without refinements class C def foo = :C end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } _END__ user system total real master 0.362859 0.002544 0.365403 ( 0.365424) modified 0.357251 0.000000 0.357251 ( 0.357258) ``` ```ruby # with refinment but without using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } __END__ user system total real master 0.957182 0.000000 0.957182 ( 0.957212) modified 0.359228 0.000000 0.359228 ( 0.359238) ``` ```ruby # with using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 using R obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} }	2023-07-31 17:13:43 +09:00
Koichi Sasada	36023d5cb7	mark `cc->cme_` if it is for `super` `vm_search_super_method()` makes orphan CCs (they are not connected from ccs) and `cc->cme_` can be collected before without marking.	2023-07-31 14:04:31 +09:00
Ruby	6391132c03	fix typo (CACH_ -> CACHE_)	2023-07-28 18:01:42 +09:00
Nobuyoshi Nakada	469e644c93	Compile code for non-embedded CI always	2023-06-30 23:59:04 +09:00
Takashi Kokubun	df1b007fbd	Remove unused VM_CALL_BLOCKISEQ flag	2023-04-01 10:22:47 -07:00
Takashi Kokubun	175538e433	Improve explanation of FCALL and VCALL	2023-04-01 10:17:59 -07:00
Koichi Sasada	c9fd81b860	`vm_call_single_noarg_inline_builtin` If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and the builtin-function (bf) is inline-able, the caller doesn't need to build a method frame. `vm_call_single_noarg_inline_builtin` is fast path for such cases.	2023-03-23 14:03:12 +09:00
Takashi Kokubun	2e875549a9	s/MJIT/RJIT/	2023-03-06 23:44:01 -08:00
Yusuke Endoh	2cc3963a00	Prevent wrong integer expansion `(attr_index + 1)` leads to wrong integer expansion on 32-bit machines (including Solaris 10 CI) because `attr_index_t` is uint16_t. http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20221013T080004Z.fail.html.gz ``` 1) Failure: TestRDocClassModule#test_marshal_load_version_2 [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_class_module.rb:493]: <[doc: [doc (file.rb): [para: "this is a comment"]]]> expected but was <[doc: [doc (file.rb): [para: "this is a comment"]]]>. 2) Failure: TestRDocStats#test_report_method_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:460]: Expected /\#\ in\ file\ file\.rb:4/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " # in file file.rb\n" + " def m1; end\n" + "\n" + " end\n" + "\n" + "\n". 3) Failure: TestRDocStats#test_report_attr_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:91]: Expected /\#\ in\ file\ file\.rb:3/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " attr_accessor :a # in file file.rb\n" + "\n" + " end\n" + "\n" + "\n". ```	2022-10-13 08:14:04 -07:00
Nobuyoshi Nakada	b55e3b842a	Initialize shape attr index also in non-markable CC	2022-10-12 09:14:55 -07:00
Yusuke Endoh	7a9f865a1d	Do not read cached_id from callcache on stack The inline cache is initialized by vm_cc_attr_index_set only when vm_cc_markable(cc). However, vm_getivar attempted to read the cache even if the cc is not vm_cc_markable. This caused a condition that depends on uninitialized value. Here is an output of valgrind: ``` ==10483== Conditional jump or move depends on uninitialised value(s) ==10483== at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171) ==10483== by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257) ==10483== by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C62B2: vm_exec_core (insns.def:820) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== by 0x4F00B3: invoke_block (vm.c:1417) ==10483== by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473) ==10483== by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491) ==10483== by 0x4D42B6: rb_yield (vm_eval.c:0) ==10483== by 0x259128: rb_ary_each (array.c:2733) ==10483== by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C6254: vm_exec_core (insns.def:801) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== ``` In fact, the CI on FreeBSD 12 started failing since `ad63b668e2`. ``` gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby' /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError) from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!' from ./ext/extmk.rb:359:in `parse_args' from ./ext/extmk.rb:396:in `<main>' ``` This change adds a guard to read the cache only when vm_cc_markable(cc). It might be better to initialize the cache as INVALID_SHAPE_ID when the cc is not vm_cc_markable.	2022-10-12 20:28:24 +09:00
Jemma Issroff	913979bede	Make inline cache reads / writes atomic with object shapes Prior to this commit, we were reading and writing ivar index and shape ID in inline caches in two separate instructions when getting and setting ivars. This meant there was a race condition with ractors and these caches where one ractor could change a value in the cache while another was still reading from it. This commit instead reads and writes shape ID and ivar index to inline caches atomically so there is no longer a race condition. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-10-11 08:40:56 -07:00
Jemma Issroff	ad63b668e2	Revert "Revert "This commit implements the Object Shapes technique in CRuby."" This reverts commit `9a6803c90b`.	2022-10-11 08:40:56 -07:00
Aaron Patterson	9a6803c90b	Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.	2022-09-30 16:01:50 -07:00
Jemma Issroff	d594a5a8bd	This commit implements the Object Shapes technique in CRuby. Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-09-28 08:26:21 -07:00
Aaron Patterson	06abfa5be6	Revert this until we can figure out WB issues or remove shapes from GC Revert "* expand tabs. [ci skip]" This reverts commit `830b5b5c35`. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit `9ddfd2ca00`.	2022-09-26 16:10:11 -07:00
Jemma Issroff	9ddfd2ca00	This commit implements the Object Shapes technique in CRuby. Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-09-26 09:21:30 -07:00
Jemma Issroff	ecff334995	Extract vm_ic_entry API to mimic vm_cc behavior	2022-07-18 12:44:01 -07:00
Nobuyoshi Nakada	90a8b1c543	Remove a typo hash [ci skip]	2022-01-29 15:33:13 +09:00
Jemma Issroff	1a180b7e18	Streamline cached attr reader / writer indexes This commit removes the need to increment and decrement the indexes used by vm_cc_attr_index getters and setters. It also introduces a vm_cc_attr_index_p predicate function, and a vm_cc_attr_index_initalize function.	2022-01-26 09:02:59 -08:00
Koichi Sasada	df48db987d	`mandatory_only_cme` should not be in `def` `def` (`rb_method_definition_t`) is shared by multiple callable method entries (cme, `rb_callable_method_entry_t`). There are two issues: * old -> young reference: `cme1->def->mandatory_only_cme = monly_cme` if `cme1` is young and `monly_cme` is young, there is no problem. Howevr, another old `cme2` can refer `def`, in this case, old `cme2` points young `monly_cme` and it violates gengc assumption. * cme can have different `defined_class` but `monly_cme` only has one `defined_class`. It does not make sense and `monly_cme` should be created for a cme (not `def`). To solve these issues, this patch allocates `monly_cme` per `cme`. `cme` does not have another room to store a pointer to the `monly_cme`, so this patch introduces `overloaded_cme_table`, which is weak key map `[cme] -> [monly_cme]`. `def::body::iseqptr::monly_cme` is deleted. The first issue is reported by Alan Wu.	2021-12-21 11:03:09 +09:00
Koichi Sasada	ca21eed6eb	fix assertion on `gc_cc_cme()` `cc->cme_` can be NULL when it is not initialized yet. It can be observed on `GC.stress == true` running.	2021-11-25 13:57:49 +09:00
Koichi Sasada	7ec1fc37f4	add `VM_CALLCACHE_ON_STACK` check if iseq refers to on stack CC (it shouldn't).	2021-11-17 22:21:42 +09:00
Koichi Sasada	8d7116552d	assert `cc->cme_ != NULL` when `vm_cc_markable(cc)`.	2021-11-17 22:21:42 +09:00
Koichi Sasada	b2255153cf	`vm_empty_cc_for_super` Same as `vm_empty_cc`, introduce a global variable which has `.call_ = vm_call_super_method`. Use it if the `cme == NULL` on `vm_search_super_method`.	2021-11-17 22:21:42 +09:00
Koichi Sasada	84aba25031	assert `cc->call_ != NULL`	2021-11-17 22:21:42 +09:00
Koichi Sasada	b1b73936c1	`Primitive.mandatory_only?` for fast path Compare with the C methods, A built-in methods written in Ruby is slower if only mandatory parameters are given because it needs to check the argumens and fill default values for optional and keyword parameters (C methods can check the number of parameters with `argc`, so there are no overhead). Passing mandatory arguments are common (optional arguments are exceptional, in many cases) so it is important to provide the fast path for such common cases. `Primitive.mandatory_only?` is a special builtin function used with `if` expression like that: ```ruby def self.at(time, subsec = false, unit = :microsecond, in: nil) if Primitive.mandatory_only? Primitive.time_s_at1(time) else Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end end ``` and it makes two ISeq, ``` def self.at(time, subsec = false, unit = :microsecond, in: nil) Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end def self.at(time) Primitive.time_s_at1(time) end ``` and (2) is pointed by (1). Note that `Primitive.mandatory_only?` should be used only in a condition of an `if` statement and the `if` statement should be equal to the methdo body (you can not put any expression before and after the `if` statement). A method entry with `mandatory_only?` (`Time.at` on the above case) is marked as `iseq_overload`. When the method will be dispatch only with mandatory arguments (`Time.at(0)` for example), make another method entry with ISeq (2) as mandatory only method entry and it will be cached in an inline method cache. The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254 but it only checks mandatory parameters or more, because many cases only mandatory parameters are given. If we find other cases (optional or keyword parameters are used frequently and it hurts performance), we can extend the feature.	2021-11-15 15:58:56 +09:00
Aaron Patterson	0d63600e4f	Partial revert of `ceebc7fc98` I'm looking through the places where YJIT needs notifications. It looks like these changes to gc.c and vm_callinfo.h have become unnecessary since `84ab77ba59`. This commit just makes the diff against upstream smaller, but otherwise shouldn't change any behavior.	2021-10-20 18:19:36 -04:00
Alan Wu	c378c7a7cb	MicroJIT: generate less code for CFUNCs Added UJIT_CHECK_MODE. Set to 1 to double check method dispatch in generated code. It's surprising to me that we need to watch both cc and cme. There might be opportunities to simplify there.	2021-10-20 18:19:26 -04:00
Nobuyoshi Nakada	cd829bb078	Remove printf family from the mjit header Linking printf family functions makes mjit objects to link unnecessary code.	2021-09-11 08:41:32 +09:00
卜部昌平	daf0c04a47	internal/*.h: skip doxygen These contents are purely implementation details, not worth appearing in CAPI documents. [ci skip]	2021-09-10 20:00:06 +09:00
Koichi Sasada	1ecda21366	global call-cache cache table for rb_funcall* rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough.	2021-01-29 16:22:12 +09:00
Koichi Sasada	a7d933e502	fix conditon of vm_cc_invalidated_p() vm_cc_invalidated_p() returns false when the cme is NOT invalidated.	2021-01-19 14:16:37 +09:00
Koichi Sasada	954d6c7432	remove invalidated cc if cc is invalidated, cc should be released from iseq.	2021-01-06 14:57:48 +09:00
Koichi Sasada	aa6287cd26	fix inline method cache sync bug `cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`.	2020-12-15 13:29:30 +09:00
Alan Wu	196eada8c6	mutete -> mutate	2020-10-22 18:20:35 -04:00
卜部昌平	e1e84fbb4f	VM_CI_NEW_ID: USE_EMBED_CI could be false It was a wrong idea to assume CIs are always embedded.	2020-06-09 09:52:46 +09:00
卜部昌平	46728557c1	rb_vm_call0: on-stack call info This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	8f3d4090f0	rb_equal_opt: fully static call data This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	77293cef91	vm_ci_markable: added CIs are created on-the-fly, which increases GC pressure. However they include no references to other objects, and those on-the-fly CIs tend to be short lived. Why not skip allocation of them. In doing so we need to add a flag denotes the CI object does not reside inside of objspace.	2020-06-09 09:52:46 +09:00

1 2

60 Коммитов