github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Peter Zhu	330830dd1a	Add IMEMO_NEW Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.	2024-02-21 11:33:05 -05:00
John Hawthorn	1c97abaaba	De-dup identical callinfo objects Previously every call to vm_ci_new (when the CI was not packable) would result in a different callinfo being returned this meant that every kwarg callsite had its own CI. When calling, different CIs result in different CCs. These CIs and CCs both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop this resulted in a memory leak of both types of object. This also likely resulted in extra memory used, and extra time searching, in non-eval cases. For simplicity in this commit I always allocate a CI object inside rb_vm_ci_lookup, but ideally we would lazily allocate it only when needed. I hope to do that as a follow up in the future.	2024-02-20 18:55:00 -08:00
Jeremy Evans	22e488464a	Add VM_CALL_ARGS_SPLAT_MUT callinfo flag This flag is set when the caller has already created a new array to handle a splat, such as for `f(a, b)` and `f(a, b)`. Previously, if `f` was defined as `def f(a)`, these calls would create an extra array on the callee side, instead of using the new array created by the caller. This modifies `setup_args_core` to set the flag whenver it would add a `splatarray true` instruction. However, when `splatarray true` is changed to `splatarray false` in the peephole optimizer, to avoid unnecessary allocations on the caller side, the flag must be removed. Add `optimize_args_splat_no_copy` and have the peephole optimizer call that. This significantly simplifies the related peephole optimizer code. On the callee side, in `setup_parameters_complex`, set `args->rest_dupped` to true if the flag is set. This takes a similar approach for optimizing regular splats that was previiously used for keyword splats in `d2c41b1bff` (via VM_CALL_KW_SPLAT_MUT).	2024-01-24 18:25:55 -08:00
Jeremy Evans	3081c83169	Support tracing of struct member accessor methods This follows the same approach used for attr_reader/attr_writer in `2d98593bf5`, skipping the checking for tracing after the first call using the call cache, and clearing the call cache when tracing is turned on/off. Fixes [Bug #18886]	2023-12-07 10:29:33 -08:00
HParker	c74dc8b4af	Use reference counting to avoid memory leak in kwargs Tracks other callinfo that references the same kwargs and frees them when all references are cleared. [bug #19906] Co-authored-by: Peter Zhu <peter@peterzhu.ca>	2023-10-01 10:55:19 -04:00
Koichi Sasada	cfd7729ce7	use inline cache for refinements From Ruby 3.0, refined method invocations are slow because resolved methods are not cached by inline cache because of conservertive strategy. However, `using` clears all caches so that it seems safe to cache resolved method entries. This patch caches resolved method entries in inline cache and clear all of inline method caches when `using` is called. fix [Bug #18572] ```ruby # without refinements class C def foo = :C end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } _END__ user system total real master 0.362859 0.002544 0.365403 ( 0.365424) modified 0.357251 0.000000 0.357251 ( 0.357258) ``` ```ruby # with refinment but without using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } __END__ user system total real master 0.957182 0.000000 0.957182 ( 0.957212) modified 0.359228 0.000000 0.359228 ( 0.359238) ``` ```ruby # with using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 using R obj = C.new require 'benchmark' Benchmark.bm{\|x\| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} }	2023-07-31 17:13:43 +09:00
Koichi Sasada	36023d5cb7	mark `cc->cme_` if it is for `super` `vm_search_super_method()` makes orphan CCs (they are not connected from ccs) and `cc->cme_` can be collected before without marking.	2023-07-31 14:04:31 +09:00
Ruby	6391132c03	fix typo (CACH_ -> CACHE_)	2023-07-28 18:01:42 +09:00
Nobuyoshi Nakada	469e644c93	Compile code for non-embedded CI always	2023-06-30 23:59:04 +09:00
Takashi Kokubun	df1b007fbd	Remove unused VM_CALL_BLOCKISEQ flag	2023-04-01 10:22:47 -07:00
Takashi Kokubun	175538e433	Improve explanation of FCALL and VCALL	2023-04-01 10:17:59 -07:00
Koichi Sasada	c9fd81b860	`vm_call_single_noarg_inline_builtin` If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and the builtin-function (bf) is inline-able, the caller doesn't need to build a method frame. `vm_call_single_noarg_inline_builtin` is fast path for such cases.	2023-03-23 14:03:12 +09:00
Takashi Kokubun	2e875549a9	s/MJIT/RJIT/	2023-03-06 23:44:01 -08:00
Yusuke Endoh	2cc3963a00	Prevent wrong integer expansion `(attr_index + 1)` leads to wrong integer expansion on 32-bit machines (including Solaris 10 CI) because `attr_index_t` is uint16_t. http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20221013T080004Z.fail.html.gz ``` 1) Failure: TestRDocClassModule#test_marshal_load_version_2 [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_class_module.rb:493]: <[doc: [doc (file.rb): [para: "this is a comment"]]]> expected but was <[doc: [doc (file.rb): [para: "this is a comment"]]]>. 2) Failure: TestRDocStats#test_report_method_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:460]: Expected /\#\ in\ file\ file\.rb:4/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " # in file file.rb\n" + " def m1; end\n" + "\n" + " end\n" + "\n" + "\n". 3) Failure: TestRDocStats#test_report_attr_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:91]: Expected /\#\ in\ file\ file\.rb:3/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " attr_accessor :a # in file file.rb\n" + "\n" + " end\n" + "\n" + "\n". ```	2022-10-13 08:14:04 -07:00
Nobuyoshi Nakada	b55e3b842a	Initialize shape attr index also in non-markable CC	2022-10-12 09:14:55 -07:00
Yusuke Endoh	7a9f865a1d	Do not read cached_id from callcache on stack The inline cache is initialized by vm_cc_attr_index_set only when vm_cc_markable(cc). However, vm_getivar attempted to read the cache even if the cc is not vm_cc_markable. This caused a condition that depends on uninitialized value. Here is an output of valgrind: ``` ==10483== Conditional jump or move depends on uninitialised value(s) ==10483== at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171) ==10483== by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257) ==10483== by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C62B2: vm_exec_core (insns.def:820) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== by 0x4F00B3: invoke_block (vm.c:1417) ==10483== by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473) ==10483== by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491) ==10483== by 0x4D42B6: rb_yield (vm_eval.c:0) ==10483== by 0x259128: rb_ary_each (array.c:2733) ==10483== by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C6254: vm_exec_core (insns.def:801) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== ``` In fact, the CI on FreeBSD 12 started failing since `ad63b668e2`. ``` gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby' /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError) from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!' from ./ext/extmk.rb:359:in `parse_args' from ./ext/extmk.rb:396:in `<main>' ``` This change adds a guard to read the cache only when vm_cc_markable(cc). It might be better to initialize the cache as INVALID_SHAPE_ID when the cc is not vm_cc_markable.	2022-10-12 20:28:24 +09:00
Jemma Issroff	913979bede	Make inline cache reads / writes atomic with object shapes Prior to this commit, we were reading and writing ivar index and shape ID in inline caches in two separate instructions when getting and setting ivars. This meant there was a race condition with ractors and these caches where one ractor could change a value in the cache while another was still reading from it. This commit instead reads and writes shape ID and ivar index to inline caches atomically so there is no longer a race condition. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-10-11 08:40:56 -07:00
Jemma Issroff	ad63b668e2	Revert "Revert "This commit implements the Object Shapes technique in CRuby."" This reverts commit `9a6803c90b`.	2022-10-11 08:40:56 -07:00
Aaron Patterson	9a6803c90b	Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.	2022-09-30 16:01:50 -07:00
Jemma Issroff	d594a5a8bd	This commit implements the Object Shapes technique in CRuby. Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-09-28 08:26:21 -07:00
Aaron Patterson	06abfa5be6	Revert this until we can figure out WB issues or remove shapes from GC Revert "* expand tabs. [ci skip]" This reverts commit `830b5b5c35`. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit `9ddfd2ca00`.	2022-09-26 16:10:11 -07:00
Jemma Issroff	9ddfd2ca00	This commit implements the Object Shapes technique in CRuby. Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>	2022-09-26 09:21:30 -07:00
Jemma Issroff	ecff334995	Extract vm_ic_entry API to mimic vm_cc behavior	2022-07-18 12:44:01 -07:00
Nobuyoshi Nakada	90a8b1c543	Remove a typo hash [ci skip]	2022-01-29 15:33:13 +09:00
Jemma Issroff	1a180b7e18	Streamline cached attr reader / writer indexes This commit removes the need to increment and decrement the indexes used by vm_cc_attr_index getters and setters. It also introduces a vm_cc_attr_index_p predicate function, and a vm_cc_attr_index_initalize function.	2022-01-26 09:02:59 -08:00
Koichi Sasada	df48db987d	`mandatory_only_cme` should not be in `def` `def` (`rb_method_definition_t`) is shared by multiple callable method entries (cme, `rb_callable_method_entry_t`). There are two issues: * old -> young reference: `cme1->def->mandatory_only_cme = monly_cme` if `cme1` is young and `monly_cme` is young, there is no problem. Howevr, another old `cme2` can refer `def`, in this case, old `cme2` points young `monly_cme` and it violates gengc assumption. * cme can have different `defined_class` but `monly_cme` only has one `defined_class`. It does not make sense and `monly_cme` should be created for a cme (not `def`). To solve these issues, this patch allocates `monly_cme` per `cme`. `cme` does not have another room to store a pointer to the `monly_cme`, so this patch introduces `overloaded_cme_table`, which is weak key map `[cme] -> [monly_cme]`. `def::body::iseqptr::monly_cme` is deleted. The first issue is reported by Alan Wu.	2021-12-21 11:03:09 +09:00
Koichi Sasada	ca21eed6eb	fix assertion on `gc_cc_cme()` `cc->cme_` can be NULL when it is not initialized yet. It can be observed on `GC.stress == true` running.	2021-11-25 13:57:49 +09:00
Koichi Sasada	7ec1fc37f4	add `VM_CALLCACHE_ON_STACK` check if iseq refers to on stack CC (it shouldn't).	2021-11-17 22:21:42 +09:00
Koichi Sasada	8d7116552d	assert `cc->cme_ != NULL` when `vm_cc_markable(cc)`.	2021-11-17 22:21:42 +09:00
Koichi Sasada	b2255153cf	`vm_empty_cc_for_super` Same as `vm_empty_cc`, introduce a global variable which has `.call_ = vm_call_super_method`. Use it if the `cme == NULL` on `vm_search_super_method`.	2021-11-17 22:21:42 +09:00
Koichi Sasada	84aba25031	assert `cc->call_ != NULL`	2021-11-17 22:21:42 +09:00
Koichi Sasada	b1b73936c1	`Primitive.mandatory_only?` for fast path Compare with the C methods, A built-in methods written in Ruby is slower if only mandatory parameters are given because it needs to check the argumens and fill default values for optional and keyword parameters (C methods can check the number of parameters with `argc`, so there are no overhead). Passing mandatory arguments are common (optional arguments are exceptional, in many cases) so it is important to provide the fast path for such common cases. `Primitive.mandatory_only?` is a special builtin function used with `if` expression like that: ```ruby def self.at(time, subsec = false, unit = :microsecond, in: nil) if Primitive.mandatory_only? Primitive.time_s_at1(time) else Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end end ``` and it makes two ISeq, ``` def self.at(time, subsec = false, unit = :microsecond, in: nil) Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end def self.at(time) Primitive.time_s_at1(time) end ``` and (2) is pointed by (1). Note that `Primitive.mandatory_only?` should be used only in a condition of an `if` statement and the `if` statement should be equal to the methdo body (you can not put any expression before and after the `if` statement). A method entry with `mandatory_only?` (`Time.at` on the above case) is marked as `iseq_overload`. When the method will be dispatch only with mandatory arguments (`Time.at(0)` for example), make another method entry with ISeq (2) as mandatory only method entry and it will be cached in an inline method cache. The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254 but it only checks mandatory parameters or more, because many cases only mandatory parameters are given. If we find other cases (optional or keyword parameters are used frequently and it hurts performance), we can extend the feature.	2021-11-15 15:58:56 +09:00
Aaron Patterson	0d63600e4f	Partial revert of `ceebc7fc98` I'm looking through the places where YJIT needs notifications. It looks like these changes to gc.c and vm_callinfo.h have become unnecessary since `84ab77ba59`. This commit just makes the diff against upstream smaller, but otherwise shouldn't change any behavior.	2021-10-20 18:19:36 -04:00
Alan Wu	c378c7a7cb	MicroJIT: generate less code for CFUNCs Added UJIT_CHECK_MODE. Set to 1 to double check method dispatch in generated code. It's surprising to me that we need to watch both cc and cme. There might be opportunities to simplify there.	2021-10-20 18:19:26 -04:00
Nobuyoshi Nakada	cd829bb078	Remove printf family from the mjit header Linking printf family functions makes mjit objects to link unnecessary code.	2021-09-11 08:41:32 +09:00
卜部昌平	daf0c04a47	internal/*.h: skip doxygen These contents are purely implementation details, not worth appearing in CAPI documents. [ci skip]	2021-09-10 20:00:06 +09:00
Koichi Sasada	1ecda21366	global call-cache cache table for rb_funcall* rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough.	2021-01-29 16:22:12 +09:00
Koichi Sasada	a7d933e502	fix conditon of vm_cc_invalidated_p() vm_cc_invalidated_p() returns false when the cme is NOT invalidated.	2021-01-19 14:16:37 +09:00
Koichi Sasada	954d6c7432	remove invalidated cc if cc is invalidated, cc should be released from iseq.	2021-01-06 14:57:48 +09:00
Koichi Sasada	aa6287cd26	fix inline method cache sync bug `cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`.	2020-12-15 13:29:30 +09:00
Alan Wu	196eada8c6	mutete -> mutate	2020-10-22 18:20:35 -04:00
卜部昌平	e1e84fbb4f	VM_CI_NEW_ID: USE_EMBED_CI could be false It was a wrong idea to assume CIs are always embedded.	2020-06-09 09:52:46 +09:00
卜部昌平	46728557c1	rb_vm_call0: on-stack call info This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	8f3d4090f0	rb_equal_opt: fully static call data This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	77293cef91	vm_ci_markable: added CIs are created on-the-fly, which increases GC pressure. However they include no references to other objects, and those on-the-fly CIs tend to be short lived. Why not skip allocation of them. In doing so we need to add a flag denotes the CI object does not reside inside of objspace.	2020-06-09 09:52:46 +09:00
Nobuyoshi Nakada	a9e0046c26	Moved vm_empty_cc to local in vm.c [Bug #16934 ] Missed to commit a staged change.	2020-06-04 17:13:56 +09:00
卜部昌平	4ff3f20540	add #include guard hack According to MSVC manual (1), cl.exe can skip including a header file when that: - contains #pragma once, or - starts with #ifndef, or - starts with #if ! defined. GCC has a similar trick (2), but it acts more stricter (e. g. there must be _no tokens_ outside of #ifndef...#endif). Sun C lacked #pragma once for a looong time. Oracle Developer Studio 12.5 finally implemented it, but we cannot assume such recent version. This changeset modifies header files so that each of them include strictly one #ifndef...#endif. I believe this is the most portable way to trigger compiler optimizations. [Bug #16770] 1: https://docs.microsoft.com/en-us/cpp/preprocessor/once 2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html	2020-04-13 16:06:00 +09:00
Alan Wu	82fdffc5ec	Avoid UB with flexible array member Accessing past the end of an array is technically UB. Use C99 flexible array member instead to avoid the UB and simplify allocation size calculation. See also: DCL38-C in the SEI CERT C Coding Standard	2020-04-12 15:19:06 -04:00
卜部昌平	9e6e39c351	Merge pull request #2991 from shyouhei/ruby.h Split ruby.h	2020-04-08 13:28:13 +09:00
Jeremy Evans	d2c41b1bff	Reduce allocations for keyword argument hashes Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(kw) kw end h = {a: 1} kw(a: 1) # about same kw(h) # 2.37x faster kws(a: 1) # 1.30x faster kws(h) # 2.19x faster kw(a: 1, h) # 1.03x slower kw(h, h) # about same kws(a: 1, h) # 1.16x faster kws(h, **h) # 1.14x faster	2020-03-17 12:09:43 -07:00

1 2

55 Коммитов