github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Aaron Patterson	2324584d46	Update vm_insnhelper.c Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2020-11-09 09:44:16 -08:00
Aaron Patterson	5582c5a232	Remove iv table size check iv tables cannot shrink. If the inline cache was ever set, then there must be an entry for the instance variable in the iv table. Just set the iv list on the object to be equal to the iv index table size, then set the iv.	2020-11-09 09:44:16 -08:00
Aaron Patterson	eb229994e5	eagerly initialize ivar table when index is small enough When the inline cache is written, the iv table will contain an entry for the instance variable. If we get an inline cache hit, then we know the iv table must contain a value for the index written to the inline cache. If the index in the inline cache is larger than the list on the object, but smaller than the iv index table on the class, then we can just eagerly allocate the iv list to be the same size as the iv index table. This avoids duplicate work of checking frozen as well as looking up the index for the particular instance variable name.	2020-11-09 09:44:16 -08:00
Koichi Sasada	5d97bdc2dc	Ractor.make_shareable(a_proc) Ractor.make_shareable() supports Proc object if (1) a Proc only read outer local variables (no assignments) (2) read outer local variables are shareable. Read local variables are stored in a snapshot, so after making shareable Proc, any assignments are not affeect like that: ```ruby a = 1 pr = Ractor.make_shareable(Proc.new{p a}) pr.call #=> 1 a = 2 pr.call #=> 1 # `a = 2` doesn't affect ``` [Feature #17284]	2020-10-30 03:12:09 +09:00
Jeremy Evans	ff2276ef8c	Fix bootstrap-test error in previous commit	2020-10-25 15:56:10 -07:00
Koichi Sasada	f6661f5085	sync RClass::ext::iv_index_tbl iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically.	2020-10-17 08:18:04 +09:00
Alan Wu	0d17cdd0ac	Abort on system stack overflow during GC Buggy native extensions could have mark functions that cause stack overflow. When a stack overflow happens during GC, Ruby used to recover by raising an exception, which runs the interpreter. It's not safe to run the interpreter during GC since the GC is in an inconsistent state. This could cause object allocation during GC, for example. Instead of running the interpreter and potentially causing a crash down the line, fail fast and abort.	2020-10-16 10:24:12 -04:00
Koichi Sasada	fad97f1f96	sync generic_ivtbl generic_ivtbl is a process global table to maintain instance variables for non T_OBJECT/T_CLASS/... objects. So we need to protect them for multi-Ractor exection. Hint: we can make them Ractor local for unshareable objects, but now it is premature optimization.	2020-10-14 16:36:55 +09:00
eileencodes	8dd9a23693	Make minor improvements to super The changes here include: * Using `FL_TEST_RAW` instead of `FL_TEST` in the first check in `vm_search_super_method`. While the profile showed us spending a fair amount of time here, the subsequent benchmarks didn't show much improvement when adding this. Regardless, we know this does less work than `FL_TEST` and we know that `FL_TEST_RAW` is safe due to the previous check so it's a small but accurate optimization. * Set `mid` only once. Both `vm_ci_new_runtime` and `vm_ci_mid` were getting the `original_id` for the method entry. We can do this once and pass the variable to the 2 callers that need it. This also doesn't have a huge performance improvement but cleans up the code a bit. Benchmark: ``` \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|vm_iclass_super \| 3.540M\| 3.940M\| \| \| -\| 1.11x\| ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2020-10-01 10:11:02 -07:00
Koichi Sasada	caaa36b4e6	prohibi method call by defined_method in other racotrs We can not call a non-isolated Proc in multiple ractors.	2020-09-25 20:37:38 +09:00
eileencodes	637d1cc0c0	Improve the performance of super This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super \| \|compare-ruby\|built-ruby\| \|:---------\|-----------:\|---------:\| \|vm_super \| 21.555M\| 37.819M\| \| \| -\| 1.75x\| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|vm_iclass_super \| 1.669M\| 3.683M\| \| \| -\| 2.21x\| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do \|module_name\| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do \|x\| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email>	2020-09-23 11:52:36 -07:00
Jeremy Evans	179384a668	Revert "Prevent SystemStackError when calling super in module with activated refinement" This reverts commit `eeef16e190`. This also reverts the spec change. Preventing the SystemStackError would be nice, but there is valid code that the fix breaks, and it is probably more common than cases that cause the SystemStackError. Fixes [Bug #17182]	2020-09-22 12:04:48 -07:00
Benoit Daloze	9b535f3ff7	Interpolated strings are no longer frozen with frozen-string-literal: true * Remove freezestring instruction since this was the only usage for it. * [Feature #17104]	2020-09-15 21:32:35 +02:00
Koichi Sasada	79df14c04b	Introduce Ractor mechanism for parallel execution This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues.	2020-09-03 21:11:06 +09:00
Alan Wu	4c3f0597de	Remove the pc argument of vm_trace() This makes the binary 272 bytes smaller on -O3 GCC 10.2.0.	2020-09-01 22:02:29 -04:00
Jeremy Evans	c60aaed185	Fix Method#super_method for aliased methods Previously, Method#super_method looked at the called_id to determine the method id to use, but that isn't correct for aliased methods, because the super target depends on the original method id, not the called_id. Additionally, aliases can reference methods defined in other classes and modules, and super lookup needs to start in the super of the defined class in such cases. This adds tests for Method#super_method for both types of aliases, one that uses VM_METHOD_TYPE_ALIAS and another that does not. Both check that the results for calling super methods return the expected values. To find the defined class for alias methods, add an rb_ prefix to find_defined_class_by_owner in vm_insnhelper.c and make it non-static, so that it can be called from method_super_method in proc.c. This bug was original discovered while researching [Bug #11189]. Fixes [Bug #17130]	2020-08-27 08:37:03 -07:00
Jeremy Evans	eeef16e190	Prevent SystemStackError when calling super in module with activated refinement Without this, if a refinement defines a method that calls super and includes a module with a module that calls super and has a activated refinement at the point super is called, the module method super call will end up calling back into the refinement method, creating a loop. Fixes [Bug #17007]	2020-07-27 08:18:11 -07:00
Nobuyoshi Nakada	a82252df42	Fixed another typo	2020-07-10 12:48:47 +09:00
Nobuyoshi Nakada	234f8eee33	Fixed typos	2020-07-10 12:32:48 +09:00
卜部昌平	e4ee992099	vm_push_frame_debug_counter_inc: use branches Ko1 doesn't like previous code.	2020-07-10 12:23:41 +09:00
卜部昌平	0e276dc458	vm_push_frame: move assignments around Struct assignment using a compound literal is more readable than before, to me at least. It seems compilers reorder assignments anyways. Neither speedup nor slowdown is observed on my machine.	2020-07-10 12:23:41 +09:00
卜部昌平	4b8170ce80	vm_push_frame: move assertions out of the function These assertions are purely static. Ned not be checked on-the-fly.	2020-07-10 12:23:41 +09:00
卜部昌平	1d93705d6a	vm_push_frame: hoist out debug codes Made it a bit readable.	2020-07-10 12:23:41 +09:00
卜部昌平	db7f3496dd	nobody uses the return value of vm_push_frame Surprised to see such a waste of time in this super duper hot path.	2020-07-10 12:23:41 +09:00
Koichi Sasada	a0f12a0258	Use ID instead of GENTRY for gvars. (#3278 ) Use ID instead of GENTRY for gvars. Global variables are compiled into GENTRY (a pointer to struct rb_global_entry). This patch replace this GENTRY to ID and make the code simple. We need to search GENTRY from ID every time (st_lookup), so additional overhead will be introduced. However, the performance of accessing global variables is not important now a day and this simplicity helps Ractor development.	2020-07-03 16:56:44 +09:00
Nobuyoshi Nakada	52ef2477e4	Extracted METHOD_ENTRY_CACHEABLE macro	2020-06-30 19:12:02 +09:00
卜部昌平	1bf0d36171	vm_getivar: do not goto into a branch I'm not necessarily against every goto in general, but jumping into a branch is definitely a bad idea. Better refactor.	2020-06-29 11:05:41 +09:00
Takashi Kokubun	7982dc1dfd	Decide JIT-ed insn based on cached cfunc for opt_* insns. opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is handled by opt_send_without_block. Therefore we can't decide which insn should be generated by checking whether it's cfunc cc or not. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master `9dbc2294a6`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux] last_commit=Decide JIT-ed insn based on cached cfunc Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s Comparison: mjit_nil?(1) after --jit: 74020528.4 i/s before --jit: 73878185.9 i/s - 1.00x slower mjit_not(1) after --jit: 74600882.0 i/s before --jit: 72634507.6 i/s - 1.03x slower mjit_eq(1, nil) after --jit: 7444657.4 i/s before --jit: 7331304.3 i/s - 1.02x slower mjit_eq(nil, 1) after --jit: 64710790.6 i/s before --jit: 49449507.4 i/s - 1.31x slower ```	2020-06-25 23:33:08 -07:00
Takashi Kokubun	d9f608b686	Verify builtin inline annotation with VM_CHECK_MODE (#3244 ) * Verify builtin inline annotation with VM_CHECK_MODE * Remove static to fix the link issue on MJIT	2020-06-21 10:27:04 -07:00
Takashi Kokubun	426db4cd90	Fix -Wmaybe-uninitialized at vm_invoke_block	2020-06-21 00:34:39 -07:00
Nobuyoshi Nakada	ccb7a4b9f2	Replaced accessors of `Struct` with `invokebuiltin`	2020-06-17 08:18:46 +09:00
Nobuyoshi Nakada	318d52e820	Revert "Replaced accessors of `Struct` with `invokebuiltin`" This reverts commit `19cabe8b09`, which didn't support tool/lib/iseq_loader_checker.rb.	2020-06-16 18:44:58 +09:00
Nobuyoshi Nakada	19cabe8b09	Replaced accessors of `Struct` with `invokebuiltin`	2020-06-16 18:24:02 +09:00
卜部昌平	5648976c3c	vm_call_method: avoid marking on-stack object This callcache is on stack, must not be GCed. However its contents are copied from other materials, which can be an ordinal object. Should set a flag to make sure it is properly skipped by the GC.	2020-06-10 10:22:39 +09:00
卜部昌平	1016cff4ff	rb_eql_opt,rb_equal_opt: purge stale cc When on USE_EMBED_CI, cd is stored statically. Previous use could cache stale cd->cc, which could have already been GCed. Need flush them.	2020-06-09 09:52:46 +09:00
卜部昌平	ffe58b9c8b	vm_ccs_push: do not cache non-heap entries Entires not GC-able must be considered to be volatile. Not eligible for later use.	2020-06-09 09:52:46 +09:00
卜部昌平	e1e84fbb4f	VM_CI_NEW_ID: USE_EMBED_CI could be false It was a wrong idea to assume CIs are always embedded.	2020-06-09 09:52:46 +09:00
卜部昌平	324038c66e	eliminate C99 compound literals Ko1 prefers variables be assgined, instead of bare literals in function arguments.	2020-06-09 09:52:46 +09:00
卜部昌平	4fbe86d0e2	vm_call_method: use struct assignment This further reduces the generated binary of vm_call_method from 566 bytes to 545 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	46728557c1	rb_vm_call0: on-stack call info This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	db406daa60	vm_yield_setup_args: refactor use macro	2020-06-09 09:52:46 +09:00
卜部昌平	367263c3dd	vm_call_method: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method from 600 bytes to 566 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	fb3f1f95e8	vm_call_refined: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method_each_type from 2,442 bytes to 2,378 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	be5dfdd8a2	vm_call_zsuper: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method_each_type from 2,522 bytes to 2,442 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	dbbde61cef	vm_call_method_missing_body: on-stack call info This changeset reduces the generated binary of vm_call_method_missing_body from 604 bytes to 532 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	9c287f8caa	vm_call_symbol: on-stack call info This changeset reduces the generated binary of vm_call_symbol from 808 bytes to 798 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	62b471bd44	vm_call_alias: no call vm_cc_fill This changeset reduces the generated binary of vm_call_alias from 188 bytes to 149 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	97f456374d	rb_eql_opt: fully static call data This changeset reduces the generated binary of rb_eql_opt from 86 bytes to 20 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	3da9c51973	rb_vm_search_method_slowpath: skip vm_empty_cc Now that vm_empty_cc is statically allocated outside of the object space. It shall not be GCed. Here, because vm_search_cc can return that. Must not be blindly passed to RB_OBJ_WRITE, unless assertions fail on RGENGC_CHECK_MODE, like this: -- C level backtrace information ------------------------------------------- ruby(rb_print_backtrace+0x19) [0x5555557fd579] vm_dump.c:757 ruby(rb_vm_bugreport+0x151) [0x5555557fd6f1] vm_dump.c:955 ruby(rb_bug+0x1d6) [0x5555558d6396] error.c:660 ruby(check_rvalue_consistency_force+0x707) [0x5555555adb97] gc.c:1289 ruby(check_rvalue_consistency+0x1a) [0x555555598a0a] gc.c:1305 ruby(RVALUE_OLD_P+0x15) [0x5555555975d5] gc.c:1382 ruby(rb_gc_writebarrier+0x9f) [0x55555559753f] gc.c:6882 ruby(rb_obj_written+0x3a) [0x5555557a025a] include/ruby/internal/rgengc.h:180 ruby(rb_obj_write+0x41) [0x5555557a1a81] include/ruby/internal/rgengc.h:195 ruby(rb_vm_search_method_slowpath+0x5a) [0x5555557a125a] vm_insnhelper.c:1603 ruby(vm_search_method_fastpath+0x197) [0x5555557d8027] vm_insnhelper.c:1638 ruby(vm_search_method+0xea) [0x5555557d7d2a] vm_insnhelper.c:1650 ruby(vm_search_method_wrap+0x29) [0x5555557dbaf9] vm_insnhelper.c:4091 ruby(vm_sendish+0xa9) [0x5555557dba39] vm_insnhelper.c:4143 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_collect+0xb0) [0x555555828320] array.c:3186 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x19f) [0x5555557d183f] vm.c:1951 ruby(rb_iseq_eval+0x30) [0x5555557d2530] vm.c:2190 ruby(load_iseq_eval+0xd6) [0x5555555fa7e6] load.c:592 ruby(require_internal+0x25e) [0x5555555f7f5e] load.c:1022 ruby(rb_require_string+0x27) [0x5555555f74e7] load.c:1094 ruby(rb_f_require_relative+0x5f) [0x5555555f758f] load.c:837 ruby(call_cfunc_1+0x30) [0x5555557f0f70] vm_insnhelper.c:2391 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_call_cfunc+0xad) [0x5555557e521d] vm_insnhelper.c:2574 ruby(vm_call_method_each_type+0xc7) [0x5555557e4af7] vm_insnhelper.c:3040 ruby(vm_call_method+0x19c) [0x5555557e45dc] vm_insnhelper.c:3144 ruby(vm_call_general+0x2d) [0x5555557c8c3d] vm_insnhelper.c:3176 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(rb_iseq_eval_main+0x30) [0x5555557d2670] vm.c:2201 ruby(rb_ec_exec_node+0x16b) [0x55555557e39b] eval.c:296 ruby(ruby_run_node+0x72) [0x55555557e1f2] eval.c:354 ruby(main+0x78) [0x55555557a5d8] main.c:50	2020-06-09 09:52:46 +09:00
卜部昌平	8f3d4090f0	rb_equal_opt: fully static call data This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	3928c151a6	vm_search_method_fastpath: avoid rb_vm_empty_cc() This is such a hot path that it's worth eliminating a function call. Use the static variable directly instead.	2020-06-09 09:52:46 +09:00
卜部昌平	877238f2d3	check_cfunc: add assertions For debug. Must not change generated binary unless VM_ASSERT is on.	2020-06-09 09:52:46 +09:00
Nobuyoshi Nakada	184f78314e	Properly resolve refinements in defined? on private call [Bug #16932 ]	2020-06-04 02:12:57 +09:00
Nobuyoshi Nakada	8340c773e5	Properly resolve refinements in defined? on method call [Bug #16932 ]	2020-06-04 02:12:57 +09:00
卜部昌平	de5e0f7c06	vm_invoke_proc_block: reduce recursion According to nobu recursion can be longer than my expectation. Limit them here.	2020-06-03 16:13:47 +09:00
卜部昌平	b61e82eac9	vm_call_symbol: check stack overflow VM stack could overflow here. The condition is when a symbol is passed to a block-taking method via &variable, and that symbol has never been used for actual method names (thus yielding that results in calling method_missing), and the VM stack is full (no single word left). This is a once-in-a-blue-moon event. Yet there is a very tiny room of stack overflow. We need to check that.	2020-06-03 16:13:47 +09:00
卜部昌平	ba20e6080d	vm_invoke_block: remove auto qualifier Was (harmless but) redundant.	2020-06-03 16:13:47 +09:00
卜部昌平	6302b96368	vm_insnhelper.c: add space [ci skip] Just cosmetic change to improve readability.	2020-06-03 16:13:47 +09:00
卜部昌平	054c2fdfdf	vm_invoke_symbol_block: reduce MEMCPY This commit changes the number of calls of MEMCPY from... \| send \| &:sym -------------------------\|-------\|------- Symbol already interned \| once \| twice Symbol not pinned yet \| none \| once to: \| send \| &:sym -------------------------\|-------\|------- Symbol already interned \| once \| none Symbol not pinned yet \| twice \| once So it sacrifices exceptional situation for normal path.	2020-06-03 16:13:47 +09:00
卜部昌平	36322942db	vm_invoke_symbol_block: call vm_call_opt_send Symbol#to_proc and Object#send are closely related each other. Why not share their implementations. By doing so we can skip recursive call of vm_exec(), which could benefit for speed.	2020-06-03 16:13:47 +09:00
卜部昌平	973883aaa1	vm_invoke_block: force indirect jump This changeset slightly speeds up on my machine. Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 38.33488426546287 40.89825082589147 fps 40.91288557922081 41.48687465359386 40.96591995270991 41.98499064664184 41.20461943032173 43.67314690779162 42.38344888176518 44.02777536251875 43.43563728880915 44.88695892714136 43.88082889062643 45.11226186242523	2020-06-03 16:13:47 +09:00
卜部昌平	796f9edae0	vm_invoke_block: insertion of unused args This makes it possible for vm_invoke_block to pass its passed arguments verbatimly to calling functions. Because they are tail-called the function calls can be strength-recuced into indirect jumps, which is a huge win.	2020-06-03 16:13:47 +09:00
卜部昌平	ec87a58d55	vm_invoke_block: eliminate goto Use recursion for better readability.	2020-06-03 16:13:47 +09:00
卜部昌平	11c70b3162	vm_invoke_block: move logics around Moved block handler -> captured block conversion from vm_invokeblock to each vm_invoke_*_block functions.	2020-06-03 16:13:47 +09:00
Nobuyoshi Nakada	d05f04d27d	Fixed `defined?` against protected method call Protected methods are restricted to be called according to the class/module in where it is defined, not the actual receiver's class. [Bug #16931]	2020-06-02 19:01:56 +09:00
卜部昌平	40ced763b4	vm_insnhelper.c: merge opt_eq_func / opt_eql_func These two function were almost identical, except in case of T_STRING/T_FLOAT. Why not merge them into one, and let the difference be handled in normal method calls (slowpath). This does not improve runtime performance for me, but at least reduces for instance rb_eql_opt from 653 bytes to 86 bytes on my machine, according to nm(1).	2020-06-02 12:38:01 +09:00
Takashi Kokubun	9d71373c23	Mark vm_stackoverflow as NOINLINE COLDFUNC on JIT to reduce code size and improve locality of hot code.	2020-05-26 23:24:58 -07:00
Jeremy Evans	ad729a1d11	Fix origin iclass pointer for modules If a module has an origin, and that module is included in another module or class, previously the iclass created for the module had an origin pointer to the module's origin instead of the iclass's origin. Setting the origin pointer correctly requires using a stack, since the origin iclass is not created until after the iclass itself. Use a hidden ruby array to implement that stack. Correctly assigning the origin pointers in the iclass caused a use-after-free in GC. If a module with an origin is included in a class, the iclass shares a method table with the module and the iclass origin shares a method table with module origin. Mark iclass origin with a flag that notes that even though the iclass is an origin, it shares a method table, so the method table should not be garbage collected. The shared method table will be garbage collected when the module origin is garbage collected. I've tested that this does not introduce a memory leak. This change caused a VM assertion failure, which was traced to callable method entries using the incorrect defined_class. Update rb_vm_check_redefinition_opt_method and find_defined_class_by_owner to treat iclass origins different than class origins to avoid this issue. This also includes a fix for Module#included_modules to skip iclasses with origins. Fixes [Bug #16736]	2020-05-22 20:31:23 -07:00
Koichi Sasada	cbd45af2a9	fix memory leak of ccs rb_callable_method_entry() creates ccs entry in cc_tbl, but this code overwrite by insert newly created ccs and overwrote ccs never freed. [Bug #16900]	2020-05-22 11:23:20 +09:00
卜部昌平	15e977349e	more on NULL versus functions Function pointers are not void*. See also `115fec062c` `ce4ea956d2` `8427fca49b`	2020-05-11 16:47:25 +09:00
卜部昌平	9e41a75255	sed -i 's\|ruby/impl\|ruby/internal\|' To fix build failures.	2020-05-11 09:24:08 +09:00
卜部昌平	d7f4d732c1	sed -i s\|ruby/3\|ruby/impl\|g This shall fix compile errors.	2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada	cfe0e660f4	Disable -Wswitch warning when VM_CHECK_MODE	2020-05-03 00:15:56 +09:00
Takashi Kokubun	79f3403be0	Invalidate fastpath when calling attr_reader by super The same bug as `8355a99883` existed in attr_reader too.	2020-04-14 23:49:29 -07:00
Takashi Kokubun	8355a99883	Invalidate fastpath when calling attr_writer by super We started to use fastpath on invokesuper when a method is not refinements since `5c27681813`, but we shouldn't have used fastpath for attr_writer either. `cc->aux_.attr_index` is for an actual receiver class, while we store its superclass in `cc->klass` and therefore there's no way to properly invalidate attr_writer's inline cache when it's called by super. [Bug #16785] I suspect the same bug also exists in attr_reader. I'll address that in another commit.	2020-04-14 23:32:13 -07:00
Takashi Kokubun	310ef9f40b	Make vm_call_cfunc_with_frame a fastpath (#3027 ) when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ```	2020-04-13 20:32:59 -07:00
Takashi Kokubun	5c27681813	Enable fastpath on invokesuper (#3021 ) Fastpath has not been used for invokesuper since it has set vm_call_super_method on every invocation. Because it seems to be blocked only by refinements, try enabling fastpath on invokesuper when cme is not for refinements. While this patch itself should be helpful for VM performance, a part of this patch's motivation is to unblock inlining invokesuper on JIT. $ benchmark-driver -v --rbenv 'before;after' benchmark/vm2_super.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-11T15:19:58Z master `a01bda5949`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-12T02:00:08Z invokesuper-fastpath c171984ee3) [x86_64-linux] Calculating ------------------------------------- before after vm2_super 20.031M 32.860M i/s - 6.000M times in 0.299534s 0.182593s Comparison: vm2_super after: 32859885.2 i/s before: 20031097.3 i/s - 1.64x slower	2020-04-11 20:45:22 -07:00
Jeremy Evans	900e83b501	Turn class variable warnings into exceptions This changes the following warnings: * warning: class variable access from toplevel * warning: class variable @foo of D is overtaken by C into RuntimeErrors. Handle defined?(@@foo) at toplevel by returning nil instead of raising an exception (the previous behavior warned before returning nil when defined? was used). Refactor the specs to avoid the warnings even in older versions. The specs were checking for the warnings, but the purpose of the related specs as evidenced from their description is to test for behavior, not for warnings. Fixes [Bug #14541]	2020-04-10 00:29:05 -07:00
卜部昌平	9e6e39c351	Merge pull request #2991 from shyouhei/ruby.h Split ruby.h	2020-04-08 13:28:13 +09:00
Jeremy Evans	d2c41b1bff	Reduce allocations for keyword argument hashes Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(kw) kw end h = {a: 1} kw(a: 1) # about same kw(h) # 2.37x faster kws(a: 1) # 1.30x faster kws(h) # 2.19x faster kw(a: 1, h) # 1.03x slower kw(h, h) # about same kws(a: 1, h) # 1.16x faster kws(h, **h) # 1.14x faster	2020-03-17 12:09:43 -07:00
卜部昌平	f12b9a3338	%p is for void * See also `35eb12c063` `6f5eb28507` `687308cf0d` `b6a2d63eb3`	2020-03-04 12:30:42 +09:00
Koichi Sasada	91de0daaa2	method_missing_reason should be set. send() has special method launcher in VM and it has special method_missing caller. This path doesn't set ec->method_missing_reason which is used at exception creation, so setup this information. Without this setting, NoMethodError exception becomes NameError. This patch will fix: http://ci.rvm.jp/results/trunk-random1@phosphorus-docker/2761643	2020-03-03 02:44:02 +09:00
Koichi Sasada	18674aef0d	check imemo_type check imemo_type to debug http://ci.rvm.jp/results/trunk-vm-asserts@silicon-docker/2744755	2020-02-27 10:50:20 +09:00
Koichi Sasada	b9007b6c54	Introduce disposable call-cache. This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details.	2020-02-22 09:58:59 +09:00
Koichi Sasada	f2286925f0	VALUE size packed callinfo (ci). Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked.	2020-02-22 09:58:59 +09:00
Nobuyoshi Nakada	0b4500d982	Adjusted indent [ci skip]	2020-02-22 00:17:31 +09:00
Koichi Sasada	99a8742067	should be compared with called_id me->called_id and me->def->original_id can be different sometimes so we should compare with called_id, which is mtbl's key. (fix GH-PR #2869)	2020-02-13 03:30:22 +09:00
John Hawthorn	ed7b46b66b	Use inline cache for super calls	2020-02-13 00:14:55 +09:00
Koichi Sasada	a635c93fde	support MJIT with debug option. VM_CHECK_MODE > 0 with optflags=-O0 can not run JIT tests because of link problems. This patch fix them.	2020-02-03 16:57:41 +09:00
Jeremy Evans	beae6cbf0f	Fully separate positional arguments and keyword arguments This removes the warnings added in 2.7, and changes the behavior so that a final positional hash is not treated as keywords or vice-versa. To handle the arg_setup_block splat case correctly with keyword arguments, we need to check if we are taking a keyword hash. That case didn't have a test, but it affects real-world code, so add a test for it. This removes rb_empty_keyword_given_p() and related code, as that is not needed in Ruby 3. The empty keyword case is the same as the no keyword case in Ruby 3. This changes rb_scan_args to implement keyword argument separation for C functions when the : character is used. For backwards compatibility, it returns a duped hash. This is a bad idea for performance, but not duping the hash breaks at least Enumerator::ArithmeticSequence#inspect. Instead of having RB_PASS_CALLED_KEYWORDS be a number, simplify the code by just making it be rb_keyword_given_p().	2020-01-02 18:40:45 -08:00
卜部昌平	5e22f873ed	decouple internal.h headers Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).	2019-12-26 20:45:12 +09:00
卜部昌平	0958e19ffb	add several __has_something macro With these macros implemented we can write codes just like we can assume the compiler being clang. MSC_VERSION_SINCE is defined to implement those macros, but turned out to be handy for other places. The -fdeclspec compiler flag is necessary for clang to properly handle __has_declspec().	2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada	db16629008	Fixed misspellings Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.	2019-12-20 09:32:42 +09:00
卜部昌平	f054f11a38	per-method serial number Methods and their definitions can be allocated/deallocated on-the-fly. One pathological situation is when a method is deallocated then another one is allocated immediately after that. Address of those old/new method entries/definitions can be the same then, depending on underlying malloc/free implementation. So pointer comparison is insufficient. We have to check the contents. To do so we introduce def->method_serial, which is an integer unique to that specific method definition. PS: Note that method_serial being uintptr_t rather than rb_serial_t is intentional. This is because rb_serial_t can be bigger than a pointer on a 32bit system (rb_serial_t is at least 64bit). In order to preserve old packing of struct rb_call_cache, rb_serial_t is inappropriate.	2019-12-18 12:52:28 +09:00
Koichi Sasada	fbe229906b	add debug counter to count `call` reusing cases.	2019-12-17 13:15:38 +09:00
卜部昌平	ba11a74745	ensure cc->def == cc->me->def The equation shall hold for every call cache. However prior to this changeset cc->me could be updated without also updating cc->def. Let's make it sure by introducing new macro named CC_SET_ME which sets cc->me and cc->def at once.	2019-12-16 17:52:18 +09:00
Jeremy Evans	55b7ba3686	Make super in instance_eval in method in module raise TypeError This makes behavior the same as super in instance_eval in method in class. The reason this wasn't implemented before is that there is a check to determine if the self in the current context is of the expected class, and a module itself can be included in multiple classes, so it doesn't have an expected class. Implementing this requires giving iclasses knowledge of which class created them, so that super call in the module method knows the expected class for super calls. This reference is called includer, and should only be set for iclasses. Note that the approach Ruby uses in this check is not robust. If you instance_eval another object of the same class and call super, instead of an TypeError, you get super called with the instance_eval receiver instead of the method receiver. Truly fixing super would require keeping a reference to the super object (method receiver) in each frame where scope has changed, and using that instead of current self when calling super. Fixes [Bug #11636]	2019-12-12 15:50:19 +09:00
Aaron Patterson	2c8d186c6e	Introduce an "Inline IVAR cache" struct This commit introduces an "inline ivar cache" struct. The reason we need this is so compaction can differentiate from an ivar cache and a regular inline cache. Regular inline caches contain references to `VALUE` and ivar caches just contain references to the ivar index. With this new struct we can easily update references for inline caches (but not inline var caches as they just contain an int)	2019-12-05 13:37:02 -08:00
Koichi Sasada	36da0b3da1	check interrupts at each frame pop timing. Asynchronous events such as signal trap, finalization timing, thread switching and so on are managed by "interrupt_flag". Ruby's threads check this flag periodically and if a thread does not check this flag, above events doesn't happen. This checking is CHECK_INTS() (related) macro and it is placed at some places (laeve instruction and so on). However, at the end of C methods, C blocks (IMEMO_IFUNC) etc there are no checking and it can introduce uninterruptible thread. To modify this situation, we decide to place CHECK_INTS() at vm_pop_frame(). It increases interrupt checking points. [Bug #16366] This patch can introduce unexpected events...	2019-11-29 17:47:02 +09:00
Yusuke Endoh	191ce5344e	Reduce duplicated warnings for the change of Ruby 3 keyword arguments By this change, the following code prints only one warning. ``` def foo(*opt); end 100.times { foo({kw:1}) } ``` A global variable `st_table caller_to_callees` is a map from caller to a set of callee methods. It remembers that a warning is already printed for each pair of caller and callee. [Feature #16289]	2019-11-29 17:32:27 +09:00
Koichi Sasada	f38b6d197f	Revert "export for MJIT" This reverts commit `2e6f1cf8b2`.	2019-11-29 03:22:24 +09:00
Koichi Sasada	e4e41840ad	Revert "* remove trailing spaces. [ci skip]" This reverts commit `27d0d7c0d3`.	2019-11-29 03:22:13 +09:00
git	27d0d7c0d3	* remove trailing spaces. [ci skip]	2019-11-29 03:18:19 +09:00
Koichi Sasada	2e6f1cf8b2	export for MJIT	2019-11-29 03:17:52 +09:00
Koichi Sasada	dd723771c1	fastpath for ivar read of FL_EXIVAR objects. vm_getivar() provides fastpath for T_OBJECT by caching an index of ivar. This patch also provides fastpath for FL_EXIVAR objects. FL_EXIVAR objects have an each ivar array and index can be cached as T_OBJECT. To access this ivar array, generic_iv_tbl is exposed by rb_ivar_generic_ivtbl() (declared in variable.h which is newly introduced). Benchmark script: Benchmark.driver(repeat_count: 3){\|x\| x.executable name: 'clean', command: %w'../clean/miniruby' x.executable name: 'trunk', command: %w'./miniruby' objs = [Object.new, 'str', {a: 1, b: 2}, [1, 2]] objs.each.with_index{\|obj, i\| rep = obj.inspect rep = 'Object.new' if /\#/ =~ rep x.prelude str = %Q{ v#{i} = #{rep} def v#{i}.foo @iv # ivar access method (attr_reader) end v#{i}.instance_variable_set(:@iv, :iv) } puts str x.report %Q{ v#{i}.foo } } } Result: v0.foo # T_OBJECT clean: 85387141.8 i/s trunk: 85249373.6 i/s - 1.00x slower v1.foo # T_STRING trunk: 57894407.5 i/s clean: 39957178.6 i/s - 1.45x slower v2.foo # T_HASH trunk: 56629413.2 i/s clean: 39227088.9 i/s - 1.44x slower v3.foo # T_ARRAY trunk: 55797530.2 i/s clean: 38263572.9 i/s - 1.46x slower	2019-11-29 03:11:04 +09:00
Kazuhiro NISHIYAMA	09e76e9828	Improve consistency of bool/true/false	2019-11-25 15:09:09 +09:00
Koichi Sasada	e27acb6148	add fast path for argc==0. If calling builtin functions with no arguments, we don't need to calculate argv location.	2019-11-25 14:04:21 +09:00
卜部昌平	f6239ce0fc	peep-hole optimize VM instructions Some minor optimizations. Calculating ------------------------------------- ours trunk vm2_regexp 8.479M 8.346M i/s - 6.000M times in 0.707612s 0.718916s vm2_regexp_invert 8.605M 8.350M i/s - 6.000M times in 0.697298s 0.718576s Comparison: vm2_regexp ours: 8479223.3 i/s trunk: 8345893.8 i/s - 1.02x slower vm2_regexp_invert ours: 8604647.4 i/s trunk: 8349852.8 i/s - 1.03x slower Calculating ------------------------------------- ours+jit trunk+jit Optcarrot Lan_Master.nes 68.603 64.167 fps Comparison: Optcarrot Lan_Master.nes ours+jit: 68.6 fps trunk+jit: 64.2 fps - 1.07x slower	2019-11-19 13:56:13 +09:00
Koichi Sasada	57cd4623cf	should not use __func__	2019-11-18 13:53:32 +09:00
Koichi Sasada	5e34ab5406	add casts. add casts to avoid compile error. http://ci.rvm.jp/results/trunk_clang_39@silicon-docker/2402215	2019-11-18 10:36:48 +09:00
Koichi Sasada	71fee9bc72	vm_invoke_builtin_delegate with start index. opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave invokes builtin functions with same parameters of the method. This technique eliminate stack push operations. However, delegation parameters should be completely same as given parameters. (e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but __builtin_foo(b, c) is not allowed) This patch relaxes this restriction. ISeq has a local variables table which includes parameters. For example, the method defined as `def foo(a, b, c) x=y=nil`, then local variables table contains [a, b, c, x, y]. If calling builtin-function with arguments which are sub-array of the lvar table, use opt_invokebuiltin_delegate instruction with start index. For example, `__builtin_foo(b, c)`, `__builtin_bar(c, x, y)` is okay, and so on.	2019-11-18 10:16:11 +09:00
Koichi Sasada	179062dd80	move rb_vm_lvar_exposed() correctly. rb_vm_lvar_exposed() is prepared for __builtin_inline!(), needed for mini_builtin.c and builtin.c. However, it's only on builtin.c. So move it to make it as a part of VM.	2019-11-14 04:21:24 +09:00
Dylan Thacker-Smith	ac112f2b5d	Avoid top-level search for nested constant reference from nil in defined? Fixes [Bug #16332] Constant access was changed to no longer allow top-level constant access through `nil`, but `defined?` wasn't changed at the same time to stay consistent. Use a separate defined type to distinguish between a constant referenced from the current lexical scope and one referenced from another namespace.	2019-11-13 15:36:58 +09:00
Koichi Sasada	9142f802f1	rewrite comment. Pointed by nagachika-san. https://ruby-trunk-changes.hatenablog.com/entry/ruby_trunk_changes_20191109	2019-11-11 16:47:50 +09:00
Koichi Sasada	43ceedecc0	use STACK_ADDR_FROM_TOP() vm_invoke_builtin() accesses VM stack via cfp->sp. However, MJIT can use their own stack. To access them appropriately, we need to use STACK_ADDR_FROM_TOP().	2019-11-09 16:18:58 +09:00
Koichi Sasada	21f7cca2c6	initialize kw special local var. A method which has keyword parameters has an implicit local variable to specify which keywords are (un)specified. vm_call_iseq_setup_kwparm_nokwarg() is special function to invoke a ISeq method without any keyword arguments. However, it should also initialize the special local var. Without this initialization, the implicit lvar can points a freed (T_NONE) object.	2019-11-09 10:04:04 +09:00
卜部昌平	90fc555258	name the result of calccall This is a pure refactoring for better understanding of what is happening here. Should change nothing but readability.	2019-11-08 12:09:01 +09:00
卜部昌平	a1a08ac9aa	describe vm_cache_check_for_class_serial [ci skip] Added comments describing what it is. Requested by ko1.	2019-11-08 10:31:06 +09:00
Koichi Sasada	46acd0075d	support builtin features with Ruby and C. Support loading builtin features written in Ruby, which implement with C builtin functions. [Feature #16254] Several features: (1) Load .rb file at boottime with native binary. Now, prelude.rb is loaded at boottime. However, this file is contained into the interpreter as a text format and we need to compile it. This patch contains a feature to load from binary format. (2) __builtin_func() in Ruby call func() written in C. In Ruby file, we can write `__builtin_func()` like method call. However this is not a method call, but special syntax to call a function `func()` written in C. C functions should be defined in a file (same compile unit) which load this .rb file. Functions (`func` in above example) should be defined with (a) 1st parameter: rb_execution_context_t ec (b) rest parameters (0 to 15). (c) VALUE return type. This is very similar requirements for functions used by rb_define_method(), however `rb_execution_context_t ec` is new requirement. (3) automatic C code generation from .rb files. tool/mk_builtin_loader.rb creates a C code to load .rb files needed by miniruby and ruby command. This script is run by BASERUBY, so *.rb should be written in BASERUBY compatbile syntax. This script load a .rb file and find all of __builtin_ prefix method calls, and generate a part of C code to export functions. tool/mk_builtin_binary.rb creates a C code which contains binary compiled Ruby files needed by ruby command.	2019-11-08 09:09:29 +09:00
卜部昌平	d45a013a1a	extend rb_call_cache Prior to this changeset, majority of inline cache mishits resulted into the same method entry when rb_callable_method_entry() resolves a method search. Let's not call the function at the first place on such situations. In doing so we extend the struct rb_call_cache from 44 bytes (in case of 64 bit machine) to 64 bytes, and fill the gap with secondary class serial(s). Call cache's class serials now behavies as a LRU cache. Calculating ------------------------------------- ours 2.7 2.6 vm2_poly_same_method 2.339M 1.744M 1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s Comparison: vm2_poly_same_method ours: 2339103.0 i/s 2.7: 1743512.3 i/s - 1.34x slower 2.6: 1369429.8 i/s - 1.71x slower	2019-11-07 17:41:30 +09:00
卜部昌平	6ff1250739	rb_method_basic_definition_p with CC Noticed that rb_method_basic_definition_p is frequently called. Its callers include vm_caller_setup_args_block(), rb_hash_default_value(), rb_num_neative_int_p(), and a lot more. It seems worth caching the method resolution part. Majority of rb_method_basic_definion_p() usages take fixed class and fixed method id combinations. Calculating ------------------------------------- ours trunk so_matrix 2.379 2.115 i/s - 1.000 times in 0.420409s 0.472879s Comparison: so_matrix ours: 2.4 i/s trunk: 2.1 i/s - 1.12x slower	2019-11-05 11:39:35 +09:00
卜部昌平	cc5580f175	fix bug in keyword + protected combination Test included for the situation formerly was not working.	2019-10-28 14:38:05 +09:00
卜部昌平	356e203a3a	more on struct rb_call_data Replacing adjacent struct rb_call_info and struct rb_call_cache into a struct rb_call_data.	2019-10-25 12:24:22 +09:00
wanabe	4ff2c58f91	retry tailcall optimization (#2529 ) Sorry, `f62f90367f` is push miss.	2019-10-25 04:40:39 +09:00
Jeremy Evans	d6a2507e49	Duplicate hash when converting keyword hash to keywords This mirrors the behavior when manually splatting a hash. This mirrors the changes made in setup_parameters_complex in `6081ddd6e6`, so that splatting to a non-iseq method works the same as splatting to an iseq method.	2019-10-24 12:35:04 -07:00
Alan Wu	89e7997622	Combine call info and cache to speed up method invocation To perform a regular method call, the VM needs two structs, `rb_call_info` and `rb_call_cache`. At the moment, we allocate these two structures in separate buffers. In the worst case, the CPU needs to read 4 cache lines to complete a method call. Putting the two structures together reduces the maximum number of cache line reads to 2. Combining the structures also saves 8 bytes per call site as the current layout uses separate two pointers for the call info and the call cache. This saves about 2 MiB on Discourse. This change improves the Optcarrot benchmark at least 3%. For more details, see attached bugs.ruby-lang.org ticket. Complications: - A new instruction attribute `comptime_sp_inc` is introduced to calculate SP increase at compile time without using call caches. At compile time, a `TS_CALLDATA` operand points to a call info struct, but at runtime, the same operand points to a call data struct. Instruction that explicitly define `sp_inc` also need to define `comptime_sp_inc`. - MJIT code for copying call cache becomes slightly more complicated. - This changes the bytecode format, which might break existing tools. [Misc #16258]	2019-10-24 18:03:42 +09:00
Nobuyoshi Nakada	42edb05626	extracted declare_under	2019-10-10 01:08:42 +09:00
Koichi Sasada	ddf5020e4f	Revert "tailcall optimization again (#2528 )" This reverts commit `f62f90367f`.	2019-10-06 17:01:00 +09:00
wanabe	f62f90367f	tailcall optimization again (#2528 ) This is follow up of r67315.	2019-10-06 16:52:09 +09:00
卜部昌平	3ffd98c5cd	add debug counters for vm_search_method_slowpath() Implemented fine-grained inspection of cache misshits. Handy for counting the reasons why an inline method cache was evicted.	2019-10-03 15:24:09 +09:00
卜部昌平	eb92159d72	Revert https://github.com/ruby/ruby/pull/2486 This reverts commits: `10d6a3aca7` `8ba48c1b85` `fba8627dc1` `dd883de5ba` `6c6a25feca` `167e6b48f1` `7cb96d41a5` `3207979278` `595b3c4fdd` `1521f7cf89` `c11c5e69ac` `cf33608203` `3632a812c0` `f56506be0d` `86427a3219` . The reason for the revert is that we observe ABA problem around inline method cache. When a cache misshits, we search for a method entry. And if the entry is identical to what was cached before, we reuse the cache. But the commits we are reverting here introduced situations where a method entry is freed, then the identical memory region is used for another method entry. An inline method cache cannot detect that ABA. Here is a code that reproduce such situation: ```ruby require 'prime' class << Integer alias org_sqrt sqrt def sqrt(n) raise end GC.stress = true Prime.each(737){} rescue nil # <- Here we populate CC class << Object.new; end # These adjacent remove-then-alias maneuver # frees a method entry, then immediately # reuses it for another. remove_method :sqrt alias sqrt org_sqrt end Prime.each(737).to_a # <- SEGV ```	2019-10-03 12:45:24 +09:00
Jeremy Evans	ef697388be	Treat return in block in class/module as LocalJumpError (#2511 ) return directly in class/module is an error, so return in proc in class/module should also be an error. I believe the previous behavior was an unintentional oversight during the addition of top-level return in 2.4.	2019-10-02 07:56:28 -07:00
Nobuyoshi Nakada	10d6a3aca7	Fix assertion callable_method_entry_p is for rb_callable_method_entry_t.	2019-09-30 17:43:11 +09:00
卜部昌平	fba8627dc1	delete unnecessary branch At last, not only myself but also your compiler are fully confident that the method entries pointed from call caches are immutable. We don't have to worry about silent updates. Just delete the branch that is now always false. Calculating ------------------------------------- ours trunk vm2_poly_same_method 2.142M 2.070M i/s - 6.000M times in 2.801148s 2.898994s Comparison: vm2_poly_same_method ours: 2141979.2 i/s trunk: 2069683.8 i/s - 1.03x slower	2019-09-30 10:26:38 +09:00
卜部昌平	dd883de5ba	refactor constify most of rb_method_entry_t Now that we have eliminated most destructive operations over the rb_method_entry_t / rb_callable_method_entry_t, let's make them mostly immutabe and mark them const. One exception is rb_export_method(), which destructively modifies visibilities of method entries. I have left that operation as is because I suspect that destructiveness is the nature of that function.	2019-09-30 10:26:38 +09:00
卜部昌平	6c6a25feca	refactor add rb_method_entry_from_template Tired of rb_method_entry_create(..., rb_method_definition_create( ..., &(rb_method_foo_t) {...})) maneuver. Provide a function that does the thing to reduce copy&paste.	2019-09-30 10:26:38 +09:00
卜部昌平	7cb96d41a5	refactor delete rb_method_entry_copy The deleted function was to destructively overwrite existing method entries, which is now considered to be a bad idea. Delete it, and assign a newly created method entry instead.	2019-09-30 10:26:38 +09:00
卜部昌平	3207979278	refactor delete rb_method_definition_set Instead of destructively write fields of method entries, create a new entry and let it overwrite its owner.	2019-09-30 10:26:38 +09:00
卜部昌平	595b3c4fdd	refactor rb_method_definition_create take opts Before this changeset rb_method_definition_create only allocated a memory region and we had to destructively initialize it later. That is not a good design so we change the API to return a complete struct instead.	2019-09-30 10:26:38 +09:00
卜部昌平	cf33608203	refactor constify most of rb_method_definition_t Most (if not all) of the fields of rb_method_definition_t are never meant to be modified once after they are stored. Marking them const makes it possible for compilers to warn on unintended modifications.	2019-09-30 10:26:38 +09:00
Jeremy Evans	6fdd701472	Remove VM_NO_KEYWORDS, replace with RB_NO_KEYWORDS VM_NO_KEYWORDS was introduced first in vm_core.h, but it is best to only use a single definition for this.	2019-09-29 16:41:00 -07:00
Jeremy Evans	7814b6c657	Correctly issue ArgumentError when calling method that accepts no keywords If a method accepts no keywords and was called with a keyword, an ArgumentError was not always issued previously. Force methods that accept no keywords to go through setup_parameters_complex so that an ArgumentError is raised if keywords are provided.	2019-09-27 11:21:50 -07:00
Nobuyoshi Nakada	8d0ff88727	Adjusted spaces [ci skip]	2019-09-27 14:06:07 +09:00
Nobuyoshi Nakada	0c6f36668a	Adjusted spaces [ci skip]	2019-09-27 10:20:56 +09:00
Jeremy Evans	3b302ea8c9	Add Module#ruby2_keywords for passing keywords through regular argument splats This approach uses a flag bit on the final hash object in the regular splat, as opposed to a previous approach that used a VM frame flag. The hash flag approach is less invasive, and handles some cases that the VM frame flag approach does not, such as saving the argument splat array and splatting it later: ruby2_keywords def foo(args) @args = args bar end def bar baz(@args) end def baz(args, kw) [args, kw] end foo(a:1) #=> [[], {a: 1}] foo({a: 1}, *{}) #=> [[{a: 1}], {}] foo({a: 1}) #=> 2.7: [[], {a: 1}] # and warning foo({a: 1}) #=> 3.0: [[{a: 1}], {}] It doesn't handle some cases that the VM frame flag handles, such as when the final hash object is replaced using Hash#merge, but those cases are probably less common and are unlikely to properly support keyword argument separation. Use ruby2_keywords to handle argument delegation in the delegate library.	2019-09-25 12:33:52 -07:00
Takashi Kokubun	6e0dd3e7c1	Use RUBY_VM_NEXT_CONTROL_FRAME macro in vm_push_frame and limit scope of i. Just a minor maintainability improvement.	2019-09-20 21:06:08 +09:00
卜部昌平	fcfe36b733	fix spec failure See also https://travis-ci.org/ruby/ruby/jobs/586452224	2019-09-19 15:18:10 +09:00
卜部昌平	d74fa8e55c	reuse cc->call I noticed that in case of cache misshit, re-calculated cc->me can be the same method entry than the pevious one. That is an okay situation but can't we partially reuse the cache, because cc->call should still be valid then? One thing that has to be special-cased is when the method entry gets amended by some refinements. That happens behind-the-scene of call cache mechanism. We have to check if cc->me->def points to the previously saved one. Calculating ------------------------------------- trunk ours vm2_poly_same_method 1.534M 2.025M i/s - 6.000M times in 3.910203s 2.962752s Comparison: vm2_poly_same_method ours: 2025143.9 i/s trunk: 1534447.2 i/s - 1.32x slower	2019-09-19 15:18:10 +09:00
卜部昌平	bcd5f2e9d3	delete unused variable	2019-09-18 11:06:24 +09:00
Jeremy Evans	775365cbd2	Fix keyword argument separation issues with sym procs when using refinements Make sure that vm_yield_with_cfunc can correctly set the empty keyword flag by passing 2 as the kw_splat value when calling it in vm_invoke_ifunc_block. Make sure calling.kw_splat is set to 1 and not 128 in vm_sendish, so we can safely check for different kw_splat values. vm_args.c needs to call add_empty_keyword, and to make JIT happy, the function needs to be exported. Rename the function to rb_adjust_argv_kw_splat to more accurately reflect what it does, and mark it as MJIT exported.	2019-09-17 16:22:44 -07:00

1 2 3 4 5 ...

1023 Коммитов