github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Takashi Kokubun	d9f608b686	Verify builtin inline annotation with VM_CHECK_MODE (#3244 ) * Verify builtin inline annotation with VM_CHECK_MODE * Remove static to fix the link issue on MJIT	2020-06-21 10:27:04 -07:00
Takashi Kokubun	426db4cd90	Fix -Wmaybe-uninitialized at vm_invoke_block	2020-06-21 00:34:39 -07:00
Nobuyoshi Nakada	ccb7a4b9f2	Replaced accessors of `Struct` with `invokebuiltin`	2020-06-17 08:18:46 +09:00
Nobuyoshi Nakada	318d52e820	Revert "Replaced accessors of `Struct` with `invokebuiltin`" This reverts commit `19cabe8b09`, which didn't support tool/lib/iseq_loader_checker.rb.	2020-06-16 18:44:58 +09:00
Nobuyoshi Nakada	19cabe8b09	Replaced accessors of `Struct` with `invokebuiltin`	2020-06-16 18:24:02 +09:00
卜部昌平	5648976c3c	vm_call_method: avoid marking on-stack object This callcache is on stack, must not be GCed. However its contents are copied from other materials, which can be an ordinal object. Should set a flag to make sure it is properly skipped by the GC.	2020-06-10 10:22:39 +09:00
卜部昌平	1016cff4ff	rb_eql_opt,rb_equal_opt: purge stale cc When on USE_EMBED_CI, cd is stored statically. Previous use could cache stale cd->cc, which could have already been GCed. Need flush them.	2020-06-09 09:52:46 +09:00
卜部昌平	ffe58b9c8b	vm_ccs_push: do not cache non-heap entries Entires not GC-able must be considered to be volatile. Not eligible for later use.	2020-06-09 09:52:46 +09:00
卜部昌平	e1e84fbb4f	VM_CI_NEW_ID: USE_EMBED_CI could be false It was a wrong idea to assume CIs are always embedded.	2020-06-09 09:52:46 +09:00
卜部昌平	324038c66e	eliminate C99 compound literals Ko1 prefers variables be assgined, instead of bare literals in function arguments.	2020-06-09 09:52:46 +09:00
卜部昌平	4fbe86d0e2	vm_call_method: use struct assignment This further reduces the generated binary of vm_call_method from 566 bytes to 545 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	46728557c1	rb_vm_call0: on-stack call info This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	db406daa60	vm_yield_setup_args: refactor use macro	2020-06-09 09:52:46 +09:00
卜部昌平	367263c3dd	vm_call_method: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method from 600 bytes to 566 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	fb3f1f95e8	vm_call_refined: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method_each_type from 2,442 bytes to 2,378 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	be5dfdd8a2	vm_call_zsuper: no call vm_cc_fill This changeset reduces the generated binary of vm_call_method_each_type from 2,522 bytes to 2,442 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	dbbde61cef	vm_call_method_missing_body: on-stack call info This changeset reduces the generated binary of vm_call_method_missing_body from 604 bytes to 532 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	9c287f8caa	vm_call_symbol: on-stack call info This changeset reduces the generated binary of vm_call_symbol from 808 bytes to 798 bytes on my machine. Should reduce GC pressure as well.	2020-06-09 09:52:46 +09:00
卜部昌平	62b471bd44	vm_call_alias: no call vm_cc_fill This changeset reduces the generated binary of vm_call_alias from 188 bytes to 149 bytes on my machine, accroding to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	97f456374d	rb_eql_opt: fully static call data This changeset reduces the generated binary of rb_eql_opt from 86 bytes to 20 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	3da9c51973	rb_vm_search_method_slowpath: skip vm_empty_cc Now that vm_empty_cc is statically allocated outside of the object space. It shall not be GCed. Here, because vm_search_cc can return that. Must not be blindly passed to RB_OBJ_WRITE, unless assertions fail on RGENGC_CHECK_MODE, like this: -- C level backtrace information ------------------------------------------- ruby(rb_print_backtrace+0x19) [0x5555557fd579] vm_dump.c:757 ruby(rb_vm_bugreport+0x151) [0x5555557fd6f1] vm_dump.c:955 ruby(rb_bug+0x1d6) [0x5555558d6396] error.c:660 ruby(check_rvalue_consistency_force+0x707) [0x5555555adb97] gc.c:1289 ruby(check_rvalue_consistency+0x1a) [0x555555598a0a] gc.c:1305 ruby(RVALUE_OLD_P+0x15) [0x5555555975d5] gc.c:1382 ruby(rb_gc_writebarrier+0x9f) [0x55555559753f] gc.c:6882 ruby(rb_obj_written+0x3a) [0x5555557a025a] include/ruby/internal/rgengc.h:180 ruby(rb_obj_write+0x41) [0x5555557a1a81] include/ruby/internal/rgengc.h:195 ruby(rb_vm_search_method_slowpath+0x5a) [0x5555557a125a] vm_insnhelper.c:1603 ruby(vm_search_method_fastpath+0x197) [0x5555557d8027] vm_insnhelper.c:1638 ruby(vm_search_method+0xea) [0x5555557d7d2a] vm_insnhelper.c:1650 ruby(vm_search_method_wrap+0x29) [0x5555557dbaf9] vm_insnhelper.c:4091 ruby(vm_sendish+0xa9) [0x5555557dba39] vm_insnhelper.c:4143 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_collect+0xb0) [0x555555828320] array.c:3186 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x19f) [0x5555557d183f] vm.c:1951 ruby(rb_iseq_eval+0x30) [0x5555557d2530] vm.c:2190 ruby(load_iseq_eval+0xd6) [0x5555555fa7e6] load.c:592 ruby(require_internal+0x25e) [0x5555555f7f5e] load.c:1022 ruby(rb_require_string+0x27) [0x5555555f74e7] load.c:1094 ruby(rb_f_require_relative+0x5f) [0x5555555f758f] load.c:837 ruby(call_cfunc_1+0x30) [0x5555557f0f70] vm_insnhelper.c:2391 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_call_cfunc+0xad) [0x5555557e521d] vm_insnhelper.c:2574 ruby(vm_call_method_each_type+0xc7) [0x5555557e4af7] vm_insnhelper.c:3040 ruby(vm_call_method+0x19c) [0x5555557e45dc] vm_insnhelper.c:3144 ruby(vm_call_general+0x2d) [0x5555557c8c3d] vm_insnhelper.c:3176 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(rb_iseq_eval_main+0x30) [0x5555557d2670] vm.c:2201 ruby(rb_ec_exec_node+0x16b) [0x55555557e39b] eval.c:296 ruby(ruby_run_node+0x72) [0x55555557e1f2] eval.c:354 ruby(main+0x78) [0x55555557a5d8] main.c:50	2020-06-09 09:52:46 +09:00
卜部昌平	8f3d4090f0	rb_equal_opt: fully static call data This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).	2020-06-09 09:52:46 +09:00
卜部昌平	3928c151a6	vm_search_method_fastpath: avoid rb_vm_empty_cc() This is such a hot path that it's worth eliminating a function call. Use the static variable directly instead.	2020-06-09 09:52:46 +09:00
卜部昌平	877238f2d3	check_cfunc: add assertions For debug. Must not change generated binary unless VM_ASSERT is on.	2020-06-09 09:52:46 +09:00
Nobuyoshi Nakada	184f78314e	Properly resolve refinements in defined? on private call [Bug #16932 ]	2020-06-04 02:12:57 +09:00
Nobuyoshi Nakada	8340c773e5	Properly resolve refinements in defined? on method call [Bug #16932 ]	2020-06-04 02:12:57 +09:00
卜部昌平	de5e0f7c06	vm_invoke_proc_block: reduce recursion According to nobu recursion can be longer than my expectation. Limit them here.	2020-06-03 16:13:47 +09:00
卜部昌平	b61e82eac9	vm_call_symbol: check stack overflow VM stack could overflow here. The condition is when a symbol is passed to a block-taking method via &variable, and that symbol has never been used for actual method names (thus yielding that results in calling method_missing), and the VM stack is full (no single word left). This is a once-in-a-blue-moon event. Yet there is a very tiny room of stack overflow. We need to check that.	2020-06-03 16:13:47 +09:00
卜部昌平	ba20e6080d	vm_invoke_block: remove auto qualifier Was (harmless but) redundant.	2020-06-03 16:13:47 +09:00
卜部昌平	6302b96368	vm_insnhelper.c: add space [ci skip] Just cosmetic change to improve readability.	2020-06-03 16:13:47 +09:00
卜部昌平	054c2fdfdf	vm_invoke_symbol_block: reduce MEMCPY This commit changes the number of calls of MEMCPY from... \| send \| &:sym -------------------------\|-------\|------- Symbol already interned \| once \| twice Symbol not pinned yet \| none \| once to: \| send \| &:sym -------------------------\|-------\|------- Symbol already interned \| once \| none Symbol not pinned yet \| twice \| once So it sacrifices exceptional situation for normal path.	2020-06-03 16:13:47 +09:00
卜部昌平	36322942db	vm_invoke_symbol_block: call vm_call_opt_send Symbol#to_proc and Object#send are closely related each other. Why not share their implementations. By doing so we can skip recursive call of vm_exec(), which could benefit for speed.	2020-06-03 16:13:47 +09:00
卜部昌平	973883aaa1	vm_invoke_block: force indirect jump This changeset slightly speeds up on my machine. Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 38.33488426546287 40.89825082589147 fps 40.91288557922081 41.48687465359386 40.96591995270991 41.98499064664184 41.20461943032173 43.67314690779162 42.38344888176518 44.02777536251875 43.43563728880915 44.88695892714136 43.88082889062643 45.11226186242523	2020-06-03 16:13:47 +09:00
卜部昌平	796f9edae0	vm_invoke_block: insertion of unused args This makes it possible for vm_invoke_block to pass its passed arguments verbatimly to calling functions. Because they are tail-called the function calls can be strength-recuced into indirect jumps, which is a huge win.	2020-06-03 16:13:47 +09:00
卜部昌平	ec87a58d55	vm_invoke_block: eliminate goto Use recursion for better readability.	2020-06-03 16:13:47 +09:00
卜部昌平	11c70b3162	vm_invoke_block: move logics around Moved block handler -> captured block conversion from vm_invokeblock to each vm_invoke_*_block functions.	2020-06-03 16:13:47 +09:00
Nobuyoshi Nakada	d05f04d27d	Fixed `defined?` against protected method call Protected methods are restricted to be called according to the class/module in where it is defined, not the actual receiver's class. [Bug #16931]	2020-06-02 19:01:56 +09:00
卜部昌平	40ced763b4	vm_insnhelper.c: merge opt_eq_func / opt_eql_func These two function were almost identical, except in case of T_STRING/T_FLOAT. Why not merge them into one, and let the difference be handled in normal method calls (slowpath). This does not improve runtime performance for me, but at least reduces for instance rb_eql_opt from 653 bytes to 86 bytes on my machine, according to nm(1).	2020-06-02 12:38:01 +09:00
Takashi Kokubun	9d71373c23	Mark vm_stackoverflow as NOINLINE COLDFUNC on JIT to reduce code size and improve locality of hot code.	2020-05-26 23:24:58 -07:00
Jeremy Evans	ad729a1d11	Fix origin iclass pointer for modules If a module has an origin, and that module is included in another module or class, previously the iclass created for the module had an origin pointer to the module's origin instead of the iclass's origin. Setting the origin pointer correctly requires using a stack, since the origin iclass is not created until after the iclass itself. Use a hidden ruby array to implement that stack. Correctly assigning the origin pointers in the iclass caused a use-after-free in GC. If a module with an origin is included in a class, the iclass shares a method table with the module and the iclass origin shares a method table with module origin. Mark iclass origin with a flag that notes that even though the iclass is an origin, it shares a method table, so the method table should not be garbage collected. The shared method table will be garbage collected when the module origin is garbage collected. I've tested that this does not introduce a memory leak. This change caused a VM assertion failure, which was traced to callable method entries using the incorrect defined_class. Update rb_vm_check_redefinition_opt_method and find_defined_class_by_owner to treat iclass origins different than class origins to avoid this issue. This also includes a fix for Module#included_modules to skip iclasses with origins. Fixes [Bug #16736]	2020-05-22 20:31:23 -07:00
Koichi Sasada	cbd45af2a9	fix memory leak of ccs rb_callable_method_entry() creates ccs entry in cc_tbl, but this code overwrite by insert newly created ccs and overwrote ccs never freed. [Bug #16900]	2020-05-22 11:23:20 +09:00
卜部昌平	15e977349e	more on NULL versus functions Function pointers are not void*. See also `115fec062c` `ce4ea956d2` `8427fca49b`	2020-05-11 16:47:25 +09:00
卜部昌平	9e41a75255	sed -i 's\|ruby/impl\|ruby/internal\|' To fix build failures.	2020-05-11 09:24:08 +09:00
卜部昌平	d7f4d732c1	sed -i s\|ruby/3\|ruby/impl\|g This shall fix compile errors.	2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada	cfe0e660f4	Disable -Wswitch warning when VM_CHECK_MODE	2020-05-03 00:15:56 +09:00
Takashi Kokubun	79f3403be0	Invalidate fastpath when calling attr_reader by super The same bug as `8355a99883` existed in attr_reader too.	2020-04-14 23:49:29 -07:00
Takashi Kokubun	8355a99883	Invalidate fastpath when calling attr_writer by super We started to use fastpath on invokesuper when a method is not refinements since `5c27681813`, but we shouldn't have used fastpath for attr_writer either. `cc->aux_.attr_index` is for an actual receiver class, while we store its superclass in `cc->klass` and therefore there's no way to properly invalidate attr_writer's inline cache when it's called by super. [Bug #16785] I suspect the same bug also exists in attr_reader. I'll address that in another commit.	2020-04-14 23:32:13 -07:00
Takashi Kokubun	310ef9f40b	Make vm_call_cfunc_with_frame a fastpath (#3027 ) when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ```	2020-04-13 20:32:59 -07:00
Takashi Kokubun	5c27681813	Enable fastpath on invokesuper (#3021 ) Fastpath has not been used for invokesuper since it has set vm_call_super_method on every invocation. Because it seems to be blocked only by refinements, try enabling fastpath on invokesuper when cme is not for refinements. While this patch itself should be helpful for VM performance, a part of this patch's motivation is to unblock inlining invokesuper on JIT. $ benchmark-driver -v --rbenv 'before;after' benchmark/vm2_super.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-11T15:19:58Z master `a01bda5949`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-12T02:00:08Z invokesuper-fastpath c171984ee3) [x86_64-linux] Calculating ------------------------------------- before after vm2_super 20.031M 32.860M i/s - 6.000M times in 0.299534s 0.182593s Comparison: vm2_super after: 32859885.2 i/s before: 20031097.3 i/s - 1.64x slower	2020-04-11 20:45:22 -07:00
Jeremy Evans	900e83b501	Turn class variable warnings into exceptions This changes the following warnings: * warning: class variable access from toplevel * warning: class variable @foo of D is overtaken by C into RuntimeErrors. Handle defined?(@@foo) at toplevel by returning nil instead of raising an exception (the previous behavior warned before returning nil when defined? was used). Refactor the specs to avoid the warnings even in older versions. The specs were checking for the warnings, but the purpose of the related specs as evidenced from their description is to test for behavior, not for warnings. Fixes [Bug #14541]	2020-04-10 00:29:05 -07:00
卜部昌平	9e6e39c351	Merge pull request #2991 from shyouhei/ruby.h Split ruby.h	2020-04-08 13:28:13 +09:00
Jeremy Evans	d2c41b1bff	Reduce allocations for keyword argument hashes Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(kw) kw end h = {a: 1} kw(a: 1) # about same kw(h) # 2.37x faster kws(a: 1) # 1.30x faster kws(h) # 2.19x faster kw(a: 1, h) # 1.03x slower kw(h, h) # about same kws(a: 1, h) # 1.16x faster kws(h, **h) # 1.14x faster	2020-03-17 12:09:43 -07:00
卜部昌平	f12b9a3338	%p is for void * See also `35eb12c063` `6f5eb28507` `687308cf0d` `b6a2d63eb3`	2020-03-04 12:30:42 +09:00
Koichi Sasada	91de0daaa2	method_missing_reason should be set. send() has special method launcher in VM and it has special method_missing caller. This path doesn't set ec->method_missing_reason which is used at exception creation, so setup this information. Without this setting, NoMethodError exception becomes NameError. This patch will fix: http://ci.rvm.jp/results/trunk-random1@phosphorus-docker/2761643	2020-03-03 02:44:02 +09:00
Koichi Sasada	18674aef0d	check imemo_type check imemo_type to debug http://ci.rvm.jp/results/trunk-vm-asserts@silicon-docker/2744755	2020-02-27 10:50:20 +09:00
Koichi Sasada	b9007b6c54	Introduce disposable call-cache. This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details.	2020-02-22 09:58:59 +09:00
Koichi Sasada	f2286925f0	VALUE size packed callinfo (ci). Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked.	2020-02-22 09:58:59 +09:00
Nobuyoshi Nakada	0b4500d982	Adjusted indent [ci skip]	2020-02-22 00:17:31 +09:00
Koichi Sasada	99a8742067	should be compared with called_id me->called_id and me->def->original_id can be different sometimes so we should compare with called_id, which is mtbl's key. (fix GH-PR #2869)	2020-02-13 03:30:22 +09:00
John Hawthorn	ed7b46b66b	Use inline cache for super calls	2020-02-13 00:14:55 +09:00
Koichi Sasada	a635c93fde	support MJIT with debug option. VM_CHECK_MODE > 0 with optflags=-O0 can not run JIT tests because of link problems. This patch fix them.	2020-02-03 16:57:41 +09:00
Jeremy Evans	beae6cbf0f	Fully separate positional arguments and keyword arguments This removes the warnings added in 2.7, and changes the behavior so that a final positional hash is not treated as keywords or vice-versa. To handle the arg_setup_block splat case correctly with keyword arguments, we need to check if we are taking a keyword hash. That case didn't have a test, but it affects real-world code, so add a test for it. This removes rb_empty_keyword_given_p() and related code, as that is not needed in Ruby 3. The empty keyword case is the same as the no keyword case in Ruby 3. This changes rb_scan_args to implement keyword argument separation for C functions when the : character is used. For backwards compatibility, it returns a duped hash. This is a bad idea for performance, but not duping the hash breaks at least Enumerator::ArithmeticSequence#inspect. Instead of having RB_PASS_CALLED_KEYWORDS be a number, simplify the code by just making it be rb_keyword_given_p().	2020-01-02 18:40:45 -08:00
卜部昌平	5e22f873ed	decouple internal.h headers Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).	2019-12-26 20:45:12 +09:00
卜部昌平	0958e19ffb	add several __has_something macro With these macros implemented we can write codes just like we can assume the compiler being clang. MSC_VERSION_SINCE is defined to implement those macros, but turned out to be handy for other places. The -fdeclspec compiler flag is necessary for clang to properly handle __has_declspec().	2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada	db16629008	Fixed misspellings Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.	2019-12-20 09:32:42 +09:00
卜部昌平	f054f11a38	per-method serial number Methods and their definitions can be allocated/deallocated on-the-fly. One pathological situation is when a method is deallocated then another one is allocated immediately after that. Address of those old/new method entries/definitions can be the same then, depending on underlying malloc/free implementation. So pointer comparison is insufficient. We have to check the contents. To do so we introduce def->method_serial, which is an integer unique to that specific method definition. PS: Note that method_serial being uintptr_t rather than rb_serial_t is intentional. This is because rb_serial_t can be bigger than a pointer on a 32bit system (rb_serial_t is at least 64bit). In order to preserve old packing of struct rb_call_cache, rb_serial_t is inappropriate.	2019-12-18 12:52:28 +09:00
Koichi Sasada	fbe229906b	add debug counter to count `call` reusing cases.	2019-12-17 13:15:38 +09:00
卜部昌平	ba11a74745	ensure cc->def == cc->me->def The equation shall hold for every call cache. However prior to this changeset cc->me could be updated without also updating cc->def. Let's make it sure by introducing new macro named CC_SET_ME which sets cc->me and cc->def at once.	2019-12-16 17:52:18 +09:00
Jeremy Evans	55b7ba3686	Make super in instance_eval in method in module raise TypeError This makes behavior the same as super in instance_eval in method in class. The reason this wasn't implemented before is that there is a check to determine if the self in the current context is of the expected class, and a module itself can be included in multiple classes, so it doesn't have an expected class. Implementing this requires giving iclasses knowledge of which class created them, so that super call in the module method knows the expected class for super calls. This reference is called includer, and should only be set for iclasses. Note that the approach Ruby uses in this check is not robust. If you instance_eval another object of the same class and call super, instead of an TypeError, you get super called with the instance_eval receiver instead of the method receiver. Truly fixing super would require keeping a reference to the super object (method receiver) in each frame where scope has changed, and using that instead of current self when calling super. Fixes [Bug #11636]	2019-12-12 15:50:19 +09:00
Aaron Patterson	2c8d186c6e	Introduce an "Inline IVAR cache" struct This commit introduces an "inline ivar cache" struct. The reason we need this is so compaction can differentiate from an ivar cache and a regular inline cache. Regular inline caches contain references to `VALUE` and ivar caches just contain references to the ivar index. With this new struct we can easily update references for inline caches (but not inline var caches as they just contain an int)	2019-12-05 13:37:02 -08:00
Koichi Sasada	36da0b3da1	check interrupts at each frame pop timing. Asynchronous events such as signal trap, finalization timing, thread switching and so on are managed by "interrupt_flag". Ruby's threads check this flag periodically and if a thread does not check this flag, above events doesn't happen. This checking is CHECK_INTS() (related) macro and it is placed at some places (laeve instruction and so on). However, at the end of C methods, C blocks (IMEMO_IFUNC) etc there are no checking and it can introduce uninterruptible thread. To modify this situation, we decide to place CHECK_INTS() at vm_pop_frame(). It increases interrupt checking points. [Bug #16366] This patch can introduce unexpected events...	2019-11-29 17:47:02 +09:00
Yusuke Endoh	191ce5344e	Reduce duplicated warnings for the change of Ruby 3 keyword arguments By this change, the following code prints only one warning. ``` def foo(*opt); end 100.times { foo({kw:1}) } ``` A global variable `st_table caller_to_callees` is a map from caller to a set of callee methods. It remembers that a warning is already printed for each pair of caller and callee. [Feature #16289]	2019-11-29 17:32:27 +09:00
Koichi Sasada	f38b6d197f	Revert "export for MJIT" This reverts commit `2e6f1cf8b2`.	2019-11-29 03:22:24 +09:00
Koichi Sasada	e4e41840ad	Revert "* remove trailing spaces. [ci skip]" This reverts commit `27d0d7c0d3`.	2019-11-29 03:22:13 +09:00
git	27d0d7c0d3	* remove trailing spaces. [ci skip]	2019-11-29 03:18:19 +09:00
Koichi Sasada	2e6f1cf8b2	export for MJIT	2019-11-29 03:17:52 +09:00
Koichi Sasada	dd723771c1	fastpath for ivar read of FL_EXIVAR objects. vm_getivar() provides fastpath for T_OBJECT by caching an index of ivar. This patch also provides fastpath for FL_EXIVAR objects. FL_EXIVAR objects have an each ivar array and index can be cached as T_OBJECT. To access this ivar array, generic_iv_tbl is exposed by rb_ivar_generic_ivtbl() (declared in variable.h which is newly introduced). Benchmark script: Benchmark.driver(repeat_count: 3){\|x\| x.executable name: 'clean', command: %w'../clean/miniruby' x.executable name: 'trunk', command: %w'./miniruby' objs = [Object.new, 'str', {a: 1, b: 2}, [1, 2]] objs.each.with_index{\|obj, i\| rep = obj.inspect rep = 'Object.new' if /\#/ =~ rep x.prelude str = %Q{ v#{i} = #{rep} def v#{i}.foo @iv # ivar access method (attr_reader) end v#{i}.instance_variable_set(:@iv, :iv) } puts str x.report %Q{ v#{i}.foo } } } Result: v0.foo # T_OBJECT clean: 85387141.8 i/s trunk: 85249373.6 i/s - 1.00x slower v1.foo # T_STRING trunk: 57894407.5 i/s clean: 39957178.6 i/s - 1.45x slower v2.foo # T_HASH trunk: 56629413.2 i/s clean: 39227088.9 i/s - 1.44x slower v3.foo # T_ARRAY trunk: 55797530.2 i/s clean: 38263572.9 i/s - 1.46x slower	2019-11-29 03:11:04 +09:00
Kazuhiro NISHIYAMA	09e76e9828	Improve consistency of bool/true/false	2019-11-25 15:09:09 +09:00
Koichi Sasada	e27acb6148	add fast path for argc==0. If calling builtin functions with no arguments, we don't need to calculate argv location.	2019-11-25 14:04:21 +09:00
卜部昌平	f6239ce0fc	peep-hole optimize VM instructions Some minor optimizations. Calculating ------------------------------------- ours trunk vm2_regexp 8.479M 8.346M i/s - 6.000M times in 0.707612s 0.718916s vm2_regexp_invert 8.605M 8.350M i/s - 6.000M times in 0.697298s 0.718576s Comparison: vm2_regexp ours: 8479223.3 i/s trunk: 8345893.8 i/s - 1.02x slower vm2_regexp_invert ours: 8604647.4 i/s trunk: 8349852.8 i/s - 1.03x slower Calculating ------------------------------------- ours+jit trunk+jit Optcarrot Lan_Master.nes 68.603 64.167 fps Comparison: Optcarrot Lan_Master.nes ours+jit: 68.6 fps trunk+jit: 64.2 fps - 1.07x slower	2019-11-19 13:56:13 +09:00
Koichi Sasada	57cd4623cf	should not use __func__	2019-11-18 13:53:32 +09:00
Koichi Sasada	5e34ab5406	add casts. add casts to avoid compile error. http://ci.rvm.jp/results/trunk_clang_39@silicon-docker/2402215	2019-11-18 10:36:48 +09:00
Koichi Sasada	71fee9bc72	vm_invoke_builtin_delegate with start index. opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave invokes builtin functions with same parameters of the method. This technique eliminate stack push operations. However, delegation parameters should be completely same as given parameters. (e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but __builtin_foo(b, c) is not allowed) This patch relaxes this restriction. ISeq has a local variables table which includes parameters. For example, the method defined as `def foo(a, b, c) x=y=nil`, then local variables table contains [a, b, c, x, y]. If calling builtin-function with arguments which are sub-array of the lvar table, use opt_invokebuiltin_delegate instruction with start index. For example, `__builtin_foo(b, c)`, `__builtin_bar(c, x, y)` is okay, and so on.	2019-11-18 10:16:11 +09:00
Koichi Sasada	179062dd80	move rb_vm_lvar_exposed() correctly. rb_vm_lvar_exposed() is prepared for __builtin_inline!(), needed for mini_builtin.c and builtin.c. However, it's only on builtin.c. So move it to make it as a part of VM.	2019-11-14 04:21:24 +09:00
Dylan Thacker-Smith	ac112f2b5d	Avoid top-level search for nested constant reference from nil in defined? Fixes [Bug #16332] Constant access was changed to no longer allow top-level constant access through `nil`, but `defined?` wasn't changed at the same time to stay consistent. Use a separate defined type to distinguish between a constant referenced from the current lexical scope and one referenced from another namespace.	2019-11-13 15:36:58 +09:00
Koichi Sasada	9142f802f1	rewrite comment. Pointed by nagachika-san. https://ruby-trunk-changes.hatenablog.com/entry/ruby_trunk_changes_20191109	2019-11-11 16:47:50 +09:00
Koichi Sasada	43ceedecc0	use STACK_ADDR_FROM_TOP() vm_invoke_builtin() accesses VM stack via cfp->sp. However, MJIT can use their own stack. To access them appropriately, we need to use STACK_ADDR_FROM_TOP().	2019-11-09 16:18:58 +09:00
Koichi Sasada	21f7cca2c6	initialize kw special local var. A method which has keyword parameters has an implicit local variable to specify which keywords are (un)specified. vm_call_iseq_setup_kwparm_nokwarg() is special function to invoke a ISeq method without any keyword arguments. However, it should also initialize the special local var. Without this initialization, the implicit lvar can points a freed (T_NONE) object.	2019-11-09 10:04:04 +09:00
卜部昌平	90fc555258	name the result of calccall This is a pure refactoring for better understanding of what is happening here. Should change nothing but readability.	2019-11-08 12:09:01 +09:00
卜部昌平	a1a08ac9aa	describe vm_cache_check_for_class_serial [ci skip] Added comments describing what it is. Requested by ko1.	2019-11-08 10:31:06 +09:00
Koichi Sasada	46acd0075d	support builtin features with Ruby and C. Support loading builtin features written in Ruby, which implement with C builtin functions. [Feature #16254] Several features: (1) Load .rb file at boottime with native binary. Now, prelude.rb is loaded at boottime. However, this file is contained into the interpreter as a text format and we need to compile it. This patch contains a feature to load from binary format. (2) __builtin_func() in Ruby call func() written in C. In Ruby file, we can write `__builtin_func()` like method call. However this is not a method call, but special syntax to call a function `func()` written in C. C functions should be defined in a file (same compile unit) which load this .rb file. Functions (`func` in above example) should be defined with (a) 1st parameter: rb_execution_context_t ec (b) rest parameters (0 to 15). (c) VALUE return type. This is very similar requirements for functions used by rb_define_method(), however `rb_execution_context_t ec` is new requirement. (3) automatic C code generation from .rb files. tool/mk_builtin_loader.rb creates a C code to load .rb files needed by miniruby and ruby command. This script is run by BASERUBY, so *.rb should be written in BASERUBY compatbile syntax. This script load a .rb file and find all of __builtin_ prefix method calls, and generate a part of C code to export functions. tool/mk_builtin_binary.rb creates a C code which contains binary compiled Ruby files needed by ruby command.	2019-11-08 09:09:29 +09:00
卜部昌平	d45a013a1a	extend rb_call_cache Prior to this changeset, majority of inline cache mishits resulted into the same method entry when rb_callable_method_entry() resolves a method search. Let's not call the function at the first place on such situations. In doing so we extend the struct rb_call_cache from 44 bytes (in case of 64 bit machine) to 64 bytes, and fill the gap with secondary class serial(s). Call cache's class serials now behavies as a LRU cache. Calculating ------------------------------------- ours 2.7 2.6 vm2_poly_same_method 2.339M 1.744M 1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s Comparison: vm2_poly_same_method ours: 2339103.0 i/s 2.7: 1743512.3 i/s - 1.34x slower 2.6: 1369429.8 i/s - 1.71x slower	2019-11-07 17:41:30 +09:00
卜部昌平	6ff1250739	rb_method_basic_definition_p with CC Noticed that rb_method_basic_definition_p is frequently called. Its callers include vm_caller_setup_args_block(), rb_hash_default_value(), rb_num_neative_int_p(), and a lot more. It seems worth caching the method resolution part. Majority of rb_method_basic_definion_p() usages take fixed class and fixed method id combinations. Calculating ------------------------------------- ours trunk so_matrix 2.379 2.115 i/s - 1.000 times in 0.420409s 0.472879s Comparison: so_matrix ours: 2.4 i/s trunk: 2.1 i/s - 1.12x slower	2019-11-05 11:39:35 +09:00
卜部昌平	cc5580f175	fix bug in keyword + protected combination Test included for the situation formerly was not working.	2019-10-28 14:38:05 +09:00
卜部昌平	356e203a3a	more on struct rb_call_data Replacing adjacent struct rb_call_info and struct rb_call_cache into a struct rb_call_data.	2019-10-25 12:24:22 +09:00
wanabe	4ff2c58f91	retry tailcall optimization (#2529 ) Sorry, `f62f90367f` is push miss.	2019-10-25 04:40:39 +09:00
Jeremy Evans	d6a2507e49	Duplicate hash when converting keyword hash to keywords This mirrors the behavior when manually splatting a hash. This mirrors the changes made in setup_parameters_complex in `6081ddd6e6`, so that splatting to a non-iseq method works the same as splatting to an iseq method.	2019-10-24 12:35:04 -07:00
Alan Wu	89e7997622	Combine call info and cache to speed up method invocation To perform a regular method call, the VM needs two structs, `rb_call_info` and `rb_call_cache`. At the moment, we allocate these two structures in separate buffers. In the worst case, the CPU needs to read 4 cache lines to complete a method call. Putting the two structures together reduces the maximum number of cache line reads to 2. Combining the structures also saves 8 bytes per call site as the current layout uses separate two pointers for the call info and the call cache. This saves about 2 MiB on Discourse. This change improves the Optcarrot benchmark at least 3%. For more details, see attached bugs.ruby-lang.org ticket. Complications: - A new instruction attribute `comptime_sp_inc` is introduced to calculate SP increase at compile time without using call caches. At compile time, a `TS_CALLDATA` operand points to a call info struct, but at runtime, the same operand points to a call data struct. Instruction that explicitly define `sp_inc` also need to define `comptime_sp_inc`. - MJIT code for copying call cache becomes slightly more complicated. - This changes the bytecode format, which might break existing tools. [Misc #16258]	2019-10-24 18:03:42 +09:00
Nobuyoshi Nakada	42edb05626	extracted declare_under	2019-10-10 01:08:42 +09:00
Koichi Sasada	ddf5020e4f	Revert "tailcall optimization again (#2528 )" This reverts commit `f62f90367f`.	2019-10-06 17:01:00 +09:00

1 2 3 4 5 ...

945 Коммитов