github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
eileencodes	c8e157bb5c	Implement getclassvariable in yjit Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2021-10-20 18:19:42 -04:00
Alan Wu	78b5e95e41	Add a slowpath for opt_getinlinecache Before this change, when we encounter a constant cache that is specific to a lexical scope, we unconditionally exit. This change falls back to the interpreter's cache in this situation. This should help constant expressions in `class << self`, which is popular at Shopify due to the style guide. This change relies on the cache being warm while compiling to detect the need for checking the lexical scope for simplicity.	2021-10-20 18:19:41 -04:00
John Hawthorn	5e37f280d1	Remove vm_opt_aset	2021-10-20 18:19:41 -04:00
Aaron Patterson	bf8557f487	Add comments for new function	2021-10-20 18:19:40 -04:00
Aaron Patterson	5bc0343261	Refactor attrset to use a function This new function will do the write barrier / resize the object / check frozen for us	2021-10-20 18:19:40 -04:00
John Hawthorn	10f1d808d5	Remove rb_opt_equality_specialized	2021-10-20 18:19:40 -04:00
Kevin Newton	be648e0940	Implement splatarray	2021-10-20 18:19:37 -04:00
Maxime Chevalier-Boisvert	da30f21ab5	Try to fix MJIT symbol clash with cargo cult	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	54fe43b45c	Implement defined bytecode (#39 )	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	d6412126bc	Implement setivar with a plain old function call (#34 ) * Implement setivar with a plain old function call * Remove return	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	0c3842d154	Implement opt_aset as interpreter handler call	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	c9feb72b65	Implement opt_mod as call to interpreter function (#29 )	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	e2c1d69331	Implement opt_eq by calling interpreter function (#28 )	2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert	e5f8b41786	Implement send with alias method (#23 ) * Implement send with alias method * Add alias_method tests	2021-10-20 18:19:34 -04:00
Alan Wu	36134f7d29	Implement calls to methods with simple optional params * Implement calls to methods with simple optional params * Remove unnecessary MJIT_STATIC See comment for MJIT_STATIC. I added it not knowing whether it's required because the function next to it has it. Don't use it and wait for problems to come up instead. * Better naming, some comments * Count bailing on kw only iseqs On railsbench: ``` opt_send_without_block exit reasons: bmethod 59729 (27.7%) optimized_method 59137 (27.5%) iseq_complex_callee 41362 (19.2%) alias_method 33346 (15.5%) callsite_not_simple 19170 ( 8.9%) iseq_only_keywords 1300 ( 0.6%) kw_splat 1299 ( 0.6%) cfunc_ruby_array_varg 18 ( 0.0%) ```	2021-10-20 18:19:34 -04:00
Alan Wu	b626dd7211	YJIT: Fancier opt_getinlinecache Make sure `opt_getinlinecache` is in a block all on its own, and invalidate it from the interpreter when `opt_setinlinecache`. It will recompile with a filled cache the second time around. This lets YJIT runs well when the IC for constant is cold.	2021-10-20 18:19:33 -04:00
Alan Wu	5d834bcf9f	YJIT: lazy polymorphic getinstancevariable Lazily compile out a chain of checks for different known classes and whether `self` embeds its ivars or not. * Remove trailing whitespaces * Get proper addresss in Capstone disassembly * Lowercase address in Capstone disassembly Capstone uses lowercase for jump targets in generated listings. Let's match it. * Use the same successor in getivar guard chains Cuts down on duplication * Address reviews * Fix copypasta error * Add a comment	2021-10-20 18:19:31 -04:00
Maxime Chevalier-Boisvert	9d8cc01b75	WIP JIT-to-JIT returns	2021-10-20 18:19:28 -04:00
Nobuyoshi Nakada	8d6dbecc80	Remove useless casts	2021-10-19 17:09:32 +09:00
Nobuyoshi Nakada	ec021e469d	Get rid of type-punning cast	2021-10-19 17:08:25 +09:00
S.H	dc9112cf10	Using NIL_P macro instead of `== Qnil`	2021-10-03 22:34:45 +09:00
Jeremy Evans	9b04909a85	Introduce rb_vm_call_with_refinements to DRY up a few calls	2021-10-01 08:12:46 -09:00
Jeremy Evans	1f5f8a187a	Make Array#min/max optimization respect refined methods Pass in ec to vm_opt_newarray_{max,min}. Avoids having to call GET_EC inside the functions, for better performance. While here, add a test for Array#min/max being redefined to test_optimization.rb. Fixes [Bug #18180]	2021-09-30 15:18:14 -07:00
Nobuyoshi Nakada	70624ae43d	Extract hook macro for attributes	2021-09-19 16:27:47 +09:00
Nobuyoshi Nakada	cd829bb078	Remove printf family from the mjit header Linking printf family functions makes mjit objects to link unnecessary code.	2021-09-11 08:41:32 +09:00
Jeremy Evans	2d98593bf5	Support tracing of attr_reader and attr_writer In vm_call_method_each_type, check for c_call and c_return events before dispatching to vm_call_ivar and vm_call_attrset. With this approach, the call cache will still dispatch directly to those functions, so this change will only decrease performance for the first (uncached) call, and even then, the performance decrease is very minimal. This approach requires that we clear the call caches when tracing is enabled or disabled. The approach currently switches all vm_call_ivar and vm_call_attrset call caches to vm_call_general any time tracing is enabled or disabled. So it could theoretically result in a slowdown for code that constantly enables or disables tracing. This approach does not handle targeted tracepoints, but from my testing, c_call and c_return events are not supported for targeted tracepoints, so that shouldn't matter. This includes a benchmark showing the performance decrease is minimal if detectable at all. Fixes [Bug #16383] Fixes [Bug #10470] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2021-08-29 07:23:39 -07:00
Nobuyoshi Nakada	a0a8f2abf5	Get rid of type-punning pointer casts [Bug #18062 ]	2021-08-11 12:07:44 +09:00
S.H	378e8cdad6	Using RBOOL macro	2021-08-02 12:06:44 +09:00
Nobuyoshi Nakada	0700ee0e94	Refactor class variable cache functions Extracted repeated code as update_classvariable_cache. When cvc table is not set in getclassvariable, an empty table was created but it has no id and would cause [BUG], so made the code same as setclassvariable.	2021-06-23 10:55:22 +09:00
eileencodes	b91b3bc771	Add a cache for class variables Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master `9e5105c`) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2021-06-18 10:02:44 -07:00
Aaron Patterson	07f055bb13	Revert "Filling cache values on cvar write" This reverts commit `08de37f9fa`. This reverts commit `e8ae922b62`.	2021-05-11 13:31:00 -07:00
eileencodes	08de37f9fa	Filling cache values on cvar write Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value.	2021-05-11 12:04:27 -07:00
eileencodes	e8ae922b62	Add a cache for class variables This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master `9e5105ca45`) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2021-05-11 12:04:27 -07:00
Nobuyoshi Nakada	0bbab1e515	Protoized old pre-ANSI K&R style declarations and definitions	2021-05-07 00:04:36 +09:00
Alan Wu	dee58d7ae7	Add back checks for empty kw splat with tests (#4405 ) This reverts commit `a224ce8150`. Turns out the checks are needed to handle splatting an array with an empty ruby2 keywords hash.	2021-04-23 22:17:20 -04:00
Jeremy Evans	1f2b5c6dfe	Remove part of comment that is no longer accurate In Ruby 2.7, empty keyword splats could be added back for backwards compatibility. However, that stopped in Ruby 3.0.	2021-04-23 16:44:01 -07:00
Alan Wu	a224ce8150	Remove unnecessary checks for empty kw splat These two checks are surrounded by an if that ensures the call site is not a kw splat call site.	2021-04-23 19:37:03 -04:00
Koichi Sasada	ecfa8dcdba	fix return from orphan Proc in lambda A "return" statement in a Proc in a lambda like: `lambda{ proc{ return }.call }` should return outer lambda block. However, the inner Proc can become orphan Proc from the lambda block. This "return" escape outer-scope like method, but this behavior was decieded as a bug. [Bug #17105] This patch raises LocalJumpError by checking the proc is orphan or not from lambda blocks before escaping by "return". Most of tests are written by Jeremy Evans https://github.com/ruby/ruby/pull/4223	2021-04-02 09:25:33 +09:00
Aaron Patterson	04a814931a	return bool instead of VALUE	2021-03-17 10:55:37 -07:00
Aaron Patterson	ea817c60fc	Refactor vm_defined to return a boolean We just need this function to return whether or not the thing we're looking for is defined. If it's defined, return something true, otherwise false.	2021-03-17 10:55:37 -07:00
Aaron Patterson	c3971bea33	Stop calling `rb_iseq_defined_string` in vm_defined We already have access to the string from the iseqs, so we can stop calling this function.	2021-03-17 10:55:37 -07:00
John Hawthorn	9d0ae387c8	Remove DEFINED_IVAR2 from enum This version of defined? doesn't seem to be possible to emit anymore.	2021-03-10 09:38:20 -08:00
Koichi Sasada	813fe4c256	opt_equality_by_mid for rb_equal_opt This patch improves the performance of sequential and parallel execution of rb_equal() (and rb_eql()). [Bug #17497] rb_equal_opt (and rb_eql_opt) does not have own cd and it waste a time to initialize cd. This patch introduces opt_equality_by_mid() to check equality without cd. Furthermore, current master uses "static" cd on rb_equal_opt (and rb_eql_opt) and it hurts CPU caches on multi-thread execution. Now they are gone so there are no bottleneck on parallel execution.	2021-02-13 11:51:33 +09:00
Koichi Sasada	1ecda21366	global call-cache cache table for rb_funcall* rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough.	2021-01-29 16:22:12 +09:00
Nobuyoshi Nakada	2bd8098c6d	Suppress the warning for the invalid method_explorer case	2021-01-17 18:07:33 +09:00
Nobuyoshi Nakada	85b5d4c8bf	Revert "[Bug #11213 ] let defined?(super) call respond_to_missing?" This reverts commit `fac2498e02` for now, due to [Bug #17509], the breakage in the case `super` is called in `respond_to?`.	2021-01-13 18:11:46 +09:00
Koichi Sasada	55e52c19e7	simplify assertion searched_cme is used only this line so the variable is not needed.	2021-01-07 23:44:25 +09:00
Takashi Kokubun	7a3322a0fd	Fix broken JIT of getinlinecache `e7fc353f04` reverted vm_ic_hit_p's signature change made in `53babf35ef`, which broke JIT compilation of getinlinecache. To make sure it doesn't happen again, I separated vm_inlined_ic_hit_p to make the intention clear.	2021-01-04 13:09:08 -08:00
Koichi Sasada	e7fc353f04	enable constant cache on ractors constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed.	2021-01-05 02:27:58 +09:00
Nobuyoshi Nakada	292230cbf9	Fixed leaked global symbols	2020-12-26 09:39:53 +09:00
Koichi Sasada	02d9524cda	separate rb_ractor_pub from rb_ractor_t separate some fields from rb_ractor_t to rb_ractor_pub and put it at the beggining of rb_ractor_t and declare it in vm_core.h so vm_core.h can access rb_ractor_pub fields. Now rb_ec_ractor_hooks() is a complete inline function and no MJIT related issue.	2020-12-22 00:03:00 +09:00
Koichi Sasada	a2950369bd	TracePoint.new(&block) should be ractor-local TracePoint should be ractor-local because the Proc can violate the Ractor-safe.	2020-12-22 00:03:00 +09:00
Koichi Sasada	dca6752fec	Introduce Ractor::IsolationError Ractor has several restrictions to keep each ractor being isolated and some operation such as `CONST="foo"` in non-main ractor raises an exception. This kind of operation raises an error but there is confusion (some code raises RuntimeError and some code raises NameError). To make clear we introduce Ractor::IsolationError which is raised when the isolation between ractors is violated.	2020-12-21 22:29:05 +09:00
Nobuyoshi Nakada	8148f88b92	ALWAYS_INLINE implies inline	2020-12-19 18:04:47 +09:00
Takashi Kokubun	52b1716c78	Fix vm_search_invokeblock call_ needs to be vm_invokeblock_i, and flags is also not empty.	2020-12-19 00:27:44 -08:00
Takashi Kokubun	8ec8f37566	discourage inlining for vm_sendish() reversing `9213771817` only for JIT, because it made JIT slower. $ benchmark-driver -v --rbenv 'before;after;before --jit;after --jit' --repeat-count=36 --alternate --output=all benchmark.yml before: ruby 3.0.0dev (2020-12-19T07:38:17Z master `a139318538`) [x86_64-linux] after: ruby 3.0.0dev (2020-12-19T07:52:01Z master ce9faaeff5) [x86_64-linux] last_commit=discourage inlining for vm_sendish() before --jit: ruby 3.0.0dev (2020-12-19T07:38:17Z master `a139318538`) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-19T07:52:01Z master ce9faaeff5) +JIT [x86_64-linux] last_commit=discourage inlining for vm_sendish() Calculating ------------------------------------- before after before --jit after --jit Optcarrot Lan_Master.nes 42.83365858987760 42.68912456143848 76.50136803552716 65.74704713379785 fps 42.87724738609940 42.89045158177300 79.72624911659534 81.26221749201044 43.34963955708526 42.95431841174180 80.18085951039328 82.86458983313545 43.56786038452823 43.57563008888242 80.45933051716041 83.09150550702445 43.83219269706004 43.60748924115331 80.67164125046142 83.39458202043882 43.99035062888973 43.62050459554573 80.93204435712701 83.56303651352751 44.25176047881120 44.04822899344536 81.15051082548314 83.58166141398522 44.41978060794512 44.06521657912991 81.35651907376140 83.80036752456826 44.46864790591856 44.09325484326153 81.53456531520031 83.87502933718609 45.54712020644544 44.70693952869038 81.97738413452767 83.95818356402224 45.84292299382878 44.77704345873913 82.35118338199700 83.95966387450966 45.89411137280815 45.41425773286726 83.01052538434648 84.12812994632024 45.93130099197283 46.16884439916935 83.50833510120576 84.26276094927231 46.13648038236674 46.66645417860622 84.88757531920830 85.41732546800056 46.74873798919658 46.71790568883760 84.90953097036886 85.56340808970482 47.11273577214855 46.74581938882115 84.93196765297411 85.57603396455576 47.17870777128640 46.82414166607185 84.97178445888456 86.63510466280221 47.19338055580042 46.83645774240446 85.43536447262163 86.74129103462393 47.25761413477774 46.86834469505590 85.59822430471097 86.85376073363715 47.53327847102834 46.90228589364909 85.76446609620548 87.26108400015282 47.64308771617673 47.02814519551055 85.79904863600991 87.72293541243303 47.80286861846863 47.44672838168050 85.88640862064263 87.86803587836525 47.86455937950740 47.65301489003541 85.88750199172448 88.16881051171814 47.90065455321760 47.73425082354376 85.94295700508701 88.71267004066843 47.90727961241468 47.86377917424705 85.94674546805844 88.77726627283683 47.93243954623904 47.88720812998766 86.51872778134982 88.78993962536994 47.95062952008558 47.88774830879015 86.63116771614249 88.88085054889298 47.95097849989396 47.89825669442417 86.77387990931732 89.72021826461126 48.04730571166697 47.89981045730949 86.95084011077047 89.75804193954582 48.08042611622322 48.03246661737583 87.87239147980547 90.05949240088842 48.08999523258601 48.15253490344558 88.31289344498016 90.36439442190294 48.25670456430854 48.26904755214532 88.33999433286937 90.54253266759406 48.25947200597002 48.41894159956091 88.35502296938638 90.72591894564106 48.30826210577268 48.43125201523194 88.58311746582939 90.77173035874087 48.31514124187375 48.53932287546499 88.89099681179805 91.07747476133886 48.44349281318267 48.58969411593706 89.34043973691581 91.08545627378257	2020-12-19 00:03:46 -08:00
Koichi Sasada	9213771817	encourage inlining for vm_sendish() Some tunings. * add `inline` for vm_sendish() * pass enum instead of func ptr to vm_sendish() * reorder initial order of `calling` struct. * add ALWAYS_INLINE for vm_search_method_fastpath() * call vm_search_method_fastpath() from vm_sendish()	2020-12-17 16:53:41 +09:00
Takashi Kokubun	53babf35ef	Inline getconstant on JIT (#3906 ) * Inline getconstant on JIT * Support USE_MJIT=0	2020-12-16 06:24:07 -08:00
Koichi Sasada	98dca85573	tuning vm_setivar_slowpath() more. specify inline/noinline code for is_attr condition.	2020-12-16 14:20:20 +09:00
Koichi Sasada	1e11c12a06	remove unused function	2020-12-16 13:30:34 +09:00
Koichi Sasada	5499651ee7	tuning ivar set * make rb_init_iv_list() simple * introduce vm_setivar_slowpath() for cache miss cases ../clean/miniruby is `647ee6f091`. Calculating ------------------------------------- ./miniruby ../clean/miniruby ../ruby_2_7/miniruby vm_ivar_init 7.388M 6.814M 5.771M i/s - 30.000M times in 4.060420s 4.402534s 5.198781s vm_ivar_init_subclass 2.158M 2.147M 1.974M i/s - 3.000M times in 1.390328s 1.397587s 1.519951s vm_ivar_set 128.607M 97.931M 140.668M i/s - 30.000M times in 0.233269s 0.306338s 0.213268s vm_ivar 144.315M 151.722M 117.734M i/s - 30.000M times in 0.207879s 0.197730s 0.254811s Comparison: vm_ivar_init ./miniruby: 7388398.8 i/s ../clean/miniruby: 6814257.1 i/s - 1.08x slower ../ruby_2_7/miniruby: 5770583.9 i/s - 1.28x slower vm_ivar_init_subclass ./miniruby: 2157763.6 i/s ../clean/miniruby: 2146557.0 i/s - 1.01x slower ../ruby_2_7/miniruby: 1973747.9 i/s - 1.09x slower vm_ivar_set ../ruby_2_7/miniruby: 140668063.8 i/s ./miniruby: 128606912.1 i/s - 1.09x slower ../clean/miniruby: 97931027.8 i/s - 1.44x slower vm_ivar ../clean/miniruby: 151722121.9 i/s ./miniruby: 144314526.5 i/s - 1.05x slower ../ruby_2_7/miniruby: 117734305.5 i/s - 1.29x slower	2020-12-16 13:06:13 +09:00
Koichi Sasada	171f0431e7	fix typo	2020-12-16 10:36:23 +09:00
Koichi Sasada	72a73691bd	add several debug counters add cc_found_in_ccs (renamed from cc_found_ccs), cc_not_found_in_ccs, call0_public, call0_other debug counters to measure more details. also it contains several modification.	2020-12-15 13:29:30 +09:00
Koichi Sasada	aa6287cd26	fix inline method cache sync bug `cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`.	2020-12-15 13:29:30 +09:00
Koichi Sasada	7060d6b721	fix condition and add another debug counter mc_inline_miss_same_def is added to check same method or not. Also the mc_inline_miss_same_cc calculation was fixed.	2020-12-14 18:38:40 +09:00
Koichi Sasada	da3be76cb0	add debug counters to survey the IMC miss	2020-12-14 17:56:34 +09:00
Koichi Sasada	fa63052be1	create ccs with 0 capa ccs is created with default 4 capa, but it is too much, so start with 0 and extend to 1, 2, 4, 8, ...	2020-12-14 11:57:46 +09:00
Koichi Sasada	d741c77b5f	fix ivar with shareable objects issue Instance variables of sharable objects are accessible only from main ractor, so we need to check it correctly.	2020-12-12 06:19:18 +09:00
Jeremy Evans	01b7d5acc7	Remove the uninitialized instance variable verbose mode warning This speeds up all instance variable access, even when not in verbose mode. Uninitialized instance variable warnings were rarely helpful, and resulted in slower code if you wanted to avoid warnings when run in verbose mode. Implements [Feature #17055]	2020-12-10 10:16:05 -08:00
Koichi Sasada	182fb73c40	rb_ext_ractor_safe() to declare ractor-safe ext C extensions can violate the ractor-safety, so only ractor-safe C extensions (C methods) can run on non-main ractors. rb_ext_ractor_safe(true) declares that the successive defined methods are ractor-safe. Otherwiwze, defined methods checked they are invoked in main ractor and raise an error if invoked at non-main ractors. [Feature #17307]	2020-12-01 15:44:18 +09:00
Takashi Kokubun	8ce1711c25	Revert "Set VM_FRAME_FLAG_FINISH at once on MJIT" This reverts commit `4d2c8edca6`. Unfortunately this seems to cause several issues: https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802	2020-11-26 22:41:15 -08:00
Takashi Kokubun	4d2c8edca6	Set VM_FRAME_FLAG_FINISH at once on MJIT Performance is probably improved? $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master `69e77e81dc`) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux] last_commit=Set VM_FRAME_FLAG_FINISH at once Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.89292998533379 82.19497327502751 fps 80.93130641142331 85.13943315260148 81.06214830270119 87.43757879797808 82.29172808453910 87.89942441487113 84.61206450455929 87.91309779491075 85.44545883567997 87.98026086648694 86.02923132404449 88.03081060383973 86.07411817365879 88.14650206137341 86.34348799602836 88.32791633649961 87.90257338977324 88.57599644892220 88.58006509876580 88.67426384743277 89.26611118140011 88.81669430874207 This should have no bad impact on VM because this function is ALWAYS_INLINE.	2020-11-26 21:32:14 -08:00
Alan Wu	e0944bde91	Prefer rb_module_new() over rb_define_module_id() rb_define_module_id() doesn't do anything with its parameter so it's a bit confusing.	2020-11-25 17:05:06 -05:00
Nobuyoshi Nakada	fac2498e02	[Bug #11213 ] let defined?(super) call respond_to_missing?	2020-11-20 16:04:45 +09:00
Alan Wu	68ffc8db08	Set allocator on class creation Allocating an instance of a class uses the allocator for the class. When the class has no allocator set, Ruby looks for it in the super class (see rb_get_alloc_func()). It's uncommon for classes created from Ruby code to ever have an allocator set, so it's common during the allocation process to search all the way to BasicObject from the class with which the allocation is being performed. This makes creating instances of classes that have long ancestry chains more expensive than creating instances of classes have that shorter ancestry chains. Setting the allocator at class creation time removes the need to perform a search for the alloctor during allocation. This is a breaking change for C-extensions that assume that classes created from Ruby code have no allocator set. Libraries that setup a class hierarchy in Ruby code and then set the allocator on some parent class, for example, can experience breakage. This seems like an unusual use case and hopefully it is rare or non-existent in practice. Rails has many classes that have upwards of 60 elements in the ancestry chain and benchmark shows a significant improvement for allocating with a class that includes 64 modules. ``` pre: ruby 3.0.0dev (2020-11-12T14:39:27Z master `6325866421`) post: ruby 3.0.0dev (2020-11-12T20:15:30Z cut-allocator-lookup) Comparison: allocate_8_deep post: 10336985.6 i/s pre: 8691873.1 i/s - 1.19x slower allocate_32_deep post: 10423181.2 i/s pre: 6264879.1 i/s - 1.66x slower allocate_64_deep post: 10541851.2 i/s pre: 4936321.5 i/s - 2.14x slower allocate_128_deep post: 10451505.0 i/s pre: 3031313.5 i/s - 3.45x slower ```	2020-11-16 17:41:40 -05:00
Jeremy Evans	ce9beb9d20	Improve error message when subclassing non-Class Fixes [Bug #14726]	2020-11-13 07:06:13 -08:00
Aaron Patterson	4219cb7adb	Add debug counter for ivar inline cache misses that could hit This commit adds a debug counter for the case where the inline cache missed but the ivar index table has an entry for that ivar. This is a case where a polymorphic cache could help	2020-11-09 14:05:41 -08:00
Aaron Patterson	f259906eab	Avoid slow path ivar setting If the ivar index table exists, we can avoid the slowest path for setting ivars.	2020-11-09 14:05:41 -08:00
Aaron Patterson	2324584d46	Update vm_insnhelper.c Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2020-11-09 09:44:16 -08:00
Aaron Patterson	5582c5a232	Remove iv table size check iv tables cannot shrink. If the inline cache was ever set, then there must be an entry for the instance variable in the iv table. Just set the iv list on the object to be equal to the iv index table size, then set the iv.	2020-11-09 09:44:16 -08:00
Aaron Patterson	eb229994e5	eagerly initialize ivar table when index is small enough When the inline cache is written, the iv table will contain an entry for the instance variable. If we get an inline cache hit, then we know the iv table must contain a value for the index written to the inline cache. If the index in the inline cache is larger than the list on the object, but smaller than the iv index table on the class, then we can just eagerly allocate the iv list to be the same size as the iv index table. This avoids duplicate work of checking frozen as well as looking up the index for the particular instance variable name.	2020-11-09 09:44:16 -08:00
Koichi Sasada	5d97bdc2dc	Ractor.make_shareable(a_proc) Ractor.make_shareable() supports Proc object if (1) a Proc only read outer local variables (no assignments) (2) read outer local variables are shareable. Read local variables are stored in a snapshot, so after making shareable Proc, any assignments are not affeect like that: ```ruby a = 1 pr = Ractor.make_shareable(Proc.new{p a}) pr.call #=> 1 a = 2 pr.call #=> 1 # `a = 2` doesn't affect ``` [Feature #17284]	2020-10-30 03:12:09 +09:00
Jeremy Evans	ff2276ef8c	Fix bootstrap-test error in previous commit	2020-10-25 15:56:10 -07:00
Koichi Sasada	f6661f5085	sync RClass::ext::iv_index_tbl iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically.	2020-10-17 08:18:04 +09:00
Alan Wu	0d17cdd0ac	Abort on system stack overflow during GC Buggy native extensions could have mark functions that cause stack overflow. When a stack overflow happens during GC, Ruby used to recover by raising an exception, which runs the interpreter. It's not safe to run the interpreter during GC since the GC is in an inconsistent state. This could cause object allocation during GC, for example. Instead of running the interpreter and potentially causing a crash down the line, fail fast and abort.	2020-10-16 10:24:12 -04:00
Koichi Sasada	fad97f1f96	sync generic_ivtbl generic_ivtbl is a process global table to maintain instance variables for non T_OBJECT/T_CLASS/... objects. So we need to protect them for multi-Ractor exection. Hint: we can make them Ractor local for unshareable objects, but now it is premature optimization.	2020-10-14 16:36:55 +09:00
eileencodes	8dd9a23693	Make minor improvements to super The changes here include: * Using `FL_TEST_RAW` instead of `FL_TEST` in the first check in `vm_search_super_method`. While the profile showed us spending a fair amount of time here, the subsequent benchmarks didn't show much improvement when adding this. Regardless, we know this does less work than `FL_TEST` and we know that `FL_TEST_RAW` is safe due to the previous check so it's a small but accurate optimization. * Set `mid` only once. Both `vm_ci_new_runtime` and `vm_ci_mid` were getting the `original_id` for the method entry. We can do this once and pass the variable to the 2 callers that need it. This also doesn't have a huge performance improvement but cleans up the code a bit. Benchmark: ``` \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|vm_iclass_super \| 3.540M\| 3.940M\| \| \| -\| 1.11x\| ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2020-10-01 10:11:02 -07:00
Koichi Sasada	caaa36b4e6	prohibi method call by defined_method in other racotrs We can not call a non-isolated Proc in multiple ractors.	2020-09-25 20:37:38 +09:00
eileencodes	637d1cc0c0	Improve the performance of super This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super \| \|compare-ruby\|built-ruby\| \|:---------\|-----------:\|---------:\| \|vm_super \| 21.555M\| 37.819M\| \| \| -\| 1.75x\| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|vm_iclass_super \| 1.669M\| 3.683M\| \| \| -\| 2.21x\| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do \|module_name\| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do \|x\| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email>	2020-09-23 11:52:36 -07:00
Jeremy Evans	179384a668	Revert "Prevent SystemStackError when calling super in module with activated refinement" This reverts commit `eeef16e190`. This also reverts the spec change. Preventing the SystemStackError would be nice, but there is valid code that the fix breaks, and it is probably more common than cases that cause the SystemStackError. Fixes [Bug #17182]	2020-09-22 12:04:48 -07:00
Benoit Daloze	9b535f3ff7	Interpolated strings are no longer frozen with frozen-string-literal: true * Remove freezestring instruction since this was the only usage for it. * [Feature #17104]	2020-09-15 21:32:35 +02:00
Koichi Sasada	79df14c04b	Introduce Ractor mechanism for parallel execution This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues.	2020-09-03 21:11:06 +09:00
Alan Wu	4c3f0597de	Remove the pc argument of vm_trace() This makes the binary 272 bytes smaller on -O3 GCC 10.2.0.	2020-09-01 22:02:29 -04:00
Jeremy Evans	c60aaed185	Fix Method#super_method for aliased methods Previously, Method#super_method looked at the called_id to determine the method id to use, but that isn't correct for aliased methods, because the super target depends on the original method id, not the called_id. Additionally, aliases can reference methods defined in other classes and modules, and super lookup needs to start in the super of the defined class in such cases. This adds tests for Method#super_method for both types of aliases, one that uses VM_METHOD_TYPE_ALIAS and another that does not. Both check that the results for calling super methods return the expected values. To find the defined class for alias methods, add an rb_ prefix to find_defined_class_by_owner in vm_insnhelper.c and make it non-static, so that it can be called from method_super_method in proc.c. This bug was original discovered while researching [Bug #11189]. Fixes [Bug #17130]	2020-08-27 08:37:03 -07:00
Jeremy Evans	eeef16e190	Prevent SystemStackError when calling super in module with activated refinement Without this, if a refinement defines a method that calls super and includes a module with a module that calls super and has a activated refinement at the point super is called, the module method super call will end up calling back into the refinement method, creating a loop. Fixes [Bug #17007]	2020-07-27 08:18:11 -07:00
Nobuyoshi Nakada	a82252df42	Fixed another typo	2020-07-10 12:48:47 +09:00
Nobuyoshi Nakada	234f8eee33	Fixed typos	2020-07-10 12:32:48 +09:00
卜部昌平	e4ee992099	vm_push_frame_debug_counter_inc: use branches Ko1 doesn't like previous code.	2020-07-10 12:23:41 +09:00
卜部昌平	0e276dc458	vm_push_frame: move assignments around Struct assignment using a compound literal is more readable than before, to me at least. It seems compilers reorder assignments anyways. Neither speedup nor slowdown is observed on my machine.	2020-07-10 12:23:41 +09:00
卜部昌平	4b8170ce80	vm_push_frame: move assertions out of the function These assertions are purely static. Ned not be checked on-the-fly.	2020-07-10 12:23:41 +09:00

1 2 3 4 5 ...

1051 Коммитов