github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Takashi Kokubun	a0216b1acf	Do not run File.write while Ractors are running also make sure all local variables have the __bmdv_ prefix.	2021-02-11 00:25:46 -08:00
Takashi Kokubun	27382eb9fc	Add a benchmark-driver runner for Ractor (#4172 ) * Add a benchmark-driver runner for Ractor * Process.clock_gettime(Process:CLOCK_MONOTONIC) could be slow in Ruby 3.0 Ractor * Fetching Time could also be slow * Fix a comment * Assert overriding a private method	2021-02-10 21:24:25 -08:00
Nobuyoshi Nakada	ad2c7f8a1e	Simple benchmark of Float#to_s	2021-02-10 19:42:00 +09:00
S.H	fad7908a5d	Improve performance Float#positive? and Float#negative? [Feature #17614 ] (#4160 )	2021-02-08 20:29:42 -08:00
Takashi Kokubun	e1fee7f949	Rename RubyVM::MJIT to RubyVM::JIT because the name "MJIT" is an internal code name, it's inconsistent with --jit while they are related to each other, and I want to discourage future JIT implementation-specific (e.g. MJIT-specific) APIs by this rename. [Feature #17490]	2021-01-13 22:46:51 -08:00
S.H	daec5f9edc	Improve performance some Float methods [Feature #17498 ] (#4018 )	2021-01-01 18:39:07 -08:00
Takashi Kokubun	dbb4f19969	Allow inlining Integer#-@ and #~ ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_integer.yml --filter '(comp\|uminus)' before --jit: ruby 3.0.0dev (2020-12-23T05:41:44Z master `0dd4896175`) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-23T06:25:41Z master 8887d78992) +JIT [x86_64-linux] last_commit=Allow inlining Integer#-@ and #~ Calculating ------------------------------------- before --jit after --jit mjit_comp(1) 44.006M 70.417M i/s - 40.000M times in 0.908967s 0.568042s mjit_uminus(1) 44.333M 68.422M i/s - 40.000M times in 0.902255s 0.584603s Comparison: mjit_comp(1) after --jit: 70417331.4 i/s before --jit: 44005980.4 i/s - 1.60x slower mjit_uminus(1) after --jit: 68422468.8 i/s before --jit: 44333371.0 i/s - 1.54x slower ```	2020-12-22 22:32:19 -08:00
Koichi Sasada	dff67c809f	fix duplicated name	2020-12-16 10:29:48 +09:00
Benoit Daloze	b4ec4a41c2	Guard all accesses to RubyVM::MJIT with defined?(RubyVM::MJIT) && * Otherwise those tests, etc cannot run on alternative Ruby implementations.	2020-12-04 16:45:54 +01:00
Alan Wu	68ffc8db08	Set allocator on class creation Allocating an instance of a class uses the allocator for the class. When the class has no allocator set, Ruby looks for it in the super class (see rb_get_alloc_func()). It's uncommon for classes created from Ruby code to ever have an allocator set, so it's common during the allocation process to search all the way to BasicObject from the class with which the allocation is being performed. This makes creating instances of classes that have long ancestry chains more expensive than creating instances of classes have that shorter ancestry chains. Setting the allocator at class creation time removes the need to perform a search for the alloctor during allocation. This is a breaking change for C-extensions that assume that classes created from Ruby code have no allocator set. Libraries that setup a class hierarchy in Ruby code and then set the allocator on some parent class, for example, can experience breakage. This seems like an unusual use case and hopefully it is rare or non-existent in practice. Rails has many classes that have upwards of 60 elements in the ancestry chain and benchmark shows a significant improvement for allocating with a class that includes 64 modules. ``` pre: ruby 3.0.0dev (2020-11-12T14:39:27Z master `6325866421`) post: ruby 3.0.0dev (2020-11-12T20:15:30Z cut-allocator-lookup) Comparison: allocate_8_deep post: 10336985.6 i/s pre: 8691873.1 i/s - 1.19x slower allocate_32_deep post: 10423181.2 i/s pre: 6264879.1 i/s - 1.66x slower allocate_64_deep post: 10541851.2 i/s pre: 4936321.5 i/s - 2.14x slower allocate_128_deep post: 10451505.0 i/s pre: 3031313.5 i/s - 3.45x slower ```	2020-11-16 17:41:40 -05:00
Aaron Patterson	d7581370fd	Add a benchmark for polymorphic ivar setting This benchmark demonstrates the performance of setting an instance variable when the type of object is constantly changing. This benchmark should give us an idea of the performance of ivar setting in a polymorphic environment	2020-11-09 14:05:41 -08:00
Aaron Patterson	eb229994e5	eagerly initialize ivar table when index is small enough When the inline cache is written, the iv table will contain an entry for the instance variable. If we get an inline cache hit, then we know the iv table must contain a value for the index written to the inline cache. If the index in the inline cache is larger than the list on the object, but smaller than the iv index table on the class, then we can just eagerly allocate the iv list to be the same size as the iv index table. This avoids duplicate work of checking frozen as well as looking up the index for the particular instance variable name.	2020-11-09 09:44:16 -08:00
Nobuyoshi Nakada	8f9c113f35	Added benchmark of vm_send by variable [ci skip]	2020-10-28 09:47:46 +09:00
eileencodes	637d1cc0c0	Improve the performance of super This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super \| \|compare-ruby\|built-ruby\| \|:---------\|-----------:\|---------:\| \|vm_super \| 21.555M\| 37.819M\| \| \| -\| 1.75x\| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|vm_iclass_super \| 1.669M\| 3.683M\| \| \| -\| 2.21x\| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do \|module_name\| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do \|x\| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email>	2020-09-23 11:52:36 -07:00
Jean Boussier	5001cc4716	Optimize ObjectSpace.dump_all The two main optimization are: - buffer writes for improved performance - avoid formatting functions when possible ``` \| \|compare-ruby\|built-ruby\| \|:------------------\|-----------:\|---------:\| \|dump_all_string \| 1.038\| 195.925\| \| \| -\| 188.77x\| \|dump_all_file \| 33.453\| 139.645\| \| \| -\| 4.17x\| \|dump_all_dev_null \| 44.030\| 278.552\| \| \| -\| 6.33x\| ```	2020-09-09 11:11:36 -07:00
Nobuyoshi Nakada	54acb3dd52	Improved Enumerable::Lazy#zip \| \|compare-ruby\|built-ruby\| \|:-------------------\|-----------:\|---------:\| \|first_ary \| 290.514k\| 296.331k\| \| \| -\| 1.02x\| \|first_nonary \| 166.954k\| 169.178k\| \| \| -\| 1.01x\| \|first_noarg \| 299.547k\| 305.358k\| \| \| -\| 1.02x\| \|take3_ary \| 129.388k\| 188.360k\| \| \| -\| 1.46x\| \|take3_nonary \| 90.684k\| 112.688k\| \| \| -\| 1.24x\| \|take3_noarg \| 131.940k\| 189.471k\| \| \| -\| 1.44x\| \|chain-first_ary \| 195.913k\| 286.194k\| \| \| -\| 1.46x\| \|chain-first_nonary \| 127.483k\| 168.716k\| \| \| -\| 1.32x\| \|chain-first_noarg \| 201.252k\| 298.562k\| \| \| -\| 1.48x\| \|chain-take3_ary \| 101.189k\| 183.188k\| \| \| -\| 1.81x\| \|chain-take3_nonary \| 75.381k\| 112.301k\| \| \| -\| 1.49x\| \|chain-take3_noarg \| 101.483k\| 192.148k\| \| \| -\| 1.89x\| \|block \| 296.696k\| 292.877k\| \| \| 1.01x\| -\|	2020-07-23 16:57:26 +09:00
Nobuyoshi Nakada	6b3cff12f6	Improved Enumerable::Lazy#flat_map \| \|compare-ruby\|built-ruby\| \|:-------\|-----------:\|---------:\| \|num3 \| 96.333k\| 160.732k\| \| \| -\| 1.67x\| \|num10 \| 96.615k\| 159.150k\| \| \| -\| 1.65x\| \|ary2 \| 103.836k\| 172.787k\| \| \| -\| 1.66x\| \|ary10 \| 109.249k\| 177.252k\| \| \| -\| 1.62x\| \|ary20 \| 106.628k\| 177.371k\| \| \| -\| 1.66x\| \|ary50 \| 107.135k\| 162.282k\| \| \| -\| 1.51x\| \|ary100 \| 106.513k\| 177.626k\| \| \| -\| 1.67x\|	2020-07-23 16:57:26 +09:00
Kenta Murata	b4e784434c	Optimize Array#min (#3324 ) The benchmark result is below: \| \|compare-ruby\|built-ruby\| \|:---------------\|-----------:\|---------:\| \|ary2.min \| 39.105M\| 39.442M\| \| \| -\| 1.01x\| \|ary10.min \| 23.995M\| 30.762M\| \| \| -\| 1.28x\| \|ary100.min \| 6.249M\| 10.783M\| \| \| -\| 1.73x\| \|ary500.min \| 1.408M\| 2.714M\| \| \| -\| 1.93x\| \|ary1000.min \| 828.397k\| 1.465M\| \| \| -\| 1.77x\| \|ary2000.min \| 332.256k\| 570.504k\| \| \| -\| 1.72x\| \|ary3000.min \| 338.079k\| 573.868k\| \| \| -\| 1.70x\| \|ary5000.min \| 168.217k\| 286.114k\| \| \| -\| 1.70x\| \|ary10000.min \| 85.512k\| 143.551k\| \| \| -\| 1.68x\| \|ary20000.min \| 43.264k\| 71.935k\| \| \| -\| 1.66x\| \|ary50000.min \| 17.317k\| 29.107k\| \| \| -\| 1.68x\| \|ary100000.min \| 9.072k\| 14.540k\| \| \| -\| 1.60x\| \|ary1000000.min \| 872.930\| 1.436k\| \| \| -\| 1.64x\| compare-ruby is `9f4b7fc82e`.	2020-07-18 23:45:25 +09:00
Kenta Murata	a63f520971	Optimize Array#max (#3325 ) The benchmark result is below: \| \|compare-ruby\|built-ruby\| \|:---------------\|-----------:\|---------:\| \|ary2.max \| 38.837M\| 40.830M\| \| \| -\| 1.05x\| \|ary10.max \| 23.035M\| 32.626M\| \| \| -\| 1.42x\| \|ary100.max \| 5.490M\| 11.020M\| \| \| -\| 2.01x\| \|ary500.max \| 1.324M\| 2.679M\| \| \| -\| 2.02x\| \|ary1000.max \| 699.167k\| 1.403M\| \| \| -\| 2.01x\| \|ary2000.max \| 284.321k\| 570.446k\| \| \| -\| 2.01x\| \|ary3000.max \| 282.613k\| 571.683k\| \| \| -\| 2.02x\| \|ary5000.max \| 145.120k\| 285.546k\| \| \| -\| 1.97x\| \|ary10000.max \| 72.102k\| 142.831k\| \| \| -\| 1.98x\| \|ary20000.max \| 36.065k\| 72.077k\| \| \| -\| 2.00x\| \|ary50000.max \| 14.343k\| 29.139k\| \| \| -\| 2.03x\| \|ary100000.max \| 7.586k\| 14.472k\| \| \| -\| 1.91x\| \|ary1000000.max \| 726.915\| 1.495k\| \| \| -\| 2.06x\|	2020-07-18 23:45:00 +09:00
Takashi Kokubun	167d139487	Inline builtin struct aref We don't do this for aset because it might raise a FrozenError. ``` $ benchmark-driver -v --rbenv 'before;after;before --jit;after --jit' benchmark/mjit_struct_aref.yml --repeat-count=4 before: ruby 2.8.0dev (2020-07-06T01:47:11Z master `d94ef7c6b6`) [x86_64-linux] after: ruby 2.8.0dev (2020-07-06T07:11:51Z master 85425168f4) [x86_64-linux] last_commit=Inline builtin struct aref before --jit: ruby 2.8.0dev (2020-07-06T01:47:11Z master `d94ef7c6b6`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-07-06T07:11:51Z master 85425168f4) +JIT [x86_64-linux] last_commit=Inline builtin struct aref Calculating ------------------------------------- before after before --jit after --jit mjit_struct_aref(struct) 34.783M 34.810M 48.321M 58.378M i/s - 40.000M times in 1.149996s 1.149097s 0.827794s 0.685192s Comparison: mjit_struct_aref(struct) after --jit: 58377836.7 i/s before --jit: 48321205.7 i/s - 1.21x slower after: 34809935.5 i/s - 1.68x slower before: 34782736.5 i/s - 1.68x slower ```	2020-07-06 00:14:00 -07:00
Takashi Kokubun	24fa37d87a	Make Kernel#then, #yield_self, #frozen? builtin (#3283 ) * Make Kernel#then, #yield_self, #frozen? builtin * Fix test_jit	2020-07-03 18:02:43 -07:00
Takashi Kokubun	f3a0d7a203	Rewrite Kernel#tap with Ruby (#3281 ) * Rewrite Kernel#tap with Ruby This was good for VM too, but of course my intention is to unblock JIT's inlining of a block over yield (inlining invokeyield has not been committed though). * Fix test_settracefunc About the :tap deletions, the :tap events are actually traced (we already have a TracePoint test for builtin methods), but it's filtered out by tp.path == "xyzzy" (it became "<internal:kernel>"). We could trace tp.path == "<internal:kernel>" cases too, but the lineno is impacted by kernel.rb changes and I didn't want to make it fragile for kernel.rb lineno changes.	2020-07-03 09:52:35 -07:00
Takashi Kokubun	0703e01471	Mark some Integer methods as inline (#3264 )	2020-06-27 10:07:47 -07:00
Vladimir Dementyev	6770d8f1b0	Add pattern matching with arrays benchmark	2020-06-27 13:51:03 +09:00
Takashi Kokubun	7982dc1dfd	Decide JIT-ed insn based on cached cfunc for opt_* insns. opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is handled by opt_send_without_block. Therefore we can't decide which insn should be generated by checking whether it's cfunc cc or not. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master `9dbc2294a6`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux] last_commit=Decide JIT-ed insn based on cached cfunc Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s Comparison: mjit_nil?(1) after --jit: 74020528.4 i/s before --jit: 73878185.9 i/s - 1.00x slower mjit_not(1) after --jit: 74600882.0 i/s before --jit: 72634507.6 i/s - 1.03x slower mjit_eq(1, nil) after --jit: 7444657.4 i/s before --jit: 7331304.3 i/s - 1.02x slower mjit_eq(nil, 1) after --jit: 64710790.6 i/s before --jit: 49449507.4 i/s - 1.31x slower ```	2020-06-25 23:33:08 -07:00
Takashi Kokubun	946e5cc668	Annotate Kernel#class as inline (#3250 ) ``` $ benchmark-driver -v --rbenv 'before;after;before --jit;after --jit' benchmark/mjit_class.yml --repeat-count=4 before: ruby 2.8.0dev (2020-06-23T07:09:54Z master `37a2e48d76`) [x86_64-linux] after: ruby 2.8.0dev (2020-06-23T17:29:56Z inline-class 0ff147c007) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-06-23T07:09:54Z master `37a2e48d76`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-23T17:29:56Z inline-class 0ff147c007) +JIT [x86_64-linux] Calculating ------------------------------------- before after before --jit after --jit mjit_class(self) 39.219M 40.060M 53.502M 69.202M i/s - 40.000M times in 1.019915s 0.998495s 0.747631s 0.578021s mjit_class(1) 39.567M 41.242M 52.100M 68.895M i/s - 40.000M times in 1.010935s 0.969885s 0.767749s 0.580591s Comparison: mjit_class(self) after --jit: 69201690.7 i/s before --jit: 53502336.4 i/s - 1.29x slower after: 40060289.1 i/s - 1.73x slower before: 39218939.2 i/s - 1.76x slower mjit_class(1) after --jit: 68895358.6 i/s before --jit: 52100353.0 i/s - 1.32x slower after: 41241993.6 i/s - 1.67x slower before: 39567314.0 i/s - 1.74x slower ```	2020-06-23 23:49:03 -07:00
Takashi Kokubun	78352fb52e	Compile opt_send for opt_* only when cc has ISeq because opt_nil/opt_not/opt_eq populates cc even when it doesn't fallback to opt_send_without_block because of vm_method_cfunc_is. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-22T08:11:24Z master `d231b8f95b`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-22T08:53:27Z master e1125879ed) +JIT [x86_64-linux] last_commit=Compile opt_send for opt_* only when cc has ISeq Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 54.106M 73.693M i/s - 40.000M times in 0.739288s 0.542795s mjit_not(1) 53.398M 74.477M i/s - 40.000M times in 0.749090s 0.537075s mjit_eq(1, nil) 7.427M 6.497M i/s - 8.000M times in 1.077136s 1.231326s Comparison: mjit_nil?(1) after --jit: 73692594.3 i/s before --jit: 54106108.4 i/s - 1.36x slower mjit_not(1) after --jit: 74477487.9 i/s before --jit: 53398125.0 i/s - 1.39x slower mjit_eq(1, nil) before --jit: 7427105.9 i/s after --jit: 6497063.0 i/s - 1.14x slower ``` Actually opt_eq becomes slower by this. Maybe it's indeed using opt_send_without_block, but I'll approach that one in another commit.	2020-06-22 02:08:21 -07:00
Takashi Kokubun	4c5780e51e	Share warmup logic across MJIT benchmarks	2020-06-22 00:54:27 -07:00
Takashi Kokubun	faf93e4545	The RUBYOPT= comment is no longer needed	2020-06-22 00:20:30 -07:00
Takashi Kokubun	8838600c1e	Stop relying on `make benchmark`'s `-I$(srcdir)/benchmark/lib` These days I don't use `make benchmark`. The YAML files should be executable with bare `benchmark-driver` CLI without passing `RUBYOPT=-Ibenchmark/lib`.	2020-06-22 00:17:10 -07:00
Takashi Kokubun	7561db8c00	Introduce Primitive.attr! to annotate 'inline' (#3242 ) [Feature #15589]	2020-06-20 17:13:03 -07:00
Takashi Kokubun	95b0fed371	Make Integer#zero? a separated method and builtin (#3226 ) A prerequisite to fix https://bugs.ruby-lang.org/issues/15589 with JIT. This commit alone doesn't make a significant difference yet, but I thought this commit should be committed independently. This method override was discussed in [Misc #16961].	2020-06-20 14:55:09 -07:00
Ryuta Kamizono	9d24ddbb53	Fix `make benchmark` example `make benchmark ARGS=../benchmark/erb_render.yml` does not work. ``` % make benchmark ARGS=../benchmark/erb_render.yml /Users/kamipo/.rbenv/shims/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::/Users/kamipo/.rbenv/shims/ruby --disable=gems -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ ../benchmark/erb_render.yml Traceback (most recent call last): 6: from ./benchmark/benchmark-driver/exe/benchmark-driver:112:in `<main>' 5: from ./benchmark/benchmark-driver/exe/benchmark-driver:112:in `flat_map' 4: from ./benchmark/benchmark-driver/exe/benchmark-driver:112:in `each' 3: from ./benchmark/benchmark-driver/exe/benchmark-driver:122:in `block in <main>' 2: from /Users/kamipo/.rbenv/versions/2.6.6/lib/ruby/2.6.0/psych.rb:577:in `load_file' 1: from /Users/kamipo/.rbenv/versions/2.6.6/lib/ruby/2.6.0/psych.rb:577:in `open' /Users/kamipo/.rbenv/versions/2.6.6/lib/ruby/2.6.0/psych.rb:577:in `initialize': No such file or directory @ rb_sysopen - ../benchmark/erb_render.yml (Errno::ENOENT) make: *** [benchmark] Error 1 % make benchmark ARGS=benchmark/erb_render.yml /Users/kamipo/.rbenv/shims/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::/Users/kamipo/.rbenv/shims/ruby --disable=gems -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ benchmark/erb_render.yml Calculating ------------------------------------- compare-ruby built-ruby erb_render 825.454k 783.664k i/s - 1.500M times in 1.817181s 1.914086s Comparison: erb_render compare-ruby: 825454.4 i/s built-ruby: 783663.8 i/s - 1.05x slower ```	2020-06-07 10:33:14 +09:00
卜部昌平	d4015cfee3	add benchmark for different block handlers	2020-06-03 16:13:47 +09:00
Nobuyoshi Nakada	02cb643ddb	Added String#split benchmark for empty regexp \| \|compare-ruby\|built-ruby\| \|:--------------\|-----------:\|---------:\| \|re_chars-1 \| 169.230k\| 973.855k\| \| \| -\| 5.75x\| \|re_chars-10 \| 25.536k\| 107.598k\| \| \| -\| 4.21x\| \|re_chars-100 \| 2.621k\| 11.207k\| \| \| -\| 4.28x\| \|re_chars-1000 \| 259.098\| 1.133k\| \| \| -\| 4.37x\|	2020-05-12 22:59:58 +09:00
Nobuyoshi Nakada	693f7ab315	Optimize String#split Optimized `String#split` with `/ /` (single space regexp) as simple string splitting. [ruby-core:98272] \| \|compare-ruby\|built-ruby\| \|:--------------\|-----------:\|---------:\| \|re_space-1 \| 432.786k\| 1.539M\| \| \| -\| 3.56x\| \|re_space-10 \| 76.231k\| 191.547k\| \| \| -\| 2.51x\| \|re_space-100 \| 8.152k\| 19.557k\| \| \| -\| 2.40x\| \|re_space-1000 \| 837.405\| 2.022k\| \| \| -\| 2.41x\| ruby-core:98272: https://bugs.ruby-lang.org/issues/15771#change-85511	2020-05-12 19:58:58 +09:00
S.H	17083011ee	support builtin for Kernel#Float # Iteration per second (i/s) \| \|compare-ruby\|built-ruby\| \|:------------\|-----------:\|---------:\| \|float \| 30.395M\| 38.314M\| \| \| -\| 1.26x\| \|float_true \| 3.833M\| 27.322M\| \| \| -\| 7.13x\| \|float_false \| 4.182M\| 24.938M\| \| \| -\| 5.96x\|	2020-04-22 09:49:13 +09:00
Takashi Kokubun	f883d6219e	Unify vm benchmark prefixes to vm_ (#3028 ) The vm1_ prefix and vm2_ had had special meaning until `820ad9cb1d` and `12068aa4e9`. AFAIK there's no special meaning in vm3_ prefix. As they have confused people (like "In `benchmark` what is difference between `vm1_`, `vm2_` and `vm3_`"), I'd like to remove the obsoleted prefix as we obviated that two years ago.	2020-04-13 21:37:42 -07:00
Takashi Kokubun	310ef9f40b	Make vm_call_cfunc_with_frame a fastpath (#3027 ) when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master `b9d3ceee8f`) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ```	2020-04-13 20:32:59 -07:00
Takashi Kokubun	b9d3ceee8f	Unwrap vm_call_cfunc indirection on JIT for VM_METHOD_TYPE_CFUNC. This has been known to decrease optcarrot fps: ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master `fb40495cd9`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 66.38132676191719 67.41369177299630 fps 69.42728743772243 68.90327567263054 72.16028300263211 69.62605130880686 72.46631319102777 70.48818243767207 73.37078877002490 70.79522887347566 73.69422431217367 70.99021920193194 74.01471487018695 74.69931965402584 75.48685183295630 74.86714575949016 75.54445264507932 75.97864419721677 77.28089738169756 76.48908637569581 78.04183397891302 76.54320932488021 78.36807984096562 76.59407262898067 78.92898762543574 77.31316743361343 78.93576483233765 77.97153484180480 79.13754917503078 77.98478782102325 79.62648945850653 78.02263322726446 79.86334213878064 78.26333724045934 80.05100635898518 78.60056756355614 80.26186843769584 78.91082645644468 80.34205717020330 79.01226659142263 80.62286066044338 79.32733939423721 80.95883033058557 79.63793060542024 80.97376819251613 79.73108936622778 81.23050939202896 80.18280109433088 ``` and I deleted this capability in an early stage of YARV-MJIT development: `0ab130feee` I suspect either of the following things could be the cause: * Directly calling vm_call_cfunc requires more optimization effort in GCC, resulting in 30ms-ish compilation time increase for such methods and decreasing the number of methods compiled in a benchmarked period. * Code size increase => icache miss hit These hypotheses could be verified by some methodologies. However, I'd like to introduce this regardless of the result because this blocks inlining C method's definition. I may revert this commit when I give up to implement inlining C method definition, which requires this change. Microbenchmark-wise, this gives slight performance improvement: ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_send_cfunc.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master `fb40495cd9`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit mjit_send_cfunc 41.961M 56.489M i/s - 100.000M times in 2.383143s 1.770244s Comparison: mjit_send_cfunc after --jit: 56489372.5 i/s before --jit: 41961388.1 i/s - 1.35x slower ```	2020-04-13 16:45:05 -07:00
Takashi Kokubun	151f8be40d	Make JIT-ed leave insn leaf to eliminate sp / pc moves by cancelling JIT execution on interrupts. $ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master `5a81562dfe`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 75.06409603894944 76.06422026555558 fps 75.12025067279242 78.48161731616810 77.42020273492177 79.78958240950033 79.07253675128945 79.88645902325614 79.99179109732327 80.33743931749331 80.07633091008627 80.53790081529166 80.15450942667547 80.99048270668010 80.48372803283709 81.70497146081003 80.57410149187352 82.79494539467382 81.80449157081202 82.85797792223954 82.24629397834902 83.00603891515506 82.63708148686703 83.23221006969828 $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_leave.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-01T03:48:56Z master `5a81562dfe`) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master `5a81562dfe`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_leave 106.656M 82.786M 91.635M i/s - 200.000M times in 1.875183s 2.415881s 2.182569s Comparison: mjit_leave before: 106656239.9 i/s after --jit: 91635143.7 i/s - 1.16x slower before --jit: 82785537.2 i/s - 1.29x slower	2020-03-31 22:10:16 -07:00
Takashi Kokubun	dad110d068	Remove an unused pragma It originally had a string literal, but it no longer has one.	2020-03-30 23:30:08 -07:00
Takashi Kokubun	b736ea63bd	Optimize exivar access on JIT-ed getivar JIT support of `dd723771c1`. $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4 before: ruby 2.8.0dev (2020-03-30T12:32:26Z master `e5db3da9d3`) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master `e5db3da9d3`) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_exivar 57.944M 53.579M 54.471M i/s - 200.000M times in 3.451588s 3.732772s 3.671687s Comparison: mjit_exivar before: 57944345.1 i/s after --jit: 54470876.7 i/s - 1.06x slower before --jit: 53579483.4 i/s - 1.08x slower	2020-03-30 23:16:35 -07:00
Jeremy Evans	d2c41b1bff	Reduce allocations for keyword argument hashes Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(kw) kw end h = {a: 1} kw(a: 1) # about same kw(h) # 2.37x faster kws(a: 1) # 1.30x faster kws(h) # 2.19x faster kw(a: 1, h) # 1.03x slower kw(h, h) # about same kws(a: 1, h) # 1.16x faster kws(h, **h) # 1.14x faster	2020-03-17 12:09:43 -07:00
S.H	290d608637	support builtin for Kernel#clone	2020-03-17 19:37:07 +09:00
Nobuyoshi Nakada	5e897227ff	Added more benchmarks for String	2020-02-29 15:42:24 +09:00
Nobuyoshi Nakada	05229cef45	Improve `String#slice!` performance Instead of searching twice to extract and to delete, extract and delete the found position at the first search. This makes faster nearly twice, for regexps and strings. \| \|compare-ruby\|built-ruby\| \|:-------------\|-----------:\|---------:\| \|regexp-short \| 2.143M\| 3.918M\| \|regexp-long \| 105.162k\| 205.410k\| \|string-short \| 3.789M\| 7.964M\| \|string-long \| 1.301M\| 2.457M\|	2020-01-31 17:12:05 +09:00
Kazuhiro NISHIYAMA	c90fc55a1f	Drop executable bit of *.{yml,h,mk.tmpl}	2020-01-22 16:04:38 +09:00
Lourens Naudé	40c57ad4a1	Let execution context local storage be an ID table	2020-01-11 14:40:36 +13:00
Lourens Naudé	592d7ceeeb	Speeds up fallback to Hash#default_proc in rb_hash_aref by removing a method call	2020-01-08 18:09:52 +09:00

1 2 3 4 5 ...

341 Коммитов