Граф коммитов

208 Коммитов

Автор SHA1 Сообщение Дата
Takashi Kokubun d409837729
Cache access to reg_cfp->self on JIT
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T06:41:15Z master 8ce1711c25) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T08:36:02Z master 2c592126b9) +JIT [x86_64-linux]
last_commit=Cache access to reg_cfp->self on JIT
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    82.40522392468650     82.66023870551237 fps
                            82.67998539899482     83.08660305312587
                            85.51280693947453     87.09311989553235
                            86.32925337181406     87.16115255191410
                            87.35617494926235     87.30699391518075
                            87.91865339426212     88.47590342996875
                            88.11573661006648     88.64778616696353
                            88.16060826662158     88.67015079203991
                            88.21639244865058     89.19630739497482
                            88.47241577897603     89.23443637947730
                            89.37087287229809     89.57052723997015
                            89.46969964699964     89.97803363889025
```
2020-11-27 00:42:42 -08:00
Takashi Kokubun 8ce1711c25
Revert "Set VM_FRAME_FLAG_FINISH at once on MJIT"
This reverts commit 4d2c8edca6.

Unfortunately this seems to cause several issues:
https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true
http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802
2020-11-26 22:41:15 -08:00
Takashi Kokubun 4d2c8edca6
Set VM_FRAME_FLAG_FINISH at once on MJIT
Performance is probably improved?

$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master 69e77e81dc) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux]
last_commit=Set VM_FRAME_FLAG_FINISH at once
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    80.89292998533379     82.19497327502751 fps
                            80.93130641142331     85.13943315260148
                            81.06214830270119     87.43757879797808
                            82.29172808453910     87.89942441487113
                            84.61206450455929     87.91309779491075
                            85.44545883567997     87.98026086648694
                            86.02923132404449     88.03081060383973
                            86.07411817365879     88.14650206137341
                            86.34348799602836     88.32791633649961
                            87.90257338977324     88.57599644892220
                            88.58006509876580     88.67426384743277
                            89.26611118140011     88.81669430874207

This should have no bad impact on VM because this function is ALWAYS_INLINE.
2020-11-26 21:32:14 -08:00
Takashi Kokubun 1fea0367d2
Clarify the intention of `false &&` 2020-11-22 22:09:42 -08:00
Nobuyoshi Nakada bf951c763d
An ellipsis (...) can only be placed at the beginning 2020-10-29 18:14:27 +09:00
Nobuyoshi Nakada 396e921044
Escape '/*' within block comment too 2020-10-26 09:01:27 +09:00
Koichi Sasada f6661f5085 sync RClass::ext::iv_index_tbl
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.

This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
2020-10-17 08:18:04 +09:00
Koichi Sasada fad97f1f96 sync generic_ivtbl
generic_ivtbl is a process global table to maintain instance variables
for non T_OBJECT/T_CLASS/... objects. So we need to protect them
for multi-Ractor exection.

Hint: we can make them Ractor local for unshareable objects, but
      now it is premature optimization.
2020-10-14 16:36:55 +09:00
Koichi Sasada 79df14c04b Introduce Ractor mechanism for parallel execution
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.

[Feature #17100]

This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.

I hope this feature can help programmers from thread-safety issues.
2020-09-03 21:11:06 +09:00
Alan Wu 4c3f0597de Remove the pc argument of vm_trace()
This makes the binary 272 bytes smaller on -O3 GCC 10.2.0.
2020-09-01 22:02:29 -04:00
卜部昌平 acd8ee8dbc tool/prelude.c.tmpl: use RubyVM::CEscape
Do not repeat yourself.
2020-08-11 16:51:07 +09:00
卜部昌平 b0eb5aa344 RubyVM::CEscape#rstring2cstr: do not escape '
A single quote "is representable either by itself or by the escape
sequence", according to ISO/IEC 9899 (checked all versions).  So this is
not a bug fix.  But the generated output is a bit readable without
backslashes.
2020-08-11 16:51:07 +09:00
卜部昌平 1fb4e28002 skip inlining cexpr! that are not attr! inline
Requested by ko1.
2020-07-16 11:49:09 +09:00
卜部昌平 8d3a084572 _mjit_compile_invokebuiltin: sp_inc can be negative
Was my bad to assume sp_inc was positive.  Real criteria is the
calculated sp is non-negative.  We have to assert that.
2020-07-14 13:15:06 +09:00
卜部昌平 927fe2422f mk_builtin_loader.rb: STACK_ADDR_FROM_TOP unusable
Stacks are emulated in MJIT, must not touch the original VM stack.

See also http://ci.rvm.jp/results/trunk-mjit-wait@silicon-docker/3061353
2020-07-13 12:30:43 +09:00
卜部昌平 7e536b3be2 builtin.h: avoid copy&paste
Instead of doubling the invokebuiltin logic here and there, use the same
insns.def definition for both MJIT/non-JIT situations.
2020-07-13 08:56:18 +09:00
卜部昌平 9721f477c7 inline Primitive.cexpr!
We can obtain the verbatim source code of Primitive.cexpr!.  Why not
paste that content into the JITed program.
2020-07-13 08:56:18 +09:00
卜部昌平 f66e0212ef precalc invokebuiltin destinations
Noticed that struct rb_builtin_function is a purely compile-time
constant.  MJIT can eliminate some runtime calculations by statically
generate dedicated C code generator for each builtin functions.
2020-07-13 08:56:18 +09:00
Takashi Kokubun 7fa3c71bec
Make sure vm_call_cfunc uses inlined cc
which is checked by the first guard. When JIT-inlined cc and operand
cd->cc are different, the JIT-ed code might wrongly dispatch cd->cc even
while class check is done with another cc inlined by JIT.

This fixes SEGV on railsbench.
2020-07-10 00:44:02 -07:00
Takashi Kokubun e4f7eee009
Check ROBJECT_EMBED on guards-merged ivar access
Fix CI failure like
http://ci.rvm.jp/results/trunk-mjit-wait@silicon-docker/3043247
introduced by a69dd699ee
2020-07-04 16:02:46 -07:00
Takashi Kokubun a69dd699ee
Merge ivar guards on JIT (#3284)
when an ISeq has multiple ivar accesses.
2020-07-03 17:52:52 -07:00
Koichi Sasada a0f12a0258
Use ID instead of GENTRY for gvars. (#3278)
Use ID instead of GENTRY for gvars.

Global variables are compiled into GENTRY (a pointer to struct
rb_global_entry). This patch replace this GENTRY to ID and
make the code simple.

We need to search GENTRY from ID every time (st_lookup), so
additional overhead will be introduced.
However, the performance of accessing global variables is not
important now a day and this simplicity helps Ractor development.
2020-07-03 16:56:44 +09:00
Takashi Kokubun 40b40523dc
Show what's inlined first in "JIT inline" log
and add a debug log
2020-06-25 23:50:19 -07:00
Takashi Kokubun 7982dc1dfd
Decide JIT-ed insn based on cached cfunc
for opt_* insns.

opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is
handled by opt_send_without_block. Therefore we can't decide which insn
should be generated by checking whether it's cfunc cc or not.

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux]
last_commit=Decide JIT-ed insn based on cached cfunc
Calculating -------------------------------------
                     before --jit  after --jit
        mjit_nil?(1)      73.878M      74.021M i/s -     40.000M times in 0.541432s 0.540391s
         mjit_not(1)      72.635M      74.601M i/s -     40.000M times in 0.550702s 0.536187s
     mjit_eq(1, nil)       7.331M       7.445M i/s -      8.000M times in 1.091211s 1.074596s
     mjit_eq(nil, 1)      49.450M      64.711M i/s -      8.000M times in 0.161781s 0.123627s

Comparison:
                     mjit_nil?(1)
         after --jit:  74020528.4 i/s
        before --jit:  73878185.9 i/s - 1.00x  slower

                      mjit_not(1)
         after --jit:  74600882.0 i/s
        before --jit:  72634507.6 i/s - 1.03x  slower

                  mjit_eq(1, nil)
         after --jit:   7444657.4 i/s
        before --jit:   7331304.3 i/s - 1.02x  slower

                  mjit_eq(nil, 1)
         after --jit:  64710790.6 i/s
        before --jit:  49449507.4 i/s - 1.31x  slower
```
2020-06-25 23:33:08 -07:00
Takashi Kokubun 37a2e48d76
Avoid generating opt_send with cfunc cc with JIT
only for opt_nil_p and opt_not.

While vm_method_cfunc_is is used for opt_eq too, many fast paths of it
don't call it. So if it's populated, it should generate opt_send,
regardless of cfunc or not. And again, opt_neq isn't relevant due to the
difference in operands.
So opt_nil_p and opt_not are the only variants using vm_method_cfunc_is
like they use.

```
$ benchmark-driver -v --rbenv 'before2 --jit::ruby --jit;before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before2 --jit: ruby 2.8.0dev (2020-06-22T08:37:37Z master 3238641750) +JIT [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-06-23T01:01:24Z master 9ce2066209) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-23T06:58:37Z master 17e9df3157) +JIT [x86_64-linux]
last_commit=Avoid generating opt_send with cfunc cc with JIT
Calculating -------------------------------------
                     before2 --jit  before --jit  after --jit
        mjit_nil?(1)       54.204M       75.536M      75.031M i/s -     40.000M times in 0.737947s 0.529548s 0.533110s
         mjit_not(1)       53.822M       70.921M      71.920M i/s -     40.000M times in 0.743195s 0.564007s 0.556171s
     mjit_eq(1, nil)        7.367M        6.496M       7.331M i/s -      8.000M times in 1.085882s 1.231470s 1.091327s

Comparison:
                     mjit_nil?(1)
        before --jit:  75536059.3 i/s
         after --jit:  75031409.4 i/s - 1.01x  slower
       before2 --jit:  54204431.6 i/s - 1.39x  slower

                      mjit_not(1)
         after --jit:  71920324.1 i/s
        before --jit:  70921063.1 i/s - 1.01x  slower
       before2 --jit:  53821697.6 i/s - 1.34x  slower

                  mjit_eq(1, nil)
       before2 --jit:   7367280.0 i/s
         after --jit:   7330527.4 i/s - 1.01x  slower
        before --jit:   6496302.8 i/s - 1.13x  slower
```
2020-06-23 00:09:54 -07:00
Takashi Kokubun 78352fb52e
Compile opt_send for opt_* only when cc has ISeq
because opt_nil/opt_not/opt_eq populates cc even when it doesn't
fallback to opt_send_without_block because of vm_method_cfunc_is.

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-22T08:11:24Z master d231b8f95b) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-22T08:53:27Z master e1125879ed) +JIT [x86_64-linux]
last_commit=Compile opt_send for opt_* only when cc has ISeq
Calculating -------------------------------------
                     before --jit  after --jit
        mjit_nil?(1)      54.106M      73.693M i/s -     40.000M times in 0.739288s 0.542795s
         mjit_not(1)      53.398M      74.477M i/s -     40.000M times in 0.749090s 0.537075s
     mjit_eq(1, nil)       7.427M       6.497M i/s -      8.000M times in 1.077136s 1.231326s

Comparison:
                     mjit_nil?(1)
         after --jit:  73692594.3 i/s
        before --jit:  54106108.4 i/s - 1.36x  slower

                      mjit_not(1)
         after --jit:  74477487.9 i/s
        before --jit:  53398125.0 i/s - 1.39x  slower

                  mjit_eq(1, nil)
        before --jit:   7427105.9 i/s
         after --jit:   6497063.0 i/s - 1.14x  slower
```

Actually opt_eq becomes slower by this. Maybe it's indeed using
opt_send_without_block, but I'll approach that one in another commit.
2020-06-22 02:08:21 -07:00
Takashi Kokubun d9f608b686
Verify builtin inline annotation with VM_CHECK_MODE (#3244)
* Verify builtin inline annotation with VM_CHECK_MODE

* Remove static to fix the link issue on MJIT
2020-06-21 10:27:04 -07:00
Takashi Kokubun 7561db8c00
Introduce Primitive.attr! to annotate 'inline' (#3242)
[Feature #15589]
2020-06-20 17:13:03 -07:00
Takashi Kokubun e544a3a23c
Remove obsoleted opt_call_c_function insn (#3232)
* Remove obsoleted opt_call_c_function insn

* Keep opt_call_c_function with DEFINE_INSN_IF
2020-06-17 09:16:01 -07:00
Takashi Kokubun 0bd025ad69
Add a debug_counter for JIT cancel on leave 2020-05-28 22:45:35 -07:00
Takashi Kokubun b16a2aa938
Reduce code size for rb_class_of
by inlining only hot path.

=== mame/optcarrot ===

$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-05-18T05:21:31Z master 0e5a58b6bf) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-05-18T06:12:04Z master 0e3d71a8d1) +JIT [x86_64-linux]
last_commit=Reduce code size for rb_class_of
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    71.62880463568773     70.95730063273503 fps
                            71.73973684273152     71.98447841929851
                            75.03923801841310     75.54262519509039
                            75.16300287174957     77.64029272984344
                            75.16834828625935     78.67861469580785
                            75.17670723726911     78.81879353707393
                            75.67637908020630     79.18188850392886
                            76.19843953215396     79.66484891814478
                            77.28166716118808     79.80278072861037
                            77.38509903325165     80.05859292679696
                            78.12693418455953     80.34624804808006
                            78.73654441746730     80.66326571254345
                            79.25387513454415     80.69760605740196
                            79.44137881689524     81.32053489212245
                            79.50497657368358     81.50250852553751
                            79.62401328582868     82.27544931834611
                            79.79178811723664     82.67455264522741
                            81.20275352937418     82.93857260493297
                            81.57027048640776     83.15019118788184
                            81.63373188649095     83.20728816044721
                            81.93420437766426     83.25027576772972
                            82.05716136357167     83.27072145898173
                            82.21070805525066     83.36008265822194
                            82.56924063784872     83.36112268888493

=== benchmark-driver/sinatra ===

[rps]
before: 13143.49 rps
after: 13505.70 rps

[inlined rb_class_of size]
before: 11.5K
after: 3.8K

(calculated by `dwarftree --die inlined_subroutine --flat --merge --show-size`)
2020-05-17 23:38:19 -07:00
Takashi Kokubun a5073c053f
Always correct sp on leave cancel
Even if local stack optimization is not used and values are written to
VM stack, the stack pointer itself may not be moved properly. So this
should be always moved on JIT cancellation.

By the way it's hard to write a test for this because if we try to
generate an interrupt, it will be a method call and it consumes the
interrupt by itself on popping a frame.
2020-05-06 20:26:03 -07:00
Takashi Kokubun f5ddbba9a2
Include unit id in a function name of an inlined method
I'm trying to make it possible to include all JIT-ed code in a single C
file. This is needed to guarantee uniqueness of all function names
2020-04-30 23:08:13 -07:00
Takashi Kokubun 04e56958e6
Make sure newarraykwsplat accesses a correct index
on stack when local_stack_p is enabled.

This fixes `RB_FL_TEST_RAW:"RB_FL_ABLE(obj)"` assertion failure
on power_assert's test with JIT enabled.
2020-04-18 01:41:50 -07:00
Takashi Kokubun 310ef9f40b
Make vm_call_cfunc_with_frame a fastpath (#3027)
when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT
(i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat).

Micro benchmark:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
                         before       after
       vm_send_cfunc    69.585M     88.724M i/s -    100.000M times in 1.437097s 1.127096s

Comparison:
                    vm_send_cfunc
               after:  88723605.2 i/s
              before:  69584737.1 i/s - 1.28x  slower
```

Optcarrot:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
                                       before                 after
Optcarrot Lan_Master.nes    50.76119601545175     42.73858236484051 fps
                            50.76388649761503     51.04211379912850
                            50.80930672252514     51.39455790755538
                            50.90236000778749     51.75656936556145
                            51.01744746340430     51.86875277356489
                            51.06495279015112     51.88692482485558
                            51.07785337168974     51.93429603190578
                            51.20163525187862     51.95768145071314
                            51.34671771913112     52.45577266040274
                            51.35918340835583     52.53163888762858
                            51.46641337418146     52.62172484121034
                            51.50835463462257     52.85064021113239
```
2020-04-13 20:32:59 -07:00
Takashi Kokubun b9d3ceee8f
Unwrap vm_call_cfunc indirection on JIT
for VM_METHOD_TYPE_CFUNC.

This has been known to decrease optcarrot fps:

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    66.38132676191719     67.41369177299630 fps
                            69.42728743772243     68.90327567263054
                            72.16028300263211     69.62605130880686
                            72.46631319102777     70.48818243767207
                            73.37078877002490     70.79522887347566
                            73.69422431217367     70.99021920193194
                            74.01471487018695     74.69931965402584
                            75.48685183295630     74.86714575949016
                            75.54445264507932     75.97864419721677
                            77.28089738169756     76.48908637569581
                            78.04183397891302     76.54320932488021
                            78.36807984096562     76.59407262898067
                            78.92898762543574     77.31316743361343
                            78.93576483233765     77.97153484180480
                            79.13754917503078     77.98478782102325
                            79.62648945850653     78.02263322726446
                            79.86334213878064     78.26333724045934
                            80.05100635898518     78.60056756355614
                            80.26186843769584     78.91082645644468
                            80.34205717020330     79.01226659142263
                            80.62286066044338     79.32733939423721
                            80.95883033058557     79.63793060542024
                            80.97376819251613     79.73108936622778
                            81.23050939202896     80.18280109433088
```

and I deleted this capability in an early stage of YARV-MJIT development:
0ab130feee

I suspect either of the following things could be the cause:

* Directly calling vm_call_cfunc requires more optimization effort in GCC,
  resulting in 30ms-ish compilation time increase for such methods and
  decreasing the number of methods compiled in a benchmarked period.

* Code size increase => icache miss hit

These hypotheses could be verified by some methodologies. However, I'd
like to introduce this regardless of the result because this blocks
inlining C method's definition.

I may revert this commit when I give up to implement inlining C method
definition, which requires this change.

Microbenchmark-wise, this gives slight performance improvement:

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_send_cfunc.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
                     before --jit  after --jit
     mjit_send_cfunc      41.961M      56.489M i/s -    100.000M times in 2.383143s 1.770244s

Comparison:
                  mjit_send_cfunc
         after --jit:  56489372.5 i/s
        before --jit:  41961388.1 i/s - 1.35x  slower
```
2020-04-13 16:45:05 -07:00
Takashi Kokubun b66d7d9be5
Remove unused variable stack_size
_mjit_compile_send.erb doesn't use _mjit_compile_insn_body.erb
2020-04-06 02:00:23 -07:00
Takashi Kokubun 3194cd36e2
Delay definition of pc_moved_p
to unify the duplicated declarations and to make sure it's not used
until set properly.

Also changed it from legacy TRUE/FALSE to stdbool.
2020-04-06 01:55:18 -07:00
Takashi Kokubun 928bb17770
Fix -Wshorten-64-to-32 in 4f802828f4 2020-04-06 01:50:12 -07:00
Takashi Kokubun 4f802828f4
Refactor `argc` in mjit_compile_send
using sp_inc_of_sendish for consistency and to make it easier to
understand
2020-04-06 01:42:32 -07:00
Takashi Kokubun 1a33845215
Update outdated comments in mjit_compile_send
and simplify `v` variable references a little.

There's no CALL_METHOD anymore, and the original code lives in
vm_sendish instead of insns.def now.
2020-04-06 01:31:11 -07:00
Takashi Kokubun f984975c4d
Collapse `if` conditions to decrease indentation
in mjit_compile_send to clarify it's not that deeply branched.
2020-04-06 00:45:43 -07:00
Nobuyoshi Nakada ec03d13742
Fallback if Pathname#relative_path_from fails
It can fail due to different prefixes, e.g., drive letters or UNC
paths on DOSish platform.
2020-04-05 11:58:31 +09:00
Takashi Kokubun 151f8be40d
Make JIT-ed leave insn leaf
to eliminate sp / pc moves by cancelling JIT execution on interrupts.

$ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all
before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    75.06409603894944     76.06422026555558 fps
                            75.12025067279242     78.48161731616810
                            77.42020273492177     79.78958240950033
                            79.07253675128945     79.88645902325614
                            79.99179109732327     80.33743931749331
                            80.07633091008627     80.53790081529166
                            80.15450942667547     80.99048270668010
                            80.48372803283709     81.70497146081003
                            80.57410149187352     82.79494539467382
                            81.80449157081202     82.85797792223954
                            82.24629397834902     83.00603891515506
                            82.63708148686703     83.23221006969828

$ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_leave.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux]
Calculating -------------------------------------
                         before  before --jit  after --jit
          mjit_leave   106.656M       82.786M      91.635M i/s -    200.000M times in 1.875183s 2.415881s 2.182569s

Comparison:
                       mjit_leave
              before: 106656239.9 i/s
         after --jit:  91635143.7 i/s - 1.16x  slower
        before --jit:  82785537.2 i/s - 1.29x  slower
2020-03-31 22:10:16 -07:00
Takashi Kokubun b736ea63bd
Optimize exivar access on JIT-ed getivar
JIT support of dd723771c1.

$ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4
before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux]
Calculating -------------------------------------
                         before  before --jit  after --jit
         mjit_exivar    57.944M       53.579M      54.471M i/s -    200.000M times in 3.451588s 3.732772s 3.671687s

Comparison:
                      mjit_exivar
              before:  57944345.1 i/s
         after --jit:  54470876.7 i/s - 1.06x  slower
        before --jit:  53579483.4 i/s - 1.08x  slower
2020-03-30 23:16:35 -07:00
Takashi Kokubun 0cd7be99e9
Avoid referring to an old value of realloc
OpenBSD RubyCI has failed with SEGV since 4bcd5981e8.
https://rubyci.org/logs/rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20200312T223005Z.fail.html.gz

This was because `status->cc_entries` could be stale after `realloc` call
for inlined iseqs.
2020-03-12 22:51:34 -07:00
Takashi Kokubun da4b97a0e3
Pin and inline cme in JIT-ed method calls
```
$ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all
before --jit: ruby 2.8.0dev (2020-03-11T07:43:12Z master e89ebdcb87) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-11T07:54:18Z master 143776a0da) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    73.86976729561439     77.20184819316513 fps
                            74.46997176460742     78.43493030231805
                            77.59686308754307     78.55714131655935
                            78.53693921126656     79.08984255596820
                            80.10158944910573     79.17751731838183
                            80.12254974411167     79.60853122429181
                            80.28678655204945     79.74674066871896
                            80.38690681095379     79.90624544440300
                            80.79223498756919     80.57881084206193
                            80.82857188422419     80.70677614429169
                            81.06447745878245     81.03868541295149
                            81.21620802278490     82.16354660940607
```
2020-03-11 00:59:34 -07:00
Takashi Kokubun 9511b4c8fa
Optimize away call data refs in JIT-ed method calls
According to ko1, `cd->cc != cc` was for GC.compact guard.
As we pin cc by rb_gc_mark(), we don't need the check.

```
$ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all
before --jit: ruby 2.8.0dev (2020-03-11T05:36:48Z master da6948753e) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-11T06:26:34Z master 36b20b8b4a) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    74.03480698689405     71.63404803273507 fps
                            74.15085286586992     73.43923328104295
                            75.51738277744781     75.75465268365384
                            76.24922600109410     76.74071607861318
                            76.45513422802325     77.47521029238116
                            76.86617230739330     78.14759496269018
                            77.71509137131933     79.14051571125866
                            77.72839157096146     79.35884822673313
                            78.25218904561633     79.92538876408051
                            78.72521071333249     79.98075556706726
                            78.79950460165091     80.51747831497875
                            79.43884960720381     80.97973166525254
```
2020-03-10 23:29:50 -07:00
Takashi Kokubun aa3a7d6d74
Remove an unnecessary TODO comment
Fixing 4bcd5981e8/mjit.c (L338)
should be the right solution for this. We may not be able to free the cc immediately.

Plus, we're not copying cc but just holding references to be marked. cc
should be GC-ed once jit_unit is freed.
2020-03-10 01:33:38 -07:00
Takashi Kokubun 4bcd5981e8
Capture inlined iseq's cc entries in root iseq's
jit_unit to avoid marking wrong cc entries when inlined iseq is compiled
multiple times, resolving the TODO added by daf7c48d88.

This obviates pseudo jit_unit in inlined iseq introduced by 7ec2359374
and fixes memory leak of the adhoc unit.
2020-03-10 00:53:35 -07:00