Граф коммитов

102 Коммитов

Автор SHA1 Сообщение Дата
Takashi Kokubun 53babf35ef
Inline getconstant on JIT (#3906)
* Inline getconstant on JIT

* Support USE_MJIT=0
2020-12-16 06:24:07 -08:00
Takashi Kokubun d409837729
Cache access to reg_cfp->self on JIT
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T06:41:15Z master 8ce1711c25) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T08:36:02Z master 2c592126b9) +JIT [x86_64-linux]
last_commit=Cache access to reg_cfp->self on JIT
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    82.40522392468650     82.66023870551237 fps
                            82.67998539899482     83.08660305312587
                            85.51280693947453     87.09311989553235
                            86.32925337181406     87.16115255191410
                            87.35617494926235     87.30699391518075
                            87.91865339426212     88.47590342996875
                            88.11573661006648     88.64778616696353
                            88.16060826662158     88.67015079203991
                            88.21639244865058     89.19630739497482
                            88.47241577897603     89.23443637947730
                            89.37087287229809     89.57052723997015
                            89.46969964699964     89.97803363889025
```
2020-11-27 00:42:42 -08:00
Takashi Kokubun 8ce1711c25
Revert "Set VM_FRAME_FLAG_FINISH at once on MJIT"
This reverts commit 4d2c8edca6.

Unfortunately this seems to cause several issues:
https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true
http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802
2020-11-26 22:41:15 -08:00
Takashi Kokubun 4d2c8edca6
Set VM_FRAME_FLAG_FINISH at once on MJIT
Performance is probably improved?

$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master 69e77e81dc) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux]
last_commit=Set VM_FRAME_FLAG_FINISH at once
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    80.89292998533379     82.19497327502751 fps
                            80.93130641142331     85.13943315260148
                            81.06214830270119     87.43757879797808
                            82.29172808453910     87.89942441487113
                            84.61206450455929     87.91309779491075
                            85.44545883567997     87.98026086648694
                            86.02923132404449     88.03081060383973
                            86.07411817365879     88.14650206137341
                            86.34348799602836     88.32791633649961
                            87.90257338977324     88.57599644892220
                            88.58006509876580     88.67426384743277
                            89.26611118140011     88.81669430874207

This should have no bad impact on VM because this function is ALWAYS_INLINE.
2020-11-26 21:32:14 -08:00
Takashi Kokubun 2700df3c9d
ruby/internal/config.h needs to be included first
to define USE_MJIT.
2020-11-22 21:02:01 -08:00
Takashi Kokubun 0960f56a1d
Eliminate IVC sync between JIT and Ruby threads (#3799)
Thanks to Ractor (https://github.com/ruby/ruby/pull/2888 and https://github.com/ruby/ruby/pull/3662),
inline caches support parallel access now.
2020-11-20 22:18:37 -08:00
Koichi Sasada f6661f5085 sync RClass::ext::iv_index_tbl
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.

This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
2020-10-17 08:18:04 +09:00
Takashi Kokubun 65ab2385e3
Use size_t for MJIT's max_ivar_index
iseq_inline_iv_cache_entry's index is also size_t.
%"PRIuSIZE" seems to print warnings against st_index_t in some environments.
2020-09-08 09:22:34 -07:00
Takashi Kokubun a69dd699ee
Merge ivar guards on JIT (#3284)
when an ISeq has multiple ivar accesses.
2020-07-03 17:52:52 -07:00
Takashi Kokubun 40b40523dc
Show what's inlined first in "JIT inline" log
and add a debug log
2020-06-25 23:50:19 -07:00
Takashi Kokubun 7982dc1dfd
Decide JIT-ed insn based on cached cfunc
for opt_* insns.

opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is
handled by opt_send_without_block. Therefore we can't decide which insn
should be generated by checking whether it's cfunc cc or not.

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux]
last_commit=Decide JIT-ed insn based on cached cfunc
Calculating -------------------------------------
                     before --jit  after --jit
        mjit_nil?(1)      73.878M      74.021M i/s -     40.000M times in 0.541432s 0.540391s
         mjit_not(1)      72.635M      74.601M i/s -     40.000M times in 0.550702s 0.536187s
     mjit_eq(1, nil)       7.331M       7.445M i/s -      8.000M times in 1.091211s 1.074596s
     mjit_eq(nil, 1)      49.450M      64.711M i/s -      8.000M times in 0.161781s 0.123627s

Comparison:
                     mjit_nil?(1)
         after --jit:  74020528.4 i/s
        before --jit:  73878185.9 i/s - 1.00x  slower

                      mjit_not(1)
         after --jit:  74600882.0 i/s
        before --jit:  72634507.6 i/s - 1.03x  slower

                  mjit_eq(1, nil)
         after --jit:   7444657.4 i/s
        before --jit:   7331304.3 i/s - 1.02x  slower

                  mjit_eq(nil, 1)
         after --jit:  64710790.6 i/s
        before --jit:  49449507.4 i/s - 1.31x  slower
```
2020-06-25 23:33:08 -07:00
Takashi Kokubun 37a2e48d76
Avoid generating opt_send with cfunc cc with JIT
only for opt_nil_p and opt_not.

While vm_method_cfunc_is is used for opt_eq too, many fast paths of it
don't call it. So if it's populated, it should generate opt_send,
regardless of cfunc or not. And again, opt_neq isn't relevant due to the
difference in operands.
So opt_nil_p and opt_not are the only variants using vm_method_cfunc_is
like they use.

```
$ benchmark-driver -v --rbenv 'before2 --jit::ruby --jit;before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before2 --jit: ruby 2.8.0dev (2020-06-22T08:37:37Z master 3238641750) +JIT [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-06-23T01:01:24Z master 9ce2066209) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-23T06:58:37Z master 17e9df3157) +JIT [x86_64-linux]
last_commit=Avoid generating opt_send with cfunc cc with JIT
Calculating -------------------------------------
                     before2 --jit  before --jit  after --jit
        mjit_nil?(1)       54.204M       75.536M      75.031M i/s -     40.000M times in 0.737947s 0.529548s 0.533110s
         mjit_not(1)       53.822M       70.921M      71.920M i/s -     40.000M times in 0.743195s 0.564007s 0.556171s
     mjit_eq(1, nil)        7.367M        6.496M       7.331M i/s -      8.000M times in 1.085882s 1.231470s 1.091327s

Comparison:
                     mjit_nil?(1)
        before --jit:  75536059.3 i/s
         after --jit:  75031409.4 i/s - 1.01x  slower
       before2 --jit:  54204431.6 i/s - 1.39x  slower

                      mjit_not(1)
         after --jit:  71920324.1 i/s
        before --jit:  70921063.1 i/s - 1.01x  slower
       before2 --jit:  53821697.6 i/s - 1.34x  slower

                  mjit_eq(1, nil)
       before2 --jit:   7367280.0 i/s
         after --jit:   7330527.4 i/s - 1.01x  slower
        before --jit:   6496302.8 i/s - 1.13x  slower
```
2020-06-23 00:09:54 -07:00
Takashi Kokubun 78352fb52e
Compile opt_send for opt_* only when cc has ISeq
because opt_nil/opt_not/opt_eq populates cc even when it doesn't
fallback to opt_send_without_block because of vm_method_cfunc_is.

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-22T08:11:24Z master d231b8f95b) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-22T08:53:27Z master e1125879ed) +JIT [x86_64-linux]
last_commit=Compile opt_send for opt_* only when cc has ISeq
Calculating -------------------------------------
                     before --jit  after --jit
        mjit_nil?(1)      54.106M      73.693M i/s -     40.000M times in 0.739288s 0.542795s
         mjit_not(1)      53.398M      74.477M i/s -     40.000M times in 0.749090s 0.537075s
     mjit_eq(1, nil)       7.427M       6.497M i/s -      8.000M times in 1.077136s 1.231326s

Comparison:
                     mjit_nil?(1)
         after --jit:  73692594.3 i/s
        before --jit:  54106108.4 i/s - 1.36x  slower

                      mjit_not(1)
         after --jit:  74477487.9 i/s
        before --jit:  53398125.0 i/s - 1.39x  slower

                  mjit_eq(1, nil)
        before --jit:   7427105.9 i/s
         after --jit:   6497063.0 i/s - 1.14x  slower
```

Actually opt_eq becomes slower by this. Maybe it's indeed using
opt_send_without_block, but I'll approach that one in another commit.
2020-06-22 02:08:21 -07:00
Takashi Kokubun 7561db8c00
Introduce Primitive.attr! to annotate 'inline' (#3242)
[Feature #15589]
2020-06-20 17:13:03 -07:00
Takashi Kokubun 61b14bb32b
Eliminate a call instruction on JIT cancel path
by calling combined functions specialized for each cancel type.

I'm hoping to improve locality of hot code, but this patch's impact should
be insignificant.
2020-05-26 23:01:52 -07:00
Takashi Kokubun 3bada9208a
Simplify maybe_special_const_class_p 2020-05-17 23:42:24 -07:00
Takashi Kokubun b16a2aa938
Reduce code size for rb_class_of
by inlining only hot path.

=== mame/optcarrot ===

$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-05-18T05:21:31Z master 0e5a58b6bf) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-05-18T06:12:04Z master 0e3d71a8d1) +JIT [x86_64-linux]
last_commit=Reduce code size for rb_class_of
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    71.62880463568773     70.95730063273503 fps
                            71.73973684273152     71.98447841929851
                            75.03923801841310     75.54262519509039
                            75.16300287174957     77.64029272984344
                            75.16834828625935     78.67861469580785
                            75.17670723726911     78.81879353707393
                            75.67637908020630     79.18188850392886
                            76.19843953215396     79.66484891814478
                            77.28166716118808     79.80278072861037
                            77.38509903325165     80.05859292679696
                            78.12693418455953     80.34624804808006
                            78.73654441746730     80.66326571254345
                            79.25387513454415     80.69760605740196
                            79.44137881689524     81.32053489212245
                            79.50497657368358     81.50250852553751
                            79.62401328582868     82.27544931834611
                            79.79178811723664     82.67455264522741
                            81.20275352937418     82.93857260493297
                            81.57027048640776     83.15019118788184
                            81.63373188649095     83.20728816044721
                            81.93420437766426     83.25027576772972
                            82.05716136357167     83.27072145898173
                            82.21070805525066     83.36008265822194
                            82.56924063784872     83.36112268888493

=== benchmark-driver/sinatra ===

[rps]
before: 13143.49 rps
after: 13505.70 rps

[inlined rb_class_of size]
before: 11.5K
after: 3.8K

(calculated by `dwarftree --die inlined_subroutine --flat --merge --show-size`)
2020-05-17 23:38:19 -07:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Takashi Kokubun 0244f91e89
Remove OPT_CHECKED_RUN code
Now this one is actually not in use because we override entire leave
definition for JIT.
2020-05-06 20:24:19 -07:00
Takashi Kokubun 1e3c910bfc
Enable OPT_CHECKED_RUN on MJIT for debugging
Trying to debug errors like
http://ci.rvm.jp/results/trunk-mjit@silicon-docker/2921397
http://ci.rvm.jp/results/trunk-mjit@silicon-docker/2894526
2020-05-06 10:47:43 -07:00
Takashi Kokubun e19f4b3ac0
Fix MJIT compiler warnings in clang 2020-05-01 02:35:18 -07:00
Takashi Kokubun 90969edf9b
Fix a wrong argument of vm_exec on JIT cancel 2020-05-01 02:12:42 -07:00
Takashi Kokubun 5c8bfad078
Make sure unit->id is inherited
to child compile_status
2020-05-01 00:40:21 -07:00
Takashi Kokubun f5ddbba9a2
Include unit id in a function name of an inlined method
I'm trying to make it possible to include all JIT-ed code in a single C
file. This is needed to guarantee uniqueness of all function names
2020-04-30 23:08:13 -07:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Takashi Kokubun b736ea63bd
Optimize exivar access on JIT-ed getivar
JIT support of dd723771c1.

$ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4
before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux]
Calculating -------------------------------------
                         before  before --jit  after --jit
         mjit_exivar    57.944M       53.579M      54.471M i/s -    200.000M times in 3.451588s 3.732772s 3.671687s

Comparison:
                      mjit_exivar
              before:  57944345.1 i/s
         after --jit:  54470876.7 i/s - 1.06x  slower
        before --jit:  53579483.4 i/s - 1.08x  slower
2020-03-30 23:16:35 -07:00
Takashi Kokubun 0cd7be99e9
Avoid referring to an old value of realloc
OpenBSD RubyCI has failed with SEGV since 4bcd5981e8.
https://rubyci.org/logs/rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20200312T223005Z.fail.html.gz

This was because `status->cc_entries` could be stale after `realloc` call
for inlined iseqs.
2020-03-12 22:51:34 -07:00
Takashi Kokubun 4bcd5981e8
Capture inlined iseq's cc entries in root iseq's
jit_unit to avoid marking wrong cc entries when inlined iseq is compiled
multiple times, resolving the TODO added by daf7c48d88.

This obviates pseudo jit_unit in inlined iseq introduced by 7ec2359374
and fixes memory leak of the adhoc unit.
2020-03-10 00:53:35 -07:00
Takashi Kokubun 69f377a3d6
Internalize rb_mjit_unit definition again
Fixed a TODO in b9007b6c54
2020-02-26 00:27:29 -08:00
Koichi Sasada b9007b6c54 Introduce disposable call-cache.
This patch contains several ideas:

(1) Disposable inline method cache (IMC) for race-free inline method cache
    * Making call-cache (CC) as a RVALUE (GC target object) and allocate new
      CC on cache miss.
    * This technique allows race-free access from parallel processing
      elements like RCU.
(2) Introduce per-Class method cache (pCMC)
    * Instead of fixed-size global method cache (GMC), pCMC allows flexible
      cache size.
    * Caching CCs reduces CC allocation and allow sharing CC's fast-path
      between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
    entries (MEs)
    * Instead of using class serials, we set "invalidated" flag for method
      entry itself to represent cache invalidation.
    * Compare with using class serials, the impact of method modification
      (add/overwrite/delete) is small.
    * Updating class serials invalidate all method caches of the class and
      sub-classes.
    * Proposed approach only invalidate the method cache of only one ME.

See [Feature #16614] for more details.
2020-02-22 09:58:59 +09:00
Koichi Sasada f2286925f0 VALUE size packed callinfo (ci).
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).

iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).

To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).

struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.

rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
2020-02-22 09:58:59 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Koichi Sasada 1197a036ae Add a proper cast to pass JIT tests on mswin.
https://ci.appveyor.com/project/ruby/ruby/builds/29001248/job/ye80bsrmewdgw294
2019-11-21 04:38:40 +09:00
Koichi Sasada 46acd0075d support builtin features with Ruby and C.
Support loading builtin features written in Ruby, which implement
with C builtin functions.
[Feature #16254]

Several features:

(1) Load .rb file at boottime with native binary.

Now, prelude.rb is loaded at boottime. However, this file is contained
into the interpreter as a text format and we need to compile it.
This patch contains a feature to load from binary format.

(2) __builtin_func() in Ruby call func() written in C.

In Ruby file, we can write `__builtin_func()` like method call.
However this is not a method call, but special syntax to call
a function `func()` written in C. C functions should be defined
in a file (same compile unit) which load this .rb file.

Functions (`func` in above example) should be defined with
  (a) 1st parameter: rb_execution_context_t *ec
  (b) rest parameters (0 to 15).
  (c) VALUE return type.
This is very similar requirements for functions used by
rb_define_method(), however `rb_execution_context_t *ec`
is new requirement.

(3) automatic C code generation from .rb files.

tool/mk_builtin_loader.rb creates a C code to load .rb files
needed by miniruby and ruby command. This script is run by
BASERUBY, so *.rb should be written in BASERUBY compatbile
syntax. This script load a .rb file and find all of __builtin_
prefix method calls, and generate a part of C code to export
functions.

tool/mk_builtin_binary.rb creates a C code which contains
binary compiled Ruby files needed by ruby command.
2019-11-08 09:09:29 +09:00
卜部昌平 d45a013a1a extend rb_call_cache
Prior to this changeset, majority of inline cache mishits resulted
into the same method entry when rb_callable_method_entry() resolves
a method search.  Let's not call the function at the first place on
such situations.

In doing so we extend the struct rb_call_cache from 44 bytes (in
case of 64 bit machine) to 64 bytes, and fill the gap with
secondary class serial(s).  Call cache's class serials now behavies
as a LRU cache.

Calculating -------------------------------------
                           ours         2.7         2.6
vm2_poly_same_method     2.339M      1.744M      1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s

Comparison:
             vm2_poly_same_method
                ours:   2339103.0 i/s
                 2.7:   1743512.3 i/s - 1.34x  slower
                 2.6:   1369429.8 i/s - 1.71x  slower
2019-11-07 17:41:30 +09:00
Alan Wu 89e7997622 Combine call info and cache to speed up method invocation
To perform a regular method call, the VM needs two structs,
`rb_call_info` and `rb_call_cache`. At the moment, we allocate these two
structures in separate buffers. In the worst case, the CPU needs to read
4 cache lines to complete a method call. Putting the two structures
together reduces the maximum number of cache line reads to 2.

Combining the structures also saves 8 bytes per call site as the current
layout uses separate two pointers for the call info and the call cache.
This saves about 2 MiB on Discourse.

This change improves the Optcarrot benchmark at least 3%. For more
details, see attached bugs.ruby-lang.org ticket.

Complications:
 - A new instruction attribute `comptime_sp_inc` is introduced to
 calculate SP increase at compile time without using call caches. At
 compile time, a `TS_CALLDATA` operand points to a call info struct, but
 at runtime, the same operand points to a call data struct. Instruction
 that explicitly define `sp_inc` also need to define `comptime_sp_inc`.
 - MJIT code for copying call cache becomes slightly more complicated.
 - This changes the bytecode format, which might break existing tools.

[Misc #16258]
2019-10-24 18:03:42 +09:00
Takashi Kokubun 10cc6bc4d9
Just disable inlining with local varaible for now
This partially reverts commit 712a66b074.

The previous fix made CI strange like:
http://ci.rvm.jp/results/trunk-vm-asserts@silicon-docker/2124178

Let me just downgrade the behavior for now and deal with it later.

[Bug #15971]
2019-07-03 10:39:22 +09:00
k0kubun 72cbc3142d Revert "Try dropping const qualifier to suppress msiwn warning"
This reverts commit b023c1cc07.

in favor of r67666.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67667 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-21 03:35:07 +00:00
usa e6a7b8a487 Get rid of warinings of VC
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-21 03:33:05 +00:00
k0kubun b023c1cc07 Try dropping const qualifier to suppress msiwn warning
https://ci.appveyor.com/project/ruby/ruby/builds/23995093/job/qo728n1uorepkx16

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67665 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-21 03:32:03 +00:00
k0kubun fcd679ed11 Recompile without method inlining
if cancel happens in an inlined method.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-16 17:02:35 +00:00
k0kubun d71b78575b Introduce frame-omitted method inlining
for ISeq including only leaf and no-handles_sp insns except leaf.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-16 17:02:16 +00:00
k0kubun b0614decfc Implement single-level basic method inlining in JIT
"Basic" means it does not omit a call frame.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67572 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-16 17:01:05 +00:00
k0kubun e68495e5f4 Carve out mjit_compile_body
This refactoring is needed for implementing inlining later.
I've had this since long ago and it has conflicted sometimes. So let me
just commit this now.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67566 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-15 14:38:57 +00:00
k0kubun 1222534719 Share optimization cancel handlers
$ benchmark-driver benchmark.yml --rbenv='before --jit;after --jit' -v --output=all --repeat-count=12
before --jit: ruby 2.7.0dev (2019-04-14 trunk 67549) +JIT [x86_64-linux]
after --jit: ruby 2.7.0dev (2019-04-14 trunk 67549) +JIT [x86_64-linux]
last_commit=Share optimization cancel handlers
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    69.55360655447375     74.15329176797863 fps
                            73.74545038318978     79.60903046141544
                            75.85637357897092     82.00930075612054
                            77.10594124022951     82.56228187301674
                            78.67350527368366     83.37512204205953
                            79.97235230767613     83.41521927993719
                            81.03050342478066     84.20227901852776
                            81.61308297895094     84.73733526226468
                            82.06805141753206     85.27884867863791
                            82.46493179193394     85.36558922650367
                            83.85259832896313     85.39993587223481
                            84.02325292922997     85.63649355214602

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 12:25:23 +00:00
k0kubun fa13bb1a6f Unify comment styles across MJIT sources
I'm writing `//` comments in newer MJIT code after C99 enablement
(because I write 1-line comments more often than multi-line comments
 and `//` requires fewer chars on 1-line) and then they are mixed
with `/* */` now.

For consistency and to avoid the conversion in future changes, let me
finish the rewrite in MJIT-related code.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67533 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 05:26:46 +00:00
k0kubun 9b6b4674d7 Recompile JIT-ed code without optimization
based on inline cache when JIT cancel happens by that.

This feature was in the original MJIT implementation by Vladimir, but on
merging MJIT to Ruby it was removed for simplification. This commit adds
the functionality again for the following benchmark:

52f05781f6/concurrent-map/bench.rb
(shown float is duration seconds. shorter is better)

* Before
```
$ INHERIT=0 ruby -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) [x86_64-linux]
--
1.6507579649914987

$ INHERIT=0 ruby -v --jit bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) +JIT [x86_64-linux]
--
1.5091587850474752

$ INHERIT=1 ruby -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) [x86_64-linux]
--
1.6124781150138006

$ INHERIT=1 ruby --jit -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) +JIT [x86_64-linux]
--
1.7495657080435194 # <-- this
```

* After
```
$ INHERIT=0 ruby -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) [x86_64-linux]
last_commit=Recompile JIT-ed code without optimization
--
1.653559010999743

$ INHERIT=0 ruby --jit -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) +JIT [x86_64-linux]
last_commit=Recompile JIT-ed code without optimization
--
1.4738391840364784

$ INHERIT=1 ruby -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) [x86_64-linux]
last_commit=Recompile JIT-ed code without optimization
--
1.645227018976584

$ INHERIT=1 ruby --jit -v bench.rb
ruby 2.7.0dev (2019-04-13 trunk 67523) +JIT [x86_64-linux]
last_commit=Recompile JIT-ed code without optimization
--
1.523708809982054 # <-- this
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67530 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 04:52:02 +00:00
k0kubun b03c11a337 Add debug counters for MJIT cancel
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67379 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 13:54:29 +00:00
k0kubun c92c0a5935 Prefer using vm_base_ptr rather than cfp->bp
in MJIT implementation.

This allows us to drop cfp->bp by just modifying vm_base_ptr in the
future.

No performance impact:

$ benchmark-driver benchmark.yml --rbenv='before::before --disable-gems --jit;bp_::after --disable-gems --jit;vm_env_ptr::ruby-svn --disable-gems --jit' -v --output=all --repeat-count=12
before: ruby 2.7.0dev (2019-03-24 trunk 67341) +JIT [x86_64-linux]
bp_: ruby 2.7.0dev (2019-03-24 trunk 67342) +JIT [x86_64-linux]
vm_env_ptr: ruby 2.7.0dev (2019-03-25 trunk 67343) +JIT [x86_64-linux]
last_commit=Prefer using vm_base_ptr rather than cfp->bp
Calculating -------------------------------------
                                       before                   bp_            vm_env_ptr
Optcarrot Lan_Master.nes    77.15059205092646     70.18873044267853     69.62171387083328 fps
                            78.75767783870441     77.49867689173411     75.43496867709587
                            79.60102690369321     77.78037687683523     79.36688927929428
                            80.25144236638835     78.74729849101701     80.42363742291455
                            82.22375417165489     80.44265482494045     80.90287243299306
                            82.29166786292619     80.51740049420938     81.81153053252902
                            83.35386925305345     80.91054205210609     81.93562989125176
                            83.39770634366975     81.34550754145043     82.24544621470430
                            83.88523450309972     81.60698516017347     82.76801860263230
                            84.17553130135879     82.69615943446324     83.02530407910871
                            84.42132328119858     83.00969158037691     83.19968539409922
                            84.60731429793329     83.32703363300098     83.81352746019631

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67344 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-25 14:26:11 +00:00