Граф коммитов

59 Коммитов

Автор SHA1 Сообщение Дата
Takashi Kokubun 0bd025ad69
Add a debug_counter for JIT cancel on leave 2020-05-28 22:45:35 -07:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Takashi Kokubun 310ef9f40b
Make vm_call_cfunc_with_frame a fastpath (#3027)
when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT
(i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat).

Micro benchmark:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
                         before       after
       vm_send_cfunc    69.585M     88.724M i/s -    100.000M times in 1.437097s 1.127096s

Comparison:
                    vm_send_cfunc
               after:  88723605.2 i/s
              before:  69584737.1 i/s - 1.28x  slower
```

Optcarrot:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
                                       before                 after
Optcarrot Lan_Master.nes    50.76119601545175     42.73858236484051 fps
                            50.76388649761503     51.04211379912850
                            50.80930672252514     51.39455790755538
                            50.90236000778749     51.75656936556145
                            51.01744746340430     51.86875277356489
                            51.06495279015112     51.88692482485558
                            51.07785337168974     51.93429603190578
                            51.20163525187862     51.95768145071314
                            51.34671771913112     52.45577266040274
                            51.35918340835583     52.53163888762858
                            51.46641337418146     52.62172484121034
                            51.50835463462257     52.85064021113239
```
2020-04-13 20:32:59 -07:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Takashi Kokubun b736ea63bd
Optimize exivar access on JIT-ed getivar
JIT support of dd723771c1.

$ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4
before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux]
Calculating -------------------------------------
                         before  before --jit  after --jit
         mjit_exivar    57.944M       53.579M      54.471M i/s -    200.000M times in 3.451588s 3.732772s 3.671687s

Comparison:
                      mjit_exivar
              before:  57944345.1 i/s
         after --jit:  54470876.7 i/s - 1.06x  slower
        before --jit:  53579483.4 i/s - 1.08x  slower
2020-03-30 23:16:35 -07:00
Kazuhiro NISHIYAMA 304538e6ff
Fix typos [ci skip] 2020-03-16 23:30:33 +09:00
Takashi Kokubun f6a54e6e46
Add debug counter for unload_units
changing add_iseq_to_process's debug counter name as well for comparison
2020-03-15 00:24:18 -07:00
Koichi Sasada b9007b6c54 Introduce disposable call-cache.
This patch contains several ideas:

(1) Disposable inline method cache (IMC) for race-free inline method cache
    * Making call-cache (CC) as a RVALUE (GC target object) and allocate new
      CC on cache miss.
    * This technique allows race-free access from parallel processing
      elements like RCU.
(2) Introduce per-Class method cache (pCMC)
    * Instead of fixed-size global method cache (GMC), pCMC allows flexible
      cache size.
    * Caching CCs reduces CC allocation and allow sharing CC's fast-path
      between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
    entries (MEs)
    * Instead of using class serials, we set "invalidated" flag for method
      entry itself to represent cache invalidation.
    * Compare with using class serials, the impact of method modification
      (add/overwrite/delete) is small.
    * Updating class serials invalidate all method caches of the class and
      sub-classes.
    * Proposed approach only invalidate the method cache of only one ME.

See [Feature #16614] for more details.
2020-02-22 09:58:59 +09:00
Koichi Sasada f2286925f0 VALUE size packed callinfo (ci).
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).

iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).

To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).

struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.

rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
2020-02-22 09:58:59 +09:00
卜部昌平 ba78bf9778 debug_counter.h must be self-contained
Include what is necessary.
2019-12-26 20:45:12 +09:00
Koichi Sasada 5220145ea2 add debug_counter access functions.
These functions are enabled only on USE_DEBUG_COUNTER=1.
2019-12-25 01:34:41 +09:00
Koichi Sasada 9eeaae432b add more debug counters to count numeric objects. 2019-12-23 16:31:17 +09:00
Nobuyoshi Nakada cc87037f1c
Fixed misspellings
Fixed misspellings reported at [Bug #16437], missed and a new
typo.
2019-12-22 22:49:17 +09:00
卜部昌平 dcb603bbdb describe mc_miss_reuse_call [ci skip] 2019-12-18 14:14:51 +09:00
Koichi Sasada fbe229906b add debug counter to count `call` reusing cases. 2019-12-17 13:15:38 +09:00
卜部昌平 3ffd98c5cd add debug counters for vm_search_method_slowpath()
Implemented fine-grained inspection of cache misshits.  Handy for
counting the reasons why an inline method cache was evicted.
2019-10-03 15:24:09 +09:00
Koichi Sasada 3deeb3fd91 introduce `obj_ary_extracapa`.
Introduce a new debug counter `obj_ary_extracapa` which counts
arrays which are `len < capa`.
2019-09-25 17:01:54 +09:00
Takashi Kokubun 4ffcadd39c
Fix rb_define_singleton_method warning
for debug counters

```
../include/ruby/intern.h:1175:137: warning: passing argument 3 of 'rb_define_singleton_method0' from incompatible pointer type [-Wincompatible-pointer-types]
 #define rb_define_singleton_method(klass, mid, func, arity) rb_define_singleton_method_choose_prototypem3((arity),(func))((klass),(mid),(func),(arity));
                                                                                                                                         ^
../vm.c:2958:5: note: in expansion of macro 'rb_define_singleton_method'
     rb_define_singleton_method(rb_cRubyVM, "show_debug_counters", rb_debug_counter_show, 0);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~
../include/ruby/intern.h:1139:99: note: expected 'VALUE (*)(VALUE) {aka long unsigned int (*)(long unsigned int)}' but argument is of type 'VALUE (*)(void) {aka long unsigned int (*)(void)}'
 __attribute__((__unused__,__weakref__("rb_define_singleton_method"),__nonnull__(2,3)))static void rb_define_singleton_method0 (VALUE,const char*,VALUE(*)(VALUE),int);
```
2019-09-20 17:44:48 +09:00
Aaron Patterson 70fd099220
Add a way to print debug counters without exiting
I am trying to study debug counters inside a Rails application.
Accessing debug counters by killing the process is hard because child
processes don't get the same TRAP as the parent, and Rails seems to
intercept calls to `exit`.  Adding this method lets me print the debug
counters when I want (at the end of requests for example)
2019-08-07 10:36:17 -07:00
Koichi Sasada e03b3b4ae0 add debug_counters to check details.
add debug_counters to check the Hash object statistics.
2019-08-02 15:59:47 +09:00
Koichi Sasada 72825c35b0 Use 1 byte hint for ar_table [Feature #15602]
On ar_table, Do not keep a full-length hash value (FLHV, 8 bytes)
but keep a 1 byte hint from a FLHV (lowest byte of FLHV).
An ar_table only contains at least 8 entries, so hints consumes
8 bytes at most. We can store hints in RHash::ar_hint.

On 32bit CPU, we use 4 entries ar_table.

The advantages:
* We don't need to keep FLHV so ar_table only consumes
  16 bytes (VALUEs of key and value) * 8 entries = 128 bytes.
* We don't need to scan ar_table, but only need to check hints
  in many cases. Especially we don't need to access ar_table
  if there is no match entries (in many cases).
  It will increase memory cache locality.

The disadvantages:
* This technique can increase `#eql?` time because hints can
  conflicts (in theory, it conflicts once in 256 times).
  It can introduce incompatibility if there is a object x where
  x.eql? returns true even if hash values are different.
  I believe we don't need to care such irregular case.
* We need to re-calculate FLHV if we need to switch from ar_table
  to st_table (e.g. exceeds 8 entries).
  It also can introduce incompatibility, on mutating key objects.
  I believe we don't need to care such irregular case too.

Add new debug counters to measure the performance:
* artable_hint_hit - hint is matched and eql?#=>true
* artable_hint_miss - hint is not matched but eql?#=>false
* artable_hint_notfound - lookup counts
2019-07-31 09:52:03 +09:00
Koichi Sasada fba3e76e3f fix debug counter for Hash counts.
Change debug_counters for Hash object counts:

* obj_hash_under4 (1-3) -> obj_hash_1_4 (1-4)
* obj_hash_ge4 (4-7)    -> obj_hash_5_8 (5-8)
* obj_hash_ge8 (>=8)    -> obj_hash_g8  (> 8)

For example on rdoc benchmark:

[RUBY_DEBUG_COUNTER]    obj_hash_empty                         554,900
[RUBY_DEBUG_COUNTER]    obj_hash_under4                        572,998
[RUBY_DEBUG_COUNTER]    obj_hash_ge4                             1,825
[RUBY_DEBUG_COUNTER]    obj_hash_ge8                             2,344
[RUBY_DEBUG_COUNTER]    obj_hash_empty                         553,097
[RUBY_DEBUG_COUNTER]    obj_hash_1_4                           571,880
[RUBY_DEBUG_COUNTER]    obj_hash_5_8                               982
[RUBY_DEBUG_COUNTER]    obj_hash_g8                              2,189
2019-07-19 16:24:14 +09:00
Koichi Sasada 182ae1407b fix shared array terminology.
Shared arrays created by Array#dup and so on points
a shared_root object to manage lifetime of Array buffer.
However, sometimes shared_root is called only shared so
it is confusing. So I fixed these wording "shared" to "shared_root".

* RArray::heap::aux::shared -> RArray::heap::aux::shared_root
* ARY_SHARED() -> ARY_SHARED_ROOT()
* ARY_SHARED_NUM() -> ARY_SHARED_ROOT_REFCNT()

Also, add some debug_counters to count shared array objects.

* ary_shared_create: shared ary by Array#dup and so on.
* ary_shared: finished in shard.
* ary_shared_root_occupied: shared_root but has only 1 refcnt.
  The number (ary_shared - ary_shared_root_occupied) is meaningful.
2019-07-19 13:07:59 +09:00
Lourens Naudé a47f598d77
Reduce ONIG_NREGION from 10 to 4: power of 2 and testing revealed most pattern matches are less than or equal to 4 results
Closes: https://github.com/ruby/ruby/pull/2135
2019-05-07 21:58:55 +09:00
Koichi Sasada 4dc5d3c5dd add new debug_counters about is_pointer_to_heap().
is_pointer_to_heap() is used for conservative marking. To analyze
this function's behavior, introduce some debug_counters.
2019-05-07 14:10:43 +09:00
k0kubun b2ffafd238 Invalidate JIT-ed code if ISeq is moved by GC.compact
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67638 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-20 05:48:22 +00:00
k0kubun b041cf55bb Fix missing debug counter name
r67550 introduced the typo

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 13:10:14 +00:00
k0kubun 18b5148215 Add debug counter for MJIT stale_units
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67546 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 07:12:44 +00:00
k0kubun 5ce28c0642 Add RubyVM.reset_debug_counters when RB_DEBUG_COUNTER
is defined. It's 0 by default and so it dissappears on actual build.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67544 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-14 06:57:21 +00:00
k0kubun 5a6b0e39a1 Add debug counter for VM <-> MJIT calls
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-06 14:42:02 +00:00
k0kubun 9d047fd3e5 Add mjit_compile_failures debug counter
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 14:44:09 +00:00
k0kubun b03c11a337 Add debug counters for MJIT cancel
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67379 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 13:54:29 +00:00
k0kubun 017cb09eb6 Add debug counter for rb_mjit_unit_list
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67378 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 13:24:56 +00:00
kazu 26d56f7b6f Fix a typo [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 13:06:48 +00:00
k0kubun bb5ab13bf5 Add debug counters for MJIT
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 12:52:59 +00:00
k0kubun fe904f1eec Elaborate more on some debug counters [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-29 12:31:08 +00:00
ko1 c671f836b4 add debug counters to count call cache fastpath.
Add counters to count ccf (call cache fastpath) usage.
These counters will help which kind of method dispatch
is important to optimize.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67336 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-22 07:57:26 +00:00
ko1 e4c79d0d10 rename li_table->ar_table (and related names).
* internal.h: rename the following names:
  * li_table -> ar_table. "li" means linear (from linear search),
    but we use the word "array" (from data layout).
  * RHASH_ARRAY -> RHASH_AR_TABLE. AR_TABLE is more clear.
  * rb_hash_array_* -> rb_hash_ar_table_*.
  * RHASH_TABLE_P() -> RHASH_ST_TABLE_P(). more clear.
  * RHASH_CLEAR() -> RHASH_ST_CLEAR().

* hash.c: rename "linear_" prefix functions to "ar_" prefix.

* hash.c (linear_init_table): rename to ar_alloc_table.

* debug_counter.h: rename obj_hash_array to obj_hash_ar.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-14 01:10:15 +00:00
ko1 8f675cdd00 support theap for T_HASH. [Feature #14989]
* hash.c, internal.h: support theap for small Hash.
  Introduce RHASH_ARRAY (li_table) besides st_table and small Hash
  (<=8 entries) are managed by an array data structure.
  This array data can be managed by theap.
  If st_table is needed, then converting array data to st_table data.

  For st_table using code, we prepare "stlike" APIs which accepts hash value
  and are very similar to st_ APIs.

  This work is based on the GSoC achievement
  by tacinight <tacingiht@gmail.com> and refined by ko1.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65454 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 22:11:51 +00:00
ko1 198ff42258 support theap for T_STRUCT.
* struct.c: members memory can use theap.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65452 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 22:03:42 +00:00
ko1 873d57347f support theap for T_OBJECT.
* variable.c: now instance variable space has theap supports.
  obj_ivar_heap_alloc() tries to acquire memory from theap.

* debug_counter.h: add some counters for theap.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65451 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 22:01:17 +00:00
ko1 312b105d0e introduce TransientHeap. [Bug #14858]
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
  theap is designed for Ruby's object system. theap is like Eden heap
  on generational GC terminology. theap allocation is very fast because
  it only needs to bump up pointer and deallocation is also fast because
  we don't do anything. However we need to evacuate (Copy GC terminology)
  if theap memory is long-lived. Evacuation logic is needed for each type.

  See [Bug #14858] for details.

* array.c: Now, theap for T_ARRAY is supported.

  ary_heap_alloc() tries to allocate memory area from theap. If this trial
  sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
  We don't need to free theap ptr.

* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
  if ary is allocated at theap, force evacuation to malloc'ed memory.
  It makes programs slow, but very compatible with current code because
  theap memory can be evacuated (theap memory will be recycled).

  If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
  instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
  will occur, use RARRAY_CONST_PTR().

(re-commit of r65444)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 21:53:56 +00:00
ko1 7d359f9b69 revert r65444 and r65446 because of commit miss
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 21:01:55 +00:00
ko1 efe869c0e5 support theap for T_OBJECT.
* variable.c: now instance variable space has theap supports.
  obj_ivar_heap_alloc() tries to acquire memory from theap.

* debug_counter.h: add some counters for theap.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65446 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 20:49:35 +00:00
ko1 90ac549fa6 introduce TransientHeap. [Bug #14858]
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
  theap is designed for Ruby's object system. theap is like Eden heap
  on generational GC terminology. theap allocation is very fast because
  it only needs to bump up pointer and deallocation is also fast because
  we don't do anything. However we need to evacuate (Copy GC terminology)
  if theap memory is long-lived. Evacuation logic is needed for each type.

  See [Bug #14858] for details.

* array.c: Now, theap for T_ARRAY is supported.

  ary_heap_alloc() tries to allocate memory area from theap. If this trial
  sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
  We don't need to free theap ptr.

* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
  if ary is allocated at theap, force evacuation to malloc'ed memory.
  It makes programs slow, but very compatible with current code because
  theap memory can be evacuated (theap memory will be recycled).

  If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
  instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
  will occur, use RARRAY_CONST_PTR().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 20:46:24 +00:00
ko1 85173be41f add new counter about GC.
* debug_counter.h: add `gc_major_oldmalloc`.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65362 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-25 02:23:58 +00:00
ko1 f8dbff557a add new debug_counters for GC.
* debug_counter.h: add new debug counters to count GC.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-24 22:17:03 +00:00
ko1 aa1023edc4 add debug counters more.
* debug_counter.h: add debug counters to count frame state transitions:
  * frame_R2R: Ruby frame to Ruby frame
  * frame_R2C: Ruby frame to C frame
  * frame_C2C: C frame to C frame
  * frame_C2R: C frame to Ruby frame

* vm_insnhelper.c (vm_push_frame): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64871 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-28 03:35:15 +00:00
ko1 df5ec4107d add debug counters more.
* debug_counter.h: add the following counters.
  * frame_push: control frame counts (total counts).
  * frame_push_*: control frame counts per every frame type.
  * obj_*: add free'ed counts for each type.

* gc.c: ditto.

* vm_insnhelper.c (vm_push_frame): ditto.

* debug_counter.c (rb_debug_counter_show_results): widen counts field
  to show >10G numbers.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64867 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-28 01:10:43 +00:00