which has been developed by Takashi Kokubun <takashikkbn@gmail> as
YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>.
This JIT compiler is designed to be a safe migration path to introduce
JIT compiler to MRI. So this commit does not include any bytecode
changes or dynamic instruction modifications, which are done in original
MJIT.
This commit even strips off some aggressive optimizations from
YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still
fairly faster than Ruby 2.5 in some benchmarks (attached below).
Note that this JIT compiler passes `make test`, `make test-all`, `make
test-spec` without JIT, and even with JIT. Not only it's perfectly safe
with JIT disabled because it does not replace VM instructions unlike
MJIT, but also with JIT enabled it stably runs Ruby applications
including Rails applications.
I'm expecting this version as just "initial" JIT compiler. I have many
optimization ideas which are skipped for initial merging, and you may
easily replace this JIT compiler with a faster one by just replacing
mjit_compile.c. `mjit_compile` interface is designed for the purpose.
common.mk: update dependencies for mjit_compile.c.
internal.h: declare `rb_vm_insn_addr2insn` for MJIT.
vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to
compiler. This avoids to include some functions which take a long time
to compile, e.g. vm_exec_core. Some of the purpose is achieved in
transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are
manually resolved for now. Load mjit_helper.h for MJIT header.
mjit_helper.h: New. This is a file used only by JIT-ed code. I'll
refactor `mjit_call_cfunc` later.
vm_eval.c: add some #ifdef switches to skip compiling some functions
like Init_vm_eval.
win32/mkexports.rb: export thread/ec functions, which are used by MJIT.
include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify
that a function is exported only for MJIT.
array.c: export a function used by MJIT.
bignum.c: ditto.
class.c: ditto.
compile.c: ditto.
error.c: ditto.
gc.c: ditto.
hash.c: ditto.
iseq.c: ditto.
numeric.c: ditto.
object.c: ditto.
proc.c: ditto.
re.c: ditto.
st.c: ditto.
string.c: ditto.
thread.c: ditto.
variable.c: ditto.
vm_backtrace.c: ditto.
vm_insnhelper.c: ditto.
vm_method.c: ditto.
I would like to improve maintainability of function exports, but I
believe this way is acceptable as initial merging if we clarify the
new exports are for MJIT (so that we can use them as TODO list to fix)
and add unit tests to detect unresolved symbols.
I'll add unit tests of JIT compilations in succeeding commits.
Author: Takashi Kokubun <takashikkbn@gmail.com>
Contributor: wanabe <s.wanabe@gmail.com>
Part of [Feature #14235]
---
* Known issues
* Code generated by gcc is faster than clang. The benchmark may be worse
in macOS. Following benchmark result is provided by gcc w/ Linux.
* Performance is decreased when Google Chrome is running
* JIT can work on MinGW, but it doesn't improve performance at least
in short running benchmark.
* Currently it doesn't perform well with Rails. We'll try to fix this
before release.
---
* Benchmark reslts
Benchmarked with:
Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores
- 2.0.0-p0: Ruby 2.0.0-p0
- r62186: Ruby trunk (early 2.6.0), before MJIT changes
- JIT off: On this commit, but without `--jit` option
- JIT on: On this commit, and with `--jit` option
** Optcarrot fps
Benchmark: https://github.com/mame/optcarrot
| |2.0.0-p0 |r62186 |JIT off |JIT on |
|:--------|:--------|:--------|:--------|:--------|
|fps |37.32 |51.46 |51.31 |58.88 |
|vs 2.0.0 |1.00x |1.38x |1.37x |1.58x |
** MJIT benchmarks
Benchmark: https://github.com/benchmark-driver/mjit-benchmarks
(Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks)
| |2.0.0-p0 |r62186 |JIT off |JIT on |
|:----------|:--------|:--------|:--------|:--------|
|aread |1.00 |1.09 |1.07 |2.19 |
|aref |1.00 |1.13 |1.11 |2.22 |
|aset |1.00 |1.50 |1.45 |2.64 |
|awrite |1.00 |1.17 |1.13 |2.20 |
|call |1.00 |1.29 |1.26 |2.02 |
|const2 |1.00 |1.10 |1.10 |2.19 |
|const |1.00 |1.11 |1.10 |2.19 |
|fannk |1.00 |1.04 |1.02 |1.00 |
|fib |1.00 |1.32 |1.31 |1.84 |
|ivread |1.00 |1.13 |1.12 |2.43 |
|ivwrite |1.00 |1.23 |1.21 |2.40 |
|mandelbrot |1.00 |1.13 |1.16 |1.28 |
|meteor |1.00 |2.97 |2.92 |3.17 |
|nbody |1.00 |1.17 |1.15 |1.49 |
|nest-ntimes|1.00 |1.22 |1.20 |1.39 |
|nest-while |1.00 |1.10 |1.10 |1.37 |
|norm |1.00 |1.18 |1.16 |1.24 |
|nsvb |1.00 |1.16 |1.16 |1.17 |
|red-black |1.00 |1.02 |0.99 |1.12 |
|sieve |1.00 |1.30 |1.28 |1.62 |
|trees |1.00 |1.14 |1.13 |1.19 |
|while |1.00 |1.12 |1.11 |2.41 |
** Discourse's script/bench.rb
Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb
NOTE: Rails performance was somehow a little degraded with JIT for now.
We should fix this.
(At least I know opt_aref is performing badly in JIT and I have an idea
to fix it. Please wait for the fix.)
*** JIT off
Your Results: (note for timings- percentile is first, duration is second in millisecs)
categories_admin:
50: 17
75: 18
90: 22
99: 29
home_admin:
50: 21
75: 21
90: 27
99: 40
topic_admin:
50: 17
75: 18
90: 22
99: 32
categories:
50: 35
75: 41
90: 43
99: 77
home:
50: 39
75: 46
90: 49
99: 95
topic:
50: 46
75: 52
90: 56
99: 101
*** JIT on
Your Results: (note for timings- percentile is first, duration is second in millisecs)
categories_admin:
50: 19
75: 21
90: 25
99: 33
home_admin:
50: 24
75: 26
90: 30
99: 35
topic_admin:
50: 19
75: 20
90: 25
99: 30
categories:
50: 40
75: 44
90: 48
99: 76
home:
50: 42
75: 48
90: 51
99: 89
topic:
50: 49
75: 55
90: 58
99: 99
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
r61560 ("offsetof(type, foo.bar) is (arguably) a GCCism")
introduced 16 bytes of stack overhead on 64-bit systems.
Remove that overhead and cast, instead. While we're at it,
restore the "waitq" name to clarify the purpose of the field.
(This is one unfortunate consequence of the CC0 ccan/list.h
implementation compared to the *GPL ones in glibc/urcu/linux)
* variable.c (struct autoload_state): remove head field, clarify naming
(autoload_reset): cast and adjust
(autoload_sleep_done): ditto
(rb_autoload_load): ditto
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
TL;DR see http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2031.htm
Suppose we have:
struct X {
struct Y {
z_t z;
} y;
} x;
then, you _cant_ infer offsetof(struct X, y.z). The ISO C99 section
7.17 says nothing about such situation. At least clang warns this
being an extension to the language (-Wextended-offsetof).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61560 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We do not need list_del_init in ensure callbacks, only list_del,
since it can only ever be called after list_del_init in
autoload_reset. So avoid the needless re-initialization.
* variable.c (autoload_sleep_done): s/list_del_init/list_del/
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58848 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We cannot assume autoload_provided/rb_feature_provided returning
TRUE means it is safe to proceed without waiting. Another
thread may call rb_provide_feature before setting the constant
(via autoload_const_set). So we must wait until autoload is
completed by another thread.
Note: this patch was tested with an explicit rb_thread_schedule
in rb_provide_feature to make the race condition more apparent
as suggested by <s.wanabe@gmail.com>:
> --- a/load.c
> +++ b/load.c
> @@ -563,6 +563,7 @@ rb_provide_feature(VALUE feature)
> rb_str_freeze(feature);
>
> rb_ary_push(features, rb_fstring(feature));
> +rb_thread_schedule();
> features_index_add(feature, INT2FIX(RARRAY_LEN(features)-1));
> reset_loaded_features_snapshot();
> }
* variable.c (check_autoload_required): do not assume a provided
feature means autoload is complete, always wait if autoload is
being performed by another thread.
[ruby-core:81105] [Bug #11384] Thanks to <s.wanabe@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58696 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (autoload_reset): use idempotent list_del_init
(autoload_sleep): moved code from rb_autoload_load
(autoload_sleep_done): cleanup for use with rb_ensure
(rb_autoload_load): ensure list delete happens in case the
thread dies during sleep
* test/ruby/bug-13526.rb: new script for separate execution
* test/ruby/test_autoload.rb (test_bug_13526): new test
[ruby-core:81016] [Bug #13526]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58587 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* How to enable this feature?
* define USE_DEBUG_COUNTER as 1.
* you can disable to output the result with
RUBY_DEBUG_COUNTER_DISABLE environment variable
even if USE_DEBUG_COUNTER == 1.
* How to add new counter?
* add COUNTER(<name>) line on debug_counter.h.
* include "debug_counter.h"
* insert RB_DEBUG_COUNTER_INC(<name>) line on your favorite place.
* counter output example:
[RUBY_DEBUG_COUNTER] mc_inline_hit 999
[RUBY_DEBUG_COUNTER] mc_inline_miss 3
[RUBY_DEBUG_COUNTER] mc_global_hit 23
[RUBY_DEBUG_COUNTER] mc_global_miss 273
[RUBY_DEBUG_COUNTER] mc_global_state_miss 3
[RUBY_DEBUG_COUNTER] mc_class_serial_miss 0
[RUBY_DEBUG_COUNTER] mc_cme_complement 0
[RUBY_DEBUG_COUNTER] mc_cme_complement_hit 0
[RUBY_DEBUG_COUNTER] mc_search_super 1384
[RUBY_DEBUG_COUNTER] ivar_get_hit 0
[RUBY_DEBUG_COUNTER] ivar_get_miss 0
[RUBY_DEBUG_COUNTER] ivar_set_hit 0
[RUBY_DEBUG_COUNTER] ivar_set_miss 0
[RUBY_DEBUG_COUNTER] ivar_get 431
[RUBY_DEBUG_COUNTER] ivar_set 465
* mc_... is related to method caching.
* ivar_... is related to instance variable accesses.
* compare with dtrace/system tap features, there are completely
no performacne penalties when it is disabled.
* This feature is supported only on __GNUC__ compilers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57676 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_generic_ivar_table): declare as noreturn only in
GCC, which does not err on different attributes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_const_set): fix the condition to cache the class
path and cache permanent or temporary path corresponding to the
outer klass. [ruby-core:79039] [Bug #13120]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57305 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_const_set): resolve and cache class name
immediately only if the outer class/module has the name,
otherwise just set the ID. [ruby-core:79007] [Bug #13113]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57284 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
If you have code like this:
```ruby
class A
def initialize
@a = nil
@b = nil
@c = nil
@d = nil
@e = nil
end
end
x = A.new
y = x.clone
100.times { |z| x.instance_variable_set(:"@foo#{z}", nil) }
puts y.inspect
```
`x` and `y` will share `iv_index_tbl` hashes. However, the size of the
hash will grow larger than the number if entries in `ivptr` in `y`.
Before this commit, `rb_ivar_count` would use the size of the hash to
determine how far to read in to the array, but this means that it could
read past the end of the array and cause the program to segv
[ruby-core:78403]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56938 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_deprecate_constant): new function to deprecate a
constant by the name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_const_search): warn with the actual class/module
name which defines the deprecated constant.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56118 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_const_search): raise with the actual class/module
name which defines the private constant.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56117 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_path_to_class): consider the string length
instead of a terminator.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_path_to_class): search the constant at once
instead of checking if defined and then getting it.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (check_autoload_required): check length first before
checking the first byte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55207 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* vm_insnhelper.c (vm_get_ev_const): warn deprecated constant even
in the class context. [ruby-core:75505] [Bug #12382]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55005 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_f_global_variables): add matched back references
only, as well as defiend? operator.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53534 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (rb_f_global_variables): add $1..$9 only if $~ is
set. fix the condition removed at r14014.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53530 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This further avoids class name resolution issues which came
about due to relying on hash table ordering before r53376.
Pre-caching the class name when it is never used raises memory
use, but the overall gain from moving away from st still gives
us a small gain. Reverting r53376 and this patch and testing with
"valgrind -v ./ruby -rrdoc -eexit" on x86 (32-bit) shows:
before:
in use at exit: 1,662,239 bytes in 25,286 blocks
total heap usage: 49,514 allocs, 24,228 frees, 6,005,561 bytes allocated
after, with this change:
in use at exit: 1,646,529 bytes in 24,572 blocks
total heap usage: 48,891 allocs, 24,319 frees, 6,003,921 bytes allocated
* class.c (Init_class_hierarchy): resolve name for rb_cObject ASAP
* object.c (rb_mod_const_set): move name resolution to rb_const_set
* variable.c (rb_const_set): do class resolution here
[ruby-core:72807] [Bug #11977]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53518 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to avoid name conflict with /usr/include/floatingpoint.h on
Solaris. [Bug #11853] [ruby-dev:49448]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53230 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_autoload_str may be safer by preventing premature GC. It
can also be more efficient by passing a pre-frozen string that
can be deduped using rb_fstring. Common autoload callers (e.g.
rubygems, rdoc) already use string literals as the file
argument.
There seems to be no reason to expose rb_autoload_str to the
public C API since autoload is not performance-critical.
Applications may declare autoloads in Ruby code or via
rb_funcall; so merely deprecate rb_autoload without exposing
rb_autoload_str to new users.
Running: valgrind -v ruby -rrdoc -rubygems -e exit
shows a minor memory reduction (32-bit userspace)
before:
in use at exit: 1,600,621 bytes in 28,819 blocks
total heap usage: 55,786 allocs, 26,967 frees, 6,693,790 bytes allocated
after:
in use at exit: 1,599,778 bytes in 28,789 blocks
total heap usage: 55,739 allocs, 26,950 frees, 6,692,973 bytes allocated
* include/ruby/intern.h (rb_autoload): deprecate
* internal.h (rb_autoload_str): declare
* load.c (rb_mod_autoload): use rb_autoload_str
* variable.c (rb_autoload): become compatibility wrapper
(rb_autoload_str): hoisted out from old rb_autoload
[ruby-core:71369] [Feature #11664]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52909 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Removing the indirection helps me with readability, at
least. It doesn't seem like there are many other places
in the Ruby code where macros are used like this.
[ruby-core:71735] [Feature #11749]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52790 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c (autoload_reset): initialize formally to suppress a
warning from container_off_var() by Visual C.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52446 b2dd03c8-39d4-4d8f-98ff-823fe69b080e