* st.c (rb_hash_bulk_insert): new API to bulk insert entries
into a hash. Given arguments are first inserted into the
table at once, then reindexed. This is faster than inserting
things using rb_hash_aset() one by one.
This arrangement (rb_ prefixed function placed in st.c) is
unavoidable because it both touches table internal and write
barrier at once.
* internal.h: delcare the new function.
* hash.c (rb_hash_s_create): use the new function.
* vm.c (core_hash_merge): ditto.
* insns.def (newhash): ditto.
* test/ruby/test_hash.rb: more coverage on hash creation.
* test/ruby/test_literal.rb: ditto.
-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name before after
loop_whileloop2 0.136 0.137
vm2_bighash* 1.249 0.623
Speedup ratio: compare with the result of `before' (greater is better)
name after
loop_whileloop2 0.996
vm2_bighash* 2.004
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We need to fix GC bug before merging this. Revert revisions
58452, 58435, 58434, 58428, 58427 in this order.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Same as rb_ary_tmp_new_from_values(), it reduces vm_exec_core binary
size from 26,176 bytes to 26,080 bytes. But this time, also with a
bit of optimizations:
- Because we are allocating a new hash and no back references are
introduced at all, we can safely skip write barriers.
- Also, the iteration never recurs. We can avoid complicated
function callbacks by using st_insert instead of st_update.
----
* hash.c (rb_hash_new_from_values): refactor
extract the bulk insert into a function.
* hash.c (rb_hash_new_from_object): also refactor.
* hash.c (rb_hash_s_create): use the new functions.
* insns.def (newhash): ditto.
* vm.c (core_hash_from_ary): ditto.
* iternal.h: export the new function.
-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name before after
loop_whileloop2 0.135 0.134
vm2_bighash* 1.236 0.687
Speedup ratio: compare with the result of `before' (greater is better)
name after
loop_whileloop2 1.008
vm2_bighash* 1.798
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58427 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Found a part where copy&paste can be eliminated. Reduces vm_exec_core
from 26,228 bytes to 26,176 bytes in size on my machine. I believe it
does not affect any runtime performance.
----
* array.c (rb_ary_tmp_new_from_values): extend existing
rb_ary_new_from_values function so that it can take
additional value for klass.
* array.c (rb_ary_new_from_values): use the new function.
* insns.def (toregexp): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58416 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* insns.def (trace): use cast `flag` to pass compilation with clang on MacOSX.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58394 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Contemporary C compilers are good at function inlining. They fold
multiple functions into one. However they are not yet smart enough to
unfold a function into several ones. So generally speaking, it is
wiser for a C programmer to manually split C functions whenever
possible. That should make rooms for compilers to optimize at will.
Before this changeset insns.def was converted into single HUGE
function called vm_exec_core(). By moving each instruction's core
into individual functions, generated C source code is reduced from
3,428 lines to 2,847 lines. Looking at the generated assembly
however, it seems my compiler (gcc 6.2) is extraordinary smart so that
it inlines almost all functions I introduced in this changeset back
into that vm_exec_core. On my machine compiled machine binary of the
function does not shrink very much in size (28,432 bytes to 26,816
bytes, according to nm(1)).
I believe this change is zero-cost. Several benchmarks I exercised
showed no significant difference beyond error mergin. For instance
3 repeated runs of optcarrot benchmark on my machine resulted in:
before this: 28.330329285707490, 27.513378371065920, 29.40420215754537
after this: 27.107195867280414, 25.549324021385907, 30.31581919050884
in fps (greater==faster).
----
* internal.h (rb_obj_not_equal): used from vm_insnhelper.c
* insns.def: move vast majority of lines into vm_insnhelper.c
* vm_insnhelper.c: moved here.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Unfortunately this enlarges insns.def by yet another
instruction. However, it is much prettier than opt_str_freeze
in use, and maybe we can avoid having so many instructions in
the future.
[ruby-core:80368]
* insns.def (DEFINE_INSN): new instruction: opt_str_uminus (maybe temporary)
* compile.c (iseq_compile_each0): split instructions
* test/ruby/test_optimization.rb (test_string_uminus): new test
* vm.c (vm_init_redefined_flag): set redefinintion flag for uminus
* vm_core.h (enum ruby_basic_operators): add BOP_UMINUS
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58144 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is auto-generated using following command:
svn diff -r57807:57788 include internal.h bignum.c numeric.c compile.c insns.def object.c sprintf.c | patch -p0
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Looking at the source code, FIXABLE tends to be just before LOING2FIX
to check applicability of that operation. Why not try computing first
then check for overflow, which should be optimial.
I also tried the same thing for unsigned types but resulted in slower
execution. It seems RB_POSFIXABLE() is fast enough on modern CPUs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57789 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
NOTE:
(1) Fixnum's LSB is always 1.
It means you can always run `x - 1` without overflow.
(2) Of course `z = x + (y-1)` may overflow.
Now z's LSB is always 1, and the MSB of true result is also 1.
You can get true result in long as `(1<<63)|(z>>1)`,
and it equals to `(z<<63)|(z>>1)` == `ror(z)`.
GCC and Clang have __builtin_add_ovewflow:
* https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
* https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57506 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Patch by Eric Wong [ruby-core:78797].
I don't like the idea of making insns.def any bigger to support
a corner case, and "test_hash_aref_fstring_identity" shows
how contrived this is.
[ruby-core:78783] [Bug #12855]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57278 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* insns.def (checkmatch): adjust type of the index variable, to
get rid of (potential) overflow.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56911 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* insns.def (opt_case_dispatch): extract float value only if the
Float method is not redefnined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56511 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
[Bug #12628]
This patch introduce many changes.
* Introduce concept of "Block Handler (BH)" to represent
passed blocks.
* move rb_control_frame_t::flag to ep[0] (as a special local
variable). This flags represents not only frame type, but also
env flags such as escaped.
* rename `rb_block_t` to `struct rb_block`.
* Make Proc, Binding and RubyVM::Env objects wb-protected.
Check [Bug #12628] for more details.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
instead of setting rb_thread_t::cfp directly.
* vm_insnhelper.c (vm_pop_frame): return the result of
finish frame or not.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
compilers to use x86 LEA instruction (3 operand).
Even if 3 operand LEA's latency is 3 cycle after SandyBridge,
it reduces code size and can be faster because of super scalar.
* insns.def (opt_plus): calculate and use rb_int2big.
On positive Fixnum overflow, `recv - 1 + obj` doesn't carry
because recv's msb and obj's msb are 0, and resulted msb is 1.
Therefore simply rshift and cast as signed long works fine.
On negative Fixnum overflow, it will carry because both arguments'
msb are 1, and resulted msb is also 1.
In this case it needs to restore carried sign bit after rshift.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55515 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* [Feature #12005] Unify Fixnum and Bignum into Integer
* include/ruby/ruby.h (rb_class_of): Return rb_cInteger for fixnums.
* insns.def (INTEGER_REDEFINED_OP_FLAG): Unified from
FIXNUM_REDEFINED_OP_FLAG and BIGNUM_REDEFINED_OP_FLAG.
* vm_core.h: Ditto.
* vm_insnhelper.c (opt_eq_func): Use INTEGER_REDEFINED_OP_FLAG instead
of FIXNUM_REDEFINED_OP_FLAG.
* vm.c (vm_redefinition_check_flag): Use rb_cInteger instead of
rb_cFixnum and rb_cBignum.
(C): Use Integer instead of Fixnum and Bignum.
* numeric.c (fix_succ): Removed.
(Init_Numeric): Define Fixnum as Integer.
* bignum.c (bignew): Use rb_cInteger instead of Rb_cBignum.
(rb_int_coerce): replaced from rb_big_coerce and return fixnums
as-is.
(Init_Bignum): Define Bignum as Integer.
Don't define ===.
* error.c (builtin_class_name): Return "Integer" for fixnums.
* sprintf.c (ruby__sfvextra): Use rb_cInteger instead of rb_cFixnum.
* ext/-test-/testutil: New directory to test.
Currently it provides utilities for fixnum and bignum.
* ext/json/generator/generator.c: Define mInteger_to_json.
* lib/mathn.rb (Fixnum#/): Redefinition removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55024 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
superclass of a class as Object and it has another superclass.
[Bug #12367] [ruby-core:75446]
* test/ruby/test_class.rb: test for above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54970 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
and define CONSTFUNC and PUREFUNC if available.
Note that I don't add those options as default because
it still shows many false-positive (it seems not to consider
longjmp).
* vm_eval.c (stack_check): get rb_thread_t* as an argument
to avoid duplicate call of GET_THREAD().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
x == FIXNUM_MIN && y == -1. This must be a rare case and it is
expected compiler to handle well.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54216 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
but LONG_LONG is always defined as 64bit), or there's int128_t.
* internal.h (DL2NUM): defined if DLONG is defined.
* internal.h (rb_fix_mul_fix): defined for `Fixnum * Fixnum`.
* insns.def (opt_mul): use rb_fix_mul_fix().
* numeric.c (fix_mul): ditto.
* time.c (mul): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54203 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Now `[x, y].max` is optimized so that a temporal array object is not
created in some condition.
* insns.def (opt_newarray_max, opt_newarray_min): added.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* bignum.c (rb_uint128t2big): added for opt_mult.
* bignum.c (rb_uint128t2big): added for rb_uint128t2big..
* configure.in: define int128_t, uint128_t and related MACROs.
Initially introduced by r41379 but reverted by r50749.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53741 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The position of `/* fall through */` was moved by r52931.
* insns.def (opt_case_dispatch): Move a comment to the
appropriate position.
[ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53423 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Infinity cannot be written as an optimizable literal,
so it can never match a key in a CDHASH.
Avoid converting it to prevent FloatDomainError.
* insns.def (opt_case_dispatch): avoid converting Infinity
* test/ruby/test_optimization.rb (test_opt_case_dispatch_inf): new
[ruby-dev:49423] [Bug #11804]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53039 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
and a pre-compilation/runtime loader sample.
[Feature #11788]
* iseq.c: add new methods:
* RubyVM::InstructionSequence#to_binary_format(extra_data = nil)
* RubyVM::InstructionSequence.from_binary_format(binary)
* RubyVM::InstructionSequence.from_binary_format_extra_data(binary)
* compile.c: implement body of this new feature.
* load.c (rb_load_internal0), iseq.c (rb_iseq_load_iseq):
call RubyVM::InstructionSequence.load_iseq(fname) with
loading script name if this method is defined.
We can return any ISeq object as a result value.
Otherwise loading will be continue as usual.
This interface is not matured and is not extensible.
So that we don't guarantee the future compatibility of this method.
Basically, you should'nt use this method.
* iseq.h: move ISEQ_MAJOR/MINOR_VERSION (and some definitions)
from iseq.c.
* encoding.c (rb_data_is_encoding), internal.h: added.
* vm_core.h: add several supports for lazy load.
* add USE_LAZY_LOAD macro to specify enable or disable of
this feature.
* add several fields to rb_iseq_t.
* introduce new macro rb_iseq_check().
* insns.def: some check for lazy loading feature.
* vm_insnhelper.c: ditto.
* proc.c: ditto.
* vm.c: ditto.
* test/lib/iseq_loader_checker.rb: enabled iff suitable
environment variables are provided.
* test/runner.rb: enable lib/iseq_loader_checker.rb.
* sample/iseq_loader.rb: add sample compiler and loader.
$ ruby sample/iseq_loader.rb [dir]
will compile all ruby scripts in [dir].
With default setting, this compile creates *.rb.yarb files
in same directory of target .rb scripts.
$ ruby -r sample/iseq_loader.rb [app]
will run with enable to load compiled binary data.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
nil/true/false are special literals just like floats, integers,
literal strings, and symbols. Optimize when statements with
them by using a jump table, too.
target 0: a (ruby 2.3.0dev (2015-12-08 trunk 52928) [x86_64-linux]) at "/home/ew/rrrr/b/ruby"
target 1: b (ruby 2.3.0dev (2015-12-08 master 52928) [x86_64-linux]) at "/home/ew/ruby/b/ruby"
benchmark results:
minimum results in each 5 measurements.
Execution time (sec)
name a b
loop_whileloop2 0.102 0.103
vm2_case_lit* 1.657 0.549
Speedup ratio: compare with the result of `a' (greater is better)
name b
loop_whileloop2 0.988
vm2_case_lit* 3.017
* benchmark/bm_vm2_case_lit.rb: new benchmark
* compile.c (case_when_optimizable_literal): add nil/true/false
* insns.def (opt_case_dispatch): ditto
* vm.c (vm_redefinition_check_flag): ditto
* vm.c (vm_init_redefined_flag): ditto
* vm_core.h: ditto
* object.c (InitVM_Object): define === explicitly for nil/true/false
* test/ruby/test_case.rb (test_deoptimize_nil): new test
* test/ruby/test_optimization.rb (test_opt_case_dispatch): update
(test_eqq): new test
[ruby-core:71923] [Feature #11769]
Original patch by Aaron Patterson <tenderlove@ruby-lang.org>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52931 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The missing check for Float#=== redefinition was noticed while
working on enhancing optimized case dispatch for nil/true/false
in [ruby-core:71818] https://bugs.ruby-lang.org/issues/11769
So no, I don't normally redefine core classes like this :P
* insns.def (opt_case_dispatch): check Float#=== redefinition
* test/ruby/test_optimization.rb (test_opt_case_dispatch): new
[ruby-core:71920] [Bug #11784]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52928 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
strings as default.
[Feature #11725]
* insns.def (freezestring): add new instruction to support adding
debug information for dynamically constracted strings.
* compile.c (iseq_compile_each): support adding debug information
for NODE_DSTR with freezestring instruction.
* error.c (rb_error_frozen): change the debug information ID name
id_debug_created_info and this field should have a 2 element array
containing path and line information.
* defs/id.def: ditto.
* test/ruby/test_rubyoptions.rb: catch up this fix.
* test/ruby/test_iseq.rb: now frozen strings are not same.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
current cref only when cached CREF list includes singleton class.
Singleton classes have own namespaces, so that we need to check
cref as a key (#10943).
However, if current CREF list does not include singleton class,
no need to check CREF beacuse it should be same name space.
* vm_insnhelper.c (vm_get_const_key_cref): add a function returns
CREF only when it includes singleton class.
* vm_core.h: constify iseq_inline_cache_entry::ic_cref.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52371 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h (RUBY_DTRACE_CREATE_HOOK): macro to call hook at
object creation.
* vm.c (rb_source_location, rb_source_loc): retrieve source path
and line number at once.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_peephole_optimize): peephole optimization for
branchnil jumps.
* compile.c (iseq_compile_each): generate save navigation operator
code.
* insns.def (branchnil): new opcode to pop the tos and branch if
it is nil.
* parse.y (NEW_QCALL, call_op, parser_yylex): parse token '.?'.
[Feature #11537]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52214 b2dd03c8-39d4-4d8f-98ff-823fe69b080e