Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
On USE_LAZY_LOAD=1, the iseq should be loaded. So rb_iseq_check()
is needed. Furthermore, now lazy loading with builtin_function_table
is not supported, so it should cancel lazy loading.
`foo(*rest, post, **empty_kw)` is compiled like
`foo(*rest + [post, **empty_kw])`, and `**empty_kw` is removed by
"newarraykwsplat" instruction.
However, the method call still has a flag of KW_SPLAT, so "post" is
considered as a keyword hash, which caused a segfault.
Note that the flag cannot be removed if "empty_kw" is not always empty.
This change fixes the issue by compiling arguments with "newarray"
instead of "newarraykwsplat".
[Bug #16442]
Now, C functions written by __builtin_cexpr!(code) and others are
named as "__builtin_inline#{n}". However, it is difficult to know
what the function is. This patch rename them into
"__builtin_foo_#{lineno}" when cexpr! is in 'foo' method.
%p is for void *. Becuase fprintf is a function with variadic arguments
automatic cast from any pointer to void * does not work. We have to be
explicit.
(This is the second try of 036bc1da6c6c9b0fa9b7f5968d897a9554dd770e.)
If iseq is GC'ed, the pointer of iseq may be reused, which may hide a
deprecation warning of keyword argument change.
http://ci.rvm.jp/results/trunk-test1@phosphorus-docker/2474221
```
1) Failure:
TestKeywordArguments#test_explicit_super_kwsplat [/tmp/ruby/v2/src/trunk-test1/test/ruby/test_keyword.rb:549]:
--- expected
+++ actual
@@ -1 +1 @@
-/The keyword argument is passed as the last hash parameter.* for `m'/m
+""
```
This change ad-hocly adds iseq_unique_id for each iseq, and use it
instead of iseq pointer. This covers the case where caller is GC'ed.
Still, the case where callee is GC'ed, is not covered.
But anyway, it is very rare that iseq is GC'ed. Even when it occurs, it
just hides some warnings. It's no big deal.
This commit introduces an "inline ivar cache" struct. The reason we
need this is so compaction can differentiate from an ivar cache and a
regular inline cache. Regular inline caches contain references to
`VALUE` and ivar caches just contain references to the ivar index. With
this new struct we can easily update references for inline caches (but
not inline var caches as they just contain an int)
jump-jump optimization ignores the event flags of the jump instruction
being skipped, which leads to overlook of line events.
This changeset stops the wrong optimization when coverage measurement is
neabled and when the jump instruction has any event flag.
Note that this issue is not only for coverage but also for TracePoint,
and this change does not fix TracePoint.
However, fixing it fundamentally is tough (which requires revamp of
the compiler). This issue is critical in terms of coverage measurement,
but minor for TracePoint (ko1 said), so we here choose a stopgap
measurement.
[Bug #15980] [Bug #16397]
Note for backporters: this changeset can be viewed by `git diff -w`.
rename __builtin_inline!(code) to __builtin_cstmt(code).
Also this commit introduce the following inlining C code features.
* __builtin_cstmt!(STMT)
(renamed from __builtin_inline!)
Define a function which run STMT implicitly and call this function at
evatuation time. Note that you need to return some value in STMT.
If there is a local variables (includes method parameters), you can
read these values.
static VALUE func(ec, self) {
VALUE x = ...;
STMT
}
Usage:
def double a
# a is readable from C code.
__builtin_cstmt! 'return INT2FIX(FIX2INT(a) * 2);'
end
* __builtin_cexpr!(EXPR)
Define a function which invoke EXPR implicitly like `__builtin_cstmt!`.
Different from cstmt!, which compiled with `return EXPR;`.
(`return` and `;` are added implicitly)
static VALUE func(ec, self) {
VALUE x = ...;
return EXPPR;
}
Usage:
def double a
__builtin_cexpr! 'INT2FIX(FIX2INT(a) * 2)'
end
* __builtin_cconst!(EXPR)
Define a function which invoke EXPR implicitly like cexpr!.
However, the function is called once at compile time, not evaluated time.
Any local variables are not accessible (because there is no local variable
at compile time).
Usage:
GCC = __builtin_cconst! '__GNUC__'
* __builtin_cinit!(STMT)
STMT are writtein in auto-generated code.
This code does not return any value.
Usage:
__builtin_cinit! '#include <zlib.h>'
def no_compression?
__builtin_cconst! 'Z_NO_COMPRESSION ? Qtrue : Qfalse'
end
These functions are used from within a compilation unit so we can
make them static, for better binary size. This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave
invokes builtin functions with same parameters of the method.
This technique eliminate stack push operations. However, delegation
parameters should be completely same as given parameters.
(e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but
__builtin_foo(b, c) is not allowed)
This patch relaxes this restriction. ISeq has a local variables
table which includes parameters. For example, the method defined
as `def foo(a, b, c) x=y=nil`, then local variables table contains
[a, b, c, x, y]. If calling builtin-function with arguments which
are sub-array of the lvar table, use opt_invokebuiltin_delegate
instruction with start index. For example, `__builtin_foo(b, c)`,
`__builtin_bar(c, x, y)` is okay, and so on.
Fixes [Bug #16332]
Constant access was changed to no longer allow top-level constant access
through `nil`, but `defined?` wasn't changed at the same time to stay
consistent.
Use a separate defined type to distinguish between a constant
referenced from the current lexical scope and one referenced from
another namespace.
Add an experimental `__builtin_inline!(c_expression)` special intrinsic
which run a C code snippet.
In `c_expression`, you can access the following variables:
* ec (rb_execution_context_t *)
* self (const VALUE)
* local variables (const VALUE)
Not that you can read these variables, but you can not write them.
You need to return from this expression and return value will be a
result of __builtin_inline!().
Examples:
`def foo(x) __builtin_inline!('return rb_p(x);'); end` calls `p(x)`.
`def double(x) __builtin_inline!('return INT2NUM(NUM2INT(x) * 2);')`
returns x*2.
Support loading builtin features written in Ruby, which implement
with C builtin functions.
[Feature #16254]
Several features:
(1) Load .rb file at boottime with native binary.
Now, prelude.rb is loaded at boottime. However, this file is contained
into the interpreter as a text format and we need to compile it.
This patch contains a feature to load from binary format.
(2) __builtin_func() in Ruby call func() written in C.
In Ruby file, we can write `__builtin_func()` like method call.
However this is not a method call, but special syntax to call
a function `func()` written in C. C functions should be defined
in a file (same compile unit) which load this .rb file.
Functions (`func` in above example) should be defined with
(a) 1st parameter: rb_execution_context_t *ec
(b) rest parameters (0 to 15).
(c) VALUE return type.
This is very similar requirements for functions used by
rb_define_method(), however `rb_execution_context_t *ec`
is new requirement.
(3) automatic C code generation from .rb files.
tool/mk_builtin_loader.rb creates a C code to load .rb files
needed by miniruby and ruby command. This script is run by
BASERUBY, so *.rb should be written in BASERUBY compatbile
syntax. This script load a .rb file and find all of __builtin_
prefix method calls, and generate a part of C code to export
functions.
tool/mk_builtin_binary.rb creates a C code which contains
binary compiled Ruby files needed by ruby command.
Get rid of these redundant and useless warnings.
```
$ ruby -e 'def bar(a) a; end; def foo(...) bar(...) end; foo({})'
-e:1: warning: The last argument is used as the keyword parameter
-e:1: warning: for `foo' defined here
-e:1: warning: The keyword argument is passed as the last hash parameter
-e:1: warning: for `bar' defined here
```
To perform a regular method call, the VM needs two structs,
`rb_call_info` and `rb_call_cache`. At the moment, we allocate these two
structures in separate buffers. In the worst case, the CPU needs to read
4 cache lines to complete a method call. Putting the two structures
together reduces the maximum number of cache line reads to 2.
Combining the structures also saves 8 bytes per call site as the current
layout uses separate two pointers for the call info and the call cache.
This saves about 2 MiB on Discourse.
This change improves the Optcarrot benchmark at least 3%. For more
details, see attached bugs.ruby-lang.org ticket.
Complications:
- A new instruction attribute `comptime_sp_inc` is introduced to
calculate SP increase at compile time without using call caches. At
compile time, a `TS_CALLDATA` operand points to a call info struct, but
at runtime, the same operand points to a call data struct. Instruction
that explicitly define `sp_inc` also need to define `comptime_sp_inc`.
- MJIT code for copying call cache becomes slightly more complicated.
- This changes the bytecode format, which might break existing tools.
[Misc #16258]
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`. Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
The parser needs to determine whether a local varaiable is defined or
not in outer scope. For the sake, "base_block" field has kept the outer
block.
However, the whole block was actually unneeded; the parser used only
base_block->iseq.
So, this change lets parser_params have the iseq directly, instead of
the whole block.
`freeze_string` essentially called iseq_add_mark_object_compile_time. I
need to know where all writes occur on the `rb_iseq_t`, so this commit
separates the function calls so we can add write barriers in the right
place.
Move the "add mark object" function to the location where we should be
calling RB_OBJ_WRITTEN. I'm going to add verification code next so we
can make sure the objects we're adding to the array are also reachable
from the mark function.
This makes it consistent with calling private attribute assignment
methods, which currently is allowed (e.g. `self.value =`).
Calling a private method in this way can be useful when trying to
assign the return value to a local variable with the same name.
[Feature #11297] [Feature #16123]
The output of RubyVM::InstructionSequence#to_binary is extremely large.
We have reduced the output of #to_binary by more than 70%.
The execution speed of RubyVM::InstructionSequence.load_from_binary is about 7% slower, but when reading a binary from a file, it may be faster than the master.
Since Bootsnap gem uses #to_binary, this proposal reduces the compilation cache size of Rails projects to about 1/4.
See details: [Feature #16163]
RubyVM::InstructionSequence.to_binary generates a bytecode binary
representation. To check compatibility with binary and loading
MRI we prepared major/minor version and compare them at loading
time. However, development version of MRI can change this format
but we can not increment minor version to make them consistent
with Ruby's major/minor versions.
To solve this issue, we introduce new minor version scheme
(binary's minor_version = ruby's minor * 10000 + dev ver)
and we can check incompatibility with older dev version.
The original code looks unnecessarily complicated (to me).
Also, it creates a pre-allocated array only for the prefix of the array.
The new code optimizes not only the prefix but also the subsequence that
is longer than 0x40 elements.
# not optimized
10000000.times { [1+1, 1,2,3,4,...,63] } # 2.12 sec.
# (1+1; push 1; push 2; ...; puts 63; newarray 64; concatarray)
# optimized
10000000.times { [1+1, 1,2,3,4,...,63,64] } # 1.46 sec.
# (1+1; newarray 1; putobject [1,2,3,...,64]; concatarray)
"popped" case can be so simple, so this change moves the branch to the
first, instead of scattering `if (popped)` branches to the main part.
Also, the return value "len" is not used. So it returns just 0 or 1.
compile_list was for the compilation of Array literal and Hash literal.
I guess it was originally reasonable to handle them in one function, but
now, compilation of Array is very different from Hash. So the function
was complicated by many branches for Array and Hash.
This change separates the function to two ones for Array and Hash.
An array literal [1,2,...,301] was compiled to the following iseq:
duparray [1,2,...,300]
putobject [301]
concatarray
The Array literal optimization took every two elements maybe because it
must handle not only Array but also Hash.
Now the optimization takes each element if it is an Array literal. So
the new iseq is: duparray [1,2,...,301].
`[{}, {}, {}, ..., {}, *{}]` is wrongly created.
A big array literal is created and concatenated for every 256 elements.
The newarraykwsplat must be emitted only at the last chunk.
and NODE_ZARRAY to NODE_ZLIST.
NODE_ARRAY is used not only by an Array literal, but also the contents
of Hash literals, method call arguments, dynamic string literals, etc.
In addition, the structure of NODE_ARRAY is a linked list, not an array.
This is very confusing, so I believe `NODE_LIST` is a better name.
Previously, **{} was removed by the parser:
```
$ ruby --dump=parse -e '{**{}}'
@ NODE_SCOPE (line: 1, location: (1,0)-(1,6))
+- nd_tbl: (empty)
+- nd_args:
| (null node)
+- nd_body:
@ NODE_HASH (line: 1, location: (1,0)-(1,6))*
+- nd_brace: 1 (hash literal)
+- nd_head:
(null node)
```
Since it was removed by the parser, the compiler did not know
about it, and `m(**{})` was therefore treated as `m()`.
This modifies the parser to not remove the `**{}`. A simple
approach for this is fairly simple by just removing a few
lines from the parser, but that would cause two hash
allocations every time it was used. The approach taken here
modifies both the parser and the compiler, and results in `**{}`
not allocating any hashes in the usual case.
The basic idea is we use a literal node in the parser containing
a frozen empty hash literal. In the compiler, we recognize when
that is used, and if it is the only keyword present, we just
push it onto the VM stack (no creation of a new hash or merging
of keywords). If it is the first keyword present, we push a
new empty hash onto the VM stack, so that later keywords can
merge into it. If it is not the first keyword present, we can
ignore it, since the there is no reason to merge an empty hash
into the existing hash.
Example instructions for `m(**{})`
Before (note ARGS_SIMPLE):
```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,7)> (catch: FALSE)
0000 putself ( 1)[Li]
0001 opt_send_without_block <callinfo!mid:m, argc:0, FCALL|ARGS_SIMPLE>, <callcache>
0004 leave
```
After (note putobject and KW_SPLAT):
```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,7)> (catch: FALSE)
0000 putself ( 1)[Li]
0001 putobject {}
0003 opt_send_without_block <callinfo!mid:m, argc:1, FCALL|KW_SPLAT>, <callcache>
0006 leave
```
Example instructions for `m(**h, **{})`
Before and After (no change):
```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 putself ( 1)[Li]
0001 putspecialobject 1
0003 newhash 0
0005 putself
0006 opt_send_without_block <callinfo!mid:h, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>
0009 opt_send_without_block <callinfo!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>, <callcache>
0012 opt_send_without_block <callinfo!mid:m, argc:1, FCALL|KW_SPLAT>, <callcache>
0015 leave
```
Example instructions for `m(**{}, **h)`
Before:
```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 putself ( 1)[Li]
0001 putspecialobject 1
0003 newhash 0
0005 putself
0006 opt_send_without_block <callinfo!mid:h, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>
0009 opt_send_without_block <callinfo!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>, <callcache>
0012 opt_send_without_block <callinfo!mid:m, argc:1, FCALL|KW_SPLAT>, <callcache>
0015 leave
```
After (basically the same except for the addition of swap):
```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 putself ( 1)[Li]
0001 newhash 0
0003 putspecialobject 1
0005 swap
0006 putself
0007 opt_send_without_block <callinfo!mid:h, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>
0010 opt_send_without_block <callinfo!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>, <callcache>
0013 opt_send_without_block <callinfo!mid:m, argc:1, FCALL|KW_SPLAT>, <callcache>
0016 leave
```
for simplicity and consistency.
Now SUPPORT_JOKE needs to be prefixed with OPT_ to make the config
visible in `RubyVM::VmOptsH`, and the inconsistency was introduced.
As it has never been available for override in configure (no #ifndef
guard), it should be fine to rename the config.
This syntax means the method should be treated as a method that
uses keyword arguments, but no specific keyword arguments are
supported, and therefore calling the method with keyword arguments
will raise an ArgumentError. It is still allowed to double splat
an empty hash when calling the method, as that does not pass
any keyword arguments.
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct. This commit adds function prototypes
for rb_hash_foreach / st_foreach_safe. Also fixes some prototype
mismatches.
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct. This commit deletes ANYARGS from
struct vm_ifunc, but in doing so we also have to decouple the usage
of this struct in compile.c, which (I think) is an abuse of ANYARGS.
Some tooling depends on the current bytecode, and adding an operand
changes the bytecode. While tooling can be updated for new bytecode,
this support doesn't warrant such a change.
This was an intentional bug added in 1.9.
The approach taken here is to add a second operand to the
getconstant instruction for whether nil should be allowed and
treated as current scope.
Fixes [Bug #11718]
Binary dumping the iseq for `case foo in []; end` used to crash as
there was no handling for these exception classes.
Pattern matching generates these classes as operands to `putobject`.
[Bug #16088]
Closes: https://github.com/ruby/ruby/pull/2325
`(ID)1` was assigned to NODE_ARGS#rest_arg for `{|x,| }`.
This change removes the magic number by introducing an explicit macro
variable for it: NODE_SPECIAL_EXCESSED_COMMA.
* internal.h (UNALIGNED_MEMBER_ACCESS, UNALIGNED_MEMBER_PTR):
moved from eval_intern.h.
* compile.c iseq.c, vm.c: use UNALIGNED_MEMBER_PTR for `entries`
in `struct iseq_catch_table`.
* vm_eval.c, vm_insnhelper.c: use UNALIGNED_MEMBER_PTR for `body`
in `rb_method_definition_t`.
Because hard to specify commits related to r67479 only.
So please commit again.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
ISeq pins references in the mark array during compile, so it manually
marks references in the mark_ary. This was causing write barrier
misses, so we need to add a write barrier when pushing on the mark
array.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67489 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* insns.def: add definemethod and definesmethod (singleton method)
instructions. Old YARV contains these instructions, but it is moved
to methods of FrozenCore class because remove number of instructions
can improve performance for some techniques (static stack caching
and so on). However, we don't employ these technique and it is hard
to optimize/analysis definition sequence. So I decide to introduce
them (and remove definition methods). `putiseq` insn is also removed.
* vm_method.c (rb_scope_visibility_get): renamed to
`vm_scope_visibility_get()` and make it accept `ec`.
Same for `vm_scope_module_func_check()`.
These fixes are result of refactoring `vm_define_method`.
* vm_insnhelper.c (rb_vm_get_cref): renamed to `vm_get_cref`
because of consistency with other functions.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
NODE_HASH#nd_brace is a flag that is 1 for `foo({ k: 1 })` and 0 for
`foo(k: 1)`.
nd_alen had been abused for the flag (and the implementation is
completely the same), but an explicit name is better to read.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67266 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
For unknown reason, setup_args processed the arguments from the last to
the first. This is not only difficult to read, but also inefficient in
some cases. For example, the arguments of `foo(*a1, *a2, *a3)` was
compiled like `a1.dup << (a2.dup << a3)`. The second dup (`a2.dup`) is
not needed.
This change refactors the function so that it processes the arguments
forward: `foo(*a1, *a2, *a3)` is compiled as `a1.dup << a2 << a3`, and
in my opinion, the source code is now much more readable.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67255 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
compile_array function had three usages: array literal, hash literal,
and method arguments. I think the third is completely different than the
first and second. For example, method arguments and popped are
meaningless; keywords_ptr and flag parameter for array/hash literal is
also unused.
This change refactors them: a function "compile_args" is created for the
third, and removes no longer used parameters of "compile_array".
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67252 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c: refacetoring:
* initialize `branches` with Qfalse intead of 0.
* make compile_call* functions from `iseq_compile_each0()`
to make modifying them easy.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (check_yield_place): this function check the yield location.
* show a warning if yield in `class` syntax. [Feature #15575]
* do strict check for toplevel `yield`. Without this patch,
`1.times{ yield }` in toplevel is valid-syntax (raise LocalJumpError
at runtime) although toplevel simple `yield` is not valid syntax.
This patch make them syntax error.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
When copying `leave` insn, TRACE also should be copied if it is
present, but this optimization is trivial and not worth the
complexity. [ruby-core:91366] [Bug #15578]
4cae5353c05afd479de6
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Not only TRACE_ELEMENT but also INSN_ELEMENT may have events.
The old pc2branchindex was created using only events of TRACE_ELEMENTs.
This change uses events of INSN_ELEMENTs too for pc2branchindex table.
[Bug #15476]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66676 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Previously, these hash literals were not frozen, and thus could be
modified by ObjectSpace, resulting in undesired behavior. Example:
```ruby
require 'objspace'
def a(b={0=>1,1=>4,2=>17})
b
end
p a
ObjectSpace.each_object(Hash) do |a|
a[3] = 8 if a.class == Hash && a[0] == 1 && a[1] == 4 && a[2] == 17
end
p a
```
It may be desirable to hide such hashes from ObjectSpace, since
they are internal, but I'm not sure how to do that.
From: Jeremy Evans <code@jeremyevans.net>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66464 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
`code_index` doesn't need to be incremented since the mark array has
been removed. Thanks for the patch ko1!
[ruby-core:90456] [Bug #15406]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_load_iseq_each): iseq_mark assumes that if
param.flags.has_kw is TRUE, then param.keyword is not NULL.
To confirm this assumption, make it FALSE before param.keyword
is initialized.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66367 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c: we need to restore `catch_except_p` flag at
`load_from_binary`. [Bug #15395]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66366 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_insert_nop_between_end_and_cont): insert nop so
that the end of rescue and continuing points are not same, to
get rid of infinite loop. [Bug #15385]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66326 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The coverage counters is initialized with `counter[lineno - 1] = 0`,
but lineno may be 0, which led to write access for index -1.
[ruby-core:90085] [Bug#15346]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66025 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (get_local_var_idx, get_dyna_var_idx): raise a compile
error which is useful than rb_bug, when ID is not found.
* compile.c (iseq_set_sequence): ditto when IC index overflow,
with dumping generated code.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65617 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_calc_param_size): use UNREACHABLE than rb_bug,
at where never reachable.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The instructions are just for optimization. To clarity the intention,
this change adds the prefix "opt_", like "opt_case_dispatch".
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65600 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
When loading iseq from binary while a TracePoint is on, we need to
recompile instructions to their "trace_" variant. Before this commit
we only recompiled instructions in the top level iseq, which meant
that TracePoint was malfunctioning for code inside module/class/method
definitions.
* compile.c: Move rb_iseq_init_trace to rb_ibf_load_iseq_complete.
It is called on all iseqs during loading.
* test_iseq.rb: Test that tracepoints fire within children iseq when
using load_from_binary.
This patch is from: Alan Wu <XrXr@users.noreply.github.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c, internal.h: support theap for small Hash.
Introduce RHASH_ARRAY (li_table) besides st_table and small Hash
(<=8 entries) are managed by an array data structure.
This array data can be managed by theap.
If st_table is needed, then converting array data to st_table data.
For st_table using code, we prepare "stlike" APIs which accepts hash value
and are very similar to st_ APIs.
This work is based on the GSoC achievement
by tacinight <tacingiht@gmail.com> and refined by ko1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65454 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
(re-commit of r65444)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_object_hash): use `rb_hash_foreach()`
instead of using `st_foreach()`.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65411 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_peephole_optimize): should `pop` before jump
instruction which succeeds to `newarray` of a literal object,
not after. [ruby-core:89536] [Bug #15245]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65350 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The instructions were used only for branch coverage.
Instead, it now uses a trace framework [Feature #14104].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This patch introduces "oneshot_lines" mode for `Coverage.start`, which
checks "whether each line was executed at least once or not", instead of
"how many times each line was executed". A hook for each line is fired
at most once, and after it is fired, the hook flag was removed; it runs
with zero overhead.
See [Feature #15022] in detail.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Recent GCC warns that default_len can be negative (thus can
overflow PTRDIFF_MAX), which is a false assert. Suppresses
warnings by adding __builtin_unreachable.
See also: https://travis-ci.org/ruby/ruby/jobs/443568193#L2227
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65167 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The former states explicitly that the argument must be a literal,
and can optimize away `strlen` on all compilers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65059 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
because r64849 seems to fix issues which we were confused about.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
as a workaround to fix the build pipeline broken by r64824,
because optimizing Ruby should be prioritized higher than supporting unused jokes.
In the current build system, exceeding 200 insns somehow crashes C
extension build on some of MinGW environments like "mingw32-make[1]:
*** No rule to make target 'note'. Stop."
https://ci.appveyor.com/project/ruby/ruby/build/9725/job/co4nu9jugm8qwdrp
and on some of Linux environments like "cannot load such file -- stringio (LoadError)"
```
build_install /home/ko1/ruby/src/trunk_gcc5/lib/rubygems/specification.rb:18:in `require': cannot load such file -- stringio (LoadError)
from /home/ko1/ruby/src/trunk_gcc5/lib/rubygems/specification.rb:18:in `<top (required)>'
from /home/ko1/ruby/src/trunk_gcc5/lib/rubygems.rb:1365:in `require'
from /home/ko1/ruby/src/trunk_gcc5/lib/rubygems.rb:1365:in `<module:Gem>'
from /home/ko1/ruby/src/trunk_gcc5/lib/rubygems.rb:116:in `<top (required)>'
from /home/ko1/ruby/src/trunk_gcc5/tool/rbinstall.rb:24:in `require'
from /home/ko1/ruby/src/trunk_gcc5/tool/rbinstall.rb:24:in `<main>'
make: *** [do-install-nodoc] Error 1
```
http://ci.rvm.jp/results/trunk_gcc5@silicon-docker/1353447
This commit removes "bitblt" and "trace_bitblt" insns, which reduces the
number of insns from 202 to 200 and fixes at least the latter build
failure. I hope this fixes the MinGW build failure as well. Let me
confirm the situation on AppVeyor CI.
Note that this is hard to fix because some MinGW environments (MSP-Greg's
MinGW CI on AppVeyor) don't reproduce this and some Linux environments
(including my local machine) don't reproduce it either. Make sure you
have the reproductive environment and confirm it's fixed when reverting
this commit.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64839 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This reverts commit r64829. I'll prepare another temporary fix, but I'll
separately commit that to make it easier to revert that later.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64838 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
not optimizing Array#& and Array#| because vm_insnhelper.c can't easily
inline it (large amount of array.c code would be needed in vm_insnhelper.c)
and the method body is a little complicated compared to Integer's ones.
So I thought only Integer#& and Integer#| have a significant impact,
and eliminating unnecessary branches would contribute to JIT's performance.
vm_insnhelper.c: ditto
tool/transform_mjit_header.rb: make sure these instructions are inlined
on JIT.
compile.c: compile vm_opt_and and vm_opt_or.
id.def: define id for them to be used in compile.c and vm*.c
vm.c: track redefinition of Integer#& and Integer#|
vm_core.h: allow detecting redefinition of & and |
test/ruby/test_jit.rb: test new insns
test/ruby/test_optimization.rb: ditto
* Optcarrot benchmark
This is a kind of experimental thing but I'm committing this since the
performance impact is significant especially on Optcarrot with JIT.
$ benchmark-driver benchmark.yml --rbenv 'before::before --disable-gems;before+JIT::before --disable-gems --jit;after::after --disable-gems;after+JIT::after --disable-gems --jit' -v --repeat-count 24
before: ruby 2.6.0dev (2018-09-24 trunk 64821) [x86_64-linux]
before+JIT: ruby 2.6.0dev (2018-09-24 trunk 64821) +JIT [x86_64-linux]
after: ruby 2.6.0dev (2018-09-24 opt_and 64821) [x86_64-linux]
last_commit=opt_or
after+JIT: ruby 2.6.0dev (2018-09-24 opt_and 64821) +JIT [x86_64-linux]
last_commit=opt_or
Calculating -------------------------------------
before before+JIT after after+JIT
Optcarrot Lan_Master.nes 51.460 66.315 53.023 71.173 fps
Comparison:
Optcarrot Lan_Master.nes
after+JIT: 71.2 fps
before+JIT: 66.3 fps - 1.07x slower
after: 53.0 fps - 1.34x slower
before: 51.5 fps - 1.38x slower
[close https://github.com/ruby/ruby/pull/1963]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64824 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_compile_each0): Use `opt_aref`/`opt_aset` over
`opt_aref_with`/`opt_aset_with` when frozen_string_literal: true,
not to resurrect the index string on non-Hash receiver.
[Fix GH-1957]
From: chopraanmol1 <chopraanmol1@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64745 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I assume we always prefix rb_ to non-static functions to avoid conflict.
These functions are not exported and safe to be renamed.
iseq.h: ditto
compile.c: ditto
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Simply use DISPATCH_ORIGINAL_INSN instead of rb_funcall. This is,
when possible, overall performant because method dispatch results are
cached inside of CALL_CACHE. Should also be good for JIT.
----
trunk: ruby 2.6.0dev (2018-09-12 trunk 64689) [x86_64-darwin15]
ours: ruby 2.6.0dev (2018-09-12 leaf-insn 64688) [x86_64-darwin15]
last_commit=make opt_str_freeze leaf
Calculating -------------------------------------
trunk ours
vm2_freezestring 5.440M 31.411M i/s - 6.000M times in 1.102968s 0.191017s
Comparison:
vm2_freezestring
ours: 31410864.5 i/s
trunk: 5439865.4 i/s - 5.77x slower
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64690 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This instruction can be written without rb_funcall. It not only boosts
performance of case statements, but also makes room of future JIT
improvements. Because opt_case_dispatch is about optimization this
should not be a bad thing to have.
----
trunk: ruby 2.6.0dev (2018-09-05 trunk 64634) [x86_64-darwin15]
ours: ruby 2.6.0dev (2018-09-12 leaf-insn 64688) [x86_64-darwin15]
last_commit=make opt_case_dispatch leaf
Calculating -------------------------------------
trunk ours
vm2_case_lit 1.366 2.012 i/s - 1.000 times in 0.731839s 0.497008s
Comparison:
vm2_case_lit
ours: 2.0 i/s
trunk: 1.4 i/s - 1.47x slower
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64689 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This enhances rb_vm_insn_addr2insn which retrieves a decoded insn number
from encoded insn.
The insn data table include not only decoded insn number, but also its
len, trace and non-trace version of encoded insn.
This table can be used to simplify trace instrumentation.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64518 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_ibf_load): remove `const` to pass iseq as no `const`
parameter.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64515 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* iseq.c (iseq_init_trace): at ISeq loading time, we need to check
`ruby_vm_event_enabled_flags` to turn on trace instructions.
Seprate this checking code from `finish_iseq_build()` and make
new function. `iseq_ibf_load()` calls this funcation after loading.
* test/ruby/test_iseq.rb: add a test for this fix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (compile_branch_condition): pop dynamic literal
object, which is never nil/false, as the branch condition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64512 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Line coverage was based on special instruction "tracecoverage".
Now, instead, it uses the mechanism of trace hook [Feature #14104].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64509 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The code fragments that initializes coverage data were scattered into
both parse.y and compile.c. parse.y allocated a coverage data, and
compile.c initialize the data.
To remove this cross-cutting concern, this change moves the allocation
from "coverage" function of parse.y to "rb_iseq_new_top" of iseq.c.
For the sake, parse.y just counts the line number of the original source
code, and the number is passed via rb_ast_body_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64508 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (when_vals): return a negative value on error.
* compile.c (compile_case): check error in when_vals().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This is just a refactoring.
The receiver of "invokesuper" was a boolean to represent if it is ZSUPER
or not. This was used in vm_search_super_method to prohibit ZSUPER call
in define_method. (It is currently prohibited because of the limitation
of the implementation.)
This change removes the hack by introducing an explicit flag,
VM_CALL_SUPER, to signal the information. Now, the implementation of
"invokesuper" is consistent with "send" instruction.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
During instruction translation (linked list -> iseq generation), we can
treat `TS_VALUE` and `TS_ISEQ` the same as they are just embedded in the
generated sequences. The only difference between `TS_ISE` and `TS_IC`
is that an inline storage entry may contain a markable `VALUE` pointer
at some point, so we need to flag the iseq as containing markable
objects.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63923 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Objects loaded during iseq deserialization using arrays need to be added
to the compile time mark array so that they stay alive until iseqs
finish loading.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63920 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_peephole_optimize): remove unreachable jump
instruction only. if it is labeled and referred from other
instructions, it is reachable and must not be removed.
[ruby-core:87830] [Bug #14897]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63870 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
```
if L1
L0:
jump L2
L1:
...
L2:
```
was wrongly optimized to:
```
unless L2
L0:
L1:
...
L2:
```
To make it conservative, this optimization is now disabled when there is
any label between `if` and `jump` instructions.
Fixes [Bug #14897].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63868 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* vm.c (core_hash_merge_kwd): simplified to merge the second hash
into the first hash.
* compile.c (compile_array): call core#hash_merge_kwd with 2
hashes always, by passing an new empty hash to at the first
iteration.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63845 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Now endless range can be created by either a literal `(1..)` or explicit
range creation `Range.new(1, nil)`. [Bug #14845]
This change is intended for "early failure"; for example,
`(1..var).to_a` causes out of memory if `var` is inadvertently nil.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63646 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The current VM_INSTRUCTION_SIZE is 198, so the linear search
painful during a major GC phase.
I noticed rb_vm_insn_addr2insn2 showing up at the top of some
profiles while working on some malloc-related stuff, so I
decided to attack it.
Most notably, the benchmark/bm_vm3_gc.rb improves by over 40%:
https://80x24.org/spew/20180602220554.GA9991@whir/raw
[ruby-core:87361] [Feature #14814]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63594 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c, iseq.c: extract body and param.keyword in iseq as
local variables.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_iseq_each): Fix a range of a conditional.
`positions` is only used when VM_INSN_INFO_TABLE_IMPL is 2.
And always `dump_body` is expected to be initialized by
`iseq->body`. For example, `dump_body->insns_info.size` is
used in `ibf_dump_insns_info_positions`.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63413 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Don't abuse struct RString to hold arbitrary memory region.
Raw pointer should just suffice.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63368 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
which has throw insn, not only ancestor iseqs of it.
I think we should remove catch_except_p flag and try to simplify the
catch table itself, to prevent similar bugs in the future.
test_jit.rb: add test to prevent the bug
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_peephole_optimize): copy not only `leave`, with
a non-operand instruction, which are not longer than `jump`.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63248 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_peephole_optimize): more eliminatable
instructions before `pop` without side effects.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63246 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* insns.def (checktype): split branchiftype to checktype and
branchif, to make branch condition negation possible.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit adds write barriers for objects marked from `rb_iseq_mark`.
r62851 introduced direct marking from iseqs to:
* keyword arg default values
* catch table iseqs
* VALUEs embedded in encoded instructions
This patch adds missing write barrier calls to those references.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63147 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_load_iseq_complete): use alternate hexadecimal
form for offset.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63111 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_load_iseq_each): iseq_size necessary to encode
positions is set in ibf_load_code(). [Bug #14660]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63103 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_iseq_each): ensure succ_index_table pointer
field to be 0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63102 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (IBF_ZERO): clear padding of struct not to include
garbages in dumped binary data.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_iseq_each): do not dump succ_index_table
pointer. positions are dumped as integer arrays. pointer
values are meaningless outside the process.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63099 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_align): fill padding with zero, instead of
resizing only, not to leave garbages.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63098 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (compile_if): branch to end_label is not used if
else_seq is not used.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63050 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (compile_if): rewind callinfo indexes used in
unreachable paths, to get rid of dumping unused callinfos.
[ruby-core:86399] [Bug #14553]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (add_insn_info, add_adjust_info): split for each
list->type, to remove unnecessary repeated conditions and casts.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62908 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_load_iseq_each): manage iseq_size to point loaded
objects in iseq_encoded. now marking iseq scans iseq_encoded
directly.
* test/ruby/test_iseq.rb (test_to_binary_with_objects): skip for
now, but fix argument order of assert_equal.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62856 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We need to mark default values for kwarg methods. This also fixes
Bootsnap. IBF iseq loading needed to mark iseqs as "having markable
objects".
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62851 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (iseq_set_arguments): determine argument variable
indexes by the order, not by just IDs. arguments begin with `_`
can be duplicate, so by-ID index may result in a wrong value.
[ruby-core:86159] [Bug #14611]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62833 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_object_object): fix a probable typo in the
function name, s/lbf/ibf/.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62807 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_align): resize the dump buffer.
rb_str_modify_expand expands the buffer but not set the length.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62796 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_align): expand the buffer for alignment.
* compile.c (ibf_dump_iseq_list, ibf_dump_object_list): align as
ibf_offset_t. not all processors do not allow unaligned word,
or larger, access.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This change assumes that continuously reading `parent_iseq` from block
ISeq would reach non-block ISeq finally.
test/ruby/test_jit.rb: add test that catches 2-depth exception
Combination of r62654 and r62678 caused following error in this test.
-e:12:in `wrapper': Stack consistency error (sp: 14, bp: 13) (fatal)
== disasm: #<ISeq:wrapper@-e:10 (10,0)-(12,3)> (catch: FALSE)===========
local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] paths<Arg> [ 1] prefixes<Arg>
0000 putself ( 11)[LiCa]
0001 getlocal_WC_0 paths
0003 getlocal_WC_0 prefixes
0005 opt_send_without_block <callinfo!mid:catch_true, argc:2, FCALL|ARGS_SIMPLE>, <callcache>
0008 leave ( 12)[Re]
As you can see, it says `catch: FALSE`, but obviously it catches
exception raised from `return path`.
As of r62655, it was kind of intentional because I only cared about
expiration of JIT-ed frame and I've thought calling `vm_exec` is only
needed once for it. So r62654 was NOT actually checking if it may catch
exception.
But for r62678, obviously we should set catch_except_p=TRUE for all
ISeqs which may catch exception. Otherwise catch table lookup would
fail.
With this bugfix, code generated by r62655 might be worse, but at least
while loop can be marked as `catch: FALSE` as expected.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Directly marking iseq operands allows us to eliminate the "mark array"
stored on ISEQ objects, which will reduce the amount of memory ISEQ
objects consume. This patch changes the iseq mark function to:
* Directly marks ISEQ operands
* Iterate over and mark child ISEQs
It also introduces two flags on the ISEQ object. In order to mark
instruction operands, we have to disassemble the instructions and find
the instruction parameters and types. Instructions may also be
translated to jump addresses. Instruction sequences may get marked by
the GC *while* they're mid flight (being compiled). The
`ISEQ_TRANSLATED` flag is used to indicate whether or not the
instructions have been translated to jump addresses so that when we
decode the instructions we know whether or not we need to go from jump
location back to original instruction or not.
Not all ISEQ objects have any markable objects embedded in their
instructions. We can detect whether or not an ISEQ has markable objects
in the instructions at compile time. If the instructions contain
markable objects, we set a flag `ISEQ_MARKABLE_ISEQ` on the ISEQ object.
This means that during the mark phase, we can skip decompilation if the
flag is *not* set. In other words, we can avoid decompilation of we
know in advance there is nothing to mark.
`once` instructions have an operand that contains the result of a
one-time compilation of a regex. Before this patch, that operand was
called an "inline cache", even though the struct was actually an "inline
storage". This patch changes the operand to be an "inline storage" so
that we can differentiate between caches that need marking (the inline
storage) and caches that don't need marking (inline cache).
[ruby-core:84909]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62706 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to be used for MJIT's optimization. It's not used for optimization
in this commit yet.
vm_core.h: added catch_except_p field.
iseq.c: show the flag in ISeq disasm for debugging.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62654 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_load_object_unsupported, ibf_load_object_class):
should raise an exception. rejection of invalid input is not a
bug.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62622 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* compile.c (ibf_dump_object_regexp): do not truncate VALUE to
long. it makes invalid VALUE on IL32LLP64 platforms where long
is shorter than VALUE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e