Previously, the following crashes with
`CFLAGS=-DRGENGC_CHECK_MODE=2 -DRUBY_DEBUG=1 -fno-inline`:
$ ./miniruby -e 'GC.stress = true; Marshal.dump({})'
It crashes with a write barrier (WB) miss assertion on an edge from the
`Hash` class object to a newly allocated negative method entry.
This is due to usages of vm_ccs_create() running the WB too early,
before the method entry is inserted into the cc table, so before the
reference edge is established. The insertion can trigger GC and promote
the class object, so running the WB after the insertion is necessary.
Move the insertion into vm_ccs_create() and run the WB after the
insertion.
Discovered on CI:
http://ci.rvm.jp/results/trunk-asserts@ruby-sp2-docker/4391770
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using
this macro will make it easier for us to change the allocation strategy
of rb_iseq_constant_body when using Variable Width Allocation.
* Lazily create singletons on instance_{exec,eval}
Previously when instance_exec or instance_eval was called on an object,
that object would be given a singleton class so that method
definitions inside the block would be added to the object rather than
its class.
This commit aims to improve performance by delaying the creation of the
singleton class unless/until one is needed for method definition. Most
of the time instance_eval is used without any method definition.
This was implemented by adding a flag to the cref indicating that it
represents a singleton of the object rather than a class itself. In this
case CREF_CLASS returns the object's existing class, but in cases that
we are defining a method (either via definemethod or
VM_SPECIAL_OBJECT_CBASE which is used for undef and alias).
This also happens to fix what I believe is a bug. Previously
instance_eval behaved differently with regards to constant access for
true/false/nil than for all other objects. I don't think this was
intentional.
String::Foo = "foo"
"".instance_eval("Foo") # => "foo"
Integer::Foo = "foo"
123.instance_eval("Foo") # => "foo"
TrueClass::Foo = "foo"
true.instance_eval("Foo") # NameError: uninitialized constant Foo
This also slightly changes the error message when trying to define a method
through instance_eval on an object which can't have a singleton class.
Before:
$ ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': no class/module to add method (TypeError)
After:
$ ./ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
IMO this error is a small improvement on the original and better matches
the (both old and new) message when definging a method using `def self.`
$ ruby -e '123.instance_eval{ def self.foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
Co-authored-by: Matthew Draper <matthew@trebex.net>
* Remove "under" argument from yield_under
* Move CREF_SINGLETON_SET into vm_cref_new
* Simplify vm_get_const_base
* Fix leaf VM_SPECIAL_OBJECT_CONST_BASE
Co-authored-by: Matthew Draper <matthew@trebex.net>
Compare with the C methods, A built-in methods written in Ruby is
slower if only mandatory parameters are given because it needs to
check the argumens and fill default values for optional and keyword
parameters (C methods can check the number of parameters with `argc`,
so there are no overhead). Passing mandatory arguments are common
(optional arguments are exceptional, in many cases) so it is important
to provide the fast path for such common cases.
`Primitive.mandatory_only?` is a special builtin function used with
`if` expression like that:
```ruby
def self.at(time, subsec = false, unit = :microsecond, in: nil)
if Primitive.mandatory_only?
Primitive.time_s_at1(time)
else
Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
end
end
```
and it makes two ISeq,
```
def self.at(time, subsec = false, unit = :microsecond, in: nil)
Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
end
def self.at(time)
Primitive.time_s_at1(time)
end
```
and (2) is pointed by (1). Note that `Primitive.mandatory_only?`
should be used only in a condition of an `if` statement and the
`if` statement should be equal to the methdo body (you can not
put any expression before and after the `if` statement).
A method entry with `mandatory_only?` (`Time.at` on the above case)
is marked as `iseq_overload`. When the method will be dispatch only
with mandatory arguments (`Time.at(0)` for example), make another
method entry with ISeq (2) as mandatory only method entry and it
will be cached in an inline method cache.
The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254
but it only checks mandatory parameters or more, because many cases
only mandatory parameters are given. If we find other cases (optional
or keyword parameters are used frequently and it hurts performance),
we can extend the feature.
In vm_call_method_each_type, check for c_call and c_return events before
dispatching to vm_call_ivar and vm_call_attrset. With this approach, the
call cache will still dispatch directly to those functions, so this
change will only decrease performance for the first (uncached) call, and
even then, the performance decrease is very minimal.
This approach requires that we clear the call caches when tracing is
enabled or disabled. The approach currently switches all vm_call_ivar
and vm_call_attrset call caches to vm_call_general any time tracing is
enabled or disabled. So it could theoretically result in a slowdown for
code that constantly enables or disables tracing.
This approach does not handle targeted tracepoints, but from my testing,
c_call and c_return events are not supported for targeted tracepoints,
so that shouldn't matter.
This includes a benchmark showing the performance decrease is minimal
if detectable at all.
Fixes [Bug #16383]
Fixes [Bug #10470]
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
This commit removes T_PAYLOAD since the new VWA implementation no longer
requires T_PAYLOAD types.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
This commit removes T_PAYLOAD since the new VWA implementation no longer
requires T_PAYLOAD types.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
This changes Thread::Location::Backtrace#absolute_path to return
nil for methods/procs defined in eval. If the realpath of an iseq
is nil, that indicates it was defined in eval, in which case you
cannot use RubyVM::AbstractSyntaxTree.of.
Fixes [Bug #16983]
Co-authored-by: Koichi Sasada <ko1@atdot.net>
"alias" type method entries can chain another aliased method
so that machine stack can be overflow on nested alias chain.
http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3344209
This patch fix this issue by use goto instead of recursion if possible.
TODO: Essentially, the alias method should not points another aliased
method entry. Try to fix it later.
rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes
Ruby's method with given receiver. Ruby 2.7 introduced inline method
cache with static memory area. However, Ruby 3.0 reimplemented the
method cache data structures and the inline cache was removed.
Without inline cache, rb_funcall* searched methods everytime.
Most of cases per-Class Method Cache (pCMC) will be helped but
pCMC requires VM-wide locking and it hurts performance on
multi-Ractor execution, especially all Ractors calls methods
with rb_funcall*.
This patch introduced Global Call-Cache Cache Table (gccct) for
rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage
method cache entry atomically and gccct enables method-caching
without VM-wide locking. This table solves the performance issue
on multi-ractor execution.
[Bug #17497]
Ruby-level method invocation does not use gccct because it has
inline-method-cache and the table size is limited. Basically
rb_funcall* is not used frequently, so 1023 entries can be enough.
We will revisit the table size if it is not enough.
add cc_found_in_ccs (renamed from cc_found_ccs), cc_not_found_in_ccs,
call0_public, call0_other debug counters to measure more details.
also it contains several modification.
`cd` is passed to method call functions to method invocation
functions, but `cd` can be manipulated by other ractors simultaneously
so it contains thread-safety issue.
To solve this issue, this patch stores `ci` and found `cc` to `calling`
and stops to pass `cd`.
Not every compilers understand that rb_raise does not return. When a
function does not end with a return statement, such compilers can issue
warnings. We would better tell them about reachabilities.
This patch contains several ideas:
(1) Disposable inline method cache (IMC) for race-free inline method cache
* Making call-cache (CC) as a RVALUE (GC target object) and allocate new
CC on cache miss.
* This technique allows race-free access from parallel processing
elements like RCU.
(2) Introduce per-Class method cache (pCMC)
* Instead of fixed-size global method cache (GMC), pCMC allows flexible
cache size.
* Caching CCs reduces CC allocation and allow sharing CC's fast-path
between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
entries (MEs)
* Instead of using class serials, we set "invalidated" flag for method
entry itself to represent cache invalidation.
* Compare with using class serials, the impact of method modification
(add/overwrite/delete) is small.
* Updating class serials invalidate all method caches of the class and
sub-classes.
* Proposed approach only invalidate the method cache of only one ME.
See [Feature #16614] for more details.
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).
iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).
To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).
struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.
rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
This removes the warning that was added in
3802fb92ff, and switches the behavior
so that the eval does not use the binding's __FILE__ and __LINE__
implicitly.
Fixes [Bug #4352]
This removes the warnings added in 2.7, and changes the behavior
so that a final positional hash is not treated as keywords or
vice-versa.
To handle the arg_setup_block splat case correctly with keyword
arguments, we need to check if we are taking a keyword hash.
That case didn't have a test, but it affects real-world code,
so add a test for it.
This removes rb_empty_keyword_given_p() and related code, as
that is not needed in Ruby 3. The empty keyword case is the
same as the no keyword case in Ruby 3.
This changes rb_scan_args to implement keyword argument
separation for C functions when the : character is used.
For backwards compatibility, it returns a duped hash.
This is a bad idea for performance, but not duping the hash
breaks at least Enumerator::ArithmeticSequence#inspect.
Instead of having RB_PASS_CALLED_KEYWORDS be a number,
simplify the code by just making it be rb_keyword_given_p().
Methods and their definitions can be allocated/deallocated on-the-fly.
One pathological situation is when a method is deallocated then another
one is allocated immediately after that. Address of those old/new method
entries/definitions can be the same then, depending on underlying
malloc/free implementation.
So pointer comparison is insufficient. We have to check the contents.
To do so we introduce def->method_serial, which is an integer unique to
that specific method definition.
PS: Note that method_serial being uintptr_t rather than rb_serial_t is
intentional. This is because rb_serial_t can be bigger than a pointer
on a 32bit system (rb_serial_t is at least 64bit). In order to preserve
old packing of struct rb_call_cache, rb_serial_t is inappropriate.
The equation shall hold for every call cache. However prior to this
changeset cc->me could be updated without also updating cc->def. Let's
make it sure by introducing new macro named CC_SET_ME which sets cc->me
and cc->def at once.
These functions are used from within a compilation unit so we can
make them static, for better binary size. This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
rb_eval_cmd takes a safe level, and now that $SAFE is deprecated,
it should be deprecated as well.
Replace with rb_eval_cmd_kw, which takes a keyword flag. Switch
the two callers to this function.
This removes the security features added by $SAFE = 1, and warns for access
or modification of $SAFE from Ruby-level, as well as warning when calling
all public C functions related to $SAFE.
This modifies some internal functions that took a safe level argument
to no longer take the argument.
rb_require_safe now warns, rb_require_string has been added as a
version that takes a VALUE and does not warn.
One public C function that still takes a safe level argument and that
this doesn't warn for is rb_eval_cmd. We may want to consider
adding an alternative method that does not take a safe level argument,
and warn for rb_eval_cmd.
Prior to this changeset, majority of inline cache mishits resulted
into the same method entry when rb_callable_method_entry() resolves
a method search. Let's not call the function at the first place on
such situations.
In doing so we extend the struct rb_call_cache from 44 bytes (in
case of 64 bit machine) to 64 bytes, and fill the gap with
secondary class serial(s). Call cache's class serials now behavies
as a LRU cache.
Calculating -------------------------------------
ours 2.7 2.6
vm2_poly_same_method 2.339M 1.744M 1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s
Comparison:
vm2_poly_same_method
ours: 2339103.0 i/s
2.7: 1743512.3 i/s - 1.34x slower
2.6: 1369429.8 i/s - 1.71x slower
A method call is often with `argc = 1` and `argv = &v` where v is a
VALUE, and some functions shift the arguments by `argc-1` and `argv+1`
(for example, rb_sym_proc_call). I'm unsure whether it is safe or not
to pass a pointer `argv+1` to memcpy with zero length, but Coverity Scan
complains it. So this attempts to suppress the warning by explicit
check of the length.
The parser needs to determine whether a local varaiable is defined or
not in outer scope. For the sake, "base_block" field has kept the outer
block.
However, the whole block was actually unneeded; the parser used only
base_block->iseq.
So, this change lets parser_params have the iseq directly, instead of
the whole block.
If the keyword flag is set, there should be at least one argument,
if there isn't, that is a sign the keyword flag was passed when it
should not have been.
This adds rb_funcall_passing_block_kw, rb_funcallv_public_kw,
and rb_yield_splat_kw. This functions are necessary to easily
handle cases where rb_funcall_passing_block, rb_funcallv_public,
and rb_yield_splat are currently used and a keyword argument
separation warning is raised.
In general RB_PASS_CALLED_KEYWORDS should only be set if we are
sure the arguments passed come directly from Ruby. For direct calls
to these C functions, we should not assume that keywords are passed.
Add static *_internal versions of these functions that
Kernel#instance_{eval,exec} and Module#{class,module}_{eval,exec}
call that set RB_PASS_CALLED_KEYWORDS.
Also, change struct.c back to calling rb_mod_module_eval, now that
the call is safe.
This fixes instance_exec and similar methods. It also fixes
Enumerator::Yielder#yield, rb_yield_block, and a couple of cases
with Proc#{<<,>>}.
This support requires the addition of rb_yield_values_kw, similar to
rb_yield_values2, for passing the keyword flag.
Unlike earlier attempts at this, this does not modify the rb_block_call_func
type or add a separate function type. The functions of type
rb_block_call_func are called by Ruby with a separate VM frame, and we can
get the keyword flag information from the VM frame flags, so it doesn't need
to be passed as a function argument.
These changes require the following VM functions accept a keyword flag:
* vm_yield_with_cref
* vm_yield
* vm_yield_with_block
rb_vm_call_kw handles the tmp buffer for you.
Also, change method_missing so it also calls rb_vm_call_kw to
handle the kw_splat flag, instead of requiring callers to handle
kw_splat flag before calling method_missing. This may fix other
cases where method_missing is currently called without the kw_splat
being handled.
When Object#to_enum is passed a block, the block is called to get
a size with the arguments given to to_enum. This calls the block
with the same keyword flag as to_enum is called with.
This requires adding rb_check_funcall_kw and
rb_check_funcall_default_kw to handle keyword flags.
If defined in Ruby, dig would be defined as def dig(arg, *rest) end,
it would not use keywords. If the last dig argument was an empty
hash, it could be treated as keyword arguments by the next dig
method. Allow dig to pass along the empty keyword flag if called
with an empty keyword, to suppress the previous behavior and force
treating the hash as a positional argument and not keywords.
Also handle the case where dig calls method_missing, passing the
empty keyword flag to that as well.
This requires adding rb_check_funcall_with_hook_kw functions, so
that dig can specify how arguments are treated. It also adds
kw_splat arguments to a couple static functions.
rb_vm_call0 allocates its own struct call_info etc. But they are
already there in case of rb_funcallv_with_cc. Let's just pass the
existing ones, instead of re-creation.
Make sure that vm_yield_with_cfunc can correctly set the empty keyword
flag by passing 2 as the kw_splat value when calling it in
vm_invoke_ifunc_block. Make sure calling.kw_splat is set to 1 and not
128 in vm_sendish, so we can safely check for different kw_splat values.
vm_args.c needs to call add_empty_keyword, and to make JIT happy, the
function needs to be exported. Rename the function to
rb_adjust_argv_kw_splat to more accurately reflect what it does, and
mark it as MJIT exported.
This makes method_missing take a flag for whether keyword arguments
were passed.
Adds tests both for rb_call_super_kw usage as well as general usage
of super calling method_missing in Ruby methods.
This should only happen if the API is misused. It's much better
to warn here and fix the problem, versus to try to debug TypeErrors
or segfaults later.
nagachika pointed out that ALLOC_N is actually just malloc, so
this memory wasn't being freed. This shouldn't be a performance
sensitive code path, and will be going away after 2.7, so just
allocate a temp buffer that will be freed later by Ruby GC.
It is not safe to set this in C functions that can be called from
other C functions, as in the non argument-delegation case, you
can end up calling a Ruby method with a flag indicating keywords
are set without passing keywords.
Introduce some new *_kw functions that take a kw_splat flag and
use these functions to set RB_PASS_CALLED_KEYWORDS in places where
we know we are delegating methods (e.g. Class#new, Method#call)
Remove rb_add_empty_keyword, and instead of calling that every
place you need to add empty keyword hashes, run that code in
a single static function in vm_eval.c.
Add 4 defines to include/ruby/ruby.h, these are to be used as
int kw_splat values when calling the various rb_*_kw functions:
RB_NO_KEYWORDS :: Do not pass keywords
RB_PASS_KEYWORDS :: Pass final argument (which should be hash) as keywords
RB_PASS_EMPTY_KEYWORDS :: Add an empty hash to arguments and pass as keywords
RB_PASS_CALLED_KEYWORDS :: Passes same keyword type as current method was
called with (for method delegation)
rb_empty_keyword_given_p needs to stay. It is required if argument
delegation is done but delayed to a later point, which Enumerator
does.
Use RB_PASS_CALLED_KEYWORDS in rb_call_super to correctly
delegate keyword arguments to super method.
This sets the correct VM frame flags when using Method#call to
call funcs, and handles empty keyword hashes for cfuncs,
attr_reader, and attr_writer. It also fixes calls to send through
Method#call. It adds tests for all of those, as well as tests for
using Method#call to call define_method, lambda, and sym_procs
(which didn't require code changes).