rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes
Ruby's method with given receiver. Ruby 2.7 introduced inline method
cache with static memory area. However, Ruby 3.0 reimplemented the
method cache data structures and the inline cache was removed.
Without inline cache, rb_funcall* searched methods everytime.
Most of cases per-Class Method Cache (pCMC) will be helped but
pCMC requires VM-wide locking and it hurts performance on
multi-Ractor execution, especially all Ractors calls methods
with rb_funcall*.
This patch introduced Global Call-Cache Cache Table (gccct) for
rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage
method cache entry atomically and gccct enables method-caching
without VM-wide locking. This table solves the performance issue
on multi-ractor execution.
[Bug #17497]
Ruby-level method invocation does not use gccct because it has
inline-method-cache and the table size is limited. Basically
rb_funcall* is not used frequently, so 1023 entries can be enough.
We will revisit the table size if it is not enough.
add cc_found_in_ccs (renamed from cc_found_ccs), cc_not_found_in_ccs,
call0_public, call0_other debug counters to measure more details.
also it contains several modification.
`cd` is passed to method call functions to method invocation
functions, but `cd` can be manipulated by other ractors simultaneously
so it contains thread-safety issue.
To solve this issue, this patch stores `ci` and found `cc` to `calling`
and stops to pass `cd`.
Not every compilers understand that rb_raise does not return. When a
function does not end with a return statement, such compilers can issue
warnings. We would better tell them about reachabilities.
This patch contains several ideas:
(1) Disposable inline method cache (IMC) for race-free inline method cache
* Making call-cache (CC) as a RVALUE (GC target object) and allocate new
CC on cache miss.
* This technique allows race-free access from parallel processing
elements like RCU.
(2) Introduce per-Class method cache (pCMC)
* Instead of fixed-size global method cache (GMC), pCMC allows flexible
cache size.
* Caching CCs reduces CC allocation and allow sharing CC's fast-path
between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
entries (MEs)
* Instead of using class serials, we set "invalidated" flag for method
entry itself to represent cache invalidation.
* Compare with using class serials, the impact of method modification
(add/overwrite/delete) is small.
* Updating class serials invalidate all method caches of the class and
sub-classes.
* Proposed approach only invalidate the method cache of only one ME.
See [Feature #16614] for more details.
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).
iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).
To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).
struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.
rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
This removes the warning that was added in
3802fb92ff, and switches the behavior
so that the eval does not use the binding's __FILE__ and __LINE__
implicitly.
Fixes [Bug #4352]
This removes the warnings added in 2.7, and changes the behavior
so that a final positional hash is not treated as keywords or
vice-versa.
To handle the arg_setup_block splat case correctly with keyword
arguments, we need to check if we are taking a keyword hash.
That case didn't have a test, but it affects real-world code,
so add a test for it.
This removes rb_empty_keyword_given_p() and related code, as
that is not needed in Ruby 3. The empty keyword case is the
same as the no keyword case in Ruby 3.
This changes rb_scan_args to implement keyword argument
separation for C functions when the : character is used.
For backwards compatibility, it returns a duped hash.
This is a bad idea for performance, but not duping the hash
breaks at least Enumerator::ArithmeticSequence#inspect.
Instead of having RB_PASS_CALLED_KEYWORDS be a number,
simplify the code by just making it be rb_keyword_given_p().
Methods and their definitions can be allocated/deallocated on-the-fly.
One pathological situation is when a method is deallocated then another
one is allocated immediately after that. Address of those old/new method
entries/definitions can be the same then, depending on underlying
malloc/free implementation.
So pointer comparison is insufficient. We have to check the contents.
To do so we introduce def->method_serial, which is an integer unique to
that specific method definition.
PS: Note that method_serial being uintptr_t rather than rb_serial_t is
intentional. This is because rb_serial_t can be bigger than a pointer
on a 32bit system (rb_serial_t is at least 64bit). In order to preserve
old packing of struct rb_call_cache, rb_serial_t is inappropriate.
The equation shall hold for every call cache. However prior to this
changeset cc->me could be updated without also updating cc->def. Let's
make it sure by introducing new macro named CC_SET_ME which sets cc->me
and cc->def at once.
These functions are used from within a compilation unit so we can
make them static, for better binary size. This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
rb_eval_cmd takes a safe level, and now that $SAFE is deprecated,
it should be deprecated as well.
Replace with rb_eval_cmd_kw, which takes a keyword flag. Switch
the two callers to this function.
This removes the security features added by $SAFE = 1, and warns for access
or modification of $SAFE from Ruby-level, as well as warning when calling
all public C functions related to $SAFE.
This modifies some internal functions that took a safe level argument
to no longer take the argument.
rb_require_safe now warns, rb_require_string has been added as a
version that takes a VALUE and does not warn.
One public C function that still takes a safe level argument and that
this doesn't warn for is rb_eval_cmd. We may want to consider
adding an alternative method that does not take a safe level argument,
and warn for rb_eval_cmd.
Prior to this changeset, majority of inline cache mishits resulted
into the same method entry when rb_callable_method_entry() resolves
a method search. Let's not call the function at the first place on
such situations.
In doing so we extend the struct rb_call_cache from 44 bytes (in
case of 64 bit machine) to 64 bytes, and fill the gap with
secondary class serial(s). Call cache's class serials now behavies
as a LRU cache.
Calculating -------------------------------------
ours 2.7 2.6
vm2_poly_same_method 2.339M 1.744M 1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s
Comparison:
vm2_poly_same_method
ours: 2339103.0 i/s
2.7: 1743512.3 i/s - 1.34x slower
2.6: 1369429.8 i/s - 1.71x slower
A method call is often with `argc = 1` and `argv = &v` where v is a
VALUE, and some functions shift the arguments by `argc-1` and `argv+1`
(for example, rb_sym_proc_call). I'm unsure whether it is safe or not
to pass a pointer `argv+1` to memcpy with zero length, but Coverity Scan
complains it. So this attempts to suppress the warning by explicit
check of the length.
The parser needs to determine whether a local varaiable is defined or
not in outer scope. For the sake, "base_block" field has kept the outer
block.
However, the whole block was actually unneeded; the parser used only
base_block->iseq.
So, this change lets parser_params have the iseq directly, instead of
the whole block.
If the keyword flag is set, there should be at least one argument,
if there isn't, that is a sign the keyword flag was passed when it
should not have been.
This adds rb_funcall_passing_block_kw, rb_funcallv_public_kw,
and rb_yield_splat_kw. This functions are necessary to easily
handle cases where rb_funcall_passing_block, rb_funcallv_public,
and rb_yield_splat are currently used and a keyword argument
separation warning is raised.
In general RB_PASS_CALLED_KEYWORDS should only be set if we are
sure the arguments passed come directly from Ruby. For direct calls
to these C functions, we should not assume that keywords are passed.
Add static *_internal versions of these functions that
Kernel#instance_{eval,exec} and Module#{class,module}_{eval,exec}
call that set RB_PASS_CALLED_KEYWORDS.
Also, change struct.c back to calling rb_mod_module_eval, now that
the call is safe.
This fixes instance_exec and similar methods. It also fixes
Enumerator::Yielder#yield, rb_yield_block, and a couple of cases
with Proc#{<<,>>}.
This support requires the addition of rb_yield_values_kw, similar to
rb_yield_values2, for passing the keyword flag.
Unlike earlier attempts at this, this does not modify the rb_block_call_func
type or add a separate function type. The functions of type
rb_block_call_func are called by Ruby with a separate VM frame, and we can
get the keyword flag information from the VM frame flags, so it doesn't need
to be passed as a function argument.
These changes require the following VM functions accept a keyword flag:
* vm_yield_with_cref
* vm_yield
* vm_yield_with_block
rb_vm_call_kw handles the tmp buffer for you.
Also, change method_missing so it also calls rb_vm_call_kw to
handle the kw_splat flag, instead of requiring callers to handle
kw_splat flag before calling method_missing. This may fix other
cases where method_missing is currently called without the kw_splat
being handled.
When Object#to_enum is passed a block, the block is called to get
a size with the arguments given to to_enum. This calls the block
with the same keyword flag as to_enum is called with.
This requires adding rb_check_funcall_kw and
rb_check_funcall_default_kw to handle keyword flags.
If defined in Ruby, dig would be defined as def dig(arg, *rest) end,
it would not use keywords. If the last dig argument was an empty
hash, it could be treated as keyword arguments by the next dig
method. Allow dig to pass along the empty keyword flag if called
with an empty keyword, to suppress the previous behavior and force
treating the hash as a positional argument and not keywords.
Also handle the case where dig calls method_missing, passing the
empty keyword flag to that as well.
This requires adding rb_check_funcall_with_hook_kw functions, so
that dig can specify how arguments are treated. It also adds
kw_splat arguments to a couple static functions.
rb_vm_call0 allocates its own struct call_info etc. But they are
already there in case of rb_funcallv_with_cc. Let's just pass the
existing ones, instead of re-creation.
Make sure that vm_yield_with_cfunc can correctly set the empty keyword
flag by passing 2 as the kw_splat value when calling it in
vm_invoke_ifunc_block. Make sure calling.kw_splat is set to 1 and not
128 in vm_sendish, so we can safely check for different kw_splat values.
vm_args.c needs to call add_empty_keyword, and to make JIT happy, the
function needs to be exported. Rename the function to
rb_adjust_argv_kw_splat to more accurately reflect what it does, and
mark it as MJIT exported.
This makes method_missing take a flag for whether keyword arguments
were passed.
Adds tests both for rb_call_super_kw usage as well as general usage
of super calling method_missing in Ruby methods.