Граф коммитов

549 Коммитов

Автор SHA1 Сообщение Дата
Jeremy Evans d2c41b1bff Reduce allocations for keyword argument hashes
Previously, passing a keyword splat to a method always allocated
a hash on the caller side, and accepting arbitrary keywords in
a method allocated a separate hash on the callee side.  Passing
explicit keywords to a method that accepted a keyword splat
did not allocate a hash on the caller side, but resulted in two
hashes allocated on the callee side.

This commit makes passing a single keyword splat to a method not
allocate a hash on the caller side.  Passing multiple keyword
splats or a mix of explicit keywords and a keyword splat still
generates a hash on the caller side.  On the callee side,
if arbitrary keywords are not accepted, it does not allocate a
hash.  If arbitrary keywords are accepted, it will allocate a
hash, but this commit uses a callinfo flag to indicate whether
the caller already allocated a hash, and if so, the callee can
use the passed hash without duplicating it.  So this commit
should make it so that a maximum of a single hash is allocated
during method calls.

To set the callinfo flag appropriately, method call argument
compilation checks if only a single keyword splat is given.
If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT
callinfo flag is not set, since in that case the keyword
splat is passed directly and not mutable.  If more than one
splat is used, a new hash needs to be generated on the caller
side, and in that case the callinfo flag is set, indicating
the keyword splat is mutable by the callee.

In compile_hash, used for both hash and keyword argument
compilation, if compiling keyword arguments and only a
single keyword splat is used, pass the argument directly.

On the caller side, in vm_args.c, the callinfo flag needs to
be recognized and handled.  Because the keyword splat
argument may not be a hash, it needs to be converted to a
hash first if not.  Then, unless the callinfo flag is set,
the hash needs to be duplicated.  The temporary copy of the
callinfo flag, kw_flag, is updated if a hash was duplicated,
to prevent the need to duplicate it again.  If we are
converting to a hash or duplicating a hash, we need to update
the argument array, which can including duplicating the
positional splat array if one was passed.  CALLER_SETUP_ARG
and a couple other places needs to be modified to handle
similar issues for other types of calls.

This includes fairly comprehensive tests for different ways
keywords are handled internally, checking that you get equal
results but that keyword splats on the caller side result in
distinct objects for keyword rest parameters.

Included are benchmarks for keyword argument calls.
Brief results when compiled without optimization:

  def kw(a: 1) a end
  def kws(**kw) kw end
  h = {a: 1}

  kw(a: 1)       # about same
  kw(**h)        # 2.37x faster
  kws(a: 1)      # 1.30x faster
  kws(**h)       # 2.19x faster
  kw(a: 1, **h)  # 1.03x slower
  kw(**h, **h)   # about same
  kws(a: 1, **h) # 1.16x faster
  kws(**h, **h)  # 1.14x faster
2020-03-17 12:09:43 -07:00
Takashi Kokubun 8562bfd150
Move code to mark jit_unit's cc_entries to mjit.c 2020-03-12 22:21:32 -07:00
Takashi Kokubun da4b97a0e3
Pin and inline cme in JIT-ed method calls
```
$ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all
before --jit: ruby 2.8.0dev (2020-03-11T07:43:12Z master e89ebdcb87) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-11T07:54:18Z master 143776a0da) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    73.86976729561439     77.20184819316513 fps
                            74.46997176460742     78.43493030231805
                            77.59686308754307     78.55714131655935
                            78.53693921126656     79.08984255596820
                            80.10158944910573     79.17751731838183
                            80.12254974411167     79.60853122429181
                            80.28678655204945     79.74674066871896
                            80.38690681095379     79.90624544440300
                            80.79223498756919     80.57881084206193
                            80.82857188422419     80.70677614429169
                            81.06447745878245     81.03868541295149
                            81.21620802278490     82.16354660940607
```
2020-03-11 00:59:34 -07:00
Takashi Kokubun 9511b4c8fa
Optimize away call data refs in JIT-ed method calls
According to ko1, `cd->cc != cc` was for GC.compact guard.
As we pin cc by rb_gc_mark(), we don't need the check.

```
$ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all
before --jit: ruby 2.8.0dev (2020-03-11T05:36:48Z master da6948753e) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-03-11T06:26:34Z master 36b20b8b4a) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    74.03480698689405     71.63404803273507 fps
                            74.15085286586992     73.43923328104295
                            75.51738277744781     75.75465268365384
                            76.24922600109410     76.74071607861318
                            76.45513422802325     77.47521029238116
                            76.86617230739330     78.14759496269018
                            77.71509137131933     79.14051571125866
                            77.72839157096146     79.35884822673313
                            78.25218904561633     79.92538876408051
                            78.72521071333249     79.98075556706726
                            78.79950460165091     80.51747831497875
                            79.43884960720381     80.97973166525254
```
2020-03-10 23:29:50 -07:00
Takashi Kokubun 33b78b89ac
Eliminate unnecessary mjit_iseq_cc_entries calls
just in case.
2020-02-26 00:34:02 -08:00
Takashi Kokubun 69f377a3d6
Internalize rb_mjit_unit definition again
Fixed a TODO in b9007b6c54
2020-02-26 00:27:29 -08:00
Koichi Sasada 84d1a99a3f should be initialize jit_unit->cc_entries.
GC can invoke just after allocation of jit_unit->cc_entries so
it should be zero-cleared.
2020-02-25 13:37:52 +09:00
Koichi Sasada f744d80106 check USE_MJIT
iseq->body->jit_unit is not available if USE_MJIT==0 .
2020-02-22 11:54:19 +09:00
Koichi Sasada b9007b6c54 Introduce disposable call-cache.
This patch contains several ideas:

(1) Disposable inline method cache (IMC) for race-free inline method cache
    * Making call-cache (CC) as a RVALUE (GC target object) and allocate new
      CC on cache miss.
    * This technique allows race-free access from parallel processing
      elements like RCU.
(2) Introduce per-Class method cache (pCMC)
    * Instead of fixed-size global method cache (GMC), pCMC allows flexible
      cache size.
    * Caching CCs reduces CC allocation and allow sharing CC's fast-path
      between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
    entries (MEs)
    * Instead of using class serials, we set "invalidated" flag for method
      entry itself to represent cache invalidation.
    * Compare with using class serials, the impact of method modification
      (add/overwrite/delete) is small.
    * Updating class serials invalidate all method caches of the class and
      sub-classes.
    * Proposed approach only invalidate the method cache of only one ME.

See [Feature #16614] for more details.
2020-02-22 09:58:59 +09:00
Koichi Sasada f2286925f0 VALUE size packed callinfo (ci).
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).

iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).

To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).

struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.

rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
2020-02-22 09:58:59 +09:00
卜部昌平 b223a78a71 this ternary operator is an undefined behaviour
Let me quote ISO/IEC 9899:2018 section 6.5.15:

> Constraints
>
> The first operand shall have scalar type.
> One of the following shall hold for the second and third operands:
> — both operands have arithmetic type;
> — both operands have the same structure or union type;
> — both operands have void type;
(snip)

Here, `*option` is a const struct rb_compile_option_struct. OTOH
`COMPILE_OPTION_DEFAULT` is a struct rb_compile_option_struct, without
const.   These two are _not_ the "same structure or union type".  Hence
the expression renders undefined behaviour.  COMPILE_OPTION_DEFAULT is
not a const because `RubyVM::InstructionSequence.compile_option=`
touches its internals on-the-fly.  There is no way to meet the
constraints quoted above.

Using ternary operator here was a mistake at the first place.  Let's
just replace it with a normal `if` statement.
2020-02-06 11:46:51 +09:00
0x005c 461db352c2 Rename RUBY_MARK_NO_PIN_UNLESS_NULL to RUBY_MARK_MOVABLE_UNLESS_NULL 2020-01-23 00:11:03 +13:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada db16629008
Fixed misspellings
Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
2019-12-20 09:32:42 +09:00
Yusuke Endoh 60c53ff6ee vm_core.h (iseq_unique_id): prefer uintptr_t instead of unsigned long
It produced a warning about type cast in LLP64 (i.e., windows).
2019-12-10 17:12:21 +09:00
Yusuke Endoh 156fb72d70 vm_args.c (rb_warn_check): Use iseq_unique_id instead of its pointer
(This is the second try of 036bc1da6c6c9b0fa9b7f5968d897a9554dd770e.)

If iseq is GC'ed, the pointer of iseq may be reused, which may hide a
deprecation warning of keyword argument change.

http://ci.rvm.jp/results/trunk-test1@phosphorus-docker/2474221

```
1) Failure:
TestKeywordArguments#test_explicit_super_kwsplat [/tmp/ruby/v2/src/trunk-test1/test/ruby/test_keyword.rb:549]:
--- expected
+++ actual
@@ -1 +1 @@
-/The keyword argument is passed as the last hash parameter.* for `m'/m
+""
```

This change ad-hocly adds iseq_unique_id for each iseq, and use it
instead of iseq pointer.  This covers the case where caller is GC'ed.
Still, the case where callee is GC'ed, is not covered.

But anyway, it is very rare that iseq is GC'ed.  Even when it occurs, it
just hides some warnings.  It's no big deal.
2019-12-09 15:22:48 +09:00
Yusuke Endoh 3cdb37d9db Revert "vm_args.c (rb_warn_check): Use iseq_unique_id instead of its pointer"
This reverts commit 036bc1da6c.

This caused a failure on iseq_binary mode.
http://ci.rvm.jp/results/trunk-iseq_binary@silicon-docker/2474587

Numbering iseqs is not trivial due to dump/load.
2019-12-09 13:49:24 +09:00
Yusuke Endoh 036bc1da6c vm_args.c (rb_warn_check): Use iseq_unique_id instead of its pointer
If iseq is GC'ed, the pointer of iseq may be reused, which may hide a
deprecation warning of keyword argument change.

http://ci.rvm.jp/results/trunk-test1@phosphorus-docker/2474221

```
  1) Failure:
TestKeywordArguments#test_explicit_super_kwsplat [/tmp/ruby/v2/src/trunk-test1/test/ruby/test_keyword.rb:549]:
--- expected
+++ actual
@@ -1 +1 @@
-/The keyword argument is passed as the last hash parameter.* for `m'/m
+""
```

This change ad-hocly adds iseq_unique_id for each iseq, and use it
instead of iseq pointer.  This covers the case where caller is GC'ed.
Still, the case where callee is GC'ed, is not covered.

But anyway, it is very rare that iseq is GC'ed.  Even when it occurs, it
just hides some warnings.  It's no big deal.
2019-12-09 12:04:58 +09:00
Aaron Patterson 2c8d186c6e
Introduce an "Inline IVAR cache" struct
This commit introduces an "inline ivar cache" struct.  The reason we
need this is so compaction can differentiate from an ivar cache and a
regular inline cache.  Regular inline caches contain references to
`VALUE` and ivar caches just contain references to the ivar index.  With
this new struct we can easily update references for inline caches (but
not inline var caches as they just contain an int)
2019-12-05 13:37:02 -08:00
卜部昌平 0e8219f591 make functions static
These functions are used from within a compilation unit so we can
make them static, for better binary size.  This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
2019-11-19 12:36:19 +09:00
Samuel Williams 78e266da1d
Clarify documentation for `InstructionSequence#compile`.
We incorrectly assumed that the `file` argument should be the file name and
caused https://github.com/scoutapp/scout_apm_ruby/issues/307 because
exception backtrace did not contain correct path. This documentation
clarifies the role of the different arguments and provides extra
examples.
2019-11-19 11:40:00 +09:00
Jeremy Evans c5c05460ac Warn on access/modify of $SAFE, and remove effects of modifying $SAFE
This removes the security features added by $SAFE = 1, and warns for access
or modification of $SAFE from Ruby-level, as well as warning when calling
all public C functions related to $SAFE.

This modifies some internal functions that took a safe level argument
to no longer take the argument.

rb_require_safe now warns, rb_require_string has been added as a
version that takes a VALUE and does not warn.

One public C function that still takes a safe level argument and that
this doesn't warn for is rb_eval_cmd.  We may want to consider
adding an alternative method that does not take a safe level argument,
and warn for rb_eval_cmd.
2019-11-18 01:00:25 +02:00
卜部昌平 c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
Dylan Thacker-Smith ac112f2b5d Avoid top-level search for nested constant reference from nil in defined?
Fixes [Bug #16332]

Constant access was changed to no longer allow top-level constant access
through `nil`, but `defined?` wasn't changed at the same time to stay
consistent.

Use a separate defined type to distinguish between a constant
referenced from the current lexical scope and one referenced from
another namespace.
2019-11-13 15:36:58 +09:00
Takashi Kokubun 5c168c7e7f
Support RB_BUILTIN in ISeq#to_a 2019-11-09 21:40:38 -08:00
Nobuyoshi Nakada 0ad0a8ff58
builtin.h must be included *AFTER* vm_core.h 2019-11-08 14:26:21 +09:00
Koichi Sasada 46acd0075d support builtin features with Ruby and C.
Support loading builtin features written in Ruby, which implement
with C builtin functions.
[Feature #16254]

Several features:

(1) Load .rb file at boottime with native binary.

Now, prelude.rb is loaded at boottime. However, this file is contained
into the interpreter as a text format and we need to compile it.
This patch contains a feature to load from binary format.

(2) __builtin_func() in Ruby call func() written in C.

In Ruby file, we can write `__builtin_func()` like method call.
However this is not a method call, but special syntax to call
a function `func()` written in C. C functions should be defined
in a file (same compile unit) which load this .rb file.

Functions (`func` in above example) should be defined with
  (a) 1st parameter: rb_execution_context_t *ec
  (b) rest parameters (0 to 15).
  (c) VALUE return type.
This is very similar requirements for functions used by
rb_define_method(), however `rb_execution_context_t *ec`
is new requirement.

(3) automatic C code generation from .rb files.

tool/mk_builtin_loader.rb creates a C code to load .rb files
needed by miniruby and ruby command. This script is run by
BASERUBY, so *.rb should be written in BASERUBY compatbile
syntax. This script load a .rb file and find all of __builtin_
prefix method calls, and generate a part of C code to export
functions.

tool/mk_builtin_binary.rb creates a C code which contains
binary compiled Ruby files needed by ruby command.
2019-11-08 09:09:29 +09:00
Lourens Naudé 4480d68931 Right size the compile option hash 2019-10-29 11:31:15 +09:00
Aaron Patterson bbf3de22b6
Pin labels during disassembly
We need to ensure that labels are pinned while disassembling.  If the
compactor runs during disassembly, references to these labels could go
bad, so this commit just ensures that the labels can't move until we're
done.
2019-10-28 12:15:05 -07:00
Alan Wu 89e7997622 Combine call info and cache to speed up method invocation
To perform a regular method call, the VM needs two structs,
`rb_call_info` and `rb_call_cache`. At the moment, we allocate these two
structures in separate buffers. In the worst case, the CPU needs to read
4 cache lines to complete a method call. Putting the two structures
together reduces the maximum number of cache line reads to 2.

Combining the structures also saves 8 bytes per call site as the current
layout uses separate two pointers for the call info and the call cache.
This saves about 2 MiB on Discourse.

This change improves the Optcarrot benchmark at least 3%. For more
details, see attached bugs.ruby-lang.org ticket.

Complications:
 - A new instruction attribute `comptime_sp_inc` is introduced to
 calculate SP increase at compile time without using call caches. At
 compile time, a `TS_CALLDATA` operand points to a call info struct, but
 at runtime, the same operand points to a call data struct. Instruction
 that explicitly define `sp_inc` also need to define `comptime_sp_inc`.
 - MJIT code for copying call cache becomes slightly more complicated.
 - This changes the bytecode format, which might break existing tools.

[Misc #16258]
2019-10-24 18:03:42 +09:00
卜部昌平 7e0ae1698d avoid overflow in integer multiplication
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`.  Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
2019-10-09 12:12:28 +09:00
Yusuke Endoh c8a18e25c1 iseq.c (rb_iseq_compile_on_base): Removed
ko1 cannot remember why he introduced the function.  And it is not used.

After it is removed, the argument "base_block" of
rb_iseq_compile_with_option is always zero.
2019-10-04 21:30:32 +09:00
Yusuke Endoh c3dd3b9553 iseq.c (rb_iseq_compile_with_option): dummy parent_iseq for the parser
The parsing of `RubyVM::InstructionSequence.compile` does not support an
outer scope currently.  So it specified NULL as parent_iseq for the
parser.  However, it resulted in the following false-positive warning.

```
RubyVM::InstructionSequence.compile(<<END)
  o = Object.new
  o #=> <compiled>:2: warning: possibly useless use of a variable in void context
END
```

This change specifies a dummy empty parent_iseq instead of NULL, which
suppresses the false positive.
2019-10-04 02:35:10 +09:00
Yusuke Endoh b43afa0a8f Make parser_params have parent_iseq instead of base_block
The parser needs to determine whether a local varaiable is defined or
not in outer scope.  For the sake, "base_block" field has kept the outer
block.

However, the whole block was actually unneeded; the parser used only
base_block->iseq.

So, this change lets parser_params have the iseq directly, instead of
the whole block.
2019-10-04 02:30:36 +09:00
Nobuyoshi Nakada 0c6f36668a
Adjusted spaces [ci skip] 2019-09-27 10:20:56 +09:00
Aaron Patterson 9b6460cacc
Remove mark array
We don't use this array anymore so we can remove it
2019-09-26 13:56:42 -07:00
Aaron Patterson 50fadefb7e
Scan the ISEQ arena for markables and mark them
This commit scans the ISEQ arena for objects that can be marked and
marks them.  This should make the mark array unnecessary.
2019-09-26 13:56:41 -07:00
Aaron Patterson 3cd8f76f7f
Introduce a secondary arena
We'll scan the secondary arena during GC mark. So, we should only
allocate "markable" instruction linked list nodes out of the secondary
arena.
2019-09-26 13:56:41 -07:00
Aaron Patterson bd017c633d
Extract allocation and free functions
Now we can allocate and free a secondary arena.
2019-09-26 13:56:41 -07:00
Jeremy Evans 3a23b71f0a Make Method/Proc#parameters handle **nil syntax
Use a [:nokey] entry in this case.
2019-08-30 12:39:31 -07:00
Nobuyoshi Nakada 896d9f967b
Constified local variable `translator` 2019-08-30 12:06:42 +09:00
Nobuyoshi Nakada e9da4f57b3
Adjust indent [ci skip] 2019-08-30 12:06:42 +09:00
卜部昌平 b8fd2e83e7 decouple compile.c usage of imemo_ifunc
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct.  This commit deletes ANYARGS from
struct vm_ifunc, but in doing so we also have to decouple the usage
of this struct in compile.c, which (I think) is an abuse of ANYARGS.
2019-08-27 15:52:26 +09:00
Alan Wu dc0e45e39b Update moved objects in original_iseq
Without doing this, enabling a TracePoint on a method could lead to use
of moved objects. This was found by running
`env RUBY_ISEQ_DUMP_DEBUG=to_binary make test-all`, which sets
orignal_iseq then runs the compaction tests and the tracepoint tests.

Please excuse the lack of tests. I was not able to figure out how to
reliably trigger a move on a specific iseq imemo to make a good
regression test.

To manually confirm the problem and this fix, you can run:
```
env RUBY_ISEQ_DUMP_DEBUG=to_binary make test-all \
  TESTOPTS="test/ruby/test_gc_compact.rb \
            test/gdbm/test_gdbm.rb \
            test/ruby/test_settracefunc.rb"
```

Or the following script:

```ruby
tp = TracePoint.new(:line) {}
1.times do # put it in a block to not keep these objects alive
  objects = 10_000.times.map { Object.new }
  objects.hash
end

1.times do
  # this allocation pattern can realistically happen in an app
  # at load time
  beek = 10_000.times.map do
    eval(<<-RUBY)
      def foo
        a + b
        1.times {
          4 + 234234
        }
        nil + 234
      end
    RUBY
    Object.new
    Object.new
  end
  beek.hash
end

tp.enable(target: self.:foo) { 234 } # allocate original iseq

GC.verify_compaction_references(toward: :empty)
GC.compact

tp.enable(target: self.:foo) { 234234 } # crash
```

[Bug #16098]
2019-08-19 12:44:38 -07:00
Benoit Daloze 39a43d9cd0 Make it as clear as possible that RubyVM is MRI-specific and only exists on MRI (#2113) [ci skip]
* Make it clear as possible that RubyVM is MRI-specific and only exists on MRI

* See [Bug #15743].
* Use "CRuby VM" instead of "Ruby VM" for clarity.

* Use YARV rather than "CRuby VM" for documenting RubyVM::InstructionSequence

* Avoid introducing a new "CRuby VM" term in documentation
2019-08-19 14:51:00 +09:00
Aaron Patterson aac4d9d6c7
Rename rb_gc_mark_no_pin -> rb_gc_mark_movable
Renaming this function.  "No pin" leaks some implementation details.  We
just want users to know that if they mark this object, the reference may
move and they'll need to update the reference accordingly.
2019-08-12 16:44:54 -04:00
git e688ab26c7 * expand tabs. 2019-08-13 01:34:36 +09:00
Aaron Patterson 76a928bac2
Unpin default value objects
We're already updating the location of default values, so we may as well
unpin them.
2019-08-12 12:34:09 -04:00
Jeremy Evans 96cec6b277 Document that RubyVM::InstructionSequence methods are implementation and version dependent
Fixes [Bug #6785]
2019-08-05 16:14:30 -07:00
Koichi Sasada c25ff7bb5d check iseq is executable 2019-07-23 08:42:20 +01:00