Граф коммитов

835 Коммитов

Автор SHA1 Сообщение Дата
Koichi Sasada a635c93fde support MJIT with debug option.
VM_CHECK_MODE > 0 with optflags=-O0 can not run JIT tests because
of link problems. This patch fix them.
2020-02-03 16:57:41 +09:00
Jeremy Evans beae6cbf0f Fully separate positional arguments and keyword arguments
This removes the warnings added in 2.7, and changes the behavior
so that a final positional hash is not treated as keywords or
vice-versa.

To handle the arg_setup_block splat case correctly with keyword
arguments, we need to check if we are taking a keyword hash.
That case didn't have a test, but it affects real-world code,
so add a test for it.

This removes rb_empty_keyword_given_p() and related code, as
that is not needed in Ruby 3.  The empty keyword case is the
same as the no keyword case in Ruby 3.

This changes rb_scan_args to implement keyword argument
separation for C functions when the : character is used.
For backwards compatibility, it returns a duped hash.
This is a bad idea for performance, but not duping the hash
breaks at least Enumerator::ArithmeticSequence#inspect.

Instead of having RB_PASS_CALLED_KEYWORDS be a number,
simplify the code by just making it be rb_keyword_given_p().
2020-01-02 18:40:45 -08:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
卜部昌平 0958e19ffb add several __has_something macro
With these macros implemented we can write codes just like we can assume
the compiler being clang.  MSC_VERSION_SINCE is defined to implement
those macros, but turned out to be handy for other places.  The -fdeclspec
compiler flag is necessary for clang to properly handle __has_declspec().
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada db16629008
Fixed misspellings
Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
2019-12-20 09:32:42 +09:00
卜部昌平 f054f11a38 per-method serial number
Methods and their definitions can be allocated/deallocated on-the-fly.
One pathological situation is when a method is deallocated then another
one is allocated immediately after that.  Address of those old/new method
entries/definitions can be the same then, depending on underlying
malloc/free implementation.

So pointer comparison is insufficient.  We have to check the contents.
To do so we introduce def->method_serial, which is an integer unique to
that specific method definition.

PS: Note that method_serial being uintptr_t rather than rb_serial_t is
intentional.  This is because rb_serial_t can be bigger than a pointer
on a 32bit system (rb_serial_t is at least 64bit).  In order to preserve
old packing of struct rb_call_cache, rb_serial_t is inappropriate.
2019-12-18 12:52:28 +09:00
Koichi Sasada fbe229906b add debug counter to count `call` reusing cases. 2019-12-17 13:15:38 +09:00
卜部昌平 ba11a74745 ensure cc->def == cc->me->def
The equation shall hold for every call cache.  However prior to this
changeset cc->me could be updated without also updating cc->def.  Let's
make it sure by introducing new macro named CC_SET_ME which sets cc->me
and cc->def at once.
2019-12-16 17:52:18 +09:00
Jeremy Evans 55b7ba3686 Make super in instance_eval in method in module raise TypeError
This makes behavior the same as super in instance_eval in method
in class.  The reason this wasn't implemented before is that
there is a check to determine if the self in the current context
is of the expected class, and a module itself can be included
in multiple classes, so it doesn't have an expected class.

Implementing this requires giving iclasses knowledge of which
class created them, so that super call in the module method
knows the expected class for super calls.  This reference
is called includer, and should only be set for iclasses.

Note that the approach Ruby uses in this check is not robust. If
you instance_eval another object of the same class and call super,
instead of an TypeError, you get super called with the
instance_eval receiver instead of the method receiver.  Truly
fixing super would require keeping a reference to the super object
(method receiver) in each frame where scope has changed, and using
that instead of current self when calling super.

Fixes [Bug #11636]
2019-12-12 15:50:19 +09:00
Aaron Patterson 2c8d186c6e
Introduce an "Inline IVAR cache" struct
This commit introduces an "inline ivar cache" struct.  The reason we
need this is so compaction can differentiate from an ivar cache and a
regular inline cache.  Regular inline caches contain references to
`VALUE` and ivar caches just contain references to the ivar index.  With
this new struct we can easily update references for inline caches (but
not inline var caches as they just contain an int)
2019-12-05 13:37:02 -08:00
Koichi Sasada 36da0b3da1 check interrupts at each frame pop timing.
Asynchronous events such as signal trap, finalization timing,
thread switching and so on are managed by "interrupt_flag".
Ruby's threads check this flag periodically and if a thread
does not check this flag, above events doesn't happen.

This checking is CHECK_INTS() (related) macro and it is placed
at some places (laeve instruction and so on). However, at the end
of C methods, C blocks (IMEMO_IFUNC) etc there are no checking
and it can introduce uninterruptible thread.

To modify this situation, we decide to place CHECK_INTS() at
vm_pop_frame(). It increases interrupt checking points.
[Bug #16366]

This patch can introduce unexpected events...
2019-11-29 17:47:02 +09:00
Yusuke Endoh 191ce5344e Reduce duplicated warnings for the change of Ruby 3 keyword arguments
By this change, the following code prints only one warning.

```
def foo(**opt); end
100.times { foo({kw:1}) }
```

A global variable `st_table *caller_to_callees` is a map from caller to
a set of callee methods.  It remembers that a warning is already printed
for each pair of caller and callee.

[Feature #16289]
2019-11-29 17:32:27 +09:00
Koichi Sasada f38b6d197f Revert "export for MJIT"
This reverts commit 2e6f1cf8b2.
2019-11-29 03:22:24 +09:00
Koichi Sasada e4e41840ad Revert "* remove trailing spaces. [ci skip]"
This reverts commit 27d0d7c0d3.
2019-11-29 03:22:13 +09:00
git 27d0d7c0d3 * remove trailing spaces. [ci skip] 2019-11-29 03:18:19 +09:00
Koichi Sasada 2e6f1cf8b2 export for MJIT 2019-11-29 03:17:52 +09:00
Koichi Sasada dd723771c1 fastpath for ivar read of FL_EXIVAR objects.
vm_getivar() provides fastpath for T_OBJECT by caching an index
of ivar. This patch also provides fastpath for FL_EXIVAR objects.
FL_EXIVAR objects have an each ivar array and index can be cached
as T_OBJECT. To access this ivar array, generic_iv_tbl is exposed
by rb_ivar_generic_ivtbl() (declared in variable.h which is newly
introduced).

Benchmark script:

Benchmark.driver(repeat_count: 3){|x|
  x.executable name: 'clean', command: %w'../clean/miniruby'
  x.executable name: 'trunk', command: %w'./miniruby'

  objs = [Object.new, 'str', {a: 1, b: 2}, [1, 2]]

  objs.each.with_index{|obj, i|
    rep = obj.inspect
    rep = 'Object.new' if /\#/ =~ rep
    x.prelude str = %Q{
      v#{i} = #{rep}
      def v#{i}.foo
        @iv # ivar access method (attr_reader)
      end
      v#{i}.instance_variable_set(:@iv, :iv)
    }
    puts str
    x.report %Q{
      v#{i}.foo
    }
  }
}

Result:

      v0.foo # T_OBJECT

               clean:  85387141.8 i/s
               trunk:  85249373.6 i/s - 1.00x  slower

      v1.foo # T_STRING

               trunk:  57894407.5 i/s
               clean:  39957178.6 i/s - 1.45x  slower

      v2.foo # T_HASH

               trunk:  56629413.2 i/s
               clean:  39227088.9 i/s - 1.44x  slower

      v3.foo # T_ARRAY

               trunk:  55797530.2 i/s
               clean:  38263572.9 i/s - 1.46x  slower
2019-11-29 03:11:04 +09:00
Kazuhiro NISHIYAMA 09e76e9828
Improve consistency of bool/true/false 2019-11-25 15:09:09 +09:00
Koichi Sasada e27acb6148 add fast path for argc==0.
If calling builtin functions with no arguments, we don't need to
calculate argv location.
2019-11-25 14:04:21 +09:00
卜部昌平 f6239ce0fc peep-hole optimize VM instructions
Some minor optimizations.

Calculating -------------------------------------
                           ours       trunk
          vm2_regexp     8.479M      8.346M i/s -      6.000M times in 0.707612s 0.718916s
   vm2_regexp_invert     8.605M      8.350M i/s -      6.000M times in 0.697298s 0.718576s

Comparison:
                       vm2_regexp
                ours:   8479223.3 i/s
               trunk:   8345893.8 i/s - 1.02x  slower

                vm2_regexp_invert
                ours:   8604647.4 i/s
               trunk:   8349852.8 i/s - 1.03x  slower

Calculating -------------------------------------
                           ours+jit   trunk+jit
Optcarrot Lan_Master.nes     68.603      64.167 fps

Comparison:
             Optcarrot Lan_Master.nes
                ours+jit:        68.6 fps
               trunk+jit:        64.2 fps - 1.07x  slower
2019-11-19 13:56:13 +09:00
Koichi Sasada 57cd4623cf should not use __func__ 2019-11-18 13:53:32 +09:00
Koichi Sasada 5e34ab5406 add casts.
add casts to avoid compile error.
http://ci.rvm.jp/results/trunk_clang_39@silicon-docker/2402215
2019-11-18 10:36:48 +09:00
Koichi Sasada 71fee9bc72 vm_invoke_builtin_delegate with start index.
opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave
invokes builtin functions with same parameters of the method.
This technique eliminate stack push operations. However, delegation
parameters should be completely same as given parameters.
(e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but
__builtin_foo(b, c) is not allowed)

This patch relaxes this restriction. ISeq has a local variables
table which includes parameters. For example, the method defined
as `def foo(a, b, c) x=y=nil`, then local variables table contains
[a, b, c, x, y]. If calling builtin-function with arguments which
are sub-array of the lvar table, use opt_invokebuiltin_delegate
instruction with start index. For example, `__builtin_foo(b, c)`,
`__builtin_bar(c, x, y)` is okay, and so on.
2019-11-18 10:16:11 +09:00
Koichi Sasada 179062dd80 move rb_vm_lvar_exposed() correctly.
rb_vm_lvar_exposed() is prepared for __builtin_inline!(), needed for
mini_builtin.c and builtin.c. However, it's only on builtin.c.
So move it to make it as a part of VM.
2019-11-14 04:21:24 +09:00
Dylan Thacker-Smith ac112f2b5d Avoid top-level search for nested constant reference from nil in defined?
Fixes [Bug #16332]

Constant access was changed to no longer allow top-level constant access
through `nil`, but `defined?` wasn't changed at the same time to stay
consistent.

Use a separate defined type to distinguish between a constant
referenced from the current lexical scope and one referenced from
another namespace.
2019-11-13 15:36:58 +09:00
Koichi Sasada 9142f802f1 rewrite comment.
Pointed by nagachika-san.
https://ruby-trunk-changes.hatenablog.com/entry/ruby_trunk_changes_20191109
2019-11-11 16:47:50 +09:00
Koichi Sasada 43ceedecc0 use STACK_ADDR_FROM_TOP()
vm_invoke_builtin() accesses VM stack via cfp->sp. However, MJIT
can use their own stack. To access them appropriately, we need to
use STACK_ADDR_FROM_TOP().
2019-11-09 16:18:58 +09:00
Koichi Sasada 21f7cca2c6 initialize kw special local var.
A method which has keyword parameters has an implicit local variable
to specify which keywords are (un)specified.

vm_call_iseq_setup_kwparm_nokwarg() is special function to invoke
a ISeq method without any keyword arguments. However, it should
also initialize the special local var. Without this initialization,
the implicit lvar can points a freed (T_NONE) object.
2019-11-09 10:04:04 +09:00
卜部昌平 90fc555258 name the result of calccall
This is a pure refactoring for better understanding of what is
happening here.  Should change nothing but readability.
2019-11-08 12:09:01 +09:00
卜部昌平 a1a08ac9aa describe vm_cache_check_for_class_serial [ci skip]
Added comments describing what it is.  Requested by ko1.
2019-11-08 10:31:06 +09:00
Koichi Sasada 46acd0075d support builtin features with Ruby and C.
Support loading builtin features written in Ruby, which implement
with C builtin functions.
[Feature #16254]

Several features:

(1) Load .rb file at boottime with native binary.

Now, prelude.rb is loaded at boottime. However, this file is contained
into the interpreter as a text format and we need to compile it.
This patch contains a feature to load from binary format.

(2) __builtin_func() in Ruby call func() written in C.

In Ruby file, we can write `__builtin_func()` like method call.
However this is not a method call, but special syntax to call
a function `func()` written in C. C functions should be defined
in a file (same compile unit) which load this .rb file.

Functions (`func` in above example) should be defined with
  (a) 1st parameter: rb_execution_context_t *ec
  (b) rest parameters (0 to 15).
  (c) VALUE return type.
This is very similar requirements for functions used by
rb_define_method(), however `rb_execution_context_t *ec`
is new requirement.

(3) automatic C code generation from .rb files.

tool/mk_builtin_loader.rb creates a C code to load .rb files
needed by miniruby and ruby command. This script is run by
BASERUBY, so *.rb should be written in BASERUBY compatbile
syntax. This script load a .rb file and find all of __builtin_
prefix method calls, and generate a part of C code to export
functions.

tool/mk_builtin_binary.rb creates a C code which contains
binary compiled Ruby files needed by ruby command.
2019-11-08 09:09:29 +09:00
卜部昌平 d45a013a1a extend rb_call_cache
Prior to this changeset, majority of inline cache mishits resulted
into the same method entry when rb_callable_method_entry() resolves
a method search.  Let's not call the function at the first place on
such situations.

In doing so we extend the struct rb_call_cache from 44 bytes (in
case of 64 bit machine) to 64 bytes, and fill the gap with
secondary class serial(s).  Call cache's class serials now behavies
as a LRU cache.

Calculating -------------------------------------
                           ours         2.7         2.6
vm2_poly_same_method     2.339M      1.744M      1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s

Comparison:
             vm2_poly_same_method
                ours:   2339103.0 i/s
                 2.7:   1743512.3 i/s - 1.34x  slower
                 2.6:   1369429.8 i/s - 1.71x  slower
2019-11-07 17:41:30 +09:00
卜部昌平 6ff1250739 rb_method_basic_definition_p with CC
Noticed that rb_method_basic_definition_p is frequently called.
Its callers include vm_caller_setup_args_block(),
rb_hash_default_value(), rb_num_neative_int_p(), and a lot more.

It seems worth caching the method resolution part.  Majority of
rb_method_basic_definion_p() usages take fixed class and fixed
method id combinations.

Calculating -------------------------------------
                           ours       trunk
           so_matrix      2.379       2.115 i/s -       1.000 times in 0.420409s 0.472879s

Comparison:
                        so_matrix
                ours:         2.4 i/s
               trunk:         2.1 i/s - 1.12x  slower
2019-11-05 11:39:35 +09:00
卜部昌平 cc5580f175 fix bug in keyword + protected combination
Test included for the situation formerly was not working.
2019-10-28 14:38:05 +09:00
卜部昌平 356e203a3a more on struct rb_call_data
Replacing adjacent struct rb_call_info and struct rb_call_cache
into a struct rb_call_data.
2019-10-25 12:24:22 +09:00
wanabe 4ff2c58f91 retry tailcall optimization (#2529)
Sorry, f62f90367f is push miss.
2019-10-25 04:40:39 +09:00
Jeremy Evans d6a2507e49 Duplicate hash when converting keyword hash to keywords
This mirrors the behavior when manually splatting a hash.  This
mirrors the changes made in setup_parameters_complex in
6081ddd6e6, so that splatting to a
non-iseq method works the same as splatting to an iseq method.
2019-10-24 12:35:04 -07:00
Alan Wu 89e7997622 Combine call info and cache to speed up method invocation
To perform a regular method call, the VM needs two structs,
`rb_call_info` and `rb_call_cache`. At the moment, we allocate these two
structures in separate buffers. In the worst case, the CPU needs to read
4 cache lines to complete a method call. Putting the two structures
together reduces the maximum number of cache line reads to 2.

Combining the structures also saves 8 bytes per call site as the current
layout uses separate two pointers for the call info and the call cache.
This saves about 2 MiB on Discourse.

This change improves the Optcarrot benchmark at least 3%. For more
details, see attached bugs.ruby-lang.org ticket.

Complications:
 - A new instruction attribute `comptime_sp_inc` is introduced to
 calculate SP increase at compile time without using call caches. At
 compile time, a `TS_CALLDATA` operand points to a call info struct, but
 at runtime, the same operand points to a call data struct. Instruction
 that explicitly define `sp_inc` also need to define `comptime_sp_inc`.
 - MJIT code for copying call cache becomes slightly more complicated.
 - This changes the bytecode format, which might break existing tools.

[Misc #16258]
2019-10-24 18:03:42 +09:00
Nobuyoshi Nakada 42edb05626
extracted declare_under 2019-10-10 01:08:42 +09:00
Koichi Sasada ddf5020e4f Revert "tailcall optimization again (#2528)"
This reverts commit f62f90367f.
2019-10-06 17:01:00 +09:00
wanabe f62f90367f tailcall optimization again (#2528)
This is follow up of r67315.
2019-10-06 16:52:09 +09:00
卜部昌平 3ffd98c5cd add debug counters for vm_search_method_slowpath()
Implemented fine-grained inspection of cache misshits.  Handy for
counting the reasons why an inline method cache was evicted.
2019-10-03 15:24:09 +09:00
卜部昌平 eb92159d72 Revert https://github.com/ruby/ruby/pull/2486
This reverts commits: 10d6a3aca7 8ba48c1b85 fba8627dc1 dd883de5ba
6c6a25feca 167e6b48f1 7cb96d41a5 3207979278 595b3c4fdd 1521f7cf89
c11c5e69ac cf33608203 3632a812c0 f56506be0d 86427a3219 .

The reason for the revert is that we observe ABA problem around
inline method cache.  When a cache misshits, we search for a
method entry.  And if the entry is identical to what was cached
before, we reuse the cache.  But the commits we are reverting here
introduced situations where a method entry is freed, then the
identical memory region is used for another method entry.  An
inline method cache cannot detect that ABA.

Here is a code that reproduce such situation:

```ruby
require 'prime'

class << Integer
  alias org_sqrt sqrt
  def sqrt(n)
    raise
  end

  GC.stress = true
  Prime.each(7*37){} rescue nil # <- Here we populate CC
  class << Object.new; end

  # These adjacent remove-then-alias maneuver
  # frees a method entry, then immediately
  # reuses it for another.
  remove_method :sqrt
  alias sqrt org_sqrt
end

Prime.each(7*37).to_a # <- SEGV
```
2019-10-03 12:45:24 +09:00
Jeremy Evans ef697388be
Treat return in block in class/module as LocalJumpError (#2511)
return directly in class/module is an error, so return in
proc in class/module should also be an error.  I believe the
previous behavior was an unintentional oversight during the
addition of top-level return in 2.4.
2019-10-02 07:56:28 -07:00
Nobuyoshi Nakada 10d6a3aca7
Fix assertion
callable_method_entry_p is for rb_callable_method_entry_t.
2019-09-30 17:43:11 +09:00
卜部昌平 fba8627dc1 delete unnecessary branch
At last, not only myself but also your compiler are fully confident
that the method entries pointed from call caches are immutable.  We
don't have to worry about silent updates.  Just delete the branch
that is now always false.

Calculating -------------------------------------
                           ours       trunk
vm2_poly_same_method     2.142M      2.070M i/s -      6.000M times in 2.801148s 2.898994s

Comparison:
             vm2_poly_same_method
                ours:   2141979.2 i/s
               trunk:   2069683.8 i/s - 1.03x  slower
2019-09-30 10:26:38 +09:00
卜部昌平 dd883de5ba refactor constify most of rb_method_entry_t
Now that we have eliminated most destructive operations over the
rb_method_entry_t / rb_callable_method_entry_t, let's make them
mostly immutabe and mark them const.

One exception is rb_export_method(), which destructively modifies
visibilities of method entries.  I have left that operation as is
because I suspect that destructiveness is the nature of that
function.
2019-09-30 10:26:38 +09:00
卜部昌平 6c6a25feca refactor add rb_method_entry_from_template
Tired of rb_method_entry_create(..., rb_method_definition_create(
..., &(rb_method_foo_t) {...})) maneuver.  Provide a function that
does the thing to reduce copy&paste.
2019-09-30 10:26:38 +09:00
卜部昌平 7cb96d41a5 refactor delete rb_method_entry_copy
The deleted function was to destructively overwrite existing method
entries, which is now considered to be a bad idea.  Delete it, and
assign a newly created method entry instead.
2019-09-30 10:26:38 +09:00
卜部昌平 3207979278 refactor delete rb_method_definition_set
Instead of destructively write fields of method entries, create a
new entry and let it overwrite its owner.
2019-09-30 10:26:38 +09:00