Граф коммитов

1127 Коммитов

Автор SHA1 Сообщение Дата
Aaron Patterson 84e4453436 Use a functional red-black tree for indexing the shapes
This is an experimental commit that uses a functional red-black tree to
create an index of the ancestor shapes.  It uses an Okasaki style
functional red black tree:

  https://www.cs.tufts.edu/comp/150FP/archive/chris-okasaki/redblack99.pdf

This tree is advantageous because:

* It offers O(n log n) insertions and O(n log n) lookups.
* It shares memory with previous "versions" of the tree

When we insert a node in the tree, only the parts of the tree that need
to be rebalanced are newly allocated.  Parts of the tree that don't need
to be rebalanced are not reallocated, so "new trees" are able to share
memory with old trees.  This is in contrast to a sorted set where we
would have to duplicate the set, and also resort the set on each
insertion.

I've added a new stat to RubyVM.stat so we can understand how the red
black tree increases.
2023-10-24 10:52:06 -07:00
Takashi Kokubun 62c1d8187b Partly revert a change in #8705
Having this variable actually helps the performance of non-JITed calls.

-----  -----------  ----------  ----------  ----------  -------------  ------------
bench  before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
fib    241.9        0.5         225.4       1.0         1.06           1.07
-----  -----------  ----------  ----------  ----------  -------------  ------------

(benchmarked with --yjit-cold-threshold=0)
2023-10-19 17:52:03 -07:00
Takashi Kokubun 985370406d Call rb_jit_cont_init() even earlier
To fix https://github.com/ruby/ruby/actions/runs/6581593578/job/17881779994
2023-10-19 17:12:09 -07:00
Takashi Kokubun 6beb09c2c9
YJIT: Add RubyVM::YJIT.enable (#8705) 2023-10-19 10:54:35 -07:00
Maxime Chevalier-Boisvert b2e1ddffa5
YJIT: port call threshold logic from Rust to C for performance (#8628)
* Port call threshold logic from Rust to C for performance

* Prefix global/field names with yjit_

* Fix linker error

* Fix preprocessor condition for rb_yjit_threshold_hit

* Fix third linker issue

* Exclude yjit_calls_at_interv from RJIT bindgen

---------

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2023-10-12 10:05:34 -04:00
Koichi Sasada be1bbd5b7d M:N thread scheduler for Ractors
This patch introduce M:N thread scheduler for Ractor system.

In general, M:N thread scheduler employs N native threads (OS threads)
to manage M user-level threads (Ruby threads in this case).
On the Ruby interpreter, 1 native thread is provided for 1 Ractor
and all Ruby threads are managed by the native thread.

From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means
1 Ruby thread has 1 native thread. M:N scheduler change this strategy.

Because of compatibility issue (and stableness issue of the implementation)
main Ractor doesn't use M:N scheduler on default. On the other words,
threads on the main Ractor will be managed with 1:1 thread scheduler.

There are additional settings by environment variables:

`RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor.
Note that non-main ractors use the M:N scheduler without this
configuration. With this configuration, single ractor applications
run threads on M:1 thread scheduler (green threads, user-level threads).

`RUBY_MAX_CPU=n` specifies maximum number of native threads for
M:N scheduler (default: 8).

This patch will be reverted soon if non-easy issues are found.

[Bug #19842]
2023-10-12 14:47:01 +09:00
Maxime Chevalier-Boisvert ea491802fa
YJIT: add heuristic to avoid compiling cold ISEQs (#8522)
* YJIT: Add counter to measure how often we compile "cold" ISEQs (#535)

Fix counter name in DEFAULT_COUNTERS

YJIT: add --yjit-cold-threshold, don't compile cold ISEQs

YJIT: increase default cold threshold to 200_000

Remove rb_yjit_call_threshold()

Remove conflict markers

Fix compilation errors

Threshold 1 should compile immediately

Debug deadlock issue with test_ractor

Fix call threshold issue with tests

* Revert exception threshold logic. Document option in yjid.md

* (void) for 0 parameter functions in C99

* Rename iseq_entry_cold => cold_iseq_entry

* Document --yjit-cold-threshold in ruby.c

* Update doc/yjit/yjit.md

Co-authored-by: Jean byroot Boussier <jean.boussier+github@shopify.com>

* Shorten help string to appease test

* Address bug found by Kokubun. Reorder logic.

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Jean byroot Boussier <jean.boussier+github@shopify.com>
2023-10-03 17:45:46 -04:00
Aaron Patterson d3574c117a Move IO#readline to Ruby
This commit moves IO#readline to Ruby.  In order to call C functions,
keyword arguments must be converted to hashes.  Prior to this commit,
code like `io.readline(chomp: true)` would allocate a hash.  This
commits moves the keyword "denaturing" to Ruby, allowing us to send
positional arguments to the C API and avoiding the hash allocation.

Here is an allocation benchmark for the method:

```
x = GC.stat(:total_allocated_objects)
File.open("/usr/share/dict/words") do |f|
  f.readline(chomp: true) until f.eof?
end
p ALLOCATIONS: GC.stat(:total_allocated_objects) - x
```

Before this commit, the output was this:

```
$ make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin22-fake  ./test.rb
{:ALLOCATIONS=>707939}
```

Now it is this:

```
$ make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin22-fake  ./test.rb
{:ALLOCATIONS=>471962}
```

[Bug #19890] [ruby-core:114803]
2023-09-28 10:43:45 -07:00
yui-knk 74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
Nobuyoshi Nakada ac244938e8 Dump backtraces to an arbitrary stream 2023-09-25 22:57:28 +09:00
Nobuyoshi Nakada 6473903082
[Bug #18257] Register the class path of FrozenCore to mark
ICLASS does not have the path usually, so it needs to be registered
separately.
2023-09-19 14:09:53 +09:00
Nobuyoshi Nakada 4634405f7c
Stop exposing FrozenCore in headers
Revert commit "Directly allocate FrozenCore as an ICLASS",
813a5f4fc4.
2023-09-19 14:08:05 +09:00
Matt Valentine-House 322548180d Prevent rb_gc_mark_values from pinning objects
This is an internal only function not exposed to the C extension API.
It's only use so far is from rb_vm_mark, where it's used to mark the
values in the vm->trap_list.cmd array.

There shouldn't be any reason why these cannot move.

This commit allows them to move by updating their references during the
reference updating step of compaction.

To do this we've introduced another internal function
rb_gc_update_values as a partner to rb_gc_mark_values.

This allows us to refactor rb_gc_mark_values to not pin
2023-08-31 19:31:18 +01:00
Alan Wu ebb9034710 Remove cfp parameter from hook_before_rewind()
It's only used once, and it has to equal `ec->cfp`, so just use that.
2023-08-24 17:37:07 -04:00
Alan Wu 05e827427f Make cfp constant and use it more in vm_exec_handle_exception()
For writing THROW_DATA_VAL, being able to see that it's writing to the
same frame after modifying PC and SP is nice.
2023-08-24 17:37:07 -04:00
Jean Boussier cedb333063 Stop incrementing `jit_entry_calls` once threshold is hit
Otherwise the ISeq page will constantly be written
into preventing it from being shared.
2023-08-23 20:20:17 +02:00
Takashi Kokubun cd8d20cd1f
YJIT: Compile exception handlers (#8171)
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2023-08-08 16:06:22 -07:00
Nobuyoshi Nakada 694d99dda2
Share duplicate code between Wasm and the others 2023-08-08 08:48:53 +09:00
Nobuyoshi Nakada ef5b1d19c1
Turn `jit_exec` and `jit_compile` into macros if disabled 2023-08-06 18:45:40 +09:00
Takashi Kokubun e176f84138 Just suppress a warning for non-Emscripten Wasm build
Revert "Revert "Skip calling jit_exec on Wasm""

This reverts commit 2e94610f70.

It's not about whether it's optimized away or not. I just don't want to
leave and maintain the callsite (e.g. signature) in the path where
YJIT is never built.
2023-08-04 20:50:46 -07:00
Nobuyoshi Nakada 2e94610f70 Revert "Skip calling jit_exec on Wasm"
This reverts commit e80752f9bb.

RJIT and YJIT are never enabled on Wasm.  When both are disabled,
`jit_exec` is defined to return `Qundef` constantly, and is optimized
away.
2023-08-05 12:12:42 +09:00
Takashi Kokubun e80752f9bb Skip calling jit_exec on Wasm
We often break Wasm build when we modify how jit_exec works. I'm
planning to modify it again soon.

We actually don't support running Ruby JIT on Wasm, so it doesn't seem
worth the maintenance effort.
2023-08-04 15:39:02 -07:00
Takashi Kokubun 44b19b01e2 Remove an obsoleted initialization from Wasm 2023-07-27 17:36:07 -07:00
Takashi Kokubun 8c7cf2de98 Remove an unused argument in vm_exec_core 2023-07-27 17:31:27 -07:00
Takashi Kokubun 38be9a9b72
Clean up OPT_STACK_CACHING (#8132) 2023-07-27 17:27:05 -07:00
Nobuyoshi Nakada c4e893ceb5
Adjust brace nesting 2023-07-25 00:01:52 +09:00
Alan Wu f302e725e1
Remove __bp__ and speed-up bmethod calls (#8060)
Remove rb_control_frame_t::__bp__ and optimize bmethod calls

This commit removes the __bp__ field from rb_control_frame_t. It was
introduced to help MJIT, but since MJIT was replaced by RJIT, we can use
vm_base_ptr() to compute it from the SP of the previous control frame
instead. Removing the field avoids needing to set it up when pushing new
frames.

Simply removing __bp__ would cause crashes since RJIT and YJIT used a
slightly different stack layout for bmethod calls than the interpreter.
At the moment of the call, the two layouts looked as follows:

                   ┌────────────┐    ┌────────────┐
                   │ frame_base │    │ frame_base │
                   ├────────────┤    ├────────────┤
                   │    ...     │    │    ...     │
                   ├────────────┤    ├────────────┤
                   │    args    │    │    args    │
                   ├────────────┤    └────────────┘<─prev_frame_sp
                   │  receiver  │
    prev_frame_sp─>└────────────┘
                     RJIT & YJIT      interpreter

Essentially, vm_base_ptr() needs to compute the address to frame_base
given prev_frame_sp in the diagrams. The presence of the receiver
created an off-by-one situation.

Make the interpreter use the layout the JITs use for iseq-to-iseq
bmethod calls. Doing so removes unnecessary argument shifting and
vm_exec_core() re-entry from the interpreter, yielding a speed
improvement visible through `benchmark/vm_defined_method.yml`:

     patched:   7578743.1 i/s
      master:   4796596.3 i/s - 1.58x  slower

C-to-iseq bmethod calls now store one more VALUE than before, but that
should have negligible impact on overall performance.

Note that re-entering vm_exec_core() used to be necessary for firing
TracePoint events, but that's no longer the case since
9121e57a5f.

Closes ruby/ruby#6428
2023-07-17 13:57:58 -04:00
Maxime Chevalier-Boisvert d70484f0eb
YJIT: refactoring to allow for fancier call threshold logic (#8078)
* YJIT: refactoring to allow for fancier call threshold logic

* Avoid potentially compiling functions multiple times.

* Update vm.c

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2023-07-17 10:41:18 -04:00
Nobuyoshi Nakada 3c4d788bfe macos: symbols for `rb_execution_context_t` should be internal 2023-07-08 11:31:17 +09:00
yui-knk 19c62b400d Replace parser & node compile_option from Hash to bit field
This commit reduces dependency to CRuby object.
2023-06-17 16:41:08 +09:00
Peter Zhu 813a5f4fc4 Directly allocate FrozenCore as an ICLASS
It's a bad idea to overwrite the flags as the garbage collector may have
set other flags.
2023-06-14 10:42:40 -04:00
yui-knk b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
Nobuyoshi Nakada 2d9bc3efe5 [Bug #19597] Freeze script name 2023-05-10 17:14:20 +09:00
Alan Wu adaff1fc49 [Bug #19592] Fix ext/Setup support
After [1], using ext/Setup to link some, but not all extensions failed
during linking. I did not know about this option, and had assumed that
only `--with-static-linked-ext` builds can include statically linked
extensions.

Include the support code for statically linked extensions in all
configurations like before [1]. Initialize the table lazily to minimize
footprint on builds that have no statically linked extensions.

[1]: 790cf4b6d0 "Fix autoload status of
         statically linked extensions"
2023-04-26 15:02:23 -04:00
Jeremy Evans 99c6d19e50 Generalize cfunc large array splat fix to fix many additional cases raising SystemStackError
Originally, when 2e7bceb34e fixed cfuncs to no
longer use the VM stack for large array splats, it was thought to have fully
fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs)
back in Ruby 2.2.

After additional research, I determined that same issue affects almost all
types of method calls, not just iseq and cfunc calls.  There were two main
types of remaining issues, important cases (where large array splat should
work) and pedantic cases (where large array splat raised SystemStackError
instead of ArgumentError).

Important cases:

```ruby
define_method(:a){|*a|}
a(*1380888.times)

def b(*a); end
send(:b, *1380888.times)

:b.to_proc.call(self, *1380888.times)

def d; yield(*1380888.times) end
d(&method(:b))

def self.method_missing(*a); end
not_a_method(*1380888.times)

```

Pedantic cases:

```ruby
def a; end
a(*1380888.times)
def b(_); end
b(*1380888.times)
def c(_=nil); end
c(*1380888.times)

c = Class.new do
  attr_accessor :a
  alias b a=
end.new
c.a(*1380888.times)
c.b(*1380888.times)

c = Struct.new(:a) do
  alias b a=
end.new
c.a(*1380888.times)
c.b(*1380888.times)
```

This patch fixes all usage of CALLER_SETUP_ARG with splatting a large
number of arguments, and required similar fixes to use a temporary
hidden array in three other cases where the VM would use the VM stack
for handling a large number of arguments.  However, it is possible
there may be additional cases where splatting a large number
of arguments still causes a SystemStackError.

This has a measurable performance impact, as it requires additional
checks for a large number of arguments in many additional cases.

This change is fairly invasive, as there were many different VM
functions that needed to be modified to support this. To avoid
too much API change, I modified struct rb_calling_info to add a
heap_argv member for storing the array, so I would not have to
thread it through many functions.  This struct is always stack
allocated, which helps ensure sure GC doesn't collect it early.

Because of how invasive the changes are, and how rarely large
arrays are actually splatted in Ruby code, the existing test/spec
suites are not great at testing for correct behavior.  To try to
find and fix all issues, I tested this in CI with
VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used
for all array splat method calls.  This was very helpful in
finding breaking cases, especially ones involving flagged keyword
hashes.

Fixes [Bug #4040]

Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>
2023-04-25 08:06:16 -07:00
Aaron Patterson c5fc1ce975 Emit special instruction for array literal + .(hash|min|max)
This commit introduces a new instruction `opt_newarray_send` which is
used when there is an array literal followed by either the `hash`,
`min`, or `max` method.

```
[a, b, c].hash
```

Will emit an `opt_newarray_send` instruction.  This instruction falls
back to a method call if the "interested" method has been monkey
patched.

Here are some examples of the instructions generated:

```
$ ./miniruby --dump=insns -e '[@a, @b].max'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :max
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].min'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :min
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].hash'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :hash
0009 leave
```

[Feature #18897] [ruby-core:109147]

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2023-04-18 17:16:22 -07:00
Nobuyoshi Nakada 607cd24128
Adjust function style [ci skip] 2023-04-15 11:50:54 +09:00
Jeremy Evans 1f115f141d Speed up rebuilding the loaded feature index
Rebuilding the loaded feature index slowed down with the bug fix
for #17885 in 79a4484a07.  The
slowdown was extreme if realpath emulation was used, but even when
not emulated, it could be about 10x slower.

This adds loaded_features_realpath_map to rb_vm_struct. This is a
hidden hash mapping loaded feature paths to realpaths. When
rebuilding the loaded feature index, look at this hash to get
cached realpath values, and skip calling rb_check_realpath if a
cached value is found.

Fixes [Bug #19246]
2023-04-13 20:22:36 -07:00
Matt Valentine-House d91a82850a Pull the shape tree out of the vm object 2023-04-06 11:07:16 +01:00
Koichi Sasada 2093e4c2db `nt->serial` for `RUBY_DEBUG_LOG`
Show native thread's serial on `RUBY_DEBUG_LOG`.
`nt->serial` is also stored into `ruby_nt_serial` if the compiler
supports `RB_THREAD_LOCAL_SPECIFIER`.
2023-03-31 11:28:18 +09:00
Alan Wu 1b06422767
YJIT: Leave cfp->pc uninitialized for VM_FRAME_MAGIC_CFUNC
C function frames don't need to use the VM-specific pc field to run
properly. When pushing a control frame from output code, save one
instruction by leaving the field uninitialized.

Fix-up rb_vm_svar_lep(), which is used while setting local variables via
Regexp#=~. Use cfp->iseq as a secondary signal so it can stop assuming
that all CFUNC frames always have zero pc's.
2023-03-29 17:57:52 -04:00
Maxime Chevalier-Boisvert 39a34694a0
YJIT: Add `--yjit-pause` and `RubyVM::YJIT.resume` (#7609)
* YJIT: Add --yjit-pause and RubyVM::YJIT.resume

This allows booting YJIT in a suspended state. We chose to add a new
command line option as opposed to simply allowing YJIT.resume to work
without any command line option because it allows for combining with
YJIT tuning command line options. It also simpifies implementation.

Paired with Kokubun and Maxime.

* Update yjit.rb

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2023-03-28 15:21:19 -04:00
Aaron Patterson 1a9e2d20e2 Fix shape allocation limits
We can only allocate enough shapes to fit in the shape buffer.
MAX_SHAPE_ID was based on the theoretical maximum number of shapes we
could have, not on the amount of memory we can actually consume.  This
commit changes the MAX_SHAPE_ID to be based on the amount of memory
we're allowed to consume.

Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
2023-03-22 08:46:12 -07:00
Takashi Kokubun ed18093200 Fix the JIT-unsupported case 2023-03-16 10:48:17 -07:00
Takashi Kokubun 9947574b9c Refactor jit_func_t and jit_exec
I closed https://github.com/ruby/ruby/pull/7543, but part of the diff
seems useful regardless, so I extracted it.
2023-03-16 10:42:17 -07:00
Samuel Williams 7fd53eeb46
Remove SIGCHLD `waidpid`. (#7527)
* Remove `waitpid_lock` and related code.

* Remove un-necessary test.

* Remove `rb_thread_sleep_interruptible` dead code.
2023-03-15 19:48:27 +13:00
Takashi Kokubun 868f03cce1 Remove unused jit_enable_p flag
This was used only by MJIT.
2023-03-14 14:01:53 -07:00
Takashi Kokubun 9a43c63d43
YJIT: Implement throw instruction (#7491)
* Break up jit_exec from vm_sendish

* YJIT: Implement throw instruction

* YJIT: Explain what rb_vm_throw does [ci skip]
2023-03-14 13:39:06 -07:00
Samuel Williams ac65ce16e9
Revert SIGCHLD changes to diagnose CI failures. (#7517)
* Revert "Remove special handling of `SIGCHLD`. (#7482)"

This reverts commit 44a0711eab.

* Revert "Remove prototypes for functions that are no longer used. (#7497)"

This reverts commit 4dce12bead.

* Revert "Remove SIGCHLD `waidpid`. (#7476)"

This reverts commit 1658e7d966.

* Fix change to rjit variable name.
2023-03-14 20:07:59 +13:00
Takashi Kokubun 4ad171bb25 Remove an unused VM option
This seems to be used nowhere today.
2023-03-13 20:54:00 -07:00