Граф коммитов

732 Коммитов

Автор SHA1 Сообщение Дата
Jeremy Evans 162ad65fdd Revert "Do not load file with same realpath twice when requiring"
This reverts commit ddb85c5d2b.

This commit causes unexpected warnings in TestTranscode#test_loading_race
occasionally in CI.
2021-09-18 17:37:35 -07:00
Jeremy Evans ddb85c5d2b Do not load file with same realpath twice when requiring
This fixes issues with paths being loaded twice in certain cases
when symlinks are used.

It took me multiple attempts to get this working.  My original
attempt tried to convert paths to realpaths before adding them
to $LOADED_FEATURES.  Unfortunately, this doesn't work well
with the loaded feature index, which is based off load paths
and not realpaths. While I was able to get require working, I'm
fairly sure the loaded feature index was not being used as
expected, which would have significant performance implications.
Additionally, I was never able to get that approach working with
autoload when autoloading a non-realpath file. It also broke
some specs.

This takes a more conservative approach. Directly before loading the
file, if the file with the same realpath has been required, the
loading of the file is skipped. The realpaths are stored as
fstrings in a hidden hash.

When rebuilding the loaded feature index, the hash of realpaths
is also rebuilt.  I'm guessing this makes rebuilding process
slower, but I don think that is a hot path. In general, modifying
loaded features is only done when reloading, and that tends to be
in non-production environments.

Change test_require_with_loaded_features_pop test to use 30 threads
and 300 iterations, instead of 4 threads and 1000 iterations.
I saw only sporadic failures with 4/1000, but consistent failures
30/300 threads. These failures were due to the fact that the
concurrent deletions from $LOADED_FEATURES in other threads can
result in rb_ary_entry returning nil when rebuilding the loaded
features index.

To avoid concurrency issues when rebuilding the loaded features
index, the building of the index itself is left alone, and
afterwards, a separate loop is done on a copy of the loaded feature
snapshot in order to rebuild the realpaths hash.

Fixes [Bug #17885]
2021-09-18 07:05:23 -09:00
卜部昌平 dddc618d30 suppress GCC's -Wsuggest-attribute=format
I was not aware of this because I use clang these days.
2021-09-10 20:00:06 +09:00
Yusuke Endoh f336a3eb6c Use free instead of xfree to free altstack
The altstack memory of a thread may be free'ed even after the VM is
destructed. After that, GC is no longer available, so calling xfree
may lead to a segfault.

This changeset uses the bare free function to free the altstack memory
instead of xfree. [Bug #18126]
2021-09-06 14:22:24 +09:00
Nobuyoshi Nakada 28d03ee776 Remove root_jmpbuf in rb_thread_struct
It has not been used since 1b82c877df.
2021-08-10 19:08:38 +09:00
Samuel Williams 42130a64f0
Replace copy coroutine with pthread implementation. 2021-07-01 11:23:03 +12:00
eileencodes b91b3bc771 Add a cache for class variables
Redo of 34a2acdac788602c14bf05fb616215187badd504 and
931138b00696419945dc03e10f033b1f53cd50f3 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Nobuyoshi Nakada e03bf76b31
Pack iseq_inline_constant_cache_entry
Reordered iseq_inline_constant_cache_entry members not to exceed
the size of RValue.
2021-06-09 19:15:57 +09:00
Takashi Kokubun c32ce2cbf1
Clarify these are just for MJIT
and not for third-party libraries.

See: e6484a1530
2021-06-02 00:09:47 -07:00
Takashi Kokubun e1b03b0c2b
Enable VM_ASSERT in --jit CIs (#4543) 2021-06-01 00:15:51 -07:00
NARUSE, Yui 46655156dc Add Thread#native_thread_id [Feature #17853] 2021-05-26 15:14:11 +09:00
Takashi Kokubun 141861a222
Update a comment about what 'inline' attr means 2021-05-21 23:27:36 -07:00
Aaron Patterson 07f055bb13
Revert "Filling cache values on cvar write"
This reverts commit 08de37f9fa.
This reverts commit e8ae922b62.
2021-05-11 13:31:00 -07:00
eileencodes e8ae922b62 Add a cache for class variables
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-05-11 12:04:27 -07:00
Benoit Daloze 68d6bd0873 Fix trivial -Wundef warnings
* See [Feature #17752]

Co-authored-by: xtkoba (Tee KOBAYASHI) <xtkoba+ruby@gmail.com>
2021-05-04 14:56:55 +02:00
xtkoba 121fa24a34 Adjust struct member offset for i386 Cygwin
Fixes [Bug #17606]
2021-05-01 11:04:17 +09:00
Aaron Patterson 8359821870 Use rb_fstring for "defined" strings.
We can take advantage of fstrings to de-duplicate the defined strings.
This means we don't need to keep the list of defined strings on the VM
(or register them as mark objects)
2021-03-17 10:55:37 -07:00
Koichi Sasada 1ecda21366 global call-cache cache table for rb_funcall*
rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes
Ruby's method with given receiver. Ruby 2.7 introduced inline method
cache with static memory area. However, Ruby 3.0 reimplemented the
method cache data structures and the inline cache was removed.

Without inline cache, rb_funcall* searched methods everytime.
Most of cases per-Class Method Cache (pCMC) will be helped but
pCMC requires VM-wide locking and it hurts performance on
multi-Ractor execution, especially all Ractors calls methods
with rb_funcall*.

This patch introduced Global Call-Cache Cache Table (gccct) for
rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage
method cache entry atomically and gccct enables method-caching
without VM-wide locking. This table solves the performance issue
on multi-ractor execution.
[Bug #17497]

Ruby-level method invocation does not use gccct because it has
inline-method-cache and the table size is limited. Basically
rb_funcall* is not used frequently, so 1023 entries can be enough.
We will revisit the table size if it is not enough.
2021-01-29 16:22:12 +09:00
Koichi Sasada d968829afa expose some C-APIs for ractor
expose some C-APIs to try to make ractor utilities on external gems.

* add
  * rb_ractor_local_storage_value_lookup() to check availability
* expose
  * rb_ractor_make_shareable()
  * rb_ractor_make_shareable_copy()
  * rb_proc_isolate() (not public)
  * rb_proc_isolate_bang() (not public)
  * rb_proc_ractor_make_shareable() (not public)
2021-01-06 16:03:09 +09:00
Koichi Sasada e7fc353f04 enable constant cache on ractors
constant cache `IC` is accessed by non-atomic manner and there are
thread-safety issues, so Ruby 3.0 disables to use const cache on
non-main ractors.

This patch enables it by introducing `imemo_constcache` and allocates
it by every re-fill of const cache like `imemo_callcache`.
[Bug #17510]

Now `IC` only has one entry `IC::entry` and it points to
`iseq_inline_constant_cache_entry`, managed by T_IMEMO object.

`IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and
`rb_mjit_after_vm_ic_update()` is not needed.
2021-01-05 02:27:58 +09:00
Koichi Sasada 02d9524cda separate rb_ractor_pub from rb_ractor_t
separate some fields from rb_ractor_t to rb_ractor_pub and put it
at the beggining of rb_ractor_t and declare it in vm_core.h so
vm_core.h can access rb_ractor_pub fields.

Now rb_ec_ractor_hooks() is a complete inline function and no
MJIT related issue.
2020-12-22 00:03:00 +09:00
Koichi Sasada a2950369bd TracePoint.new(&block) should be ractor-local
TracePoint should be ractor-local because the Proc can violate the
Ractor-safe.
2020-12-22 00:03:00 +09:00
Koichi Sasada 91e2f08a6a export rb_eRactorIsolationError for MJIT
https://ci.appveyor.com/project/ruby/ruby/builds/36942168/job/7ugrpk0pndoly9wp
```
_ruby_mjit_p11920u0.c
C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.c(14) : warning C4005: 'GET_SELF' : macro redefinition
        c:\projects\ruby\vm_insnhelper.h(111) : see previous definition of 'GET_SELF'
   Creating library C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.lib and object C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.exp
_ruby_mjit_p11920u0.obj : error LNK2001: unresolved external symbol rb_eRactorIsolationError
C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.so : fatal error LNK1120: 1 unresolved externals
```
2020-12-21 23:28:05 +09:00
Koichi Sasada dca6752fec Introduce Ractor::IsolationError
Ractor has several restrictions to keep each ractor being isolated
and some operation such as `CONST="foo"` in non-main ractor raises
an exception. This kind of operation raises an error but there is
confusion (some code raises RuntimeError and some code raises
NameError).

To make clear we introduce Ractor::IsolationError which is raised
when the isolation between ractors is violated.
2020-12-21 22:29:05 +09:00
Koichi Sasada aa6287cd26 fix inline method cache sync bug
`cd` is passed to method call functions to method invocation
functions, but `cd` can be manipulated by other ractors simultaneously
so it contains thread-safety issue.

To solve this issue, this patch stores `ci` and found `cc` to `calling`
and stops to pass `cd`.
2020-12-15 13:29:30 +09:00
Koichi Sasada 967040ba59 Introduce negative method cache
pCMC doesn't have negative method cache so this patch  implements it.
2020-12-14 11:57:46 +09:00
Nobuyoshi Nakada 0df67a4695 Signal handler type should be void 2020-12-12 17:02:42 +09:00
Koichi Sasada bef3eb5440 fix decl of ruby_single_main_ractor
On windows, MJIT doesn't work without this patch because of
the declaration of ruby_single_main_ractor. This patch fix this
issue and move the definition of it from ractor.c to vm.c to locate
near place of ruby_current_vm_ptr.
2020-12-07 08:28:36 +09:00
Koichi Sasada b67b24d0f5 ruby_single_main_ractor for single ractor mode
ruby_multi_ractor was a flag that indicates the interpreter doesn't
make any additional ractors (single ractor mode).
Instead of boolean flag, ruby_single_main_ractor pointer is introduced
which keeps main ractor's pointer if single ractor mode. If additional
ractors are created, ruby_single_main_ractor becomes NULL.
2020-12-07 08:28:36 +09:00
Koichi Sasada 182fb73c40 rb_ext_ractor_safe() to declare ractor-safe ext
C extensions can violate the ractor-safety, so only ractor-safe
C extensions (C methods) can run on non-main ractors.
rb_ext_ractor_safe(true) declares that the successive
defined methods are ractor-safe. Otherwiwze, defined methods
checked they are invoked in main ractor and raise an error
if invoked at non-main ractors.

[Feature #17307]
2020-12-01 15:44:18 +09:00
Koichi Sasada 1e8abe5d03 introduce USE_VM_CLOCK for windows.
The timer function used on windows system set timer interrupt
flag of current main ractor's executing ec and thread can detect
the end of time slice. However, to set all ec->interrupt_flag for
all running ractors, it is requires to synchronize with other ractors.
However, timer thread can not acquire the ractor-wide lock because
of some limitation.

To solve this issue, this patch introduces USE_VM_CLOCK compile option
to introduce rb_vm_t::clock. This clock will be incremented by the
timer thread and each thread can check the incrementing by comparison
with previous checked clock. At last, on windows platform this patch
introduces some overhead, but I think there is no critical performance
issue because of this modification.
2020-11-11 15:49:02 +09:00
Koichi Sasada 5d97bdc2dc Ractor.make_shareable(a_proc)
Ractor.make_shareable() supports Proc object if
(1) a Proc only read outer local variables (no assignments)
(2) read outer local variables are shareable.

Read local variables are stored in a snapshot, so after making
shareable Proc, any assignments are not affeect like that:

```ruby
a = 1
pr = Ractor.make_shareable(Proc.new{p a})
pr.call #=> 1
a = 2
pr.call #=> 1 # `a = 2` doesn't affect
```

[Feature #17284]
2020-10-30 03:12:09 +09:00
Koichi Sasada 07c03bc309 check isolated Proc more strictly
Isolated Proc prohibit to access outer local variables, but it was
violated by binding and so on, so they should be error.
2020-10-29 23:42:55 +09:00
Jeremy Evans dfb3605bbe
Add Thread.ignore_deadlock accessor
Setting this to true disables the deadlock detector.  It should
only be used in cases where the deadlock could be broken via some
external means, such as via a signal.

Now that $SAFE is no longer used, replace the safe_level_ VM flag
with ignore_deadlock for storing the setting.

Fixes [Bug #13768]
2020-10-28 15:27:00 -07:00
Koichi Sasada 99310e3eb5 Some global variables can be accessed from ractors
Some global variables should be used from non-main Ractors.
[Bug #17268]

```ruby
     # ractor-local (derived from created ractor): debug
     '$DEBUG' => $DEBUG,
     '$-d' => $-d,

     # ractor-local (derived from created ractor): verbose
     '$VERBOSE' => $VERBOSE,
     '$-w' => $-w,
     '$-W' => $-W,
     '$-v' => $-v,

     # process-local (readonly): other commandline parameters
     '$-p' => $-p,
     '$-l' => $-l,
     '$-a' => $-a,

     # process-local (readonly): getpid
     '$$'  => $$,

     # thread local: process result
     '$?'  => $?,

     # scope local: match
     '$~'  => $~.inspect,
     '$&'  => $&,
     '$`'  => $`,
     '$\''  => $',
     '$+'  => $+,
     '$1'  => $1,

     # scope local: last line
     '$_' => $_,

     # scope local: last backtrace
     '$@' => $@,
     '$!' => $!,

     # ractor local: stdin, out, err
     '$stdin'  => $stdin.inspect,
     '$stdout' => $stdout.inspect,
     '$stderr' => $stderr.inspect,
```
2020-10-20 15:38:54 +09:00
Koichi Sasada 319afed20f Use language TLS specifier if it is possible.
To access TLS, it is faster to use language TLS specifier instead
of using pthread_get/setspecific functions.

Original proposal is: Use native thread locals. #3665
2020-10-20 01:05:06 +09:00
Koichi Sasada f6661f5085 sync RClass::ext::iv_index_tbl
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.

This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
2020-10-17 08:18:04 +09:00
Koichi Sasada ae693fff74 fix releasing timing.
(1) recorded_lock_rec > current_lock_rec should not be occurred
    on rb_ec_vm_lock_rec_release().
(2) should be release VM lock at EXEC_TAG(), not POP_TAG().
(3) some refactoring.
2020-10-14 16:36:55 +09:00
Koichi Sasada c3ba3fa8d0 support exception when lock_rec > 0
If a ractor getting a VM lock (monitor) raises an exception,
unlock can be skipped. To release VM lock correctly on exception
(or other jumps with JUMP_TAG), EC_POP_TAG() releases VM lock.
2020-10-14 14:02:06 +09:00
Samuel Williams 70f08f1eed Make `Thread#join` non-blocking. 2020-09-21 11:48:44 +12:00
Koichi Sasada 74ddac1c82 relax dependency
vm_sync.h does not need to include vm_core.h and ractor_pub.h.
2020-09-15 00:04:59 +09:00
Koichi Sasada f7ccb8dd88 restart Ractor.select on intterupt
signal can interrupt Ractor.select, but if there is no exception,
Ractor.select should restart automatically.
2020-09-15 00:04:59 +09:00
Koichi Sasada 79df14c04b Introduce Ractor mechanism for parallel execution
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.

[Feature #17100]

This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.

I hope this feature can help programmers from thread-safety issues.
2020-09-03 21:11:06 +09:00
Jeremy Evans a0273d67d0 Avoid a use after free in VM assertion
If the thread for the current EC has been killed, don't check
the VM ptr for the EC (which gets it via the thread), as that will
have already been freed.

Fixes [Bug #16907]
2020-08-21 14:52:30 -07:00
Alan Wu 1d8b689b9e Remove unused field in rb_iseq_constant_body
This was introduced in 191ce5344e
and has been unused since beae6cbf0f
2020-07-23 01:17:59 -04:00
卜部昌平 215c6fa3d0 RUBY_CONST_ASSERT: use STATIC_ASSERT instead
Static assertions shall be done using STATIC_ASSERT these days.
2020-07-10 12:23:41 +09:00
Takashi Kokubun 7561db8c00
Introduce Primitive.attr! to annotate 'inline' (#3242)
[Feature #15589]
2020-06-20 17:13:03 -07:00
Samuel Williams 0e3b0fcdba
Thread scheduler for light weight concurrency. 2020-05-14 22:10:55 +12:00
卜部昌平 233c2018f1 drop varargs.h support
This header file is simply out of date (for decades since at least
1989).  It's the 21st century.  Just stop using it.
2020-05-11 14:56:51 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00