github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
John Hawthorn	679ef34586	New constant caching insn: opt_getconstant_path Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out.	2022-09-01 15:20:49 -07:00
Takashi Kokubun	d6f21b308b	Convert catch_except_t to stdbool catch_excep_t is a field that exists for MJIT. In the process of rewriting MJIT in Ruby, I added API to convert 1/0 of _Bool to true/false, and it seemed confusing and hard to maintain if you don't use _Bool for *_p fields.	2022-08-25 23:00:19 -07:00
Takashi Kokubun	485019c2bd	Rename mjit_exec to jit_exec (#6262 ) * Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments	2022-08-19 23:57:17 -07:00
Nobuyoshi Nakada	ee864beb7c	Simplify around `USE_YJIT` macro (#6240 ) * Simplify around `USE_YJIT` macro - Use `USE_YJIT` macro only instead of `YJIT_BUILD`. - An intermediate macro `YJIT_SUPPORTED_P` is no longer used. * Bail out if YJIT is enabled on unsupported platforms	2022-08-15 13:05:12 -04:00
Yusuke Endoh	8f7e188822	Add "rb_" prefixes to toplevel enum definitions ... as per ko1's request.	2022-07-22 23:10:24 +09:00
Yusuke Endoh	e763b1118b	Move enum definitions out of struct definition	2022-07-22 23:10:24 +09:00
Takashi Kokubun	5b21e94beb	Expand tabs [ci skip] [Misc #18891]	2022-07-21 09:42:04 -07:00
Jemma Issroff	85ea46730d	Separate TS_IVC and TS_ICVARC in is_entries buffers This allows us to treat cvar caches differently than ivar caches.	2022-07-18 14:06:30 -07:00
John Hawthorn	b8247a1669	Correct comment explaining env flags [ci skip] We use 4 values for env flags now, which also shifted over the frame flags by one bit.	2022-07-14 09:59:34 -07:00
Aaron Patterson	8d63a04703	Flatten bitmap when there is only one element We can avoid allocating a bitmap when the number of elements in the iseq is fewer than the size of an iseq_bits_t	2022-06-23 16:52:00 -07:00
Aaron Patterson	1ccdb1a251	Update vm_core.h Co-authored-by: Tomás Coêlho <36938811+tomascco@users.noreply.github.com>	2022-06-23 14:01:46 -07:00
Aaron Patterson	e23540e566	Speed up ISeq by marking via bitmaps and IC rearranging This commit adds a bitfield to the iseq body that stores offsets inside the iseq buffer that contain values we need to mark. We can use this bitfield to mark objects instead of disassembling the instructions. This commit also groups inline storage entries and adds a counter for each entry. This allows us to iterate and mark each entry without disassembling instructions Since we have a bitfield and grouped inline caches, we can mark all VALUE objects associated with instructions without actually disassembling the instructions at mark time. [Feature #18875] [ruby-core:109042]	2022-06-23 14:01:46 -07:00
Koichi Sasada	08cee2bf80	altstack is native thread's attr Move th->altstack to th->nt->altstack.	2022-05-24 17:50:49 +09:00
Koichi Sasada	f3235ac095	add `rb_th_serial()` `rb_th_serial(th)` returns th's serial for debug print purpose.	2022-05-24 10:06:51 +09:00
Koichi Sasada	d9984f39d3	remove `NON_SCALAR_THREAD_ID` support `NON_SCALAR_THREAD_ID` shows `pthread_t` is non-scalar (non-pointer) and only s390x is known platform. However, the supporting code is very complex and it is only used for deubg print information. So this patch removes the support of `NON_SCALAR_THREAD_ID` and make the code simple.	2022-05-24 10:06:51 +09:00
Koichi Sasada	eab99b1d4b	`rb_thread_t::serial` for debug `rb_thread_t::serial` is auto-incremented serial number for threads and it can overflow, it means the serial is not a ID for each thread, it is only for debug print. `RUBY_DEBUG_LOG` shows this information. Also skip EC related information if EC is NULL. This patch enable to use `RUBY_DEBUG_LOG` without setup EC.	2022-05-20 17:37:46 +09:00
Samuel Williams	f626998c4f	Delete autoload data from global features after autoload has completed. (#5910 ) * Update naming of critical section assertions macros. * Improved locking for autoload.	2022-05-17 00:50:02 +12:00
Samuel Williams	32de6097b2	Fix various autoload race conditions. (#5898 ) * Add RUBY_VM_CRITICAL_SECTION for detecting unexpected context switch. * Prevent race between GC mark and autoload setup. * Protect race on autoload state. * Avoid potential race condition when allocating `autoload_featuremap`. * Add NEWS entry for autoload fixes.	2022-05-15 16:07:12 +12:00
Alan Wu	f90549cd38	Rust YJIT In December 2021, we opened an [issue] to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications. Because we want to be careful, YJIT is guarded behind a configure option: ```shell ./configure --enable-yjit # Build YJIT in release mode ./configure --enable-yjit=dev # Build YJIT in dev/debug mode ``` By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then `cargo` is used to fetch development dependencies, but when building in release, `cargo` is not required, only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer. The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in `doc/yjit/yjit.md`. The CI tests have been updated and do not take any more resources than before. The development history of the Rust port is available at the following commit for interested parties: `1fd9573d8b` Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up. [issue]: https://bugs.ruby-lang.org/issues/18481 Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com> Co-authored-by: Kevin Newton <kddnewton@gmail.com>	2022-04-27 11:00:22 -04:00
Koichi Sasada	03d21a4fb0	introduce struct `rb_native_thread` `rb_thread_t` contained `native_thread_data_t` to represent thread implementation dependent data. This patch separates them and rename it `rb_native_thread` and point it from `rb_thraed_t`. Now, 1 Ruby thread (`rb_thread_t`) has 1 native thread (`rb_native_thread`).	2022-04-23 03:08:27 +09:00
Koichi Sasada	1c4fc0241d	rename thread internal naming Now GVL is not process Global so this patch try to use another words. * `rb_global_vm_lock_t` -> `struct rb_thread_sched` * `gvl->owner` -> `sched->running` * `gvl->waitq` -> `sched->readyq` * `rb_gvl_init` -> `rb_thread_sched_init` * `gvl_destroy` -> `rb_thread_sched_destroy` * `gvl_acquire` -> `thread_sched_to_running` # waiting -> ready -> running * `gvl_release` -> `thread_sched_to_waiting` # running -> waiting * `gvl_yield` -> `thread_sched_yield` * `GVL_UNLOCK_BEGIN` -> `THREAD_BLOCKING_BEGIN` * `GVL_UNLOCK_END` -> `THREAD_BLOCKING_END` * removed * `rb_ractor_gvl` * `rb_vm_gvl_destroy` (not used) There are GVL functions such as `rb_thread_call_without_gvl()` yet but I don't have good name to replace them. Maybe GVL stands for "Greate Valuable Lock" or something like that.	2022-04-22 07:54:09 +09:00
Kevin Newton	6068da8937	Finer-grained constant cache invalidation (take 2) This commit reintroduces finer-grained constant cache invalidation. After `8008fb7` got merged, it was causing issues on token-threaded builds (such as on Windows). The issue was that when you're iterating through instruction sequences and using the translator functions to get back the instruction structs, you're either using `rb_vm_insn_null_translator` or `rb_vm_insn_addr2insn2` depending if it's a direct-threading build. `rb_vm_insn_addr2insn2` does some normalization to always return to you the non-trace version of whatever instruction you're looking at. `rb_vm_insn_null_translator` does not do that normalization. This means that when you're looping through the instructions if you're trying to do an opcode comparison, it can change depending on the type of threading that you're using. This can be very confusing. So, this commit creates a new translator function `rb_vm_insn_normalizing_translator` to always return the non-trace version so that opcode comparisons don't have to worry about different configurations. [Feature #18589]	2022-04-01 14:48:22 -04:00
Nobuyoshi Nakada	42a0bed351	Prefix ccan headers (#4568 ) * Prefixed ccan headers * Remove unprefixed names in ccan/build_assert * Remove unprefixed names in ccan/check_type * Remove unprefixed names in ccan/container_of * Remove unprefixed names in ccan/list Co-authored-by: Samuel Williams <samuel.williams@oriontransfer.co.nz>	2022-03-30 20:36:31 +13:00
Nobuyoshi Nakada	69967ee64e	Revert "Finer-grained inline constant cache invalidation" This reverts commits for [Feature #18589]: * `8008fb7352` "Update formatting per feedback" * `8f6eaca2e1` "Delete ID from constant cache table if it becomes empty on ISEQ free" * `629908586b` "Finer-grained inline constant cache invalidation" MSWin builds on AppVeyor have been crashing since the merger.	2022-03-25 20:29:09 +09:00
Kevin Newton	629908586b	Finer-grained inline constant cache invalidation Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated. ```ruby class A B = 1 end def foo A::B # inline cache depends on global counter end foo # populate inline cache foo # hit inline cache C = 1 # global counter increments, all caches are invalidated foo # misses inline cache due to `C = 1` ``` Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache. ```ruby class A B = 1 end def foo A::B # inline cache depends constants named "A" and "B" end foo # populate inline cache foo # hit inline cache C = 1 # caches that depend on the name "C" are invalidated foo # hits inline cache because IC only depends on "A" and "B" ``` Examples of breaking the new cache: ```ruby module C # Breaks `foo` cache because "A" constant is set and the cache in foo depends # on "A" and "B" class A; end end B = 1 ``` We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT.	2022-03-24 09:14:38 -07:00
Peter Zhu	5f10bd634f	Add ISEQ_BODY macro Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation.	2022-03-24 10:03:51 -04:00
Yuta Saito	23de01c7aa	[wasm] eval_inter.h gc.c vm_core.h: include wasm/setjmp.h instead of sysroot header	2022-01-19 11:19:06 +09:00
Koichi Sasada	ad450c9fe5	make `overloaded_cme_table` truly weak key map `overloaded_cme_table` keeps cme -> monly_cme pairs to manage corresponding `monly_cme` for `cme`. The lifetime of the `monly_cme` should be longer than `monly_cme`, but the previous patch losts the reference to the living `monly_cme`. Now `overloaded_cme_table` values are always root (keys are only weak reference), it means `monly_cme` does not freed until corresponding `cme` is invalidated. To make managing easy, move `overloaded_cme_table` to `rb_vm_t`.	2021-12-21 15:21:30 +09:00
Koichi Sasada	2e6e2fd9da	fix local TP memory leak It free `rb_hook_list_t` itself if needed. To recognize the need, this patch introduced `rb_hook_list_t::is_local` flag. This patch is succession of https://github.com/ruby/ruby/pull/4652	2021-12-15 02:31:58 +09:00
Yusuke Endoh	8613c0c675	Introduce an option "--dump=insns_without_opt" for debugging purposes	2021-12-13 10:29:08 +09:00
Koichi Sasada	b1b73936c1	`Primitive.mandatory_only?` for fast path Compare with the C methods, A built-in methods written in Ruby is slower if only mandatory parameters are given because it needs to check the argumens and fill default values for optional and keyword parameters (C methods can check the number of parameters with `argc`, so there are no overhead). Passing mandatory arguments are common (optional arguments are exceptional, in many cases) so it is important to provide the fast path for such common cases. `Primitive.mandatory_only?` is a special builtin function used with `if` expression like that: ```ruby def self.at(time, subsec = false, unit = :microsecond, in: nil) if Primitive.mandatory_only? Primitive.time_s_at1(time) else Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end end ``` and it makes two ISeq, ``` def self.at(time, subsec = false, unit = :microsecond, in: nil) Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end def self.at(time) Primitive.time_s_at1(time) end ``` and (2) is pointed by (1). Note that `Primitive.mandatory_only?` should be used only in a condition of an `if` statement and the `if` statement should be equal to the methdo body (you can not put any expression before and after the `if` statement). A method entry with `mandatory_only?` (`Time.at` on the above case) is marked as `iseq_overload`. When the method will be dispatch only with mandatory arguments (`Time.at(0)` for example), make another method entry with ISeq (2) as mandatory only method entry and it will be cached in an inline method cache. The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254 but it only checks mandatory parameters or more, because many cases only mandatory parameters are given. If we find other cases (optional or keyword parameters are used frequently and it hurts performance), we can extend the feature.	2021-11-15 15:58:56 +09:00
Yuta Saito	8590d61ea9	Select including thread impl file at config time	2021-10-30 10:18:33 +09:00
Yusuke Endoh	c1228f833c	vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine This caused Bus error on 32 bit Solaris	2021-10-29 10:57:46 +09:00
Yusuke Endoh	86e3d77abb	Make Coverage suspendable (#4856 ) * Make Coverage suspendable Add `Coverage.suspend`, `Coverage.resume` and some methods. [Feature #18176] [ruby-core:105321]	2021-10-25 20:00:51 +09:00
Nobuyoshi Nakada	7459a32af3	suppress warnings for probable NULL dererefences	2021-10-24 19:24:50 +09:00
Koichi Sasada	c7550537f1	`RubyVM.keep_script_lines` `RubyVM.keep_script_lines` enables to keep script lines for each ISeq and AST. This feature is for debugger/REPL support. ```ruby RubyVM.keep_script_lines = true RubyVM::keep_script_lines = true eval("def foo = nil\ndef bar = nil") pp RubyVM::InstructionSequence.of(method(:foo)).script_lines ```	2021-10-21 16:17:39 +09:00
Alan Wu	5906a5a732	Add comments about special runtime routines YJIT calls When YJIT make calls to routines without reconstructing interpreter state through jit_prepare_routine_call(), it relies on the routine to never allocate, raise, and push/pop control frames. Comment about this on the routines that YJTI calls. This is probably something we should dynamically verify on debug builds. It's hard to statically verify this as it requires verifying all functions in the call tree. Maybe something to look at in the future.	2021-10-20 18:19:43 -04:00
Alan Wu	7c08538aa3	Cleanup diff against upstream. Add comments I did a `git diff --stat` against upstream and looked at all the files that are outside of YJIT to come up with these minor changes.	2021-10-20 18:19:42 -04:00
Alan Wu	b626dd7211	YJIT: Fancier opt_getinlinecache Make sure `opt_getinlinecache` is in a block all on its own, and invalidate it from the interpreter when `opt_setinlinecache`. It will recompile with a filled cache the second time around. This lets YJIT runs well when the IC for constant is cold.	2021-10-20 18:19:33 -04:00
Jose Narvaez	4e2eb7695e	Yet Another Ruby JIT! Renaming uJIT to YJIT. AKA s/ujit/yjit/g.	2021-10-20 18:19:31 -04:00
Maxime Chevalier-Boisvert	abc016ad2c	WIP refactor block lists to use darray	2021-10-20 18:19:30 -04:00
Alan Wu	c02517bacb	Tie lifetime of uJIT blocks to iseqs * Tie lifetime of uJIT blocks to iseqs Blocks weren't being freed when iseqs are collected. * Add rb_dary. Use it for method dependency table * Keep track of blocks per iseq Remove global version_tbl * Block version bookkeeping fix * dary -> darray * free ujit_blocks * comment about size of ujit_blocks	2021-10-20 18:19:29 -04:00
Maxime Chevalier-Boisvert	9d8cc01b75	WIP JIT-to-JIT returns	2021-10-20 18:19:28 -04:00
Jeremy Evans	79a4484a07	Do not load file with same realpath twice when requiring This fixes issues with paths being loaded twice in certain cases when symlinks are used. It took me multiple attempts to get this working. My original attempt tried to convert paths to realpaths before adding them to $LOADED_FEATURES. Unfortunately, this doesn't work well with the loaded feature index, which is based off load paths and not realpaths. While I was able to get require working, I'm fairly sure the loaded feature index was not being used as expected, which would have significant performance implications. Additionally, I was never able to get that approach working with autoload when autoloading a non-realpath file. It also broke some specs. This takes a more conservative approach. Directly before loading the file, if the file with the same realpath has been required, the loading of the file is skipped. The realpaths are stored as fstrings in a hidden hash. When rebuilding the loaded feature index, the hash of realpaths is also rebuilt. I'm guessing this makes rebuilding process slower, but I don think that is a hot path. In general, modifying loaded features is only done when reloading, and that tends to be in non-production environments. Change test_require_with_loaded_features_pop test to use 30 threads and 300 iterations, instead of 4 threads and 1000 iterations. I saw only sporadic failures with 4/1000, but consistent failures 30/300 threads. These failures were due to the fact that the concurrent deletions from $LOADED_FEATURES in other threads can result in rb_ary_entry returning nil when rebuilding the loaded features index. To avoid concurrency issues when rebuilding the loaded features index, the building of the index itself is left alone, and afterwards, a separate loop is done on a copy of the loaded feature snapshot in order to rebuild the realpaths hash. Fixes [Bug #17885]	2021-10-02 05:51:29 -09:00
Jeremy Evans	162ad65fdd	Revert "Do not load file with same realpath twice when requiring" This reverts commit `ddb85c5d2b`. This commit causes unexpected warnings in TestTranscode#test_loading_race occasionally in CI.	2021-09-18 17:37:35 -07:00
Jeremy Evans	ddb85c5d2b	Do not load file with same realpath twice when requiring This fixes issues with paths being loaded twice in certain cases when symlinks are used. It took me multiple attempts to get this working. My original attempt tried to convert paths to realpaths before adding them to $LOADED_FEATURES. Unfortunately, this doesn't work well with the loaded feature index, which is based off load paths and not realpaths. While I was able to get require working, I'm fairly sure the loaded feature index was not being used as expected, which would have significant performance implications. Additionally, I was never able to get that approach working with autoload when autoloading a non-realpath file. It also broke some specs. This takes a more conservative approach. Directly before loading the file, if the file with the same realpath has been required, the loading of the file is skipped. The realpaths are stored as fstrings in a hidden hash. When rebuilding the loaded feature index, the hash of realpaths is also rebuilt. I'm guessing this makes rebuilding process slower, but I don think that is a hot path. In general, modifying loaded features is only done when reloading, and that tends to be in non-production environments. Change test_require_with_loaded_features_pop test to use 30 threads and 300 iterations, instead of 4 threads and 1000 iterations. I saw only sporadic failures with 4/1000, but consistent failures 30/300 threads. These failures were due to the fact that the concurrent deletions from $LOADED_FEATURES in other threads can result in rb_ary_entry returning nil when rebuilding the loaded features index. To avoid concurrency issues when rebuilding the loaded features index, the building of the index itself is left alone, and afterwards, a separate loop is done on a copy of the loaded feature snapshot in order to rebuild the realpaths hash. Fixes [Bug #17885]	2021-09-18 07:05:23 -09:00
卜部昌平	dddc618d30	suppress GCC's -Wsuggest-attribute=format I was not aware of this because I use clang these days.	2021-09-10 20:00:06 +09:00
Yusuke Endoh	f336a3eb6c	Use free instead of xfree to free altstack The altstack memory of a thread may be free'ed even after the VM is destructed. After that, GC is no longer available, so calling xfree may lead to a segfault. This changeset uses the bare free function to free the altstack memory instead of xfree. [Bug #18126]	2021-09-06 14:22:24 +09:00
Nobuyoshi Nakada	28d03ee776	Remove root_jmpbuf in rb_thread_struct It has not been used since `1b82c877df`.	2021-08-10 19:08:38 +09:00
Samuel Williams	42130a64f0	Replace copy coroutine with pthread implementation.	2021-07-01 11:23:03 +12:00
eileencodes	b91b3bc771	Add a cache for class variables Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master `9e5105c`) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2021-06-18 10:02:44 -07:00
Nobuyoshi Nakada	e03bf76b31	Pack iseq_inline_constant_cache_entry Reordered iseq_inline_constant_cache_entry members not to exceed the size of RValue.	2021-06-09 19:15:57 +09:00
Takashi Kokubun	c32ce2cbf1	Clarify these are just for MJIT and not for third-party libraries. See: `e6484a1530`	2021-06-02 00:09:47 -07:00
Takashi Kokubun	e1b03b0c2b	Enable VM_ASSERT in --jit CIs (#4543 )	2021-06-01 00:15:51 -07:00
NARUSE, Yui	46655156dc	Add Thread#native_thread_id [Feature #17853 ]	2021-05-26 15:14:11 +09:00
Takashi Kokubun	141861a222	Update a comment about what 'inline' attr means	2021-05-21 23:27:36 -07:00
Aaron Patterson	07f055bb13	Revert "Filling cache values on cvar write" This reverts commit `08de37f9fa`. This reverts commit `e8ae922b62`.	2021-05-11 13:31:00 -07:00
eileencodes	e8ae922b62	Add a cache for class variables This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master `9e5105ca45`) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2021-05-11 12:04:27 -07:00
Benoit Daloze	68d6bd0873	Fix trivial -Wundef warnings * See [Feature #17752] Co-authored-by: xtkoba (Tee KOBAYASHI) <xtkoba+ruby@gmail.com>	2021-05-04 14:56:55 +02:00
xtkoba	121fa24a34	Adjust struct member offset for i386 Cygwin Fixes [Bug #17606]	2021-05-01 11:04:17 +09:00
Aaron Patterson	8359821870	Use rb_fstring for "defined" strings. We can take advantage of fstrings to de-duplicate the defined strings. This means we don't need to keep the list of defined strings on the VM (or register them as mark objects)	2021-03-17 10:55:37 -07:00
Koichi Sasada	1ecda21366	global call-cache cache table for rb_funcall* rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough.	2021-01-29 16:22:12 +09:00
Koichi Sasada	d968829afa	expose some C-APIs for ractor expose some C-APIs to try to make ractor utilities on external gems. * add * rb_ractor_local_storage_value_lookup() to check availability * expose * rb_ractor_make_shareable() * rb_ractor_make_shareable_copy() * rb_proc_isolate() (not public) * rb_proc_isolate_bang() (not public) * rb_proc_ractor_make_shareable() (not public)	2021-01-06 16:03:09 +09:00
Koichi Sasada	e7fc353f04	enable constant cache on ractors constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed.	2021-01-05 02:27:58 +09:00
Koichi Sasada	02d9524cda	separate rb_ractor_pub from rb_ractor_t separate some fields from rb_ractor_t to rb_ractor_pub and put it at the beggining of rb_ractor_t and declare it in vm_core.h so vm_core.h can access rb_ractor_pub fields. Now rb_ec_ractor_hooks() is a complete inline function and no MJIT related issue.	2020-12-22 00:03:00 +09:00
Koichi Sasada	a2950369bd	TracePoint.new(&block) should be ractor-local TracePoint should be ractor-local because the Proc can violate the Ractor-safe.	2020-12-22 00:03:00 +09:00
Koichi Sasada	91e2f08a6a	export rb_eRactorIsolationError for MJIT https://ci.appveyor.com/project/ruby/ruby/builds/36942168/job/7ugrpk0pndoly9wp ``` _ruby_mjit_p11920u0.c C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.c(14) : warning C4005: 'GET_SELF' : macro redefinition c:\projects\ruby\vm_insnhelper.h(111) : see previous definition of 'GET_SELF' Creating library C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.lib and object C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.exp _ruby_mjit_p11920u0.obj : error LNK2001: unresolved external symbol rb_eRactorIsolationError C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.so : fatal error LNK1120: 1 unresolved externals ```	2020-12-21 23:28:05 +09:00
Koichi Sasada	dca6752fec	Introduce Ractor::IsolationError Ractor has several restrictions to keep each ractor being isolated and some operation such as `CONST="foo"` in non-main ractor raises an exception. This kind of operation raises an error but there is confusion (some code raises RuntimeError and some code raises NameError). To make clear we introduce Ractor::IsolationError which is raised when the isolation between ractors is violated.	2020-12-21 22:29:05 +09:00
Koichi Sasada	aa6287cd26	fix inline method cache sync bug `cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`.	2020-12-15 13:29:30 +09:00
Koichi Sasada	967040ba59	Introduce negative method cache pCMC doesn't have negative method cache so this patch implements it.	2020-12-14 11:57:46 +09:00
Nobuyoshi Nakada	0df67a4695	Signal handler type should be void	2020-12-12 17:02:42 +09:00
Koichi Sasada	bef3eb5440	fix decl of ruby_single_main_ractor On windows, MJIT doesn't work without this patch because of the declaration of ruby_single_main_ractor. This patch fix this issue and move the definition of it from ractor.c to vm.c to locate near place of ruby_current_vm_ptr.	2020-12-07 08:28:36 +09:00
Koichi Sasada	b67b24d0f5	ruby_single_main_ractor for single ractor mode ruby_multi_ractor was a flag that indicates the interpreter doesn't make any additional ractors (single ractor mode). Instead of boolean flag, ruby_single_main_ractor pointer is introduced which keeps main ractor's pointer if single ractor mode. If additional ractors are created, ruby_single_main_ractor becomes NULL.	2020-12-07 08:28:36 +09:00
Koichi Sasada	182fb73c40	rb_ext_ractor_safe() to declare ractor-safe ext C extensions can violate the ractor-safety, so only ractor-safe C extensions (C methods) can run on non-main ractors. rb_ext_ractor_safe(true) declares that the successive defined methods are ractor-safe. Otherwiwze, defined methods checked they are invoked in main ractor and raise an error if invoked at non-main ractors. [Feature #17307]	2020-12-01 15:44:18 +09:00
Koichi Sasada	1e8abe5d03	introduce USE_VM_CLOCK for windows. The timer function used on windows system set timer interrupt flag of current main ractor's executing ec and thread can detect the end of time slice. However, to set all ec->interrupt_flag for all running ractors, it is requires to synchronize with other ractors. However, timer thread can not acquire the ractor-wide lock because of some limitation. To solve this issue, this patch introduces USE_VM_CLOCK compile option to introduce rb_vm_t::clock. This clock will be incremented by the timer thread and each thread can check the incrementing by comparison with previous checked clock. At last, on windows platform this patch introduces some overhead, but I think there is no critical performance issue because of this modification.	2020-11-11 15:49:02 +09:00
Koichi Sasada	5d97bdc2dc	Ractor.make_shareable(a_proc) Ractor.make_shareable() supports Proc object if (1) a Proc only read outer local variables (no assignments) (2) read outer local variables are shareable. Read local variables are stored in a snapshot, so after making shareable Proc, any assignments are not affeect like that: ```ruby a = 1 pr = Ractor.make_shareable(Proc.new{p a}) pr.call #=> 1 a = 2 pr.call #=> 1 # `a = 2` doesn't affect ``` [Feature #17284]	2020-10-30 03:12:09 +09:00
Koichi Sasada	07c03bc309	check isolated Proc more strictly Isolated Proc prohibit to access outer local variables, but it was violated by binding and so on, so they should be error.	2020-10-29 23:42:55 +09:00
Jeremy Evans	dfb3605bbe	Add Thread.ignore_deadlock accessor Setting this to true disables the deadlock detector. It should only be used in cases where the deadlock could be broken via some external means, such as via a signal. Now that $SAFE is no longer used, replace the safe_level_ VM flag with ignore_deadlock for storing the setting. Fixes [Bug #13768]	2020-10-28 15:27:00 -07:00
Koichi Sasada	99310e3eb5	Some global variables can be accessed from ractors Some global variables should be used from non-main Ractors. [Bug #17268] ```ruby # ractor-local (derived from created ractor): debug '$DEBUG' => $DEBUG, '$-d' => $-d, # ractor-local (derived from created ractor): verbose '$VERBOSE' => $VERBOSE, '$-w' => $-w, '$-W' => $-W, '$-v' => $-v, # process-local (readonly): other commandline parameters '$-p' => $-p, '$-l' => $-l, '$-a' => $-a, # process-local (readonly): getpid '$$' => $$, # thread local: process result '$?' => $?, # scope local: match '$~' => $~.inspect, '$&' => $&, '$`' => $`, '$\'' => $', '$+' => $+, '$1' => $1, # scope local: last line '$_' => $_, # scope local: last backtrace '$@' => $@, '$!' => $!, # ractor local: stdin, out, err '$stdin' => $stdin.inspect, '$stdout' => $stdout.inspect, '$stderr' => $stderr.inspect, ```	2020-10-20 15:38:54 +09:00
Koichi Sasada	319afed20f	Use language TLS specifier if it is possible. To access TLS, it is faster to use language TLS specifier instead of using pthread_get/setspecific functions. Original proposal is: Use native thread locals. #3665	2020-10-20 01:05:06 +09:00
Koichi Sasada	f6661f5085	sync RClass::ext::iv_index_tbl iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically.	2020-10-17 08:18:04 +09:00
Koichi Sasada	ae693fff74	fix releasing timing. (1) recorded_lock_rec > current_lock_rec should not be occurred on rb_ec_vm_lock_rec_release(). (2) should be release VM lock at EXEC_TAG(), not POP_TAG(). (3) some refactoring.	2020-10-14 16:36:55 +09:00
Koichi Sasada	c3ba3fa8d0	support exception when lock_rec > 0 If a ractor getting a VM lock (monitor) raises an exception, unlock can be skipped. To release VM lock correctly on exception (or other jumps with JUMP_TAG), EC_POP_TAG() releases VM lock.	2020-10-14 14:02:06 +09:00
Samuel Williams	70f08f1eed	Make `Thread#join` non-blocking.	2020-09-21 11:48:44 +12:00
Koichi Sasada	74ddac1c82	relax dependency vm_sync.h does not need to include vm_core.h and ractor_pub.h.	2020-09-15 00:04:59 +09:00
Koichi Sasada	f7ccb8dd88	restart Ractor.select on intterupt signal can interrupt Ractor.select, but if there is no exception, Ractor.select should restart automatically.	2020-09-15 00:04:59 +09:00
Koichi Sasada	79df14c04b	Introduce Ractor mechanism for parallel execution This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues.	2020-09-03 21:11:06 +09:00
Jeremy Evans	a0273d67d0	Avoid a use after free in VM assertion If the thread for the current EC has been killed, don't check the VM ptr for the EC (which gets it via the thread), as that will have already been freed. Fixes [Bug #16907]	2020-08-21 14:52:30 -07:00
Alan Wu	1d8b689b9e	Remove unused field in rb_iseq_constant_body This was introduced in `191ce5344e` and has been unused since `beae6cbf0f`	2020-07-23 01:17:59 -04:00
卜部昌平	215c6fa3d0	RUBY_CONST_ASSERT: use STATIC_ASSERT instead Static assertions shall be done using STATIC_ASSERT these days.	2020-07-10 12:23:41 +09:00
Takashi Kokubun	7561db8c00	Introduce Primitive.attr! to annotate 'inline' (#3242 ) [Feature #15589]	2020-06-20 17:13:03 -07:00
Samuel Williams	0e3b0fcdba	Thread scheduler for light weight concurrency.	2020-05-14 22:10:55 +12:00
卜部昌平	233c2018f1	drop varargs.h support This header file is simply out of date (for decades since at least 1989). It's the 21st century. Just stop using it.	2020-05-11 14:56:51 +09:00
卜部昌平	9e41a75255	sed -i 's\|ruby/impl\|ruby/internal\|' To fix build failures.	2020-05-11 09:24:08 +09:00
卜部昌平	d7f4d732c1	sed -i s\|ruby/3\|ruby/impl\|g This shall fix compile errors.	2020-05-11 09:24:08 +09:00
卜部昌平	4ff3f20540	add #include guard hack According to MSVC manual (1), cl.exe can skip including a header file when that: - contains #pragma once, or - starts with #ifndef, or - starts with #if ! defined. GCC has a similar trick (2), but it acts more stricter (e. g. there must be _no tokens_ outside of #ifndef...#endif). Sun C lacked #pragma once for a looong time. Oracle Developer Studio 12.5 finally implemented it, but we cannot assume such recent version. This changeset modifies header files so that each of them include strictly one #ifndef...#endif. I believe this is the most portable way to trigger compiler optimizations. [Bug #16770] 1: https://docs.microsoft.com/en-us/cpp/preprocessor/once 2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html	2020-04-13 16:06:00 +09:00
卜部昌平	9e6e39c351	Merge pull request #2991 from shyouhei/ruby.h Split ruby.h	2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada	b6833ff50d	Get rid of redefinition of `rb_execution_context_t` Regardless of the order to include "vm_core.h" and "builtin.h".	2020-03-19 13:25:53 +09:00
Yusuke Endoh	0256e4f0f5	thread_pthread.c: allocate sigaltstack before pthread_create A new (not-initialized-yet) pthread attempts to allocate sigaltstack by using xmalloc. It may cause GC, but because the thread is not initialized yet, ruby_native_thread_p() returns false, which leads to "[FATAL] failed to allocate memory" and exit. In fact, we can observe the error message in the log of OpenBSD CI: https://rubyci.org/logs/rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20200306T083005Z.log.html.gz This changeset allocates sigaltstack before pthread is created.	2020-03-06 21:41:34 +09:00
Koichi Sasada	b9007b6c54	Introduce disposable call-cache. This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details.	2020-02-22 09:58:59 +09:00

1 2 3 4 5 ...

826 Коммитов