github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Peter Zhu	51bd816517	[Feature #20470 ] Split GC into gc_impl.c This commit splits gc.c into two files: - gc.c now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM). - gc_impl.c now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, gc_impl.c only uses public APIs in Ruby and a limited set of functions exposed in gc.c. This allows us to build gc_impl.c independently of Ruby and plug Ruby's GC into itself.	2024-07-03 09:03:40 -04:00
Aaron Patterson	cdf33ed5f3	Optimized forwarding callers and callees This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>	2024-06-18 09:28:25 -07:00
Maxime Chevalier-Boisvert	425e630ce7	YJIT: implement variable-length context encoding scheme (#10888 ) * Implement BitVector data structure for variable-length context encoding * Rename method to make intent clearer * Rename write_uint => push_uint to make intent clearer * Implement debug trait for BitVector * Fix bug in BitVector::read_uint_at(), enable more tests * Add one more test for good measure * Start sketching Context::encode() * Progress on variable length context encoding * Add tests. Fix bug. * Encode stack state * Add comments. Try to estimate context encoding size. * More compact encoding for stack size * Commit before rebase * Change Context::encode() to take a BitVector as input * Refactor BitVector::read_uint(), add helper read functions * Implement Context::decode() function. Add test. * Fix bug, add tests * Rename methods * Add Context::encode() and decode() methods using global data * Make encode and decode methods use u32 indices * Refactor YJIT to use variable-length context encoding * Tag functions as allow unused * Add a simple caching mechanism and stats for bytes per context etc * Add comments, fix formatting * Grow vector of bytes by 1.2x instead of 2x * Add debug assert to check round-trip encoding-decoding * Take some rustfmt formatting * Add decoded_from field to Context to reuse previous encodings * Remove olde context stats * Re-add stack_size assert * Disable decoded_from optimization for now	2024-06-07 16:26:14 -04:00
Alan Wu	83c03cc73a	YJIT: Stop asserting rb_objspace_markable_object_p() Because of the way things are sequenced, it doesn't work properly during auto-compaction.	2024-04-26 18:03:26 -07:00
Alan Wu	73eeb8643b	YJIT: Fix reference update for `Invariants::no_ep_escape_iseqs` Previously, the update was done in the ISEQ callback. That effectively never updated anything because the callback itself is given an intact reference, so it could update its content, and `rb_gc_location(iseq)` never returned a new address. Update the whole table once in the YJIT root instead.	2024-04-26 18:03:26 -07:00
Takashi Kokubun	7ab1a608e7	YJIT: Optimize local variables when EP == BP (take 2) (#10607 ) * Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit `c878344195`. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> --------- Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>	2024-04-25 10:04:53 -04:00
Alan Wu	c878344195	Revert "YJIT: Optimize local variables when EP == BP" (#10584 ) This reverts commit `4cc58ea0b8`. Since the change landed call-threshold=1 CI runs have been timing out. There has also been `verify-ctx` violations. Revert for now while we debug.	2024-04-19 16:47:25 +00:00
Takashi Kokubun	4cc58ea0b8	YJIT: Optimize local variables when EP == BP (#10487 )	2024-04-17 15:00:03 -04:00
Jean Boussier	d4f3dcf4df	Refactor VM root modules This `st_table` is used to both mark and pin classes defined from the C API. But `vm->mark_object_ary` already does both much more efficiently. Currently a Ruby process starts with 252 rooted classes, which uses `7224B` in an `st_table` or `2016B` in an `RArray`. So a baseline of 5kB saved, but since `mark_object_ary` is preallocated with `1024` slots but only use `405` of them, it's a net `7kB` save. `vm->mark_object_ary` is also being refactored. Prior to this changes, `mark_object_ary` was a regular `RArray`, but since this allows for references to be moved, it was marked a second time from `rb_vm_mark()` to pin these objects. This has the detrimental effect of marking these references on every minors even though it's a mostly append only list. But using a custom TypedData we can save from having to mark all the references on minor GC runs. Addtionally, immediate values are now ignored and not appended to `vm->mark_object_ary` as it's just wasted space.	2024-03-06 15:33:43 -05:00
Takashi Kokubun	661f9e6d03	YJIT: Support opt_invokebuiltin_delegate for leaf builtin (#10152 )	2024-03-01 13:03:00 -05:00
Alan Wu	558b58d330	YJIT: Reject keywords hash in -1 arity cfunc splat support `test_keyword.rb` caught this issue. Just need to run with `threshold=1`	2024-02-28 15:00:26 -05:00
Alan Wu	11f121364a	YJIT: Support splat with C methods with -1 arity Usually we deal with splats by speculating that they're of a specific size. In this case, the C method takes a pointer and a length, so we can support changing sizes just fine.	2024-02-27 17:50:38 +00:00
Alan Wu	da7b9478d3	YJIT: Pass nil to anonymous kwrest when empty (#9972 ) This is the same optimization as `e4272fd29` ("Avoid allocation when passing no keywords to anonymous kwrest methods") but for YJIT. For anonymous kwrest parameters, nil is just as good as an empty hash. On the usage side, update `splatkw` to handle `nil` with a leaner path.	2024-02-15 11:59:37 -05:00
Takashi Kokubun	e7b0a01002	YJIT: Add top ISEQ call counts to --yjit-stats (#9906 )	2024-02-09 22:12:24 +00:00
Alan Wu	ac1e9e443a	YJIT: Fix ruby2_keywords splat+rest and drop bogus checks YJIT didn't guard for ruby2_keywords hash in case of splat calls that land in methods with a rest parameter, creating incorrect results. The compile-time checks didn't correspond to any actual effects of ruby2_keywords, so it was masking this bug and YJIT was needlessly refusing to compile some code. About 16% of fallback reasons in `lobsters` was due to the ISeq check. We already handle the tagging part with exit_if_supplying_kw_and_has_no_kw() and should now have a dynamic guard for all splat cases. Note for backporting: You also need `7f51959ff1`. [Bug #20195]	2024-01-23 19:22:57 -05:00
Alan Wu	015b0e2e1d	YJIT: Fix unused warnings ``` warning: unused import: `condition::Condition` --> src/asm/arm64/arg/mod.rs:13:9 \| 13 \| pub use condition::Condition; \| ^^^^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default warning: unused import: `rb_yjit_fix_mul_fix as rb_fix_mul_fix` --> src/cruby.rs:188:9 \| 188 \| pub use rb_yjit_fix_mul_fix as rb_fix_mul_fix; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ warning: unused import: `rb_insn_len as raw_insn_len` --> src/cruby.rs:142:9 \| 142 \| pub use rb_insn_len as raw_insn_len; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default ``` Make asm public so it stops warning about unused public stuff in there.	2024-01-10 13:19:15 -05:00
Takashi Kokubun	bd91c5127f	YJIT: Add stats option to RubyVM::YJIT.enable (#9297 )	2023-12-19 11:47:27 -08:00
Takashi Kokubun	6beb09c2c9	YJIT: Add RubyVM::YJIT.enable (#8705 )	2023-10-19 10:54:35 -07:00
Alan Wu	07a7c4bdaf	YJIT: Remove duplicate cfp->iseq accessor	2023-10-05 16:40:27 -04:00
ywenc	3dff315ed3	YJIT: Quiet mode when running with `--yjit-stats` (#8251 ) Quiet mode for running with --yjit-stats	2023-08-18 18:27:59 -04:00
Takashi Kokubun	cd8d20cd1f	YJIT: Compile exception handlers (#8171 ) Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-08-08 16:06:22 -07:00
Takashi Kokubun	d405410e3c	YJIT: Move ROBJECT_OFFSET_* to yjit.c (#8157 )	2023-08-02 10:15:29 -04:00
Takashi Kokubun	cef60e93e6	YJIT: Fallback send instructions to vm_sendish (#8106 )	2023-07-24 13:51:46 -07:00
Maxime Chevalier-Boisvert	d70484f0eb	YJIT: refactoring to allow for fancier call threshold logic (#8078 ) * YJIT: refactoring to allow for fancier call threshold logic * Avoid potentially compiling functions multiple times. * Update vm.c Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2023-07-17 10:41:18 -04:00
Peter Zhu	7577c101ed	Unify length field for embedded and heap strings (#7908 ) * Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN	2023-06-06 10:19:20 -04:00
Nobuyoshi Nakada	8d242a33af	`rb_bug` prints a newline after the message	2023-05-20 21:43:30 +09:00
John Hawthorn	2dff1d4fda	YJIT: Fix raw sample stack lengths in exit traces (#7728 ) yjit-trace-exits appends a synthetic sample for the instruction being exited, but we didn't increment the size of the stack. Fixing this count correctly lets us successfully generate a flamegraph from the exits. I also replaced the line number for instructions with 0, as I don't think the previous value had meaning. Co-authored-by: Adam Hess <HParker@github.com>	2023-04-18 10:09:16 -04:00
Matt Valentine-House	d91a82850a	Pull the shape tree out of the vm object	2023-04-06 11:07:16 +01:00
Takashi Kokubun	1587494b0b	YJIT: Add codegen for Integer methods (#7665 ) * YJIT: Add codegen for Integer methods * YJIT: Update dependencies * YJIT: Fix Integer#[] for argc=2	2023-04-05 13:19:31 -07:00
Takashi Kokubun	1b475fcd10	Remove an unneeded function copy	2023-04-01 23:09:05 -07:00
Maxime Chevalier-Boisvert	39a34694a0	YJIT: Add `--yjit-pause` and `RubyVM::YJIT.resume` (#7609 ) * YJIT: Add --yjit-pause and RubyVM::YJIT.resume This allows booting YJIT in a suspended state. We chose to add a new command line option as opposed to simply allowing YJIT.resume to work without any command line option because it allows for combining with YJIT tuning command line options. It also simpifies implementation. Paired with Kokubun and Maxime. * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>	2023-03-28 15:21:19 -04:00
Alan Wu	35e9b5348d	YJIT: Constify EC to avoid an `as` pointer cast (#7591 )	2023-03-24 12:36:06 -04:00
Takashi Kokubun	32e0c97dfa	RJIT: Optimize String#bytesize	2023-03-18 23:35:42 -07:00
Takashi Kokubun	9fd94d6a0c	YJIT: Support entry for multiple PCs per ISEQ (GH-7535)	2023-03-17 11:53:17 -07:00
Takashi Kokubun	9947574b9c	Refactor jit_func_t and jit_exec I closed https://github.com/ruby/ruby/pull/7543, but part of the diff seems useful regardless, so I extracted it.	2023-03-16 10:42:17 -07:00
Alan Wu	de174681f7	YJIT: Assert that we have the VM lock while marking Somewhat important because having the lock is a key part of the soundness reasoning for the `unsafe` usage here.	2023-03-15 15:45:20 -04:00
Takashi Kokubun	70ba310212	YJIT: Introduce no_gc attribute (#7511 )	2023-03-14 15:38:58 -07:00
Jimmy Miller	45127c84d9	YJIT: Handle rest+splat where non-splat < required (#7499 )	2023-03-13 11:12:23 -04:00
Takashi Kokubun	94da5f7c36	Rename builtin attr :inline to :leaf	2023-03-11 14:25:12 -08:00
Takashi Kokubun	0c0c88d383	Support multiple attributes with Primitive.attr!	2023-03-11 14:19:46 -08:00
Jimmy Miller	56df6d5f9d	YJIT: Handle splat+rest for args pass greater than required (#7468 ) For example: ```ruby def my_func(x, y, rest) p [x, y, rest] end my_func(1, 2, 3, [4, 5]) ```	2023-03-07 17:03:43 -05:00
Nobuyoshi Nakada	ef00c6da88	Adjust `else` style to be consistent in each files [ci skip]	2023-02-26 13:20:43 +09:00
Peter Zhu	3e09822407	Fix incorrect line numbers in GC hook If the previous instruction is not a leaf instruction, then the PC was incremented before the instruction was ran (meaning the currently executing instruction is actually the previous instruction), so we should not increment the PC otherwise we will calculate the source line for the next instruction. This bug can be reproduced in the following script: ``` require "objspace" ObjectSpace.trace_object_allocations_start a = 1.0 / 0.0 p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)] ``` Which outputs: [4, "test.rb"] This is incorrect because the object was allocated on line 10 and not line 4. The behaviour is correct when we use a leaf instruction (e.g. if we replaced `1.0 / 0.0` with `"hello"`), then the output is: [10, "test.rb"]. [Bug #19456]	2023-02-24 14:10:09 -05:00
Takashi Kokubun	21f9c92c71	YJIT: Show Context stats on exit (#7327 )	2023-02-16 11:32:13 -08:00
Takashi Kokubun	15ef2b2d7c	YJIT: Optimize != for Integers and Strings (#7301 )	2023-02-14 16:31:33 -05:00
Matt Valentine-House	72aba64fff	Merge gc.h and internal/gc.h [Feature #19425]	2023-02-09 10:32:29 -05:00
Jimmy Miller	1148fab7ae	YJIT: Handle splat with opt more fully (#7209 ) * YJIT: Handle splat with opt more fully * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2023-01-31 16:18:56 -05:00
Maxime Chevalier-Boisvert	6bb576fe75	YJIT: implement codegen for `String#empty?` (#7148 ) YJIT: implement codegen for String#empty?	2023-01-18 15:41:28 -05:00
Takashi Kokubun	c80edc9f98	YJIT: Add object shape count to stats (#6754 )	2022-11-17 12:59:59 -08:00
Jimmy Miller	1a65ab20cb	Implement optimize call (#6691 ) This dispatches to a c func for doing the dynamic lookup. I experimented with chain on the proc but wasn't able to detect which call sites would be monomorphic vs polymorphic. There is definitely room for optimization here, but it does reduce exits.	2022-11-08 15:28:28 -05:00

1 2

93 Коммитов