Граф коммитов

1496 Коммитов

Автор SHA1 Сообщение Дата
Jean Boussier 7c12169230 Eliminate internal uses of `Data_Wrap_Struct`
Ref: https://github.com/ruby/ruby/pull/10872

These should be the last internal uses of the old `Data` API
inside Ruby itself. Some use remain in a couple default gems.
2024-06-02 13:59:11 +02:00
Nobuyoshi Nakada cedc7737b6 Make interchangeable NODE types aliases 2024-06-02 09:43:33 +09:00
Kevin Newton fd95ba255a Make ensure first lineno the first line of the ensure
Previously, ensure ISEQs took their first line number from the
line number coming from the AST. However, if this is coming from
an empty `begin`..`end` inside of a method, this can be all of the
way back to the method declaration. Instead, this commit changes
it to be the first line number of the ensure block itself.

The first_lineno field is only accessible through manual ISEQ
compilation or through tracepoint. Either way, this will be more
accurate for targeting going forward.
2024-05-28 13:12:21 -04:00
Jean Boussier 9e9f1d9301 Precompute embedded string literals hash code
With embedded strings we often have some space left in the slot, which
we can use to store the string Hash code.

It's probably only worth it for string literals, as they are the ones
likely to be used as hash keys.

We chose to store the Hash code right after the string terminator as to
make it easy/fast to compute, and not require one more union in RString.

```
compare-ruby: ruby 3.4.0dev (2024-04-22T06:32:21Z main f77618c1fa) [arm64-darwin23]
built-ruby: ruby 3.4.0dev (2024-04-22T10:13:03Z interned-string-ha.. 8a1a32331b) [arm64-darwin23]
last_commit=Precompute embedded string literals hash code

|            |compare-ruby|built-ruby|
|:-----------|-----------:|---------:|
|symbol      |     39.275M|   39.753M|
|            |           -|     1.01x|
|dyn_symbol  |     37.348M|   37.704M|
|            |           -|     1.01x|
|small_lit   |     29.514M|   33.948M|
|            |           -|     1.15x|
|frozen_lit  |     27.180M|   33.056M|
|            |           -|     1.22x|
|iseq_lit    |     27.391M|   32.242M|
|            |           -|     1.18x|
```

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2024-05-28 07:32:41 +02:00
Nobuyoshi Nakada f4b475993e
Apply optimizations for `putstring` to `putchilledstring` as well 2024-05-27 12:41:38 +09:00
Nobuyoshi Nakada 49fcd33e13 Introduce a specialize instruction for Array#pack
Instructions for this code:

```ruby
  # frozen_string_literal: true

[a].pack("C")
```

Before this commit:

```
== disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)>
0000 putself                                                          (   3)[Li]
0001 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 newarray                               1
0005 putobject                              "C"
0007 opt_send_without_block                 <calldata!mid:pack, argc:1, ARGS_SIMPLE>
0009 leave
```

After this commit:

```
== disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)>
0000 putself                                                          (   3)[Li]
0001 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 putobject                              "C"
0005 opt_newarray_send                      2, :pack
0008 leave
```

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2024-05-23 12:11:50 -07:00
Nobuyoshi Nakada 2dd46bb82f
[Bug #20468] Fix safe navigation in `for` variable 2024-05-16 16:22:17 +09:00
yui-knk 899d9f79dd Rename `vast` to `ast_value`
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
2024-05-03 12:40:35 +09:00
HASUMI Hitoshi 55a402bb75 Add line_count field to rb_ast_body_t
This patch adds `int line_count` field to `rb_ast_body_t` structure.
Instead, we no longer cast `script_lines` to Fixnum.

## Background

Ref https://github.com/ruby/ruby/pull/10618

In the PR above, we have decoupled IMEMO from `rb_ast_t`.
This means we could lift the five-words-restriction of the structure
that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field.

## Relating refactor

- Remove the second parameter of `rb_ruby_ast_new()` function

## Attention

I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()`
of vm.c, because I don't think it is necessary.
But I will make another PR for this so that we can atomically revert
in case I was wrong (See the comment on the code)
2024-04-27 12:08:26 +09:00
Kevin Newton af800bef21 Remove dependency on NODE from coverage structure 2024-04-26 12:25:45 -04:00
HASUMI Hitoshi 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Zack Deveau 9555a997ac ensure ibf_load_setup is only passed String params
In cases where RubyVM::InstructionSequence.load_from_binary() is
passed a param other than a String, we attempt to call the
RSTRING_LENINT macro on it which can cause a segfault.

ex:
```
var_0 = 0
RubyVM::InstructionSequence.load_from_binary(var_0)
```

This commit adds a type check to raise unless we are provided
a String.
2024-04-20 10:41:01 +09:00
Koichi Sasada 662ce928a7 `RUBY_TRY_UNUSED_BLOCK_WARNING_STRICT`
`RUBY_TRY_UNUSED_BLOCK_WARNING_STRICT=1 ruby ...` will enable
strict check for unused block warning.

This option is only for trial to compare the results so the
envname is not considered well.
Should be removed before Ruby 3.4.0 release.
2024-04-19 14:28:54 +09:00
Koichi Sasada e9d7478ded relax unused block warning for duck typing
if a method `foo` uses a block, other (unrelated) method `foo`
can receives a block. So try to relax the unused block warning
condition.

```ruby
      class C0
        def f = yield
      end

      class C1 < C0
        def f = nil
      end

      [C0, C1].f{ block } # do not warn
```
2024-04-17 20:26:49 +09:00
Koichi Sasada f9f3018001 `ISeq#to_a` respects `use_block` status
```ruby
b = RubyVM::InstructionSequence.compile('def f = yield; def g = nil').to_a
pp b

 #=>
 ...
 {:use_block=>true},
 ...
```
2024-04-17 17:03:46 +09:00
Jean Boussier f06670c5a2 Eliminate usage of OBJ_FREEZE_RAW
Previously it would bypass the `FL_ABLE` check, but
since shapes introduction, it started having a different
behavior than `OBJ_FREEZE`, as it would onyl set the `FL_FREEZE`
flag, but not update the shape.

I have no indication of this causing a bug yet, but it seems
like a trap waiting to happen.
2024-04-16 17:20:35 +02:00
HASUMI Hitoshi 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Koichi Sasada 9a57b04703 `super{}` doesn't use block
`super(){}`, `super{}` and `super(&b)` doesn't use the given
block so warn unused block warning when calling a method which
doesn't use block with above `super` expressions.

e.g.: `def f = super{B1}` (warn on `f{B2}` because `B2` is not used.
2024-04-15 17:56:49 +09:00
Koichi Sasada 145cced9bc fix incorrect warning.
`super()` (not zsuper) passes the passed block and
it can be used.

```ruby
class C0
  def foo; yield; end
end

class C1 < C0
  def foo; super(); end
end

C1.new.foo{p :block} #=> :block
```
2024-04-15 14:53:41 +09:00
Koichi Sasada 9180e33ca3 show warning for unused block
With verbopse mode (-w), the interpreter shows a warning if
a block is passed to a method which does not use the given block.

Warning on:

* the invoked method is written in C
* the invoked method is not `initialize`
* not invoked with `super`
* the first time on the call-site with the invoked method
  (`obj.foo{}` will be warned once if `foo` is same method)

[Feature #15554]

`Primitive.attr! :use_block` is introduced to declare that primitive
functions (written in C) will use passed block.

For minitest, test needs some tweak, so use
ea9caafc07
for `test-bundled-gems`.
2024-04-15 12:08:07 +09:00
Jean Boussier 1b830740ba compile.c: use rb_enc_interned_str to reduce allocations
The `rb_fstring(rb_enc_str_new())` pattern is inneficient because:

- It passes a mutable string to `rb_fstring` so if it has to be interned
  it will first be duped.
- It an equivalent interned string already exists, we allocated the string
  for nothing.

With `rb_enc_interned_str` we either directly get the pre-existing string
with 0 allocations, or efficiently directly intern the one we create
without first duping it.
2024-04-11 09:04:31 +02:00
Jeremy Evans ad90fdd24c Remove compiler code to handle blocks in attrasgn
Passing blocks is no longer allowed in attrasgn.  This is similar
to 3a674c9c65, but for attrasgn instead
of op_asgn.
2024-04-06 10:33:16 -07:00
yui-knk f022a700bf Remove imemo type check for NODE
In the past, `rb_iseq_compile_node` received `NODE *`
and `struct vm_ifunc *` as `node`. But after e743a35,
the function only receives `NODE *`.
This commit removes imemo type check to reduce the dependence
on `VALUE flags` of `struct RNode`.
2024-04-06 18:20:31 +09:00
Jeremy Evans 3a674c9c65 Remove compiler code to handle keywords and blocks in operator assignment syntax
Code such as:

```ruby
foo[0, &bar] = baz
foo[0, bar: 1] = baz
foo[0, **bar] = baz
```

Is now a syntax error, so all of the removed code is now dead.
2024-04-04 17:13:40 -07:00
HASUMI Hitoshi 8aa8fce320 Fix return-type warning in compile.c
This patch surppresses the warning below:

```console
compile.c:10314:1: warning: control reaches end of non-void function [-Wreturn-type]
10314 | }
      | ^
```
2024-04-04 13:38:26 +09:00
yui-knk f057741c5d NODE_LIT is not used anymore 2024-04-04 13:17:26 +09:00
yui-knk 6056773105 Move shareable_constant_value logic from parse.y to compile.c 2024-04-04 08:44:10 +09:00
KJ Tsanaktsidis 9d0a5148ae Add missing RB_GC_GUARDs related to DATA_PTR
I discovered the problem in `compile.c` from a failing
TestIseqLoad#test_stressful_roundtrip test with ASAN enabled. The other
two changes in array.c and string.c I found by auditing similar usages
of DATA_PTR in the codebase.

[Bug #20402]
2024-03-31 20:33:38 +11:00
Maxime Chevalier-Boisvert bb3cbdfe2f
YJIT: add iseq_alloc_count to stats (#10398)
* YJIT: add iseq_alloc_count to stats

* Remove an empty line

---------

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2024-03-28 15:21:09 -04:00
Étienne Barrié 12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jeremy Evans 815c7e197c Avoid caller-side hash allocation for f(*a, kw: 1) and f(*a, kw: 1, &block)
Previously, this used:

```
splatarray false
duphash
getlocal/getblockparamproxy # in the block passing case
send ARGS_SPLAT|KW_SPLAT|KW_SPLAT_MUT
```

This changes the duphash to putobject, with putobject using
a frozen version of the hash, and removing the keyword mutability:

```
splatarray false
putobject
getlocal/getblockparamproxy # in the block passing case
send ARGS_SPLAT|KW_SPLAT
```
2024-03-16 09:27:32 -07:00
Jean Boussier 91bf7eb274 Refactor frozen_string_literal check during compilation
In preparation for https://bugs.ruby-lang.org/issues/20205.

The `frozen_string_literal` compilation option will no longer
be a boolean but a tri-state: `on/off/default`.
2024-03-15 15:52:33 +01:00
Jeremy Evans c388784943 Fix array allocation optimization for f(*a, kw: 1)
This was broken during the refactoring in 22e488464a.
2024-03-14 16:11:35 -07:00
Jeremy Evans 6ea01d204e
Fix dump of hidden local variable indexes
This fixes test failures when running tests with
RUBY_ISEQ_DUMP_DEBUG=to_binary, which started after
5899f6aa55 was committed.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-03-06 09:10:00 -08:00
Jeremy Evans 5899f6aa55 Keep hidden local variables when dumping and loading iseqs
Fixes [Bug #19975]
2024-03-04 09:49:55 -08:00
S-H-GAMELINKS 2d8788e90c Support NODE_ONCE for pattern matching 2024-03-04 12:33:00 +09:00
Jeremy Evans 99384bac28 Correctly set anon_kwrest flag for def f(b: 1, **)
In cases where a method accepts both keywords and an anonymous
keyword splat, the method was not marked as taking an anonymous
keyword splat.  Fix that in the compiler.

Doing that broke handling of nil keyword splats in yjit, so
update yjit to handle that.

Add a test to check that calling a method that accepts both
a keyword argument and an anonymous keyword splat does not
modify a passed keyword splat hash.

Move the anon_kwrest check from setup_parameters_complex to
ignore_keyword_hash_p, and only use it if the keyword hash
is already a hash. This should speed things up slightly as
it avoids a check previously used for all callers of
setup_parameters_complex.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-03-01 12:36:19 -08:00
Jeremy Evans 334e4c65b3 Fix a couple issues noticed by nobu
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-03-01 07:10:25 -08:00
Jeremy Evans 73371450c3 Avoid 1-2 array allocations for zsuper calls with post arguments
These previously resulted in 2 array allocations, one for newarray
and one for concatarray.  This replaces newarray and concatarray
with pushtoarray, and changes splatarray false to splatarray true,
which reduces it to 1 array allocation, in splatarray true.

This also sets VM_CALL_ARGS_SPLAT_MUT, so if the super method
accepts a positional splat, this will avoid an additional array
allocation on the callee side.
2024-03-01 07:10:25 -08:00
Jeremy Evans 32c58753af Fix splatarray false peephole optimization for f(*ary, **kw, &block)
This optimization stopped being using when the splatkw VM instruction
was added.  This change allows the optimization to apply again. This
also optimizes the following cases:

  super(*ary, **kw, &block)
  f(...)
  super(...)
2024-03-01 07:10:25 -08:00
Jeremy Evans e484ffaf20 Perform splatarray false peephole optimization for invokesuper in addition to send
This optimizes cases such as:

  super(arg, *ary)
  super(arg, *ary, &block)
  super(*ary, **kw)
  super(*ary, kw: 1)
  super(*ary, kw: 1, &block)

The super(*ary, **kw, &block) case does not use the splatarray false
optimization.  This is also true of the send case, since the
introduction of the splatkw VM instruction.  That will be fixed in
a later commit.
2024-03-01 07:10:25 -08:00
Kevin Newton 0f1ca9492c [PRISM] Provide runtime flag for prism in iseq 2024-02-21 11:44:40 -05:00
Alan Wu 2a6917b463 Fix string value in hash literal being forced frozen
We should pass `false` for `hash_key` for value nodes. Credits to
`@kddnewton` for noticing and bisecting.
2024-02-20 21:00:54 -05:00
yui-knk e7ab5d891c Introduce NODE_REGX to manage regexp literal 2024-02-21 08:06:48 +09:00
Jeremy Evans 77c1233f79 Add pushtoarraykwsplat instruction to avoid unnecessary array allocation
This is designed to replace the newarraykwsplat instruction, which is
no longer used in the parse.y compiler after this commit.  This avoids
an unnecessary array allocation in the case where ARGSCAT is followed
by LIST with keyword:

```ruby
a = []
kw = {}
[*a, 1, **kw]
```

Previous Instructions:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 newhash                                0                         (   2)[Li]
0006 setlocal_WC_0                          kw@1
0008 getlocal_WC_0                          a@0                       (   3)[Li]
0010 splatarray                             true
0012 putobject_INT2FIX_1_
0013 putspecialobject                       1
0015 newhash                                0
0017 getlocal_WC_0                          kw@1
0019 opt_send_without_block                 <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0021 newarraykwsplat                        2
0023 concattoarray
0024 leave
```

New Instructions:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 newhash                                0                         (   2)[Li]
0006 setlocal_WC_0                          kw@1
0008 getlocal_WC_0                          a@0                       (   3)[Li]
0010 splatarray                             true
0012 putobject_INT2FIX_1_
0013 pushtoarray                            1
0015 putspecialobject                       1
0017 newhash                                0
0019 getlocal_WC_0                          kw@1
0021 opt_send_without_block                 <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0023 pushtoarraykwsplat
0024 leave
```

pushtoarraykwsplat is designed to be simpler than newarraykwsplat.
It does not take a variable number of arguments from the stack, it
pops the top of the stack, and appends it to the second from the top,
unless the top of the stack is an empty hash.

During this work, I found the ARGSPUSH followed by HASH with keyword
did not compile correctly, as it pushed the generated hash to the
array even if the hash was empty.  This fixes the behavior, to use
pushtoarraykwsplat instead of pushtoarray in that case:

```ruby
a = []
kw = {}
[*a, **kw]

[{}] # Before

[] # After
```

This does not remove the newarraykwsplat instruction, as it is still
referenced in the prism compiler (which should be updated similar
to this), YJIT (only in the bindings, it does not appear to be
implemented), and RJIT (in a couple comments).  After those are
updated, the newarraykwsplat instruction should be removed.
2024-02-20 10:47:44 -08:00
Peter Zhu 2967b7eb76 GC guard catch_table_ary
Using RARRAY_CONST_PTR can cause the array object to not exist on the
stack, which could cause it to be GC'd or be moved by GC compaction. This
can cause RARRAY_CONST_PTR to point to the incorrect location if the
array is embedded and moved by GC compaction.

Fixes ruby/prism#2444.
2024-02-16 15:58:39 -05:00
Yusuke Endoh 25d74b9527 Do not include a backtick in error messages and backtraces
[Feature #16495]
2024-02-15 18:42:31 +09:00
Peter Zhu de7a29ef8d Replace assert with RUBY_ASSERT in compile.c
assert does not print the bug report, only the file and line number of
the assertion that failed. RUBY_ASSERT prints the full bug report, which
makes it much easier to debug.
2024-02-12 15:07:47 -05:00
Kevin Newton f7467e70e1 Split line_no and node_id before new_insn_body
Before this commit, there were many places where we had to generate
dummy line nodes to hold both the line number and the node id that
would then immediately get pulled out from the created node. Now
we pass them explicitly so that we don't have to generate these
nodes.

This makes a clearer line between the parser and compiler, and also
makes it easier to generate instructions when we don't have a
specific node to tie them to. As such, it removes almost every
single place where we needed to previously generate dummy nodes.

This also makes it easier for the prism compiler, because now we
can pass in line number and node id instead of trying to generate
dummy nodes for every instruction that we compile.
2024-02-09 17:01:27 -05:00
yui-knk 33c1e082d0 Remove ruby object from string nodes
String nodes holds ruby string object on `VALUE nd_lit`.
This commit changes it to `struct rb_parser_string *string`
to reduce dependency on ruby object.
Sometimes these strings are concatenated with other string
therefore string concatenate functions are needed.
2024-02-09 14:20:17 +09:00