Граф коммитов

2569 Коммитов

Автор SHA1 Сообщение Дата
yui-knk 4fb7e1b6d0 Change `enum rb_parser_ary_data_type` default value to 1 for easy debug
We face `[BUG] unexpected rb_parser_ary_data_type (0) for script lines`
on master branch recently.
This commit changes `enum rb_parser_ary_data_type` to start with `1`
and `0` to be invalid then it makes clear `rb_parser_ary_data_type (0)`
is not intentional.
2024-06-26 07:48:43 +09:00
Nobuyoshi Nakada 250fc1223c [Bug #20457] Do not remove final `return` node
This was an optimization for versions prior to 1.9 that traverse the
AST at runtime.
2024-06-25 11:07:58 +09:00
Nobuyoshi Nakada 22f98bb7ca Parenthesize `nd_fl_newline` macro expressions 2024-06-25 11:07:58 +09:00
Aaron Patterson cdf33ed5f3 Optimized forwarding callers and callees
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls.

Calls it optimizes look like this:

```ruby
def bar(a) = a
def foo(...) = bar(...) # optimized
foo(123)
```

```ruby
def bar(a) = a
def foo(...) = bar(1, 2, ...) # optimized
foo(123)
```

```ruby
def bar(*a) = a

def foo(...)
  list = [1, 2]
  bar(*list, ...) # optimized
end
foo(123)
```

All variants of the above but using `super` are also optimized, including a bare super like this:

```ruby
def foo(...)
  super
end
```

This patch eliminates intermediate allocations made when calling methods that accept `...`.
We can observe allocation elimination like this:

```ruby
def m
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

def bar(a) = a
def foo(...) = bar(...)

def test
  m { foo(123) }
end

test
p test # allocates 1 object on master, but 0 objects with this patch
```

```ruby
def bar(a, b:) = a + b
def foo(...) = bar(...)

def test
  m { foo(1, b: 2) }
end

test
p test # allocates 2 objects on master, but 0 objects with this patch
```

How does it work?
-----------------

This patch works by using a dynamic stack size when passing forwarded parameters to callees.
The caller's info object (known as the "CI") contains the stack size of the
parameters, so we pass the CI object itself as a parameter to the callee.
When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee.
The CI at the forwarded call site is adjusted using information from the caller's CI.

I think this description is kind of confusing, so let's walk through an example with code.

```ruby
def delegatee(a, b) = a + b

def delegator(...)
  delegatee(...)  # CI2 (FORWARDING)
end

def caller
  delegator(1, 2) # CI1 (argc: 2)
end
```

Before we call the delegator method, the stack looks like this:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   |
              5|   delegatee(...)  # CI2 (FORWARDING)  |
              6| end                                   |
              7|                                       |
              8| def caller                            |
          ->  9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in
to `delegator`, it writes `CI1` on to the stack as a local variable for the
`delegator` method.  The `delegator` method has a special local called `...`
that holds the caller's CI object.

Here is the ISeq disasm fo `delegator`:

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

The local called `...` will contain the caller's CI: CI1.

Here is the stack when we enter `delegator`:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
           -> 4|   #                                   | CI1 (argc: 2)
              5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            |
              9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to
memcopy the caller's stack before calling `delegatee`.  In this case, it will
memcopy self, 1, and 2 to the stack before calling `delegatee`.  It knows how much
memory to copy from the caller because `CI1` contains stack size information
(argc: 2).

Before executing the `send` instruction, we push `...` on the stack.  The
`send` instruction pops `...`, and because it is tagged with `FORWARDING`, it
knows to memcopy (using the information in the CI it just popped):

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

Instruction 001 puts the caller's CI on the stack.  `send` is tagged with
FORWARDING, so it reads the CI and _copies_ the callers stack to this stack:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   | CI1 (argc: 2)
           -> 5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            | self
              9|   delegator(1, 2) # CI1 (argc: 2)     | 1
             10| end                                   | 2
```

The "FORWARDING" call site combines information from CI1 with CI2 in order
to support passing other values in addition to the `...` value, as well as
perfectly forward splat args, kwargs, etc.

Since we're able to copy the stack from `caller` in to `delegator`'s stack, we
can avoid allocating objects.

I want to do this to eliminate object allocations for delegate methods.
My long term goal is to implement `Class#new` in Ruby and it uses `...`.

I was able to implement `Class#new` in Ruby
[here](https://github.com/ruby/ruby/pull/9289).
If we adopt the technique in this patch, then we can optimize allocating
objects that take keyword parameters for `initialize`.

For example, this code will allocate 2 objects: one for `SomeObject`, and one
for the kwargs:

```ruby
SomeObject.new(foo: 1)
```

If we combine this technique, plus implement `Class#new` in Ruby, then we can
reduce allocations for this common operation.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2024-06-18 09:28:25 -07:00
Nobuyoshi Nakada a1f72a563b [Bug #20579] ripper: Dispatch spaces at END-OF-INPUT without newline 2024-06-14 17:54:02 +09:00
Nobuyoshi Nakada 7f47469105 Include `__LINE__` in `add_delayed_token` macro 2024-06-14 17:54:02 +09:00
Nobuyoshi Nakada 2e59cf00cc [Bug #20578] ripper: Fix dispatching part at invalid escapes 2024-06-14 15:02:15 +09:00
S-H-GAMELINKS 1fc0763724 Introduce `ident_or_const` inline rule 2024-06-12 15:36:55 +09:00
Nobuyoshi Nakada 206465e84d ripper: Unify `dispatch_end` 2024-06-12 11:49:33 +09:00
Nobuyoshi Nakada 906a86e4de
Use `dllexport` as `RUBY_FUNC_EXPORTED` on Windows 2024-06-09 16:55:27 +09:00
Nobuyoshi Nakada 7612e45306
ripper: Unify formal argument error handling 2024-06-08 15:00:18 +09:00
Nobuyoshi Nakada 9bee49e0e1
ripper: Unify backref error handling 2024-06-08 13:25:44 +09:00
Nobuyoshi Nakada 18fcec23bf
ripper: Introduce `RIPPER_ID` macro instead of `ripper_id_` macros 2024-06-08 13:20:46 +09:00
Nobuyoshi Nakada 9e28354705
ripper: Fix excess `compile_error` at simple backref op_asgn
Fix up 89cfc15207.
2024-06-07 11:28:38 +09:00
Kevin Newton cbc83c4a92 Remove circular parameter syntax error
https://bugs.ruby-lang.org/issues/20478
2024-06-06 16:29:50 -04:00
Nobuyoshi Nakada 27321290d9 [Bug #20521] ripper: Clean up strterm 2024-06-06 20:43:56 +09:00
Nobuyoshi Nakada ae203984ff Ditto for NODE_DOT2 and NODE_DOT3 2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada 2889ed1bcb Use `RNode_DREGX` variable for debuggers
At least LLDB needs an actual variable not only casts to access the
type in debugger sessions.
2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada cedc7737b6 Make interchangeable NODE types aliases 2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada fd74614059
Get rid of type-punning pointer casts 2024-06-01 21:51:27 +09:00
Nobuyoshi Nakada 05553cf22d
[Bug #20517] Make a multibyte character one token at meta escape 2024-06-01 19:33:12 +09:00
Jeremy Evans 89486c79bb
Make error messages clear blocks/keywords are disallowed in index assignment
Blocks and keywords are allowed in regular index.

Also update NEWS to make this more clear.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-05-31 08:22:40 -07:00
Yusuke Endoh a15e4d405b Revert 528c4501f4
Recently, `TestRubyLiteral#test_float` fails randomly.

```
  1) Error:
TestRubyLiteral#test_float:
ArgumentError: SyntaxError#path changed: "(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"->"(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"
```
https://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20240527T050036Z.fail.html.gz

According to Launchable, the first failure was on Apr 30.
This is just when 528c4501f4 was
committed. I don't know if the change is really the cause, but I want to
revert it once to see if the random failure disappears.
2024-05-31 18:24:43 +09:00
Kevin Newton 47f0965269 Update duplicated when clause warning message 2024-05-24 12:36:54 -04:00
Nobuyoshi Nakada a99d79dd31 Remove dead code
Since 140512d222, `else` without
`rescue` has been a syntax error.
2024-05-23 19:28:02 +09:00
Yusuke Endoh 1471a160ba Add RB_GC_GUARD for rb_str_to_parser_string
I think this fixes the following random test failure that could not be
fixed for a long time:

```
  1) Failure:
TestSymbol#test_inspect_under_gc_compact_stress [/home/chkbuild/chkbuild/tmp/build/20240522T003003Z/ruby/test/ruby/test_symbol.rb:126]:
<":testing"> expected but was
<":\"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\"">.
```

The value passed to this function is the return value of `rb_id2str`, so
it is never collected.  However, if auto_compact is enabled, the string
may move and `RSTRING_PTR(str)` became invalid.

This change prevents the string from being moved by RB_GC_GUARD.
2024-05-23 19:26:45 +09:00
Nobuyoshi Nakada c773453c77
ripper: Splat find patterns 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada 501dbf2bca
ripper: Splat hash patterns 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada 978c31f04a
ripper: Splat array patterns with `pre_arg` 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada 3e81bc3d53
ripper: Splat `$:opt_args_tail` for `params!` 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada be0f2ab32d
ripper: Splat `$:head` for `defs!` 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada 5bba5fb739
ripper: Describe `var_ref` for `user_variable` in ripper DSL 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada 56d2c26c85
ripper: Move `assign_error` call to `assignable`
Prepare `lhs` as `$:$` before `assignable` and update it there.
Remove `ripper_assignable` which is no longer used.
2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada e61e5c3b84
ripper: Move `assign_error` call to `const_decl`
Prepare `path` as `$:$` before `const_decl` and update it there.
Remove `ripper_const_decl` which is no longer used.
2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada 147134b474
ripper: Remove rb_ripper_none
Now it is used only for wheter `opt_paren_args` is `none`.  Introduce
a new special node to distinguish an empty parentheses from it .
2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada ee8bbbabe5
ripper: Show popped TOS in debug mode 2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada 2e765c20db
ripper: Short hand for `rb_ary_new_from_args` 2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada 2d92a4afba
ripper: Make `$:n` to refer each grammar values
Ripper DSL uses these values for callbacks, but does not need indexes.
2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada 5fed63f7b0
ripper: Use ripper DSL in simple dispatch chain cases 2024-05-21 13:52:29 +09:00
Nobuyoshi Nakada dbbaf871de
[DOC] Fix `$<` comment 2024-05-19 00:29:00 +09:00
Nobuyoshi Nakada fd8e6e8c54
Replace cast tags for `tSTRING_DVAR` with typed midrule actions 2024-05-19 00:27:34 +09:00
Nobuyoshi Nakada 232f7b37cf
Replace cast tags with typed midrule actions
* Add types to `tLAMBDA` and `tSTRING_DBEG` to store corresponding
  information when returning these tokens.
* Add `enum lex_state_e state` to `%union` for `tSTRING_DBEG`.
2024-05-18 19:46:05 +09:00
yui-knk 55c62e676f No need to specify tags anymore
In the past, these codes were used by both parser and ripper.
On ripper, the type of LHS is `<val>` then type cast was needed.
However currently these are only used by parser then no need to
cast.
2024-05-18 11:26:17 +09:00
Nobuyoshi Nakada 5695c5df95
ripper: Fix opassign when assignment to backref variables 2024-05-12 15:38:22 +09:00
yui-knk 7e604a0263 Fix SEGV when ripper hits `backref_error` on `command_asgn` or `arg` 2024-05-11 20:47:15 +09:00
Nobuyoshi Nakada 5bb656e4f0
[Bug #20474] Keep spaces in leading blank line 2024-05-08 19:25:37 +09:00
yui-knk cf74ff714a Change return value of `gets` function to be `rb_parser_string_t *` instead of `VALUE`
This change reduces parser's dependency on ruby object.
2024-05-04 11:59:10 +09:00
ydah e1905ca180 Use user defined parameterizing rules `f_optarg(value)` 2024-05-03 12:05:21 +09:00
ydah ed5a7a59c0 Use callee side tag specification of parameterizing rules 2024-05-02 15:04:20 +09:00
Peter Zhu 7ef8bb129f Fix memory leak in Ripper.sexp
rb_ast_dispose does not free the rb_ast_t causing it to be leaked. This
commit changes it to use rb_ast_free instead.

For example:

    require "ripper"

    10.times do
      100_000.times do
        Ripper.sexp("")
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    27648
    32512
    37376
    42240
    47232
    52224
    57344
    62208
    67072
    71936

After:

    22784
    22784
    22784
    22784
    22912
    22912
    22912
    22912
    22912
    22912
2024-05-01 11:09:54 -04:00