Invalid escapes are handled at multiple levels. The first level
is in parse.y, so skip invalid unicode escape checks for regexps
in parse.y.
Make rb_reg_preprocess and unescape_nonascii accept the regexp
options. In unescape_nonascii, if the regexp is an extended
regexp, when "#" is encountered, ignore all characters until the
end of line or end of regexp.
Unfortunately, in extended regexps, you can use "#" as a non-comment
character inside a character class, so also parse "[" and "]"
specially for extended regexps, and only skip comments if "#" is
not inside a character class. Handle nested character classes as well.
This issue doesn't just affect extended regexps, it also affects
"(#?" comments inside all regexps. So for those comments, scan
until trailing ")" and ignore content inside.
I'm not sure if there are other corner cases not handled. A
better fix would be to redesign the regexp parser so that it
unescaped during parsing instead of before parsing, so you already
know the current parsing state.
Fixes [Bug #18294]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
This allows for the following syntax:
```ruby
def foo(*)
bar(*)
end
def baz(**)
quux(**)
end
```
This is a natural addition after the introduction of anonymous
block forwarding. Anonymous rest and keyword rest arguments were
already supported in method parameters, this just allows them to
be used as arguments to other methods. The same advantages of
anonymous block forwarding apply to rest and keyword rest argument
forwarding.
This has some minor changes to #parameters output. Now, instead
of `[:rest], [:keyrest]`, you get `[:rest, :*], [:keyrest, :**]`.
These were already used for `...` forwarding, so I think it makes
it more consistent to include them in other cases. If we want to
use `[:rest], [:keyrest]` in both cases, that is also possible.
I don't think the previous behavior of `[:rest], [:keyrest]` in
the non-... case and `[:rest, :*], [:keyrest, :**]` in the ...
case makes sense, but if we did want that behavior, we'll have to
make more substantial changes, such as using a different ID in the
... forwarding case.
Implements [Feature #18351]
This `NODE` type was used in pre-YARV implementation, to improve
the performance of assignment to dynamic local variable defined at
the innermost scope. It has no longer any actual difference with
`NODE_DASGN`, except for the node dump.
Dumped iseq binary can not have unnamed symbols/IDs, and ID 0 is
stored instead. As `struct rb_id_table` disallows ID 0, also for
the distinction, re-assign a new temporary ID based on the local
variable table index when loading from the binary, as well as the
parser.
The implementation of a local variable tables was represented as `ID*`,
but it was very hacky: the first element is not an ID but the size of
the table, and, the last element is (sometimes) a link to the next local
table only when the id tables are a linked list.
This change converts the hacky implementation to a normal struct.
block to another method without having to provide a name for the
block parameter.
Implements [Feature #11256]
Co-authored-by: Yusuke Endoh mame@ruby-lang.org
Co-authored-by: Nobuyoshi Nakada nobu@ruby-lang.org
`RubyVM.keep_script_lines` enables to keep script lines
for each ISeq and AST. This feature is for debugger/REPL
support.
```ruby
RubyVM.keep_script_lines = true
RubyVM::keep_script_lines = true
eval("def foo = nil\ndef bar = nil")
pp RubyVM::InstructionSequence.of(method(:foo)).script_lines
```
When Bison reports "memory exhausted", it means the parser stack
depth reached the limit `YYMAXDEPTH` which is defaulted to 10_000,
but not memory allocation failed.