In this commit we're splitting up the call nodes that were in target
positions (that is, for loop indices, rescue error captures, and
multi assign targets).
Previously, we would simply leave the call nodes in place. This had
the benefit of keeping the AST relatively simple, but had the
downside of not being very explicit. If a static analysis tool wanted
to only look at call nodes, it could easily be confused because the
method would have 1 fewer argument than it would actually be called
with.
This also brings some consistency to the AST. All of the nodes in
a target position are now *TargetNode nodes. These should all be
treated the same, and the call nodes can now be treated the same.
Finally, there is benefit to memory. Because being in a target
position ensures we don't have some fields, we can strip down the
number of fields on these nodes.
So this commit introduces two new nodes: CallTargetNode and
IndexTargetNode. For CallTargetNode we get to drop the opening_loc,
closing_loc, arguments, and block. Those can never be present. We
also get to mark their fields as non-null, so they will always be
seen as present.
The IndexTargetNode keeps around most of its fields but gets to
drop both the name (because it will always be []=) and the
message_loc (which was always super confusing because it included
the arguments by virtue of being inside the []).
Overall, this adds complexity to the AST at the expense of memory
savings and explicitness. I believe this tradeoff is worth it in
this case, especially because these are very much not common nodes
in the first place.
https://github.com/ruby/prism/commit/3ef71cdb45
This doesn't actually fix the encodings for symbols and regex,
unfortunately. But I wanted to get this change in because it is
the last AST change we're going to make before 3.3 is released.
So, if consumers want, they can start to check these flags to
determine the encoding, even though it will be wrong. Then once we
actually set them correctly, everything should work.
https://github.com/ruby/prism/commit/9b35f7e891
The locals_body_index gives the index in the locals array where
the locals from the body start. This allows compilers to easily
index past the parameters in the locals array.
https://github.com/ruby/prism/commit/5d4627b890
Previously numbered parameters were a field on blocks and lambdas
that indicated the maximum number of numbered parameters in either
the block or lambda, respectively. However they also had a
parameters field that would always be nil in these cases.
This changes it so that we introduce a NumberedParametersNode that
goes in place of parameters, which has a single uint8_t maximum
field on it. That field contains the maximum numbered parameter in
either the block or lambda.
As a part of the PR, I'm introducing a new UInt8Field type that
can be used on nodes, which is just to make it a little more
explicit what the maximum values can be (the maximum is actually 9,
since it only goes up to _9). Plus we can do a couple of nice
things in serialization like just read a single byte.
https://github.com/ruby/prism/commit/2d87303903
Fundamentally, `foo { |bar,| }` is different from `foo { |bar, *| }`
because of arity checks. This PR introduces a new node to handle
that, `ImplicitRestNode`, which goes in the `rest` slot of parameter
nodes instead of `RestParameterNode` instances.
This is also used in a couple of other places, namely:
* pattern matching: `foo in [bar,]`
* multi target: `for foo, in bar do end`
* multi write: `foo, = bar`
Now the only splat nodes with a `NULL` value are when you're
forwarding, as in: `def foo(*) = bar(*)`.
https://github.com/ruby/prism/commit/dba2a3b652
We are aware at parse time how many numbered parameters we have
on a BlockNode or LambdaNode, but prior to this commit, did not
store that information anywhere in its own right.
The numbered parameters were stored as locals, but this does not
distinguish them from other locals that have been set, for example
in `a { b = 1; _1 }` there is nothing on the AST that distinguishes
b from _1.
Consumers such as the compiler need to know information about how
many numbered parameters exist to set up their own tables around
parameters. Since we have this information at parse time, we should
compute it here, instead of deferring the work later on.
https://github.com/ruby/prism/commit/bf4a1e124d
* The same order as in source code.
* CallOrWriteNode, CallOperatorWriteNode, CallAndWriteNode already have
the correct order so it was also inconsistent with them.
https://github.com/ruby/prism/commit/4434e4bc22
Right now when you have a lot of string concats it ends up being
difficult to work with because of the depth of the tree. You end
up descending very far for every string literal that is part of the
concat.
There are already times when we use an interpolated string node to
group together two string segments that are part of the same string
(like when they are interupted by the contents of a heredoc). This
commit takes the same approach and replaces string concats with
interpolated string nodes.
Now that they're a flat list, they should be much easier to work
with. There's still some missing information here that would be
useful to consumers: whether or not there is _actually_ any
interpolation contained in the list. We could remedy this with
another node type that is named something like string list, or we
could add a flag to interpolated string node indicating that there
is interpolation. Either way I want to solve that in a follow-up
commit, since this commit is valuable on its own.
https://github.com/ruby/prism/commit/1e7ae3ad1b
Prior to this commit, KeywordParameterNode included both optional
and required keywords. With this commit, it is split in two, with
`OptionalKeywordParameterNode`s no longer having a value field.
https://github.com/ruby/prism/commit/89084d9af4
Method calls with keyword splat args compile differently than
without since they merge the keyword arg hash with the keyword splat
hash. We know this information at parse time, so can set a flag
which the compiler can use.
https://github.com/ruby/prism/commit/e5f8a9a3cd
Right now, our Call{Operator,And,Or}WriteNode nodes represent two
different concepts:
```ruby
foo.bar += 1
foo[bar] += 1
```
These two statements are different in what they can support. The
former can never have arguments (or an opening_loc or closing_loc).
The former can also never have a block. Also, the former is a
variable method name.
The latter is always going to be []/[]=, it can have any number of
arguments including blocks (`foo[&bar] ||= 1`), and will always
have an opening_loc and closing_loc.
Furthermore, these statements end of having to take different paths
through the various compilers because with the latter you have to
consider the arguments and the block, whereas the former can
perform some additional peephole optimizations since there are
fewer values on the stack.
For these reasons, I'm introducing Index{Operator,And,Or}WriteNode.
These nodes never have a read_name or write_name on them because
they are always []/[]=. They also support blocks, which the previous
write nodes didn't. As a benefit of introducing these nodes, I've
removed the opening_loc, closing_loc, and arguments from the older
write nodes because they will always be null.
For the serialized format, both of these nodes end up being
smaller, and for in-memory we're storing fewer things in general,
so we have savings all around.
I don't love that we are introducing another node that is a call
node since we generally want consumers to only have to handle a
single call, but these nodes are so specific that they would have
to be handled separately anyway since in fact call 2 methods.
https://github.com/ruby/prism/commit/70155db9cd
Moves the common flag bits to the top. This lets us eliminate the `COMMON`
constant, and also allows us to group encoding flags on a nibble so we
can more easily mask them.
https://github.com/ruby/prism/commit/895508659e