Граф коммитов

75 Коммитов

Автор SHA1 Сообщение Дата
yui-knk d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
eileencodes 3391c51eff Add `node_id_for_backtrace_location` function
We want to use error highlight with eval'd code, specifically ERB
templates. We're able to recover the generated code for eval'd templates
and can get a parse tree for the ERB generated code, but we don't have a
way to get the node id from the backtrace location. So we can't pass the
right node into error highlight.

This patch gives us an API to get the node id from the backtrace
location so we can find the node in the AST.

Error Highlight PR: https://github.com/ruby/error_highlight/pull/26

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2022-10-31 13:39:56 +09:00
yui-knk 4bfdf6d06d Move `error` from top_stmts and top_stmt to stmt
By this change, syntax error is recovered smaller units.
In the case below, "DEFN :bar" is same level with "CLASS :Foo"
now.

```
module Z
  class Foo
    foo.
  end

  def bar
  end
end
```

[Feature #19013]
2022-10-08 17:59:11 +09:00
yui-knk fbbdbdd891 Add error_tolerant option to RubyVM::AST
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.

[Feature #19013]
2022-10-08 17:59:11 +09:00
Peter Zhu 5f10bd634f Add ISEQ_BODY macro
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using
this macro will make it easier for us to change the allocation strategy
of rb_iseq_constant_body when using Variable Width Allocation.
2022-03-24 10:03:51 -04:00
Yusuke Endoh 0dc7816c43 Make RubyVM::AST.of work with code written in `-e` command-line option
[Bug #18434]
2021-12-26 20:57:34 +09:00
Yusuke Endoh 6a51c3e80c Make AST.of possible even under eval when keep_script_lines is enabled
Now the following code works without an exception.

```
RubyVM.keep_script_lines = true

eval(<<END)
def foo
end
END

p RubyVM::AbstractSyntaxTree.of(method(:foo))
```
2021-12-19 04:00:51 +09:00
Yusuke Endoh acac2b8128 Make RubyVM::AbstractSyntaxTree.of raise for backtrace location in eval
This check is needed to fix a bug of error_highlight when NameError
occurred in eval'ed code.
https://github.com/ruby/error_highlight/pull/16

The same check for proc/method has been already introduced since
64ac984129.
2021-12-19 03:51:37 +09:00
Nobuyoshi Nakada 54f0e63a8c Remove `NODE_DASGN_CURR` [Feature #18406]
This `NODE` type was used in pre-YARV implementation, to improve
the performance of assignment to dynamic local variable defined at
the innermost scope.  It has no longer any actual difference with
`NODE_DASGN`, except for the node dump.
2021-12-13 12:53:03 +09:00
S.H ec7f14d9fa
Add `nd_type_p` macro 2021-12-04 00:01:24 +09:00
Yusuke Endoh feda058531 Refactor hacky ID tables to struct rb_ast_id_table_t
The implementation of a local variable tables was represented as `ID*`,
but it was very hacky: the first element is not an ID but the size of
the table, and, the last element is (sometimes) a link to the next local
table only when the id tables are a linked list.

This change converts the hacky implementation to a normal struct.
2021-11-21 08:59:24 +09:00
Yusuke Endoh 09fa773e04
ast.c: Use kept script_lines data instead of re-opening the source file (#5019)
ast.c: Use kept script_lines data instead of re-open the source file
2021-10-26 01:58:01 +09:00
Yusuke Endoh 1b300789ff ast.c: AST.of against C method should return nil (as Ruby 2.6--3.0) 2021-09-18 21:52:18 +09:00
Yusuke Endoh ed9d9cee76 ast.c: AST.of checks if a given method object is defined in C
[Bug #18178]
2021-09-18 21:28:35 +09:00
S-H-GAMELINKS bdd6d8746f Replace RBOOL macro 2021-09-05 23:01:27 +09:00
Yusuke Endoh cad83fa3c4 ast.c: Rename "save_script_lines" to "keep_script_lines"
... as per ko1's preference. He is preparing to extend this feature to
ISeq for his new debugger. He prefers "keep" to "save" for this wording.
This API is internal and not included in any released version, so I
change it in advance.
2021-08-20 16:18:36 +09:00
Jeremy Evans 64ac984129 Make RubyVM::AbstractSyntaxTree.of raise for method/proc created in eval
This changes Thread::Location::Backtrace#absolute_path to return
nil for methods/procs defined in eval.  If the realpath of an iseq
is nil, that indicates it was defined in eval, in which case you
cannot use RubyVM::AbstractSyntaxTree.of.

Fixes [Bug #16983]

Co-authored-by: Koichi Sasada <ko1@atdot.net>
2021-07-29 13:51:03 -07:00
Yusuke Endoh ed8e265d4b Experimentally expose RubyVM::AST::Node#node_id
Now ISeq#to_a includes the node_id list for each bytecode instruction.
I want a way to retrieve the AST::Node instance corresponding to an
instruction for a research purpose including TypeProf-based LSP server.
2021-06-21 21:15:25 +09:00
Yusuke Endoh 0a36cab1b5 Enable USE_ISEQ_NODE_ID by default
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.

See also ff69ef27b0.

https://bugs.ruby-lang.org/issues/17930
2021-06-18 03:35:38 +09:00
Yusuke Endoh dfba87cd62 Make it possible to get AST::Node from Thread::Backtrace::Location
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.

This change also includes iseq's dump/load support of node_ids for each
instructions.
2021-06-18 03:35:38 +09:00
Yusuke Endoh fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh acae5f363d ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Yusuke Endoh ff69ef27b0 compile.c: Pass node instead of nd_line(node) to ADD_INSN* functions
... then, new_insn_core extracts nd_line(node).

Also, if a macro "EXPERIMENTAL_ISEQ_NODE_ID" is defined, this changeset
keeps nd_node_id(node) for each instruction. This is intended for
TypeProf to identify what AST::Node corresponds to each instruction.

This patch is originally authored by @yui-knk for showing which column a
NoMethodError occurred.

https://github.com/ruby/ruby/compare/master...yui-knk:feature/node_id

Co-Authored-By: Yuichiro Kaneko <yui-knk@ruby-lang.org>
2021-05-07 17:02:15 +09:00
S.H bf3eaf39df
Remove unused rb_ast_parse_array declaration 2021-03-20 18:07:54 +09:00
Nobuyoshi Nakada 7b2bea42a2
Unfreeze string-literal-only interpolated string-literal
[Feature #17104]
2020-09-30 22:15:28 +09:00
Nobuyoshi Nakada 11af12026e
Hoisted out functions for no name rest argument symbol 2020-07-08 18:35:46 +09:00
Nobuyoshi Nakada 6a05532315
Constified NODE pointer in ASTNodeData 2020-07-08 18:26:12 +09:00
manga_osyo ff5e660340 Added `NODE_SPECIAL_EXCESSIVE_COMMA` info to `ARGS` of `RubyVM::AbstractSyntaxTree`. 2020-07-08 17:43:04 +09:00
manga_osyo 8e189df32c Add operator info to `OP_ASGN2` of `RubyVM::AbstractSyntaxTree`. 2020-07-06 00:48:15 +09:00
卜部昌平 e634a9d1a5 node_children: do not goto into a branch
Was this an autogenerated function?  This tendency of avoiding empty
branches are no longer preserved (see for instance NODE_IVAR).  Let's
just delete those unnecessary jumps into branches.
2020-06-29 11:05:41 +09:00
Kazuki Tsujimoto ddded1157a
Introduce find pattern [Feature #16828] 2020-06-14 09:24:36 +09:00
Nobuyoshi Nakada d7bef803ac Separate builtin initialization calls 2019-12-29 12:34:55 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada fb6a489af2
Revert "Method reference operator"
This reverts commit 67c5747369.
[Feature #16275]
2019-11-12 17:24:48 +09:00
Nobuyoshi Nakada 20971799f2
Renamed `load_*.inc` as `*.rbinc` to utilize a suffix rule 2019-11-08 16:30:28 +09:00
Koichi Sasada a47d058ebf use builtin for RubyVM::AbstractSyntaxTree.
Define RubyVM::AbstractSyntaxTree in ast.rb
with __builtin functions.
2019-11-08 09:09:29 +09:00
Yusuke Endoh 99c9431ea1 Rename NODE_ARRAY to NODE_LIST to reflect its actual use cases
and NODE_ZARRAY to NODE_ZLIST.

NODE_ARRAY is used not only by an Array literal, but also the contents
of Hash literals, method call arguments, dynamic string literals, etc.
In addition, the structure of NODE_ARRAY is a linked list, not an array.

This is very confusing, so I believe `NODE_LIST` is a better name.
2019-09-07 13:56:29 +09:00
Kazuki Tsujimoto 94d6ec1d90
Make pattern matching support **nil syntax 2019-09-01 16:39:34 +09:00
Jeremy Evans fa41a7b260 Make RubyVM::AbstractSyntaxTree handle **nil syntax
Use false instead of nil for the keyword and keyword rest values
in that case.
2019-08-30 12:39:31 -07:00
Benoit Daloze 39a43d9cd0 Make it as clear as possible that RubyVM is MRI-specific and only exists on MRI (#2113) [ci skip]
* Make it clear as possible that RubyVM is MRI-specific and only exists on MRI

* See [Bug #15743].
* Use "CRuby VM" instead of "Ruby VM" for clarity.

* Use YARV rather than "CRuby VM" for documenting RubyVM::InstructionSequence

* Avoid introducing a new "CRuby VM" term in documentation
2019-08-19 14:51:00 +09:00
Nobuyoshi Nakada 33f54da15b
Support memsize of AST 2019-07-23 16:22:53 +09:00
Nobuyoshi Nakada 4d93340d38
ast.c: update inspect results in the documents 2019-05-22 16:33:03 +09:00
Nobuyoshi Nakada c4bad9f74e
Distinguish pre-condition and post-condition loops 2019-05-18 09:35:40 +09:00
ktsj 243842f68a Avoid usage of the dummy empty BEGIN node
Use NODE_SPECIAL_NO_NAME_REST instead.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67629 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-20 03:37:22 +00:00
yui-knk b8b503d657 Add symbol to the result of `RubyVM::AbstractSyntaxTree#children`.
Add symbol to the result to make pattern match easily.

For example:

(1) NODE_MASGN * NODE_SPECIAL_NO_NAME_REST

```
$ ./miniruby -e 'p RubyVM::AbstractSyntaxTree.parse("a, * = b").children[-1].children'
[#<RubyVM::AbstractSyntaxTree::Node:VCALL@1:7-1:8>, #<RubyVM::AbstractSyntaxTree::Node:ARRAY@1:0-1:1>, :NODE_SPECIAL_NO_NAME_REST]
```

(2) NODE_POSTARG * NODE_SPECIAL_NO_NAME_REST

```
$ ./miniruby -e 'p RubyVM::AbstractSyntaxTree.parse("a, *, _ = b").children[-1].children[-1].children'
[:NODE_SPECIAL_NO_NAME_REST, #<RubyVM::AbstractSyntaxTree::Node:ARRAY@1:6-1:7>]
```

(3) NODE_LASGN * NODE_SPECIAL_REQUIRED_KEYWORD

```
$ ./miniruby -e 'p RubyVM::AbstractSyntaxTree.parse("def a(k:) end").children[-1].children[-1].children[1].children[7].children[0].children'
[:k, :NODE_SPECIAL_REQUIRED_KEYWORD]
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67627 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-20 03:25:36 +00:00
ktsj 9738f96fcf Introduce pattern matching [EXPERIMENTAL]
[ruby-core:87945] [Feature #14912]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67586 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-17 06:48:03 +00:00
nobu 56557ec28a [DOC] fix markups [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67337 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-22 11:04:59 +00:00
nobu 4d727639b4 ast.c: fix missing head part in dynamic literal
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66819 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-14 10:16:54 +00:00
nobu dd3ed41cd0 ast.c: argument must be a string
[ruby-core:90904] [Bug #15511]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-06 05:07:10 +00:00
nobu 67c5747369 Method reference operator
Introduce the new operator for method reference, `.:`.
[Feature #12125] [Feature #13581]
[EXPERIMENTAL]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66667 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-31 15:00:37 +00:00