github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
yui-knk	f2728c3393	Change RESBODY Node structure Extracrt exception variable into `nd_exc_var` field to keep the original grammar structure. For example: ``` begin rescue Error => e1 end ``` Before: ``` @ NODE_RESBODY (id: 8, line: 2, location: (2,0)-(2,18)) +- nd_args: \| @ NODE_LIST (id: 2, line: 2, location: (2,7)-(2,12)) \| +- as.nd_alen: 1 \| +- nd_head: \| \| @ NODE_CONST (id: 1, line: 2, location: (2,7)-(2,12)) \| \| +- nd_vid: :Error \| +- nd_next: \| (null node) +- nd_body: \| @ NODE_BLOCK (id: 6, line: 2, location: (2,13)-(2,18)) \| +- nd_head (1): \| \| @ NODE_LASGN (id: 3, line: 2, location: (2,13)-(2,18)) \| \| +- nd_vid: :e1 \| \| +- nd_value: \| \| @ NODE_ERRINFO (id: 5, line: 2, location: (2,13)-(2,18)) \| +- nd_head (2): \| @ NODE_BEGIN (id: 4, line: 2, location: (2,18)-(2,18)) \| +- nd_body: \| (null node) +- nd_next: (null node) ``` After: ``` @ NODE_RESBODY (id: 6, line: 2, location: (2,0)-(2,18)) +- nd_args: \| @ NODE_LIST (id: 2, line: 2, location: (2,7)-(2,12)) \| +- as.nd_alen: 1 \| +- nd_head: \| \| @ NODE_CONST (id: 1, line: 2, location: (2,7)-(2,12)) \| \| +- nd_vid: :Error \| +- nd_next: \| (null node) +- nd_exc_var: \| @ NODE_LASGN (id: 3, line: 2, location: (2,13)-(2,18)) \| +- nd_vid: :e1 \| +- nd_value: \| @ NODE_ERRINFO (id: 5, line: 2, location: (2,13)-(2,18)) +- nd_body: \| @ NODE_BEGIN (id: 4, line: 2, location: (2,18)-(2,18)) \| +- nd_body: \| (null node) +- nd_next: (null node) ```	2024-07-26 07:29:32 +09:00
yui-knk	57b11be15a	Implement UNLESS NODE keyword locations	2024-07-23 14:35:23 +09:00
yui-knk	f23485a8d6	[Feature #20624 ] Enhance `RubyVM::AbstractSyntaxTree::Node#locations` This commit introduce `RubyVM::AbstractSyntaxTree::Node#locations` method and `RubyVM::AbstractSyntaxTree::Location` class. Ruby AST node will hold multiple locations information. `RubyVM::AbstractSyntaxTree::Node#locations` provides a way to access these locations information. `RubyVM::AbstractSyntaxTree::Location` is a class which holds these location information: * `#first_lineno` * `#first_column` * `#last_lineno` * `#last_column`	2024-07-23 12:36:00 +09:00
yui-knk	6be539aab5	Change UNDEF Node structure Change UNDEF Node to hold their items to keep the original grammar structure. For example: ``` undef a, b ``` Before: ``` @ NODE_BLOCK (id: 4, line: 1, location: (1,6)-(1,10))* +- nd_head (1): \| @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,7)) \| +- nd_undef: \| @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) \| +- string: :a +- nd_head (2): @ NODE_UNDEF (id: 3, line: 1, location: (1,9)-(1,10)) +- nd_undef: @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ``` After: ``` @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,10))* +- nd_undefs: +- length: 2 +- element (0): \| @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) \| +- string: :a +- element (1): @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ```	2024-07-20 11:25:26 +09:00
eileencodes	d25b74b32c	Resize arrays in `rb_ary_freeze` and use it for freezing arrays While working on a separate issue we found that in some cases `ary_heap_realloc` was being called on frozen arrays. To fix this, this change does the following: 1) Updates `rb_ary_freeze` to assert the type is an array, return if already frozen, and shrink the capacity if it is not embedded, shared or a shared root. 2) Replaces `rb_obj_freeze` with `rb_ary_freeze` when the object is always an array. 3) In `ary_heap_realloc`, ensure the new capa is set with `ARY_SET_CAPA`. Previously the change in capa was not set. 4) Adds an assertion to `ary_heap_realloc` that the array is not frozen. Some of this work was originally done in https://github.com/ruby/ruby/pull/2640, referencing this issue https://bugs.ruby-lang.org/issues/16291. There didn't appear to be any objections to this PR, it appears to have simply lost traction. The original PR made changes to arrays and strings at the same time, this PR only does arrays. Also it was old enough that rather than revive that branch I've made a new one. I added Lourens as co-author in addtion to Aaron who helped me with this patch. The original PR made this change for performance reasons, and while that's still true for this PR, the goal of this PR is to avoid calling `ary_heap_realloc` on frozen arrays. The capacity should be shrunk _before_ the array is frozen, not after. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: methodmissing <lourens@methodmissing.com>	2024-07-02 10:34:23 -07:00
yui-knk	899d9f79dd	Rename `vast` to `ast_value` There is an English word "vast". This commit changes the name to be more clear name to avoid confusion.	2024-05-03 12:40:35 +09:00
HASUMI Hitoshi	55a402bb75	Add line_count field to rb_ast_body_t This patch adds `int line_count` field to `rb_ast_body_t` structure. Instead, we no longer cast `script_lines` to Fixnum. ## Background Ref https://github.com/ruby/ruby/pull/10618 In the PR above, we have decoupled IMEMO from `rb_ast_t`. This means we could lift the five-words-restriction of the structure that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field. ## Relating refactor - Remove the second parameter of `rb_ruby_ast_new()` function ## Attention I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()` of vm.c, because I don't think it is necessary. But I will make another PR for this so that we can atomically revert in case I was wrong (See the comment on the code)	2024-04-27 12:08:26 +09:00
HASUMI Hitoshi	2244c58b00	[Universal parser] Decouple IMEMO from rb_ast_t This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t ` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t ` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c\|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2024-04-26 11:21:08 +09:00
yui-knk	2992e1074a	Refactor parser compile functions Refactor parser compile functions to reduce the dependence on ruby functions. This commit includes these changes 1. Refactor `gets`, `input` and `gets_` of `parser_params` Parser needs two different data structure to get next line, function (`gets`) and input data (`input`). However `gets_` is used for both function (`call`) and input data (`ptr`). `call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used. `ptr` is used for managing the current pointer on String when `parser_compile_string` is used. This commit changes parser to used only `gets` and `input` then removes `gets_`. 2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c This change reduces the dependence on ruby functions from parser. 3. Change ruby_parser and ripper to take care of `VALUE input` GC mark Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper. `input` is arbitrary data pointer from the viewpoint of parser. 4. Introduce rb_parser_compile_array function Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know about the detail of `lex_gets` and `input`. Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.	2024-04-23 07:20:22 +09:00
HASUMI Hitoshi	9b1e97b211	[Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params )->debug_lines` - `(rb_ast_t )->body.script_lines` - Instead, they are now `rb_parser_ary_t ` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t ` into `VALUE` ## Other details - Extend `rb_parser_ary_t `. It previously could only store `rb_parser_ast_token `, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see Future tasks below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple	2024-04-15 20:51:54 +09:00
HASUMI Hitoshi	f5e387a300	Separate SCRIPT_LINES__ from ast.c This patch suggests relocating the code dealing with `SCRIPT_LINES__` from ast.c to ruby_parser.c. ## Background - I guess `AbstractSyntaxTree.of` method used to use `SCRIPT_LINES__` internally for some reason before - However, now it appears `SCRIPT_LINES__` is no longer used meaningfully by the method - As evidence of this, (and as my patch shows,) removing the function call of `rb_script_lines_for()` from `ast_s_of()` does not affect the result of `test/ruby/test_ast.rb` Given the above, I think two possibilities can be considered: - (A) `AbstractSyntaxTree.of` has not needed `SCRIPT_LINES__` already (I pick this) - (B) We lack a test case of `AbstractSyntaxTree.of` that needs to use `SCRIPT_LINES__` ## Besides, The current implementation causes strange behavior: ```console ruby -e"SCRIPT_LINES__ = {__FILE__ => []}; puts RubyVM::AbstractSyntaxTree.of(->{ 1 + 2 }, keep_script_lines: true).script_lines" => `-e:1:in '<main>': undefined method 'script_lines' for nil (NoMethodError)` ``` I think this is a bug because `AbstractSyntaxTree.of` is not supposed to return `nil` even in this case. This happens due to the ast.c's dependence on `SCRIPT_LINES__`. And at the end of the `ast_s_of()`, `node_find()` can not find the target child node obviously because it doesn't make sense to look for a corresponding node made from the parameter of `AbstractSyntaxTree.of` in the AST tree made from the value of `{__FILE__ => []}` ## Solution Since I think it's good enough `SCRIPT_LINES__` to be only referred by ruby.c, I chose the possibility "(A)" and wrote this patch which moves `rb_script_lines_for()` from ast.c to ruby_parser.c. So as the result: - `ast_s_of()` function no longer look up `SCRIPT_LINES__` - Even so, this patched code passes the existing tests - The strange behavior above no longer happens (I also added a test for it) Please correct me if I miss something🙏	2024-04-04 18:29:16 +09:00
yui-knk	f057741c5d	NODE_LIT is not used anymore	2024-04-04 13:17:26 +09:00
HASUMI Hitoshi	9a19cfd4cd	[Universal Parser] Reduce dependence on RArray in parse.y - Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t ` - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t` - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c - Implement `rb_parser_str_escape()` - This is a port of the `rb_str_escape()` function in string.c - `rb_parser_str_escape()` does not depend on `VALUE` (RString) - Instead, it uses `rb_parser_stirng_t ` - This function works when --dump=y option passed - Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future - Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)	2024-03-12 17:17:52 +09:00
Kevin Newton	82a4c3af16	Add error for iseqs compiled by prism	2024-02-21 11:44:40 -05:00
yui-knk	e7ab5d891c	Introduce NODE_REGX to manage regexp literal	2024-02-21 08:06:48 +09:00
yui-knk	89cfc15207	[Feature #20257 ] Rearchitect Ripper Introduce another semantic value stack for Ripper so that Ripper can manage both Node and Ruby Object separately. This rearchitectutre of Ripper solves these issues. Therefore adding test cases for them. * [Bug 10436] https://bugs.ruby-lang.org/issues/10436 * [Bug 18988] https://bugs.ruby-lang.org/issues/18988 * [Bug 20055] https://bugs.ruby-lang.org/issues/20055 Checked the differences of `Ripper.sexp` for files under `/test/ruby` are only on test_pattern_matching.rb. The differences comes from the differences between `new_hash_pattern_tail` functions between parser and Ripper. Ripper `new_hash_pattern_tail` didn’t call `assignable` then `kw_rest_arg` wasn’t marked as local variable. This is also fixed by this commit. ``` --- a/./tmp/before/test_pattern_matching.rb +++ b/./tmp/after/test_pattern_matching.rb @@ -3607,7 +3607,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [985, 10]]], + [:var_ref, [:@ident, “a”, [985, 10]]], :==, [:hash, nil]]], nil]]], @@ -3662,7 +3662,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [994, 10]]], + [:var_ref, [:@ident, “a”, [994, 10]]], :==, [:hash, [:assoclist_from_args, @@ -3813,7 +3813,7 @@ [:command, [:@ident, “raise”, [1022, 10]], [:args_add_block, - [[:vcall, [:@ident, “b”, [1022, 16]]]], + [[:var_ref, [:@ident, “b”, [1022, 16]]]], false]]], [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]], nil, @@ -3876,7 +3876,7 @@ [:@int, “0”, [1033, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1033, 20]]], + [:var_ref, [:@ident, “b”, [1033, 20]]], :==, [:hash, nil]]]], nil]]], @@ -3946,7 +3946,7 @@ [:@int, “0”, [1042, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1042, 20]]], + [:var_ref, [:@ident, “b”, [1042, 20]]], :==, [:hash, [:assoclist_from_args, @@ -5206,7 +5206,7 @@ [[:assoc_new, [:@label, “c:“, [1352, 22]], [:@int, “0”, [1352, 25]]]]]], - [:vcall, [:@ident, “r”, [1352, 29]]]], + [:var_ref, [:@ident, “r”, [1352, 29]]]], false]]], [:binary, [:call, @@ -5299,7 +5299,7 @@ [:assoc_new, [:@label, “c:“, [1367, 34]], [:@int, “0”, [1367, 37]]]]]], - [:vcall, [:@ident, “r”, [1367, 41]]]], + [:var_ref, [:@ident, “r”, [1367, 41]]]], false]]], [:binary, [:call, @@ -5931,7 +5931,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]], [[:binary, - [:vcall, [:@ident, “r”, [1534, 8]]], + [:var_ref, [:@ident, “r”, [1534, 8]]], :==, [:hash, [:assoclist_from_args, ```	2024-02-20 17:33:58 +09:00
yui-knk	33c1e082d0	Remove ruby object from string nodes String nodes holds ruby string object on `VALUE nd_lit`. This commit changes it to `struct rb_parser_string *string` to reduce dependency on ruby object. Sometimes these strings are concatenated with other string therefore string concatenate functions are needed.	2024-02-09 14:20:17 +09:00
Nobuyoshi Nakada	e018036d89	Rename `nd_head` in `RNode_RESBODY` as `nd_next`	2024-01-28 11:12:22 +09:00
S.H	9b40f42c22	Introduce `NODE_ENCODING` `__ENCODING__ `was managed by `NODE_LIT` with Encoding object. Introduce `NODE_ENCODING` for 1. `__ENCODING__` is detectable from AST Node. 2. Reduce dependency Ruby object for parse.y	2024-01-27 08:11:10 +00:00
yui-knk	db476cc71c	Introduce NODE_SYM to manage symbol literal `:sym` was managed by `NODE_LIT` with `Symbol` object. This commit introduces `NODE_SYM` so that 1. Symbol literal is detectable from AST Node 2. Reduce dependency on ruby object	2024-01-09 16:07:19 +09:00
yui-knk	7ffff3e043	Change numeric node value functions argument to `NODE *` Change the argument to align with other node value functions like `rb_node_line_lineno_val`.	2024-01-08 14:02:48 +09:00
S-H-GAMELINKS	1b8d01136c	Introduce Numeric Node's	2024-01-07 09:24:34 +09:00
yui-knk	7a050638b1	Introduce NODE_FILE `__FILE__` was managed by `NODE_STR` with `String` object. This commit introduces `NODE_FILE` and `struct rb_parser_string` so that 1. `__FILE__` is detectable from AST Node 2. Reduce dependency ruby object	2024-01-02 14:19:42 +09:00
yui-knk	1ade170a6c	Introduce NODE_LINE `__LINE__` was managed by `NODE_LIT` with `Integer` object. This commit introduces `NODE_LINE` so that 1. `__LINE__` is detectable from AST Node 2. Reduce dependency ruby object	2023-12-29 18:32:27 +09:00
Nobuyoshi Nakada	45eee0cd94	Remove duplicate to_path conversion `rb_file_open_str` calls `FilePathValue`, and the converted result is not used in this function.	2023-11-02 10:06:03 +09:00
Nobuyoshi Nakada	13c9cbe09e	Embed `rb_args_info` in `rb_node_args_t`	2023-10-30 00:19:43 +09:00
yui-knk	08e25985d1	Expand OP_ASGN1 nd_args to nd_index and nd_rvalue ARGSCAT has been used for nd_args to hold index and rvalue, because there was limitation on the number of members for Node. We can easily change structure of node now, let's expand it.	2023-10-20 07:56:20 +09:00
yui-knk	3049b5e348	Differentiate VAR nodes	2023-10-09 13:33:36 +09:00
yui-knk	09b33ea15a	Differentiate CALL nodes	2023-10-09 13:33:36 +09:00
yui-knk	529a651f82	Differentiate ASGN nodes	2023-10-07 17:54:35 +09:00
yui-knk	f28d380374	Pass nd_value to NODE_REQUIRED_KEYWORD_P	2023-10-07 17:54:35 +09:00
Nobuyoshi Nakada	a5cc6341c0	Remove `NODE_VALUES` This node type was added for the multi-value experiment back in 2004. The feature itself was removed after a few years, but this is its remnant.	2023-10-06 03:39:58 +09:00
Nobuyoshi Nakada	696022a0cb	Differentiate `NODE_BREAK`/`NODE_NEXT`/`NODE_RETURN`	2023-10-05 14:23:42 +09:00
Nobuyoshi Nakada	70e1635950	Move internal NODE_DEF_TEMP to parse.y	2023-10-05 14:23:42 +09:00
yui-knk	08239fd6af	Use rb_node_args_t and rb_node_args_aux_t instead of NODE	2023-10-01 19:38:03 +09:00
yui-knk	cecd1de2eb	Use rb_node_opt_arg_t and rb_node_kw_arg_t instead of NODE	2023-10-01 09:19:42 +09:00
yui-knk	d293d9e191	Expand pattern_info struct into ARYPTN Node and FNDPTN Node	2023-09-30 13:11:32 +09:00
yui-knk	68ae87546e	Merge NODE_DEF_TEMP and NODE_DEF_TEMP2	2023-09-29 19:36:34 +09:00
yui-knk	37a783a30c	Merge RNode_OP_ASGN2 and RNode_OP_ASGN22	2023-09-29 08:36:39 +09:00
yui-knk	74c6781153	Change RNode structure from union to struct All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.	2023-09-28 11:58:10 +09:00
Nobuyoshi Nakada	6aa16f9ec1	Move SCRIPT_LINES__ away from parse.y	2023-08-25 18:23:05 +09:00
yui-knk	b481b673d7	[Feature #19719 ] Universal Parser Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.	2023-06-12 18:23:48 +09:00
yui-knk	5f65e8c5d5	Rename `rb_node_name` to the original name `98637d421d` changes the name of the function. However this function is exported as global, then change the name to origin one for keeping compatibility.	2023-05-24 20:54:48 +09:00
yui-knk	98637d421d	Move `ruby_node_name` to node.c and rename prefix of the function	2023-05-23 18:05:35 +09:00
Nobuyoshi Nakada	2490b2e121	Add utility macros `DECIMAL_SIZE_OF` and `DECIMAL_SIZE_OF_BYTES`	2023-02-14 15:18:21 +09:00
yui-knk	979dd02e2f	Check if the argument is Thread::Backtrace::Location object [Bug #19262]	2023-01-06 09:22:09 +09:00
yui-knk	d8601621ed	Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: \| src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. \| \|compare-ruby\|built-ruby\| \|:--------------------\|-----------:\|---------:\| \|without_keep_tokens \| 21.659k\| 21.303k\| \| \| 1.02x\| -\| \|with_keep_tokens \| 6.220k\| 5.691k\| \| \| 1.09x\| -\| ```	2022-11-21 09:01:34 +09:00
eileencodes	3391c51eff	Add `node_id_for_backtrace_location` function We want to use error highlight with eval'd code, specifically ERB templates. We're able to recover the generated code for eval'd templates and can get a parse tree for the ERB generated code, but we don't have a way to get the node id from the backtrace location. So we can't pass the right node into error highlight. This patch gives us an API to get the node id from the backtrace location so we can find the node in the AST. Error Highlight PR: https://github.com/ruby/error_highlight/pull/26 Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2022-10-31 13:39:56 +09:00
yui-knk	4bfdf6d06d	Move `error` from top_stmts and top_stmt to stmt By this change, syntax error is recovered smaller units. In the case below, "DEFN :bar" is same level with "CLASS :Foo" now. ``` module Z class Foo foo. end def bar end end ``` [Feature #19013]	2022-10-08 17:59:11 +09:00
yui-knk	fbbdbdd891	Add error_tolerant option to RubyVM::AST If this option is enabled, SyntaxError is not raised and Node is returned even if passed script is broken. [Feature #19013]	2022-10-08 17:59:11 +09:00

1 2 3

121 Коммитов