github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
yui-knk	db476cc71c	Introduce NODE_SYM to manage symbol literal `:sym` was managed by `NODE_LIT` with `Symbol` object. This commit introduces `NODE_SYM` so that 1. Symbol literal is detectable from AST Node 2. Reduce dependency on ruby object	2024-01-09 16:07:19 +09:00
S-H-GAMELINKS	1b8d01136c	Introduce Numeric Node's	2024-01-07 09:24:34 +09:00
yui-knk	7a050638b1	Introduce NODE_FILE `__FILE__` was managed by `NODE_STR` with `String` object. This commit introduces `NODE_FILE` and `struct rb_parser_string` so that 1. `__FILE__` is detectable from AST Node 2. Reduce dependency ruby object	2024-01-02 14:19:42 +09:00
Nobuyoshi Nakada	13c9cbe09e	Embed `rb_args_info` in `rb_node_args_t`	2023-10-30 00:19:43 +09:00
Nobuyoshi Nakada	a405b28e85	Delete heredoc line mark references	2023-10-14 11:08:43 +09:00
yui-knk	d293d9e191	Expand pattern_info struct into ARYPTN Node and FNDPTN Node	2023-09-30 13:11:32 +09:00
Peter Zhu	97564ddf2b	Fix memory leak in the parser Reproduction script: ``` require "ripper" 10.times do 20_000.times do Ripper.parse("") end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 28032 34432 40704 47232 53632 60032 66432 72832 79232 85632 ``` After: ``` 21760 21760 21760 21760 21760 21760 21760 21760 21760 21760 ```	2023-09-29 16:51:17 -04:00
yui-knk	74c6781153	Change RNode structure from union to struct All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.	2023-09-28 11:58:10 +09:00
yui-knk	fb7a2ddb4b	Directly free structure managed by imemo tmpbuf NODE_ARGS, NODE_ARYPTN, NODE_FNDPTN manage memory of their structure by imemo tmpbuf Object. However rb_ast_struct has reference to NODE. Then these memory can be freed directly when rb_ast_struct is freed. This commit reduces parser's dependency on CRuby functions.	2023-09-22 11:25:53 +09:00
yui-knk	19c62b400d	Replace parser & node compile_option from Hash to bit field This commit reduces dependency to CRuby object.	2023-06-17 16:41:08 +09:00
yui-knk	b481b673d7	[Feature #19719 ] Universal Parser Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.	2023-06-12 18:23:48 +09:00
yui-knk	5f65e8c5d5	Rename `rb_node_name` to the original name `98637d421d` changes the name of the function. However this function is exported as global, then change the name to origin one for keeping compatibility.	2023-05-24 20:54:48 +09:00
yui-knk	98637d421d	Move `ruby_node_name` to node.c and rename prefix of the function	2023-05-23 18:05:35 +09:00
yui-knk	d8601621ed	Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: \| src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. \| \|compare-ruby\|built-ruby\| \|:--------------------\|-----------:\|---------:\| \|without_keep_tokens \| 21.659k\| 21.303k\| \| \| 1.02x\| -\| \|with_keep_tokens \| 6.220k\| 5.691k\| \| \| 1.09x\| -\| ```	2022-11-21 09:01:34 +09:00
yui-knk	4bfdf6d06d	Move `error` from top_stmts and top_stmt to stmt By this change, syntax error is recovered smaller units. In the case below, "DEFN :bar" is same level with "CLASS :Foo" now. ``` module Z class Foo foo. end def bar end end ``` [Feature #19013]	2022-10-08 17:59:11 +09:00
Wolf	c69ad738dc	Initialize node_id In some causes node_id might have been left uninitialized leading to undefined behavior on access. So always set it to -1, so we have some valid value in there.	2022-08-01 10:36:36 +09:00
Takashi Kokubun	5b21e94beb	Expand tabs [ci skip] [Misc #18891]	2022-07-21 09:42:04 -07:00
Nobuyoshi Nakada	54f0e63a8c	Remove `NODE_DASGN_CURR` [Feature #18406 ] This `NODE` type was used in pre-YARV implementation, to improve the performance of assignment to dynamic local variable defined at the innermost scope. It has no longer any actual difference with `NODE_DASGN`, except for the node dump.	2021-12-13 12:53:03 +09:00
S.H	ec7f14d9fa	Add `nd_type_p` macro	2021-12-04 00:01:24 +09:00
Yusuke Endoh	feda058531	Refactor hacky ID tables to struct rb_ast_id_table_t The implementation of a local variable tables was represented as `ID*`, but it was very hacky: the first element is not an ID but the size of the table, and, the last element is (sometimes) a link to the next local table only when the id tables are a linked list. This change converts the hacky implementation to a normal struct.	2021-11-21 08:59:24 +09:00
Yusuke Endoh	753cfbdbf3	node.c (dump_node): update format explanation for NODE_ARGS	2021-11-17 23:38:52 +09:00
Yusuke Endoh	5a7b4dba26	node.c (dump_node): trivial refactoring	2021-11-17 23:38:19 +09:00
Nobuyoshi Nakada	6504ca006b	Show node IDs in dump	2021-07-12 12:10:16 +09:00
Yusuke Endoh	acae5f363d	ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true` This option makes the parser keep the original source as an array of the original code lines. This feature exploits the mechanism of `SCRIPT_LINES__` but records only the specified code that is passed to RubyVM::AST.of or .parse, instead of recording all parsed program texts.	2021-06-18 02:34:27 +09:00
Yusuke Endoh	e48109d86f	Partially revert `2c7d3b3a72` to make imemo_ast WB-protected again. Only the test is kept.	2021-04-27 17:05:19 +09:00
Yusuke Endoh	2c7d3b3a72	node.c (rb_ast_new): imemo_ast is WB-unprotected Previously imemo_ast was handled as WB-protected which caused a segfault of the following code: # shareable_constant_value: literal M0 = {} M1 = {} ... M100000 = {} My analysis is here: `shareable_constant_value: literal` creates many Hash instances during parsing, and add them to node_buffer of imemo_ast. However, the contents are missed because imemo_ast is incorrectly WB-protected. This changeset makes imemo_ast as WB-unprotected.	2021-04-26 22:46:51 +09:00
Nobuyoshi Nakada	c060bdc2b4	NODE markability should not change by nd_set_type	2021-01-14 16:12:02 +09:00
Kazuki Tsujimoto	e03e1982bd	Change NODE layout for pattern matching I prefer pconst to be the first element of NODE. Before: \| ARYPTN \| FNDPTN \| HSHPTN ---+--------+--------+----------- u1 \| imemo \| imemo \| pkwargs u2 \| pconst \| pconst \| pconst u3 \| apinfo \| fpinfo \| pkwrestarg After: \| ARYPTN \| FNDPTN \| HSHPTN ---+--------+--------+----------- u1 \| pconst \| pconst \| pconst u2 \| imemo \| imemo \| pkwargs u3 \| apinfo \| fpinfo \| pkwrestarg	2020-11-01 16:19:07 +09:00
Nobuyoshi Nakada	081cc4eb28	Dump FrozenCore specially	2020-10-20 23:52:19 +09:00
Nobuyoshi Nakada	7b2bea42a2	Unfreeze string-literal-only interpolated string-literal [Feature #17104]	2020-09-30 22:15:28 +09:00
Kazuki Tsujimoto	fcdbdff631	rb_{ary,fnd}_pattern_info: Remove imemo member to reduce memory usage This is a partial revert commit of `8f096226e1`. NODE layout: Before: \| ARYPTN \| FNDPTN \| HSHPTN ---+--------+--------+----------- u1 \| pconst \| pconst \| pconst u2 \| unused \| unused \| pkwargs u3 \| apinfo \| fpinfo \| pkwrestarg After: \| ARYPTN \| FNDPTN \| HSHPTN ---+--------+--------+----------- u1 \| imemo \| imemo \| pkwargs u2 \| pconst \| pconst \| pconst u3 \| apinfo \| fpinfo \| pkwrestarg	2020-08-02 01:04:06 +09:00
Aaron Patterson	7533519990	NODE_MATCH needs reference updating	2020-07-30 11:11:13 -07:00
Aaron Patterson	35ba2783fe	Use a linked list to eliminate imemo tmp bufs for managing local tables This patch changes local table memory to be managed by a linked list rather than via the garbage collector. It reduces allocations from the GC and also fixes a use-after-free bug in the concurrent-with-sweep compactor I'm working on.	2020-07-27 12:40:01 -07:00
Koichi Sasada	a0f12a0258	Use ID instead of GENTRY for gvars. (#3278 ) Use ID instead of GENTRY for gvars. Global variables are compiled into GENTRY (a pointer to struct rb_global_entry). This patch replace this GENTRY to ID and make the code simple. We need to search GENTRY from ID every time (st_lookup), so additional overhead will be introduced. However, the performance of accessing global variables is not important now a day and this simplicity helps Ractor development.	2020-07-03 16:56:44 +09:00
Kazuki Tsujimoto	ddded1157a	Introduce find pattern [Feature #16828 ]	2020-06-14 09:24:36 +09:00
卜部昌平	5e22f873ed	decouple internal.h headers Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).	2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada	fb6a489af2	Revert "Method reference operator" This reverts commit `67c5747369`. [Feature #16275]	2019-11-12 17:24:48 +09:00
Aaron Patterson	7460c884fb	Use an identity hash for pinning Ripper objects Ripper reuses parse.y for its implementation. Ripper changes the grammar productions to sometimes return Ruby objects. This Ruby objects are put in to the parser's stack, so they must be kept alive. This is where the "mark_ary" comes in. The mark array ensures that Ruby objects created and pushed on the stack during the course of parsing will stay alive for the life of the parsing functions. Unfortunately, Arrays do not prevent their contents from moving. If the compactor runs, objects on the parser stack could move because the array won't prevent them from moving. But the GC doesn't know about the parser stack, so it can't update references in that stack (it will update them in the array). This commit changes the mark array to be an identity hash. Since the identity hash relies on memory addresses for the definition of identity, the GC will not allow keys in an identity hash to move. We can prevent movement of objects in the parser stack by sticking them in an identity hash.	2019-11-05 08:24:14 -08:00
卜部昌平	7e0ae1698d	avoid overflow in integer multiplication This changeset basically replaces `ruby_xmalloc(x * y)` into `ruby_xmalloc2(x, y)`. Some convenient functions are also provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates x * y + z byes.	2019-10-09 12:12:28 +09:00
Nobuyoshi Nakada	0c6f36668a	Adjusted spaces [ci skip]	2019-09-27 10:20:56 +09:00
Aaron Patterson	293c6c8cc3	Add compaction support to `rb_ast_t` This commit adds compaction support to `rb_ast_t`.	2019-09-26 15:41:46 -07:00
Aaron Patterson	414a80d242	`NODE_MATCH` needs to be marked / allocated from marking bucket Fixes a test in RubySpec	2019-09-10 10:44:49 -07:00
Aaron Patterson	4524780d17	Revert "Reverting node marking until I can fix GC problem." This reverts commit `092f31e7e2`.	2019-09-09 14:26:51 -07:00
Yusuke Endoh	99c9431ea1	Rename NODE_ARRAY to NODE_LIST to reflect its actual use cases and NODE_ZARRAY to NODE_ZLIST. NODE_ARRAY is used not only by an Array literal, but also the contents of Hash literals, method call arguments, dynamic string literals, etc. In addition, the structure of NODE_ARRAY is a linked list, not an array. This is very confusing, so I believe `NODE_LIST` is a better name.	2019-09-07 13:56:29 +09:00
Aaron Patterson	092f31e7e2	Reverting node marking until I can fix GC problem. Looks like we're getting WB misses during stressful GC on startup. I am investigating.	2019-09-05 12:44:23 -07:00
Aaron Patterson	f211ab2015	I forgot to add `break` in my case statements Give me a break.	2019-09-05 11:37:03 -07:00
Aaron Patterson	8f096226e1	Stash tmpbuffer inside internal structs I guess those AST node were actually used for something, so we'd better not touch them. Instead this commit just puts the tmpbuffer inside a different internal struct so that we can mark them.	2019-09-05 11:04:43 -07:00
Aaron Patterson	8cd845aa5b	add debugging code to the mark function	2019-09-05 10:13:50 -07:00
Aaron Patterson	01aa2462b5	lazily allocate the mark array	2019-09-05 10:13:50 -07:00
Aaron Patterson	545b6db3fb	Create two buckets for allocating NODE structs This commit adds two buckets for allocating NODE structs, then allocates "markable" NODE objects from one bucket. The reason to do this is so when the AST mark function scans nodes for VALUE objects to mark, we only scan NODE objects that we know to reference VALUE objects. If we did not divide the objects, then the mark function spends too much time scanning objects that don't contain any references.	2019-09-05 10:13:50 -07:00

1 2 3 4

187 Коммитов