In some causes node_id might have been left uninitialized leading to
undefined behavior on access. So always set it to -1, so we have *some*
valid value in there.
This `NODE` type was used in pre-YARV implementation, to improve
the performance of assignment to dynamic local variable defined at
the innermost scope. It has no longer any actual difference with
`NODE_DASGN`, except for the node dump.
The implementation of a local variable tables was represented as `ID*`,
but it was very hacky: the first element is not an ID but the size of
the table, and, the last element is (sometimes) a link to the next local
table only when the id tables are a linked list.
This change converts the hacky implementation to a normal struct.
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
Previously imemo_ast was handled as WB-protected which caused a segfault
of the following code:
# shareable_constant_value: literal
M0 = {}
M1 = {}
...
M100000 = {}
My analysis is here: `shareable_constant_value: literal` creates many
Hash instances during parsing, and add them to node_buffer of imemo_ast.
However, the contents are missed because imemo_ast is incorrectly
WB-protected.
This changeset makes imemo_ast as WB-unprotected.
This patch changes local table memory to be managed by a linked list
rather than via the garbage collector. It reduces allocations from the
GC and also fixes a use-after-free bug in the concurrent-with-sweep
compactor I'm working on.
Use ID instead of GENTRY for gvars.
Global variables are compiled into GENTRY (a pointer to struct
rb_global_entry). This patch replace this GENTRY to ID and
make the code simple.
We need to search GENTRY from ID every time (st_lookup), so
additional overhead will be introduced.
However, the performance of accessing global variables is not
important now a day and this simplicity helps Ractor development.
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
Ripper reuses parse.y for its implementation. Ripper changes the
grammar productions to sometimes return Ruby objects. This Ruby objects
are put in to the parser's stack, so they must be kept alive. This is
where the "mark_ary" comes in. The mark array ensures that Ruby objects
created and pushed on the stack during the course of parsing will stay
alive for the life of the parsing functions.
Unfortunately, Arrays do not prevent their contents from moving. If the
compactor runs, objects on the parser stack could move because the array
won't prevent them from moving. But the GC doesn't know about the
parser stack, so it can't update references in that stack (it will
update them in the array).
This commit changes the mark array to be an identity hash. Since the
identity hash relies on memory addresses for the definition of identity,
the GC will not allow keys in an identity hash to move. We can prevent
movement of objects in the parser stack by sticking them in an identity
hash.
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`. Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
and NODE_ZARRAY to NODE_ZLIST.
NODE_ARRAY is used not only by an Array literal, but also the contents
of Hash literals, method call arguments, dynamic string literals, etc.
In addition, the structure of NODE_ARRAY is a linked list, not an array.
This is very confusing, so I believe `NODE_LIST` is a better name.
I guess those AST node were actually used for something, so we'd better
not touch them. Instead this commit just puts the tmpbuffer inside a
different internal struct so that we can mark them.
This commit adds two buckets for allocating NODE structs, then allocates
"markable" NODE objects from one bucket. The reason to do this is so
when the AST mark function scans nodes for VALUE objects to mark, we
only scan NODE objects that we know to reference VALUE objects. If we
*did not* divide the objects, then the mark function spends too much
time scanning objects that don't contain any references.
Now we can reach the ID table buffer from the id table itself, so when
SCOPE nodes are marked we can keep the buffers alive. This eliminates
the need for the "mark array" during normal parse / compile (IOW *not*
Ripper).
This patch changes the AST mark function so that it will walk through
nodes in the NODE buffer marking Ruby objects rather than using a mark
array to guarantee liveness. The reason I want to do this is so that
when compaction happens on major GCs, node objects will have their
references pinned (or possibly we can update them correctly).
`rb_ast_t` holds a reference to this object, so it should mark the
object. Currently it is relying on the `mark_ary` on `node_buffer` to
ensure that the object stays alive. But since the array internals can
move, this could cause a segv if compaction impacts the array.
`(ID)1` was assigned to NODE_ARGS#rest_arg for `{|x,| }`.
This change removes the magic number by introducing an explicit macro
variable for it: NODE_SPECIAL_EXCESSED_COMMA.