Граф коммитов

845 Коммитов

Автор SHA1 Сообщение Дата
HASUMI Hitoshi 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Kevin Newton af24ba4034 [PRISM] Raise LoadError when file cannot be read 2024-04-25 14:59:48 -04:00
Jean Boussier f06670c5a2 Eliminate usage of OBJ_FREEZE_RAW
Previously it would bypass the `FL_ABLE` check, but
since shapes introduction, it started having a different
behavior than `OBJ_FREEZE`, as it would onyl set the `FL_FREEZE`
flag, but not update the shape.

I have no indication of this causing a bug yet, but it seems
like a trap waiting to happen.
2024-04-16 17:20:35 +02:00
HASUMI Hitoshi 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Nobuyoshi Nakada b88e0d6653
Merge `push_include` and `ruby_push_include` 2024-04-07 17:29:24 +09:00
Nobuyoshi Nakada 0d93fd0f69
Merge `push_include_cygwin` into `push_include` 2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada 0620f006c2
Remove `translit_char`
It has been used only for DOSISH other than Windows.
2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada df8f1f78f0
[Feature #20329] Separate additional flags from main dump options
Additional flags are comma separated list preceeded by `-` or `+`.

Before:
```sh
$ ruby --dump=insns+without_opt
```

After:
```sh
$ ruby --dump=insns-opt,-optimize
```

At the same time, `parsetree_with_comment` is split to `parsetree`
option and additional `comment` flag.

Before:
```sh
$ ruby --dump=parsetree_with_comment
```

After:
```sh
$ ruby --dump=parsetree,+comment
```

Also flags can be separate `--dump`.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```

Ineffective flags are ignored silently.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```
2024-04-06 20:27:02 +09:00
Nobuyoshi Nakada 9b5d4274a2
[Feature #20329] Clean up dump sub-options
Restructure `insns_without_opt` and `parsetree_with_comment` as
`insns+without_opt` and `parsetree+with_comment` respectively, like
`+error-tolerant`.
2024-04-06 20:27:01 +09:00
HASUMI Hitoshi f5e387a300 Separate SCRIPT_LINES__ from ast.c
This patch suggests relocating the code dealing with `SCRIPT_LINES__` from ast.c to ruby_parser.c.

## Background

- I guess `AbstractSyntaxTree.of` method used to use `SCRIPT_LINES__` internally for some reason before
- However, now it appears `SCRIPT_LINES__` is no longer used meaningfully by the method
- As evidence of this, (and as my patch shows,) removing the function call of `rb_script_lines_for()` from `ast_s_of()` does not affect the result of `test/ruby/test_ast.rb`

Given the above, I think two possibilities can be considered:

- (A) `AbstractSyntaxTree.of` has not needed `SCRIPT_LINES__` already (I pick this)
- (B) We lack a test case of `AbstractSyntaxTree.of` that needs to use `SCRIPT_LINES__`

## Besides,

The current implementation causes strange behavior:

```console
ruby -e"SCRIPT_LINES__ = {__FILE__ => []}; puts RubyVM::AbstractSyntaxTree.of(->{ 1 + 2 }, keep_script_lines: true).script_lines"
=> `-e:1:in '<main>': undefined method 'script_lines' for nil (NoMethodError)`
```

I think this is a bug because `AbstractSyntaxTree.of` is not supposed to return `nil` even in this case.
This happens due to the ast.c's dependence on `SCRIPT_LINES__`.
And at the end of the `ast_s_of()`, `node_find()` can not find the target child node obviously because it doesn't make sense to look for a corresponding node made from the parameter of `AbstractSyntaxTree.of` in the AST tree made from the value of `{__FILE__ => []}`

## Solution

Since I think it's good enough `SCRIPT_LINES__` to be only referred by ruby.c, I chose the possibility "(A)" and wrote this patch which moves `rb_script_lines_for()` from ast.c to ruby_parser.c.

So as the result:

- `ast_s_of()` function no longer look up `SCRIPT_LINES__`
- Even so, this patched code passes the existing tests
- The strange behavior above no longer happens (I also added a test for it)

Please correct me if I miss something🙏
2024-04-04 18:29:16 +09:00
Kevin Newton 42d1cd8f7f [PRISM] Pass --enable-frozen-string-literal through to evals 2024-03-27 08:34:42 -04:00
Étienne Barrié 12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Kevin Newton 97810cbbf2 [PRISM] Process encoding on CLI for -K 2024-03-18 11:55:43 -04:00
Jean Boussier 91bf7eb274 Refactor frozen_string_literal check during compilation
In preparation for https://bugs.ruby-lang.org/issues/20205.

The `frozen_string_literal` compilation option will no longer
be a boolean but a tri-state: `on/off/default`.
2024-03-15 15:52:33 +01:00
Nobuyoshi Nakada c843afbf6f Chomp last punctuations from descriptions for `-h`
The following parts will not be shown for `-h` option.  And not to
reach 80 columns.  Some terminal emulators (Windows command prompt at
least) wrap the cursor to the next line when reaching the rightmost
column, before exceeding.
2024-03-14 01:19:57 +09:00
Nobuyoshi Nakada dec2a8191c
`--dump=prism_parsetree` is no longer provided
Since it did not make sense without `--parser=prism` option, just a
duplication.  Now it is `--parser=prism --dump=parsetree`.
2024-03-13 11:28:50 +09:00
Takashi Kokubun 22708be0d7 Revisions for #10198
This fixes some inconsistencies introduced by that PR.
2024-03-12 13:44:48 -07:00
Burdette Lamar 19da3b4ecf
Revisions for help text (#10198) 2024-03-12 15:14:56 -04:00
Kevin Newton a6dac9bb4f [PRISM] Parse stdin on CLI with prism 2024-03-11 12:51:32 -04:00
Jean Boussier d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Kevin Newton f9d5c604c8 [PRISM] Use new command line option flags 2024-02-29 12:05:19 -05:00
Kevin Newton f8355e88d6 [PRISM] Do not load -r until we check if main script can be read 2024-02-28 12:42:57 -05:00
Kevin Newton 1cef366319 [PRISM] Factor in CLI options for prism 2024-02-28 11:09:43 -05:00
Nobuyoshi Nakada 835fa98a62 Update warning flags before dump 2024-02-20 23:56:07 +09:00
Kevin Newton 8414c26f0d [PRISM] Make prism compiler warning experimental 2024-02-16 11:56:48 -05:00
Nobuyoshi Nakada 81752d2097
Abort when streaming code from stdin with Prism
Do not read STDIN as a String instance.
2024-02-16 15:26:10 +09:00
Nobuyoshi Nakada a64e93a896
Use ID without cache and fix conversion of offset 2024-02-16 15:26:10 +09:00
Nobuyoshi Nakada e0d068aa9c
Extract `process_options_global_setup` 2024-02-16 15:26:10 +09:00
Nobuyoshi Nakada 839ccad20b
Extract functions depending on `--parser` option 2024-02-16 15:26:09 +09:00
Nobuyoshi Nakada 574312dead Extract `show_help` function 2024-02-16 11:20:29 +09:00
Nobuyoshi Nakada 7ac8d3d6ee Dispose AST before exit by yydebug 2024-02-16 11:20:29 +09:00
Yusuke Endoh 25d74b9527 Do not include a backtick in error messages and backtraces
[Feature #16495]
2024-02-15 18:42:31 +09:00
Kevin Newton 9933377c34 [PRISM] Correctly hook up line numbers for eval 2024-02-14 15:29:26 -05:00
Nobuyoshi Nakada 84d8dbe7a5 Enable redefinition check for rbinc methods 2024-02-12 11:51:06 -08:00
Kevin Newton 4a40364c62 [PRISM] Run opt init before parsing 2024-02-08 14:36:39 -05:00
Kevin Newton 3ecfc3e33e [PRISM] Support the DATA constant 2024-02-08 14:36:29 -05:00
Kevin Newton 71f16d498d Raise errors for dumping prism parse tree 2024-01-31 14:54:39 -05:00
Kevin Newton 610636fd6b [PRISM] Mirror iseq APIs
Before this commit, we were mixing a lot of concerns with the prism
compile between RubyVM::InstructionSequence and the general entry
points to the prism parser/compiler.

This commit makes all of the various prism-related APIs mirror
their corresponding APIs in the existing parser/compiler. This means
we now have the correct frame naming, and it's much easier to follow
where the logic actually flows. Furthermore this consolidates a lot
of the prism initialization, making it easier to see where we could
potentially be raising errors.
2024-01-31 13:41:36 -05:00
Matt Valentine-House 4592fdc545 [Prism] path and script name are not the same
When loading Ruby from a file, or parsing using
RubyVM::InstructionSequence.
2024-01-22 15:15:32 -08:00
Kevin Newton 6bcbb9a02b Make prism respect dump_without_opt 2024-01-22 10:18:41 -05:00
Nobuyoshi Nakada 6215b5ba98 Fix off-by-one error of argc
Fix ruby/ruby#9562
2024-01-17 18:26:39 +09:00
Nobuyoshi Nakada 9ba2558b76 Fix possible out-of-bounds access 2024-01-13 23:20:05 +09:00
Takashi Kokubun 64c52cd1c2 RJIT: Add --rjit-trace to allow TracePoint during JIT 2023-12-21 21:05:13 -08:00
HParker 7ef90b3978 Correct free_on_exit env var to free_at_exit 2023-12-20 14:36:32 +09:00
Aaron Patterson df0bfde2b2 We need to load builtins so that they work
Before this commit no methods defined in Ruby were being loaded. For
example `class` or `tap` methods would not exist.

[ruby-core:115793] [Bug #20073]
2023-12-19 08:53:52 -08:00
Matt Valentine-House 893fe30ef2 [PRISM] Fix crash when --parser=prism called with stdin
[Bug #20071]

Currently Ruby crashes when the --parser=prism flag is used either with
no input, or with input that is being redirected from stdin. So all of
the following will crash

ruby --parser=prism
ruby --parser=prism < test_code.rb
cat test_code.rb | ruby --parser=prism

This commit checks whether the input is assumed to be from stdin, and
then processes that as a file.

This will fix the second and third case above, but will cause a slight
behavioural changes for the first case - Ruby will treat stdin as an
empty file in this case and exit, rather than waiting for data to be
piped into stdin.
2023-12-18 19:44:44 +00:00
Nobuyoshi Nakada 2f595c744e
Adjust styles [ci skip] 2023-12-17 00:21:00 +09:00
Adam Hess a604fe4262 update message to clarify compiler, not parser
Co-authored-by: Ufuk Kayserilioglu <ufuk@paralaus.com>
2023-12-15 13:42:19 -05:00
HParker 55326a915f Introduce --parser runtime flag
Introduce runtime flag for specifying the parser,

```
ruby --parser=prism
```

also update the description:

```
$ ruby --parser=prism --version
ruby 3.3.0dev (2023-12-08T04:47:14Z add-parser-runtime.. 0616384c9f) +PRISM [x86_64-darwin23]
```

[Bug #20044]
2023-12-15 13:42:19 -05:00
Takashi Kokubun e282d7b880 Avoid warning --jit when only YJIT is enabled 2023-12-13 00:05:12 -08:00