Граф коммитов

870 Коммитов

Автор SHA1 Сообщение Дата
Peter Zhu 51ffef2819 Fix memory leak in prism when syntax error in iseq compilation
If there's a syntax error during iseq compilation then prism would leak
memory because it would not free the pm_parse_result_t.

This commit changes pm_iseq_new_with_opt to have a rb_protect to catch
when an error is raised, and return NULL and set error_state to a value
that can be raised by calling rb_jump_tag after memory has been freed.

For example:

    10.times do
      10_000.times do
        eval("/[/=~s")
      rescue SyntaxError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    39280
    68736
    99232
    128864
    158896
    188208
    217344
    246304
    275376
    304592

After:

    12192
    13200
    14256
    14848
    16000
    16000
    16000
    16064
    17232
    17952
2024-11-08 15:43:41 -05:00
Koichi Sasada ab7ab9e450 `Warning[:strict_unused_block]`
to show unused block warning strictly.

```ruby
class C
  def f = nil
end

class D
  def f = yield
end

[C.new, D.new].each{|obj| obj.f{}}
```

In this case, `D#f` accepts a block. However `C#f` doesn't
accept a block. There are some cases passing a block with
`obj.f{}` where `obj` is `C` or `D`. To avoid warnings on
such cases, "unused block warning" will be warned only if
there is not same name which accepts a block.
On the above example, `C.new.f{}` doesn't show any warnings
because there is a same name `D#f` which accepts a block.

We call this default behavior as "relax mode".

`strict_unused_block` new warning category changes from
"relax mode" to "strict mode", we don't check same name
methods and `C.new.f{}` will be warned.

[Feature #15554]
2024-11-06 11:06:18 +09:00
Takashi Kokubun 478e0fc710
YJIT: Replace Array#each only when YJIT is enabled (#11955)
* YJIT: Replace Array#each only when YJIT is enabled

* Add comments about BUILTIN_ATTR_C_TRACE

* Make Ruby Array#each available with --yjit as well

* Fix all paths that expect a C location

* Use method_basic_definition_p to detect patches

* Copy a comment about C_TRACE flag to compilers

* Rephrase a comment about add_yjit_hook

* Give METHOD_ENTRY_BASIC flag to Array#each

* Add --yjit-c-builtin option

* Allow inconsistent source_location in test-spec

* Refactor a check of BUILTIN_ATTR_C_TRACE

* Set METHOD_ENTRY_BASIC without touching vm->running
2024-11-04 11:14:28 -05:00
Nobuyoshi Nakada 3e1021b144 Make default parser enum and define getter/setter 2024-10-02 20:43:40 +09:00
Lars Kanis 9b4a497456 Fix loading of nonascii script name on Windows
Since the prism parser was enabled by default, loading scripts with nonascii characters somewhere in the script path is no longer working.
It only works when the codepage was switched to 65001 (UTF-8).

This patch doesn't change the encoding of __FILE__. It is still in locale encoding.
That's why pm_load_file() is called with UTF-8 script name and pm_parse_file() with locale encoding.

The loading of nonascii script names is part of the test-all, but it doesn't trigger the failure on GHA, since it is using cp 65001.
On other codepages it fails with:

[53/71] TestRubyOptions#test_command_line_progname_nonascii = 0.04 s
  1) Failure:
TestRubyOptions#test_command_line_progname_nonascii [C:/Users/Administrator/ruby/test/ruby/test_rubyoptions.rb:1086]:
[ruby-dev:48752] [Bug #10555]
pid 1736 exit 1
| C:\Users\Administrator\ruby\ruby.exe: No such file or directory -- �.rb (LoadError)
.

1. [1/2] Assertion for "stdout"
   | <["\xFF.rb"]> expected but was
   | <[]>.

2. [2/2] Assertion for "stderr"
   | <[]> expected but was
   | <["C:\\Users\\Administrator\\ruby\\ruby.exe: No such file or directory -- \xFF.rb (LoadError)"]>.
2024-09-29 19:01:18 -04:00
Kevin Newton 9afc6a981d [PRISM] Only parse shebang on main script
Fixes [Bug #20730]
2024-09-13 12:51:53 -04:00
Kevin Newton 371432b2d7 [PRISM] Handle RubyVM.keep_script_lines 2024-08-29 20:27:01 -04:00
Alan Wu 554098303d [PRISM] For stdin scripts, use locale encoding
For example:

    $ echo 'p __ENCODING__' | LANG=C ruby
    #<Encoding:US-ASCII>

But, allow -K to override the source encoding.
Found by running spec/ruby/language/magic_comment_spec.rb with LANG=C.
2024-08-29 20:20:26 -04:00
Nobuyoshi Nakada d33e3d47b8
[Bug #20704] Win32: Fix chdir to non-ASCII path
On Windows, `chdir` in compilers' runtime libraries uses the active
code page, but command line arguments in ruby are always UTF-8, since
commit:33ea2646b98adb49ae2e1781753bf22d33729ac0.
2024-08-29 19:41:53 +09:00
Alexander Momchilov f93c27d86b Set encoding index correctly 2024-08-28 08:47:43 -04:00
Alan Wu f2ac013009
Add RB_DEFAULT_PARSER preprocessor macro
This way there is one place to change for switching the default.
This also allows for building the same commit with different cppflags.
2024-08-27 23:15:37 +00:00
Kevin Newton 465cf8d80b [PRISM] Potentially enable coverage on the main script 2024-08-21 16:32:05 -04:00
Kevin Newton de28ef7db4 [PRISM] Use src encoding not ext encoding 2024-08-15 13:34:25 -04:00
Kevin Newton 09bf3c9d6a [PRISM] Trigger moreswitches off shebang 2024-08-14 15:39:03 -04:00
Peter Zhu f69ba5716f Move RUBY_FREE_AT_EXIT check earlier
Things that exit early, like `ruby -v`, could not use RUBY_FREE_AT_EXIT
because the check for RUBY_FREE_AT_EXIT was not executed.
2024-07-24 08:36:40 -04:00
Kevin Newton 49cf042cd2 [PRISM] Define DATA constant when parsing stdin and __END__ 2024-07-19 10:17:50 -04:00
Kevin Newton b1608fc6bc [PRISM] Do not respect xflag when eflag is set 2024-07-18 13:03:33 -04:00
Peter Zhu 8fd2df529b Revert "Load external GC using command line argument"
This reverts commit 8ddb1110c283c5cb59b6582383f36fdbcc43ab19.
2024-07-05 14:05:58 -04:00
Jean Boussier 95ffcd3f9f Fix `--debug-frozen-string-literal` to not apply `--disable-frozen-string-literal`
[Feature #20205]

This was an undesired side effect. Now that this value is a triplet, we can't
assume it's disabled by default.
2024-06-24 12:43:39 +02:00
Peter Zhu 90763e04ba Load external GC using command line argument
This commit changes the external GC to be loaded with the `--gc-library`
command line argument instead of the RUBY_GC_LIBRARY_PATH environment
variable because @nobu pointed out that loading binaries using environment
variables can pose a security risk.
2024-06-21 11:49:01 -04:00
Nobuyoshi Nakada 01b13886dc [Bug #20562] Categorize `RUBY_FREE_AT_EXIT` warning as experimental 2024-06-12 15:36:10 +09:00
Kevin Newton 792e9c46a4 Remove prism compiler warning 2024-06-07 12:24:05 -04:00
Jean Boussier 33f92b3c88 Don't add `+YJIT` to `RUBY_DESCRIPTION` until it's actually enabled
If you start Ruby with `--yjit-disable`, the `+YJIT` shouldn't be
added until `RubyVM::YJIT.enable` is actually called. Otherwise
it's confusing in crash reports etc.
2024-06-05 20:53:49 +02:00
Kevin Newton a708b6aa65 [PRISM] Respect eval coverage setting 2024-05-20 12:28:47 -04:00
yui-knk 899d9f79dd Rename `vast` to `ast_value`
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
2024-05-03 12:40:35 +09:00
HASUMI Hitoshi 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Kevin Newton af24ba4034 [PRISM] Raise LoadError when file cannot be read 2024-04-25 14:59:48 -04:00
Jean Boussier f06670c5a2 Eliminate usage of OBJ_FREEZE_RAW
Previously it would bypass the `FL_ABLE` check, but
since shapes introduction, it started having a different
behavior than `OBJ_FREEZE`, as it would onyl set the `FL_FREEZE`
flag, but not update the shape.

I have no indication of this causing a bug yet, but it seems
like a trap waiting to happen.
2024-04-16 17:20:35 +02:00
HASUMI Hitoshi 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Nobuyoshi Nakada b88e0d6653
Merge `push_include` and `ruby_push_include` 2024-04-07 17:29:24 +09:00
Nobuyoshi Nakada 0d93fd0f69
Merge `push_include_cygwin` into `push_include` 2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada 0620f006c2
Remove `translit_char`
It has been used only for DOSISH other than Windows.
2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada df8f1f78f0
[Feature #20329] Separate additional flags from main dump options
Additional flags are comma separated list preceeded by `-` or `+`.

Before:
```sh
$ ruby --dump=insns+without_opt
```

After:
```sh
$ ruby --dump=insns-opt,-optimize
```

At the same time, `parsetree_with_comment` is split to `parsetree`
option and additional `comment` flag.

Before:
```sh
$ ruby --dump=parsetree_with_comment
```

After:
```sh
$ ruby --dump=parsetree,+comment
```

Also flags can be separate `--dump`.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```

Ineffective flags are ignored silently.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```
2024-04-06 20:27:02 +09:00
Nobuyoshi Nakada 9b5d4274a2
[Feature #20329] Clean up dump sub-options
Restructure `insns_without_opt` and `parsetree_with_comment` as
`insns+without_opt` and `parsetree+with_comment` respectively, like
`+error-tolerant`.
2024-04-06 20:27:01 +09:00
HASUMI Hitoshi f5e387a300 Separate SCRIPT_LINES__ from ast.c
This patch suggests relocating the code dealing with `SCRIPT_LINES__` from ast.c to ruby_parser.c.

## Background

- I guess `AbstractSyntaxTree.of` method used to use `SCRIPT_LINES__` internally for some reason before
- However, now it appears `SCRIPT_LINES__` is no longer used meaningfully by the method
- As evidence of this, (and as my patch shows,) removing the function call of `rb_script_lines_for()` from `ast_s_of()` does not affect the result of `test/ruby/test_ast.rb`

Given the above, I think two possibilities can be considered:

- (A) `AbstractSyntaxTree.of` has not needed `SCRIPT_LINES__` already (I pick this)
- (B) We lack a test case of `AbstractSyntaxTree.of` that needs to use `SCRIPT_LINES__`

## Besides,

The current implementation causes strange behavior:

```console
ruby -e"SCRIPT_LINES__ = {__FILE__ => []}; puts RubyVM::AbstractSyntaxTree.of(->{ 1 + 2 }, keep_script_lines: true).script_lines"
=> `-e:1:in '<main>': undefined method 'script_lines' for nil (NoMethodError)`
```

I think this is a bug because `AbstractSyntaxTree.of` is not supposed to return `nil` even in this case.
This happens due to the ast.c's dependence on `SCRIPT_LINES__`.
And at the end of the `ast_s_of()`, `node_find()` can not find the target child node obviously because it doesn't make sense to look for a corresponding node made from the parameter of `AbstractSyntaxTree.of` in the AST tree made from the value of `{__FILE__ => []}`

## Solution

Since I think it's good enough `SCRIPT_LINES__` to be only referred by ruby.c, I chose the possibility "(A)" and wrote this patch which moves `rb_script_lines_for()` from ast.c to ruby_parser.c.

So as the result:

- `ast_s_of()` function no longer look up `SCRIPT_LINES__`
- Even so, this patched code passes the existing tests
- The strange behavior above no longer happens (I also added a test for it)

Please correct me if I miss something🙏
2024-04-04 18:29:16 +09:00
Kevin Newton 42d1cd8f7f [PRISM] Pass --enable-frozen-string-literal through to evals 2024-03-27 08:34:42 -04:00
Étienne Barrié 12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Kevin Newton 97810cbbf2 [PRISM] Process encoding on CLI for -K 2024-03-18 11:55:43 -04:00
Jean Boussier 91bf7eb274 Refactor frozen_string_literal check during compilation
In preparation for https://bugs.ruby-lang.org/issues/20205.

The `frozen_string_literal` compilation option will no longer
be a boolean but a tri-state: `on/off/default`.
2024-03-15 15:52:33 +01:00
Nobuyoshi Nakada c843afbf6f Chomp last punctuations from descriptions for `-h`
The following parts will not be shown for `-h` option.  And not to
reach 80 columns.  Some terminal emulators (Windows command prompt at
least) wrap the cursor to the next line when reaching the rightmost
column, before exceeding.
2024-03-14 01:19:57 +09:00
Nobuyoshi Nakada dec2a8191c
`--dump=prism_parsetree` is no longer provided
Since it did not make sense without `--parser=prism` option, just a
duplication.  Now it is `--parser=prism --dump=parsetree`.
2024-03-13 11:28:50 +09:00
Takashi Kokubun 22708be0d7 Revisions for #10198
This fixes some inconsistencies introduced by that PR.
2024-03-12 13:44:48 -07:00
Burdette Lamar 19da3b4ecf
Revisions for help text (#10198) 2024-03-12 15:14:56 -04:00
Kevin Newton a6dac9bb4f [PRISM] Parse stdin on CLI with prism 2024-03-11 12:51:32 -04:00
Jean Boussier d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Kevin Newton f9d5c604c8 [PRISM] Use new command line option flags 2024-02-29 12:05:19 -05:00
Kevin Newton f8355e88d6 [PRISM] Do not load -r until we check if main script can be read 2024-02-28 12:42:57 -05:00
Kevin Newton 1cef366319 [PRISM] Factor in CLI options for prism 2024-02-28 11:09:43 -05:00
Nobuyoshi Nakada 835fa98a62 Update warning flags before dump 2024-02-20 23:56:07 +09:00
Kevin Newton 8414c26f0d [PRISM] Make prism compiler warning experimental 2024-02-16 11:56:48 -05:00