Граф коммитов

42 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada 3e1021b144 Make default parser enum and define getter/setter 2024-10-02 20:43:40 +09:00
Peter Zhu 407f8b8716 Fix memory leak in Ripper for indented heredocs
The allocated parser string is never freed, which causes a memory leak.

The following code leaks memory:

    Ripper.sexp_raw(DATA.read)

    __END__
    <<~EOF
      a
        #{1}
      a
    EOF
2024-09-25 08:56:14 -04:00
S-H-GAMELINKS 95d26ee41e Reuse dedent_string function in rb_ruby_ripper_dedent_string function
This change is reduce Ruby C API dependency for Universal Parser.
Reuse dedent_string functions in rb_ruby_ripper_dedent_string functions and remove dependencies on rb_str_modify and rb_str_set_len from the parser.
2024-09-22 12:22:20 +09:00
Kevin Newton ea2af5782d Switch the default parser from parse.y to Prism
This commit switches the default parser to Prism. There are a
couple of additional changes related to this that are a part of
this as well to make this happen.

* Switch the default parser in parse.h
* Remove the Prism-specific workflow and add a parse.y-specific
  workflow to CI so that it continues to be tested
* Update a few test exclusions since Prism has the correct
  behavior but parse.y doesn't per
  https://bugs.ruby-lang.org/issues/20504.
* Skips a couple of tests on RBS which are failing because they
  are using RubyVM::AbstractSyntaxTree.of.

Fixes [Feature #20564]
2024-09-12 13:43:04 -04:00
Alan Wu f2ac013009
Add RB_DEFAULT_PARSER preprocessor macro
This way there is one place to change for switching the default.
This also allows for building the same commit with different cppflags.
2024-08-27 23:15:37 +00:00
Yusuke Endoh a15e4d405b Revert 528c4501f4
Recently, `TestRubyLiteral#test_float` fails randomly.

```
  1) Error:
TestRubyLiteral#test_float:
ArgumentError: SyntaxError#path changed: "(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"->"(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"
```
https://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20240527T050036Z.fail.html.gz

According to Launchable, the first failure was on Apr 30.
This is just when 528c4501f4 was
committed. I don't know if the change is really the cause, but I want to
revert it once to see if the random failure disappears.
2024-05-31 18:24:43 +09:00
Nobuyoshi Nakada 3c16d93cd3 Constify encoding type in universal parser
Fixed warning about discarding modifiers.

```
../src/ruby_parser.c:677:48: warning: passing 'rb_encoding *' (aka 'const struct OnigEncodingTypeST *') to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
  677 |     ast = rb_parser_compile(p, gets, ptr, len, enc, input, line);
      |                                                ^~~
../src/internal/parse.h:58:128: note: passing argument to parameter 'fname_enc' here
   58 | rb_ast_t *rb_parser_compile(rb_parser_t *p, rb_parser_lex_gets_func *gets, const char *fname_ptr, long fname_len, rb_encoding *fname_enc, rb_parser_input_data input, int line);
      |                                                                                                                                ^
```
2024-05-13 08:26:54 +09:00
yui-knk cf74ff714a Change return value of `gets` function to be `rb_parser_string_t *` instead of `VALUE`
This change reduces parser's dependency on ruby object.
2024-05-04 11:59:10 +09:00
yui-knk 528c4501f4 Use `rb_parser_string_t *` as `ruby_sourcefile_string`
This reduces dependency on VALUE.
2024-04-30 09:00:05 +09:00
yui-knk 33929ef995 Move encoding object conversion outside of parser
Reduce the parser's dependence on `VALUE` and `rb_enc_from_encoding`.
2024-04-23 13:11:46 +09:00
yui-knk 2992e1074a Refactor parser compile functions
Refactor parser compile functions to reduce the dependence
on ruby functions.
This commit includes these changes

1. Refactor `gets`, `input` and `gets_` of `parser_params`

Parser needs two different data structure to get next line, function (`gets`) and input data (`input`).
However `gets_` is used for both function (`call`) and input data (`ptr`).
`call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used.
`ptr` is used for managing the current pointer on String when `parser_compile_string` is used.
This commit changes parser to used only `gets` and `input` then removes `gets_`.

2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c

This change reduces the dependence on ruby functions from parser.

3. Change ruby_parser and ripper to take care of `VALUE input` GC mark

Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper.
`input` is arbitrary data pointer from the viewpoint of parser.

4. Introduce rb_parser_compile_array function

Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know
about the detail of `lex_gets` and `input`.
Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
2024-04-23 07:20:22 +09:00
yui-knk d07df8567e Parser and universal parser share wrapper functions 2024-04-20 18:08:33 +09:00
HASUMI Hitoshi 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
yui-knk 515e52a0b1 Emit `warn` event for duplicated hash keys on ripper
Need to use `rb_warn` macro instead of calling `rb_compile_warn`
directly to emit `warn` event on ripper.
2024-04-15 06:29:25 +09:00
Samuel Williams b473d304d4
Revert "Enumerator should use a non-blocking fiber. (#10478)" (#10480)
This reverts commit dfa0897de8.

This commit accidentally included some change in `parse.h`. Reverting
and re-applying the relevant changes.
2024-04-07 10:20:22 +00:00
Samuel Williams dfa0897de8
Enumerator should use a non-blocking fiber. (#10478) 2024-04-07 19:18:09 +12:00
yui-knk 6bfabd076b Remove undefined function's prototype declaration 2024-04-07 11:21:04 +09:00
yui-knk 7767db2379 Fix ripper to dispatch warning event for duplicated when clause
Need to separate `check_literal_when` function for parser and
ripper otherwise warning event is not dispatched because
parser `rb_warning1` is used in ripper.
2024-04-07 11:15:09 +09:00
yui-knk 799e854897 [Feature #20331] Simplify parser warnings for hash keys duplication and when clause duplication
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.
2024-04-02 08:26:58 +09:00
yui-knk 8066e3ea6e Remove VALUE from `struct rb_strterm_struct`
In the past, it was imemo. However a075c55 changed it.
Therefore no need to use `VALUE` for the first field.
2024-04-02 07:26:35 +09:00
S-H-GAMELINKS 060a71d4e7 Fix Ripper memory allocation size when enabled Universal Parser
The size of `struct parser_params` is 8 bytes difference in `ripper_s_allocate` and `rb_ruby_parser_allocate` when the universal parser is
enabled.
This causes a situation where `*r->p` is not fully initialized in `ripper_s_allocate` as shown below.

```console
(gdb) p *r->p
$2 = {heap = 0x0, lval = 0x0, yylloc = 0x0, lex = {strterm = 0x0, gets = 0x0, input = 0, string_buffer = {head = 0x0, last = 0x0}, lastlin
e = 0x0,
    nextline = 0x0, pbeg = 0x0, pcur = 0x0, pend = 0x0, ptok = 0x0, gets_ = {ptr = 0, call = 0x0}, state = EXPR_NONE, paren_nest = 0, lpar
_seen = 0,
    debug = 0, has_shebang = 0, token_seen = 0, token_info_enabled = 0, error_p = 0, cr_seen = 0, value = 0, result = 0, parsing_thread = 0, s_value = 0,
    s_lvalue = 0, s_value_stack = 2097}
````

This seems to cause `double free or corruption (!prev)` and SEGV.
So, fixing this by introduce `rb_ripper_parser_params_allocate` and `rb_ruby_parser_config` functions for Ripper, and `struct parser_params` same size is returned.
2024-03-21 18:10:02 +09:00
yui-knk 89cfc15207 [Feature #20257] Rearchitect Ripper
Introduce another semantic value stack for Ripper so that
Ripper can manage both Node and Ruby Object separately.
This rearchitectutre of Ripper solves these issues.
Therefore adding test cases for them.

* [Bug 10436] https://bugs.ruby-lang.org/issues/10436
* [Bug 18988] https://bugs.ruby-lang.org/issues/18988
* [Bug 20055] https://bugs.ruby-lang.org/issues/20055

Checked the differences of `Ripper.sexp` for files under `/test/ruby`
are only on test_pattern_matching.rb.
The differences comes from the differences between
`new_hash_pattern_tail` functions between parser and Ripper.
Ripper `new_hash_pattern_tail` didn’t call `assignable` then
`kw_rest_arg` wasn’t marked as local variable.
This is also fixed by this commit.

```
--- a/./tmp/before/test_pattern_matching.rb
+++ b/./tmp/after/test_pattern_matching.rb
@@ -3607,7 +3607,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [985, 10]]],
+                    [:var_ref, [:@ident, “a”, [985, 10]]],
                     :==,
                     [:hash, nil]]],
                   nil]]],
@@ -3662,7 +3662,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [994, 10]]],
+                    [:var_ref, [:@ident, “a”, [994, 10]]],
                     :==,
                     [:hash,
                      [:assoclist_from_args,
@@ -3813,7 +3813,7 @@
                    [:command,
                     [:@ident, “raise”, [1022, 10]],
                     [:args_add_block,
-                     [[:vcall, [:@ident, “b”, [1022, 16]]]],
+                     [[:var_ref, [:@ident, “b”, [1022, 16]]]],
                      false]]],
                   [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]],
                nil,
@@ -3876,7 +3876,7 @@
                      [:@int, “0”, [1033, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1033, 20]]],
+                     [:var_ref, [:@ident, “b”, [1033, 20]]],
                      :==,
                      [:hash, nil]]]],
                   nil]]],
@@ -3946,7 +3946,7 @@
                      [:@int, “0”, [1042, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1042, 20]]],
+                     [:var_ref, [:@ident, “b”, [1042, 20]]],
                      :==,
                      [:hash,
                       [:assoclist_from_args,
@@ -5206,7 +5206,7 @@
                      [[:assoc_new,
                        [:@label, “c:“, [1352, 22]],
                        [:@int, “0”, [1352, 25]]]]]],
-                   [:vcall, [:@ident, “r”, [1352, 29]]]],
+                   [:var_ref, [:@ident, “r”, [1352, 29]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5299,7 +5299,7 @@
                       [:assoc_new,
                        [:@label, “c:“, [1367, 34]],
                        [:@int, “0”, [1367, 37]]]]]],
-                   [:vcall, [:@ident, “r”, [1367, 41]]]],
+                   [:var_ref, [:@ident, “r”, [1367, 41]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5931,7 +5931,7 @@
              [:in,
               [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]],
               [[:binary,
-                [:vcall, [:@ident, “r”, [1534, 8]]],
+                [:var_ref, [:@ident, “r”, [1534, 8]]],
                 :==,
                 [:hash,
                  [:assoclist_from_args,
```
2024-02-20 17:33:58 +09:00
yui-knk ee7f63ebba Make lastline and nextline to be rb_parser_string
This commit changes `struct parser_params` lastline and nextline
from `VALUE` (String object) to `rb_parser_string_t *` so that
dependency on Ruby Object is reduced.
`parser_string_buffer_t string_buffer` is added to `struct parser_params`
to manage `rb_parser_string_t` pointers of each line. All allocated line
strings are freed in `rb_ruby_parser_free`.
2024-01-23 08:58:16 +09:00
Nobuyoshi Nakada 5fc9810bf3 Shorten `rb_strterm_literal_t` members 2023-10-14 11:08:43 +09:00
Nobuyoshi Nakada a075c55d0c Manage `rb_strterm_t` without imemo 2023-10-14 11:08:43 +09:00
Nobuyoshi Nakada cb06b6632a Remove unions in `rb_strterm` structs for alignment 2023-10-14 11:08:43 +09:00
Nobuyoshi Nakada 6aa16f9ec1 Move SCRIPT_LINES__ away from parse.y 2023-08-25 18:23:05 +09:00
Nobuyoshi Nakada 81836c6cb9
Fix duplicate symbol errors when statically linking ripper 2023-06-12 20:22:01 +09:00
yui-knk b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
Takashi Kokubun 526111290b Revert "reuse open(2) from rb_file_load_ok on POSIX-like system"
This reverts commit 35136e1e9c.

test-spec has been failing since this revision.

.github/workflows/compilers.yml:82
https://github.com/ruby/ruby/actions/runs/4276884159/jobs/7445299562

```
env:
  # Minimal flags to pass the check.
  default_cc: 'gcc-11 -fcf-protection -Wa,--generate-missing-build-notes=yes'
  optflags: '-O2'
  LDFLAGS: '-Wl,-z,now'
  # FIXME: Drop skipping options
  # https://bugs.ruby-lang.org/issues/18061
  # https://sourceware.org/annobin/annobin.html/Test-pie.html
  TEST_ANNOCHECK_OPTS: "--skip-pie --skip-gaps"
```

Failure:

```
  1)
  An exception occurred during: Kernel#require (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:317
  Kernel#require (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.ext.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:5:in `<top (required)>'

  2)
  An exception occurred during: Kernel#require ($LOADED_FEATURES) stores an absolute path
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:330
  Kernel#require ($LOADED_FEATURES) stores an absolute path ERROR
  LeakError: Closed file descriptor: 8
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:5:in `<top (required)>'

  3)
  An exception occurred during: Kernel#require ($LOADED_FEATURES) does not load a non-canonical path for a file already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:535
  Kernel#require ($LOADED_FEATURES) does not load a non-canonical path for a file already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:5:in `<top (required)>'

  4)
  An exception occurred during: Kernel#require ($LOADED_FEATURES) does not load a ../ relative path for a file already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:551
  Kernel#require ($LOADED_FEATURES) does not load a ../ relative path for a file already loaded ERROR
  LeakError: Leaked file descriptor: 9 : #<File:../code/load_fixture.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:5:in `<top (required)>'

  5)
  An exception occurred during: Kernel#require ($LOADED_FEATURES) complex, enumerator, rational, thread, ruby2_keywords are already required
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:563
  Kernel#require ($LOADED_FEATURES) complex, enumerator, rational, thread, ruby2_keywords are already required ERROR
  LeakError: Closed file descriptor: 8
  Closed file descriptor: 9
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:5:in `<top (required)>'

  6)
  An exception occurred during: Kernel.require (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:317
  Kernel.require (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.ext.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:23:in `<top (required)>'

  7)
  An exception occurred during: Kernel.require ($LOADED_FEATURES) stores an absolute path
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:330
  Kernel.require ($LOADED_FEATURES) stores an absolute path ERROR
  LeakError: Closed file descriptor: 8
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:23:in `<top (required)>'

  8)
  An exception occurred during: Kernel.require ($LOADED_FEATURES) does not load a non-canonical path for a file already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:535
  Kernel.require ($LOADED_FEATURES) does not load a non-canonical path for a file already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:23:in `<top (required)>'

  9)
  An exception occurred during: Kernel.require ($LOADED_FEATURES) does not load a ../ relative path for a file already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:551
  Kernel.require ($LOADED_FEATURES) does not load a ../ relative path for a file already loaded ERROR
  LeakError: Leaked file descriptor: 9 : #<File:../code/load_fixture.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:23:in `<top (required)>'

  10)
  An exception occurred during: Kernel.require ($LOADED_FEATURES) complex, enumerator, rational, thread, ruby2_keywords are already required
  /__w/ruby/ruby/src/spec/ruby/core/kernel/shared/require.rb:563
  Kernel.require ($LOADED_FEATURES) complex, enumerator, rational, thread, ruby2_keywords are already required ERROR
  LeakError: Closed file descriptor: 8
  Closed file descriptor: 9
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_spec.rb:23:in `<top (required)>'

  11)
  An exception occurred during: Kernel#require_relative with a relative path (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:197
  Kernel#require_relative with a relative path (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.ext.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:4:in `<top (required)>'

  12)
  An exception occurred during: Kernel#require_relative with a relative path ($LOADED_FEATURES) stores an absolute path
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:205
  Kernel#require_relative with a relative path ($LOADED_FEATURES) stores an absolute path ERROR
  LeakError: Closed file descriptor: 8
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:4:in `<top (required)>'

  13)
  An exception occurred during: Kernel#require_relative with an absolute path (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:399
  Kernel#require_relative with an absolute path (file extensions) does not load a C-extension file if a complex-extensioned .rb file is already loaded ERROR
  LeakError: Leaked file descriptor: 8 : #<File:/__w/ruby/ruby/src/spec/ruby/fixtures/code/load_fixture.ext.rb>
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:277:in `<top (required)>'

  14)
  An exception occurred during: Kernel#require_relative with an absolute path ($LOAD_FEATURES) stores an absolute path
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:407
  Kernel#require_relative with an absolute path ($LOAD_FEATURES) stores an absolute path ERROR
  LeakError: Closed file descriptor: 8
  /__w/ruby/ruby/src/spec/ruby/core/kernel/require_relative_spec.rb:277:in `<top (required)>'
```
2023-02-27 09:24:45 -08:00
Eric Wong 35136e1e9c reuse open(2) from rb_file_load_ok on POSIX-like system
When loading Ruby source files, we can save the result of
successful opens as open(2)/openat(2) are a fairly expensive
syscalls.  This also avoids a time-of-check-to-time-of-use
(TOCTTOU) problem.

This reduces open(2) syscalls during `require'; but should be
most apparent when users have a small $LOAD_PATH.  Users with
large $LOAD_PATH will benefit less since there'll be more
open(2) failures due to ENOENT.

With `strace -c -e openat ruby -e exit' under Linux, this
results in a ~14% reduction of openat(2) syscalls
(glibc uses openat(2) to implement open(2)).

 % time     seconds  usecs/call     calls    errors syscall
 ------ ----------- ----------- --------- --------- ----------------
   0.00    0.000000           0       296       110 openat
   0.00    0.000000           0       254       110 openat

Additionally, the introduction of `struct ruby_file_load_state'
may make future optimizations more apparent.

This change cannot benefit binary (.so) loading since the
dlopen(3) API requires a filename and I'm not aware of an
alternative that takes a pre-existing FD.  In typical
situations, Ruby source files outnumber the mount of .so
files.
2023-02-26 20:39:41 +00:00
yui-knk d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
yui-knk fbbdbdd891 Add error_tolerant option to RubyVM::AST
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.

[Feature #19013]
2022-10-08 17:59:11 +09:00
卜部昌平 daf0c04a47 internal/*.h: skip doxygen
These contents are purely implementation details, not worth appearing in
CAPI documents. [ci skip]
2021-09-10 20:00:06 +09:00
Yusuke Endoh cad83fa3c4 ast.c: Rename "save_script_lines" to "keep_script_lines"
... as per ko1's preference. He is preparing to extend this feature to
ISeq for his new debugger. He prefers "keep" to "save" for this wording.
This API is internal and not included in any released version, so I
change it in advance.
2021-08-20 16:18:36 +09:00
Yusuke Endoh acae5f363d ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
卜部昌平 4ff3f20540 add #include guard hack
According to MSVC manual (*1), cl.exe can skip including a header file
when that:

- contains #pragma once, or
- starts with #ifndef, or
- starts with #if ! defined.

GCC has a similar trick (*2), but it acts more stricter (e. g. there
must be _no tokens_ outside of #ifndef...#endif).

Sun C lacked #pragma once for a looong time.  Oracle Developer Studio
12.5 finally implemented it, but we cannot assume such recent version.

This changeset modifies header files so that each of them include
strictly one #ifndef...#endif.  I believe this is the most portable way
to trigger compiler optimizations. [Bug #16770]

*1: https://docs.microsoft.com/en-us/cpp/preprocessor/once
*2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
2020-04-13 16:06:00 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
卜部昌平 bf53d6c7d1 other minior internal header tweaks
These headers need no rewrite.  Just add some minor tweaks, like
addition of #include lines.  Mainly cosmetic.

TIMET_MAX_PLUS_ONE was deleted because the macro was used from only
one place (directly write expression there).
2019-12-26 20:45:12 +09:00
卜部昌平 ce2c97d738 internal/symbol.h rework
Some declatations are moved from internal/parse.h, to reflect the fact
that they are defined in symbol.c.
2019-12-26 20:45:12 +09:00
卜部昌平 b739a63eb4 split internal.h into files
One day, I could not resist the way it was written.  I finally started
to make the code clean.  This changeset is the beginning of a series of
housekeeping commits.  It is a simple refactoring; split internal.h into
files, so that we can divide and concur in the upcoming commits.  No
lines of codes are either added or removed, except the obvious file
headers/footers.  The generated binary is identical to the one before.
2019-12-26 20:45:12 +09:00