Граф коммитов

273 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada ceec988f2e ripper: Support member references in the DSL 2023-10-10 00:09:52 +09:00
yui-knk cecd1de2eb Use rb_node_opt_arg_t and rb_node_kw_arg_t instead of NODE 2023-10-01 09:19:42 +09:00
Nobuyoshi Nakada d647709d1a Extract `ripper_parser_params` 2023-09-30 20:17:38 +09:00
yui-knk 74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
Nobuyoshi Nakada fbe4db5182 ripper: Support named references in the DSL 2023-09-25 23:04:09 +09:00
Nobuyoshi Nakada 69d7871b02
ripper: Preprocess ripper-dispatchable types only
Keep the other types, which not having setter macros for ripper.
2023-09-17 16:22:01 +09:00
Nobuyoshi Nakada f2102e4015
Set ripper_init.c.tmpl to C mode [ci skip] 2023-09-10 19:20:31 +09:00
卜部昌平 d9cba2fc74 include missing header 2023-08-25 17:27:53 +09:00
卜部昌平 eec85a6309 tool/update-deps --fix 2023-08-25 17:27:53 +09:00
yui-knk 0a570a0069 Fix `#line` directive filename of ripper.c
Before:

```c
/* First part of user prologue.  */
#line 14 "parse.y"
```

After:

```c
/* First part of user prologue.  */
#line 14 "ripper.y"
```
2023-07-16 19:27:08 +09:00
Nobuyoshi Nakada 5c77402d88
Fix null pointer access in Ripper#initialize
In `rb_ruby_ripper_parser_allocate`, `r->p` is NULL between creating
`self` and `parser_params` assignment.  As GC can happen there, the
typed-data functions for it need to consider the case.
2023-07-16 15:41:10 +09:00
yui-knk 82cd70ef93 Use functions defined by parser_st.c to reduce dependency on st.c 2023-07-15 12:50:40 +09:00
yui-knk b2bccf053b Include ripper.h into `$distcleanfiles` 2023-07-09 13:02:25 +09:00
Nobuyoshi Nakada c89f519170
More dependencies for ripper 2023-06-29 18:47:56 +09:00
Peter Zhu a500eb9f8c Fix memory leak in Ripper
The following script leaks memory in Ripper:

```ruby
require "ripper"

20.times do
  100_000.times do
    Ripper.parse("")
  end

  puts `ps -o rss= -p #{$$}`
end
```
2023-06-28 09:50:51 -04:00
Nobuyoshi Nakada 70483f6ca4
Add missing dependencies 2023-06-12 19:10:29 +09:00
yui-knk b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
yui-knk 7b803eafa2 Ripper does not depend on Bison [ci skip]
It also uses Lrama then no dependency on Bison.
2023-06-03 10:34:24 +09:00
yui-knk 3a4206c7a1 No need to define "BISON" on extconf.rb
"BISON" is defined in "ext/ripper/depend".
2023-06-02 09:28:30 +09:00
Nobuyoshi Nakada 3fe45a3123
Process parse.y without temporary files 2023-05-15 19:10:24 +09:00
Nobuyoshi Nakada bdaa491565 Add user argument to some macros used by bison 2023-05-14 15:38:48 +09:00
Nobuyoshi Nakada 3150516aab Preprocess input parse.y from stdin 2023-05-14 15:38:48 +09:00
Yuichiro Kaneko a1b01e7701
Use Lrama LALR parser generator instead of Bison
https://bugs.ruby-lang.org/issues/19637

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-05-12 18:25:10 +09:00
Matt Valentine-House 2a34bcaa10 Update VPATH for socket, & dependencies
The socket extensions rubysocket.h pulls in the "private" include/gc.h,
which now depends on vm_core.h. vm_core.h pulls in id.h

when tool/update-deps generates the dependencies for the makefiles, it
generates the line for id.h to be based on VPATH, which is configured in
the extconf.rb for each of the extensions. By default VPATH does not
include the actual source directory of the current Ruby so the
dependency fails to resolve and linking fails.

We need to append the topdir and top_srcdir to VPATH to have the
dependancy picked up correctly (and I believe we need both of these to
cope with in-tree and out-of-tree builds).

I copied this from the approach taken in
https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
2023-04-06 11:07:16 +01:00
Matt Valentine-House 5e4b80177e Update the depend files 2023-02-28 09:09:00 -08:00
Matt Valentine-House f38c6552f9 Remove intern/gc.h from Make deps 2023-02-27 10:11:56 -08:00
Nobuyoshi Nakada 899ea35035
Extract include/ruby/internal/attr/packed_struct.h
Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the
macros bellow:
* `RBIMPL_ATTR_PACKED_STRUCT_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_END`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END`
2023-02-08 12:34:13 +09:00
Nobuyoshi Nakada fad48fefe1 [Bug #19399] Parsing invalid heredoc inside block parameter
Although this is of course invalid as Ruby code, allow to just parse
and tokenize.
2023-02-02 12:20:10 +09:00
S-H-GAMELINKS 1a64d45c67 Introduce encoding check macro 2022-12-02 01:31:27 +09:00
yui-knk d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
Jemma Issroff 5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Nobuyoshi Nakada b7504af8fc Preprocess for older bison is no longer needed 2022-11-10 09:51:50 +09:00
yui-knk f7db1affd1 Set default %printer for NODE nterms
Before:

```
Reducing stack by rule 639 (line 5062):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: )
```

After:

```
Reducing stack by rule 641 (line 5078):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: NODE_LIT)
```

`"<*>"` is supported by Bison 2.3b (2008-05-27) or later.
https://git.savannah.gnu.org/cgit/bison.git/commit/?id=12e3584054c16ab255672c07af0ffc7bb220e8bc

Therefore developers need to install Bison 2.3b+ to build ruby from
source codes if their Bison is older.

Minimum version requirement for Bison is changed to 3.0.

See: https://bugs.ruby-lang.org/issues/19068 [Feature #19068]
2022-11-08 12:30:03 +09:00
Nobuyoshi Nakada bcf82b7c26
Process token IDs from id.def without id.h
Fixes id.h error during updating ripper.c by `make after-update`.

While it used to update id.h in the build directory, but was trying to
update ripper.c in the source directory.  In principle, files in the
source directory can or should not depend on files in the build
directory.
2022-09-08 18:22:47 +09:00
Peter Zhu 2d5ecd60a5 [Feature #18249] Update dependencies 2022-02-22 09:55:21 -05:00
Yusuke Endoh 17e7219679 ext/ripper/lib/ripper/lexer.rb: Do not deprecate Ripper::Lexer::State#[]
The old code of IRB still uses this method. The warning is noisy on
rails console.
In principle, Ruby 3.1 deprecates nothing, so let's avoid the
deprecation for the while.
I think It is not so hard to continue to maintain it as it is a trivial
shim.

https://github.com/ruby/ruby/pull/5093
2021-12-09 00:30:17 +09:00
Nobuyoshi Nakada 524a808d23
Define Ripper::Lexer::Elem#to_s
Alias `#inspect` as `#to_s` also in the new `Ripper::Lexer::Elem`
class, so that `puts Ripper::Lexer.new(code).scan` shows the
attributes.
2021-12-02 18:29:45 +09:00
schneems 8944009be7 Deprecate `Lexer::Elem#[]` and `Lexer::State#[]`
Discussed in https://github.com/ruby/ruby/pull/5093#issuecomment-964426481. 

> it would be enough to mimic only [] for almost all cases

This adds back the `Lexer::Elem#[]` and `Lexer::State#[]` and adds deprecation warnings for them.
2021-12-02 15:55:42 +09:00
schneems 3685b5af95 Only iterate Lexer heredoc arrays
The last element in the `@buf` may be either an array or an `Elem`. In the case it is an `Elem` we iterate over every element, when we do not need to. This check guards that case by ensuring that we only iterate over an array of elements.
2021-12-02 15:55:42 +09:00
schneems 3f74eaa7a8 ~1.10x faster Change Ripper.lex structs to classes
## Concept

I am proposing we replace the Struct implementation of data structures inside of ripper with real classes.

This will improve performance and the implementation is not meaningfully more complicated.

## Example

Struct versus class comparison:

```ruby
Elem = Struct.new(:pos, :event, :tok, :state, :message) do
  def initialize(pos, event, tok, state, message = nil)
    super(pos, event, tok, State.new(state), message)
  end

  # ...

  def to_a
    a = super
    a.pop unless a.empty?
    a
  end
end

class ElemClass
  attr_accessor :pos, :event, :tok, :state, :message

  def initialize(pos, event, tok, state, message = nil)
    @pos = pos
    @event = event
    @tok = tok
    @state = State.new(state)
    @message = message
  end

  def to_a
    if @message
      [@pos, @event, @tok, @state, @message]
    else
      [@pos, @event, @tok, @state]
    end
  end
end

# stub state class creation for now
class State; def initialize(val); end; end
```

## MicroBenchmark creation

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

Benchmark.ips do |x|
  x.report("struct") { Elem.new(pos, event, tok, state) }
  x.report("class ") { ElemClass.new(pos, event, tok, state) }
  x.compare!
end; nil
```

Gives ~1.2x faster creation:

```
Warming up --------------------------------------
              struct   263.983k i/100ms
              class    303.367k i/100ms
Calculating -------------------------------------
              struct      2.638M (± 5.9%) i/s -     13.199M in   5.023460s
              class       3.171M (± 4.6%) i/s -     16.078M in   5.082369s

Comparison:
              class :  3170690.2 i/s
              struct:  2638493.5 i/s - 1.20x  (± 0.00) slower
```

## MicroBenchmark `to_a` (Called by Ripper.lex for every element)

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.to_a }
  x.report("class ") { from_class.to_a }
  x.compare!
end; nil
```

Gives 1.46x faster `to_a`:

```
Warming up --------------------------------------
              struct   612.094k i/100ms
              class    893.233k i/100ms
Calculating -------------------------------------
              struct      6.121M (± 5.4%) i/s -     30.605M in   5.015851s
              class       8.931M (± 7.9%) i/s -     44.662M in   5.039733s

Comparison:
              class :  8930619.0 i/s
              struct:  6121358.9 i/s - 1.46x  (± 0.00) slower
```

## MicroBenchmark data access

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.pos[1] }
  x.report("class ") { from_class.pos[1] }
  x.compare!
end; nil
```

Gives ~1.17x faster data access:

```
Warming up --------------------------------------
              struct     1.694M i/100ms
              class      1.868M i/100ms
Calculating -------------------------------------
              struct     16.149M (± 6.8%) i/s -     81.318M in   5.060633s
              class      18.886M (± 2.9%) i/s -     95.262M in   5.048359s

Comparison:
              class : 18885669.6 i/s
              struct: 16149255.8 i/s - 1.17x  (± 0.00) slower
```

## Full benchmark integration of this inside of Ripper.lex

Inside of this repo with this commit

```
$ cd ext/ripper
$ make
$ cat test.rb
file = File.join(__dir__, "../../array.rb")
source = File.read(file)

bench = Benchmark.measure do
  10_000.times.each do
    Ripper.lex(source)
  end
end

puts bench
```

Then execute with and without this change 50 times:

```
rm new.txt
rm old.txt
for i in {0..50}
do
  `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt`
  `ruby -rripper -rbenchmark ./test.rb >> old.txt`
done
```

I used derailed benchmarks internals to compare the results:

```
dir = Pathname(".")
branch_info = {}
branch_info["old"]  = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" }
branch_info["new"]  = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" }
stats = DerailedBenchmarks::StatsFromDir.new(branch_info)
stats.call.banner
```

Which gave us:

```
❤️ ❤️ ❤️  (Statistically Significant) ❤️ ❤️ ❤️

[new] (3.3139 seconds) "Class lex" ref: "new"
  FASTER 🚀🚀🚀 by:
    1.1046x [older/newer]
    9.4700% [(older - newer) / older * 100]
[old] (3.6606 seconds) "Struct lex" ref: "old"

Iterations per sample:
Samples: 51

Test type: Kolmogorov Smirnov
Confidence level: 99.0 %
Is significant? (max > critical): true
D critical: 0.30049534876137013
D max: 0.9607843137254902

Histograms (time ranges are in seconds):

   [new] description:                                        [old] description:
     "Class lex"                                               "Struct lex"
              ┌                                        ┐                ┌                                        ┐
   [3.0, 3.3) ┤▇ 1                                           [3.0, 3.3) ┤ 0
   [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47       [3.3, 3.6) ┤ 0
   [3.5, 3.8) ┤▇▇ 2                                          [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46
   [3.8, 4.1) ┤▇ 1                                           [3.8, 4.1) ┤▇▇▇ 4
   [4.0, 4.3) ┤ 0                                            [4.0, 4.3) ┤ 0
   [4.3, 4.6) ┤ 0                                            [4.3, 4.6) ┤▇ 1
              └                                        ┘                └                                        ┘
                         # of runs in range                                        # of runs in range
```

To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
2021-12-02 15:55:42 +09:00
Nobuyoshi Nakada d896746d69
Keep the generated source files when clean [Bug #18363] 2021-11-25 19:16:39 +09:00
Nobuyoshi Nakada ac152b3cac
Update dependencies 2021-11-21 16:21:18 +09:00
卜部昌平 5c167a9778 ruby tool/update-deps --fix 2021-10-05 14:18:23 +09:00
卜部昌平 6413dc27dc dependency updates 2021-04-13 14:30:21 +09:00
Shugo Maeda 5de38c41ae
ripper: fix a bug of Ripper::Lexer with syntax error and heredoc [Bug #17644] 2021-02-19 16:40:29 +09:00
manga_osyo b84b253a69 Fix Ripper with heredoc. 2021-01-17 12:58:13 +09:00
Nobuyoshi Nakada 433a3be86a
ripper: call #pretty_print on also `state` 2021-01-04 23:37:00 +09:00
Nobuhiro IMAI e33eb09b76 ripper: fix `#tok` on some error events [Bug 17345]
sorting alias target by event arity, and setup suitable `Elem` for error.
2020-12-19 17:32:39 +09:00
Nobuyoshi Nakada e0bdd54348 Ripper: Refined error callbacks [Bug #17345] 2020-12-15 21:36:23 +09:00
Nobuyoshi Nakada 7898f4243f
ripper: return pushed new token instead of the token list 2020-12-15 10:26:50 +09:00