Граф коммитов

256 Коммитов

Автор SHA1 Сообщение Дата
yui-knk 7b803eafa2 Ripper does not depend on Bison [ci skip]
It also uses Lrama then no dependency on Bison.
2023-06-03 10:34:24 +09:00
yui-knk 3a4206c7a1 No need to define "BISON" on extconf.rb
"BISON" is defined in "ext/ripper/depend".
2023-06-02 09:28:30 +09:00
Nobuyoshi Nakada 3fe45a3123
Process parse.y without temporary files 2023-05-15 19:10:24 +09:00
Nobuyoshi Nakada bdaa491565 Add user argument to some macros used by bison 2023-05-14 15:38:48 +09:00
Nobuyoshi Nakada 3150516aab Preprocess input parse.y from stdin 2023-05-14 15:38:48 +09:00
Yuichiro Kaneko a1b01e7701
Use Lrama LALR parser generator instead of Bison
https://bugs.ruby-lang.org/issues/19637

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-05-12 18:25:10 +09:00
Matt Valentine-House 2a34bcaa10 Update VPATH for socket, & dependencies
The socket extensions rubysocket.h pulls in the "private" include/gc.h,
which now depends on vm_core.h. vm_core.h pulls in id.h

when tool/update-deps generates the dependencies for the makefiles, it
generates the line for id.h to be based on VPATH, which is configured in
the extconf.rb for each of the extensions. By default VPATH does not
include the actual source directory of the current Ruby so the
dependency fails to resolve and linking fails.

We need to append the topdir and top_srcdir to VPATH to have the
dependancy picked up correctly (and I believe we need both of these to
cope with in-tree and out-of-tree builds).

I copied this from the approach taken in
https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
2023-04-06 11:07:16 +01:00
Matt Valentine-House 5e4b80177e Update the depend files 2023-02-28 09:09:00 -08:00
Matt Valentine-House f38c6552f9 Remove intern/gc.h from Make deps 2023-02-27 10:11:56 -08:00
Nobuyoshi Nakada 899ea35035
Extract include/ruby/internal/attr/packed_struct.h
Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the
macros bellow:
* `RBIMPL_ATTR_PACKED_STRUCT_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_END`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END`
2023-02-08 12:34:13 +09:00
Nobuyoshi Nakada fad48fefe1 [Bug #19399] Parsing invalid heredoc inside block parameter
Although this is of course invalid as Ruby code, allow to just parse
and tokenize.
2023-02-02 12:20:10 +09:00
S-H-GAMELINKS 1a64d45c67 Introduce encoding check macro 2022-12-02 01:31:27 +09:00
yui-knk d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
Jemma Issroff 5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Nobuyoshi Nakada b7504af8fc Preprocess for older bison is no longer needed 2022-11-10 09:51:50 +09:00
yui-knk f7db1affd1 Set default %printer for NODE nterms
Before:

```
Reducing stack by rule 639 (line 5062):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: )
```

After:

```
Reducing stack by rule 641 (line 5078):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: NODE_LIT)
```

`"<*>"` is supported by Bison 2.3b (2008-05-27) or later.
https://git.savannah.gnu.org/cgit/bison.git/commit/?id=12e3584054c16ab255672c07af0ffc7bb220e8bc

Therefore developers need to install Bison 2.3b+ to build ruby from
source codes if their Bison is older.

Minimum version requirement for Bison is changed to 3.0.

See: https://bugs.ruby-lang.org/issues/19068 [Feature #19068]
2022-11-08 12:30:03 +09:00
Nobuyoshi Nakada bcf82b7c26
Process token IDs from id.def without id.h
Fixes id.h error during updating ripper.c by `make after-update`.

While it used to update id.h in the build directory, but was trying to
update ripper.c in the source directory.  In principle, files in the
source directory can or should not depend on files in the build
directory.
2022-09-08 18:22:47 +09:00
Peter Zhu 2d5ecd60a5 [Feature #18249] Update dependencies 2022-02-22 09:55:21 -05:00
Yusuke Endoh 17e7219679 ext/ripper/lib/ripper/lexer.rb: Do not deprecate Ripper::Lexer::State#[]
The old code of IRB still uses this method. The warning is noisy on
rails console.
In principle, Ruby 3.1 deprecates nothing, so let's avoid the
deprecation for the while.
I think It is not so hard to continue to maintain it as it is a trivial
shim.

https://github.com/ruby/ruby/pull/5093
2021-12-09 00:30:17 +09:00
Nobuyoshi Nakada 524a808d23
Define Ripper::Lexer::Elem#to_s
Alias `#inspect` as `#to_s` also in the new `Ripper::Lexer::Elem`
class, so that `puts Ripper::Lexer.new(code).scan` shows the
attributes.
2021-12-02 18:29:45 +09:00
schneems 8944009be7 Deprecate `Lexer::Elem#[]` and `Lexer::State#[]`
Discussed in https://github.com/ruby/ruby/pull/5093#issuecomment-964426481. 

> it would be enough to mimic only [] for almost all cases

This adds back the `Lexer::Elem#[]` and `Lexer::State#[]` and adds deprecation warnings for them.
2021-12-02 15:55:42 +09:00
schneems 3685b5af95 Only iterate Lexer heredoc arrays
The last element in the `@buf` may be either an array or an `Elem`. In the case it is an `Elem` we iterate over every element, when we do not need to. This check guards that case by ensuring that we only iterate over an array of elements.
2021-12-02 15:55:42 +09:00
schneems 3f74eaa7a8 ~1.10x faster Change Ripper.lex structs to classes
## Concept

I am proposing we replace the Struct implementation of data structures inside of ripper with real classes.

This will improve performance and the implementation is not meaningfully more complicated.

## Example

Struct versus class comparison:

```ruby
Elem = Struct.new(:pos, :event, :tok, :state, :message) do
  def initialize(pos, event, tok, state, message = nil)
    super(pos, event, tok, State.new(state), message)
  end

  # ...

  def to_a
    a = super
    a.pop unless a.empty?
    a
  end
end

class ElemClass
  attr_accessor :pos, :event, :tok, :state, :message

  def initialize(pos, event, tok, state, message = nil)
    @pos = pos
    @event = event
    @tok = tok
    @state = State.new(state)
    @message = message
  end

  def to_a
    if @message
      [@pos, @event, @tok, @state, @message]
    else
      [@pos, @event, @tok, @state]
    end
  end
end

# stub state class creation for now
class State; def initialize(val); end; end
```

## MicroBenchmark creation

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

Benchmark.ips do |x|
  x.report("struct") { Elem.new(pos, event, tok, state) }
  x.report("class ") { ElemClass.new(pos, event, tok, state) }
  x.compare!
end; nil
```

Gives ~1.2x faster creation:

```
Warming up --------------------------------------
              struct   263.983k i/100ms
              class    303.367k i/100ms
Calculating -------------------------------------
              struct      2.638M (± 5.9%) i/s -     13.199M in   5.023460s
              class       3.171M (± 4.6%) i/s -     16.078M in   5.082369s

Comparison:
              class :  3170690.2 i/s
              struct:  2638493.5 i/s - 1.20x  (± 0.00) slower
```

## MicroBenchmark `to_a` (Called by Ripper.lex for every element)

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.to_a }
  x.report("class ") { from_class.to_a }
  x.compare!
end; nil
```

Gives 1.46x faster `to_a`:

```
Warming up --------------------------------------
              struct   612.094k i/100ms
              class    893.233k i/100ms
Calculating -------------------------------------
              struct      6.121M (± 5.4%) i/s -     30.605M in   5.015851s
              class       8.931M (± 7.9%) i/s -     44.662M in   5.039733s

Comparison:
              class :  8930619.0 i/s
              struct:  6121358.9 i/s - 1.46x  (± 0.00) slower
```

## MicroBenchmark data access

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.pos[1] }
  x.report("class ") { from_class.pos[1] }
  x.compare!
end; nil
```

Gives ~1.17x faster data access:

```
Warming up --------------------------------------
              struct     1.694M i/100ms
              class      1.868M i/100ms
Calculating -------------------------------------
              struct     16.149M (± 6.8%) i/s -     81.318M in   5.060633s
              class      18.886M (± 2.9%) i/s -     95.262M in   5.048359s

Comparison:
              class : 18885669.6 i/s
              struct: 16149255.8 i/s - 1.17x  (± 0.00) slower
```

## Full benchmark integration of this inside of Ripper.lex

Inside of this repo with this commit

```
$ cd ext/ripper
$ make
$ cat test.rb
file = File.join(__dir__, "../../array.rb")
source = File.read(file)

bench = Benchmark.measure do
  10_000.times.each do
    Ripper.lex(source)
  end
end

puts bench
```

Then execute with and without this change 50 times:

```
rm new.txt
rm old.txt
for i in {0..50}
do
  `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt`
  `ruby -rripper -rbenchmark ./test.rb >> old.txt`
done
```

I used derailed benchmarks internals to compare the results:

```
dir = Pathname(".")
branch_info = {}
branch_info["old"]  = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" }
branch_info["new"]  = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" }
stats = DerailedBenchmarks::StatsFromDir.new(branch_info)
stats.call.banner
```

Which gave us:

```
❤️ ❤️ ❤️  (Statistically Significant) ❤️ ❤️ ❤️

[new] (3.3139 seconds) "Class lex" ref: "new"
  FASTER 🚀🚀🚀 by:
    1.1046x [older/newer]
    9.4700% [(older - newer) / older * 100]
[old] (3.6606 seconds) "Struct lex" ref: "old"

Iterations per sample:
Samples: 51

Test type: Kolmogorov Smirnov
Confidence level: 99.0 %
Is significant? (max > critical): true
D critical: 0.30049534876137013
D max: 0.9607843137254902

Histograms (time ranges are in seconds):

   [new] description:                                        [old] description:
     "Class lex"                                               "Struct lex"
              ┌                                        ┐                ┌                                        ┐
   [3.0, 3.3) ┤▇ 1                                           [3.0, 3.3) ┤ 0
   [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47       [3.3, 3.6) ┤ 0
   [3.5, 3.8) ┤▇▇ 2                                          [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46
   [3.8, 4.1) ┤▇ 1                                           [3.8, 4.1) ┤▇▇▇ 4
   [4.0, 4.3) ┤ 0                                            [4.0, 4.3) ┤ 0
   [4.3, 4.6) ┤ 0                                            [4.3, 4.6) ┤▇ 1
              └                                        ┘                └                                        ┘
                         # of runs in range                                        # of runs in range
```

To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
2021-12-02 15:55:42 +09:00
Nobuyoshi Nakada d896746d69
Keep the generated source files when clean [Bug #18363] 2021-11-25 19:16:39 +09:00
Nobuyoshi Nakada ac152b3cac
Update dependencies 2021-11-21 16:21:18 +09:00
卜部昌平 5c167a9778 ruby tool/update-deps --fix 2021-10-05 14:18:23 +09:00
卜部昌平 6413dc27dc dependency updates 2021-04-13 14:30:21 +09:00
Shugo Maeda 5de38c41ae
ripper: fix a bug of Ripper::Lexer with syntax error and heredoc [Bug #17644] 2021-02-19 16:40:29 +09:00
manga_osyo b84b253a69 Fix Ripper with heredoc. 2021-01-17 12:58:13 +09:00
Nobuyoshi Nakada 433a3be86a
ripper: call #pretty_print on also `state` 2021-01-04 23:37:00 +09:00
Nobuhiro IMAI e33eb09b76 ripper: fix `#tok` on some error events [Bug 17345]
sorting alias target by event arity, and setup suitable `Elem` for error.
2020-12-19 17:32:39 +09:00
Nobuyoshi Nakada e0bdd54348 Ripper: Refined error callbacks [Bug #17345] 2020-12-15 21:36:23 +09:00
Nobuyoshi Nakada 7898f4243f
ripper: return pushed new token instead of the token list 2020-12-15 10:26:50 +09:00
Nobuyoshi Nakada f5ca3ff4db
Store all kinds of syntax errors [Bug #17345] 2020-11-26 20:14:34 +09:00
Nobuhiro IMAI 4f5d14eb8c [DOC] Ripper.{lex,tokenize} now always return full tokens. [ci skip] 2020-11-20 15:46:17 -08:00
Nobuyoshi Nakada 69d871eeeb
[Feature #17276] Moved raise_errors support to Ripper::Lexer#parse 2020-11-20 17:18:27 +09:00
Nobuhiro IMAI 1800f3fa5c
Ripper.{lex,tokenize} return full tokens even if syntax error
yet another implements [Feature #17276]
2020-11-20 11:44:57 +09:00
Jeremy Evans 1301bd8ca9 Update documentation for Ripper.{lex,tokenize,sexp,sexp_raw} [ci skip] 2020-11-17 21:26:56 -08:00
Jeremy Evans cd0877a93e
Support raise_errors keyword for Ripper.{lex,tokenize,sexp,sexp_raw}
Implements [Feature #17276]
2020-11-17 21:15:50 -08:00
Koichi Sasada 5e3259ea74 fix public interface
To make some kind of Ractor related extensions, some functions
should be exposed.

* include/ruby/thread_native.h
  * rb_native_mutex_*
  * rb_native_cond_*
* include/ruby/ractor.h
  * RB_OBJ_SHAREABLE_P(obj)
  * rb_ractor_shareable_p(obj)
  * rb_ractor_std*()
  * rb_cRactor

and rm ractor_pub.h
and rename srcdir/ractor.h to srcdir/ractor_core.h
    (to avoid conflict with include/ruby/ractor.h)
2020-11-18 03:52:41 +09:00
Koichi Sasada 79df14c04b Introduce Ractor mechanism for parallel execution
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.

[Feature #17100]

This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.

I hope this feature can help programmers from thread-safety issues.
2020-09-03 21:11:06 +09:00
卜部昌平 490010084e sed -i '/rmodule.h/d' 2020-08-27 16:42:06 +09:00
卜部昌平 756403d775 sed -i '/r_cast.h/d' 2020-08-27 15:03:36 +09:00
卜部昌平 0da2a3f1fc sed -i '\,2/extern.h,d' 2020-08-27 14:07:49 +09:00
Nobuyoshi Nakada d32e2bb02d
Allow references to $$ in Ripper DSL 2020-05-29 09:41:27 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada b7e1eda932
Suppress warnings by gcc 10.1.0-RC-20200430
* Folding results should not be empty.

  If `OnigCodePointCount(to->n)` were 0, `for` loop using `fn`
  wouldn't execute and `ncs` elements are not initialized.

  ```
  enc/unicode.c:557:21: warning: 'ncs[0]' may be used uninitialized in this function [-Wmaybe-uninitialized]
    557 |  for (i = 0; i < ncs[0]; i++) {
        |                  ~~~^~~
  ```

* Cast to `enum yytokentype`

  Additional enums for scanner events by ripper are not included
  in `yytokentype`.

  ```
  ripper.y:7274:28: warning: implicit conversion from 'enum <anonymous>' to 'enum yytokentype' [-Wenum-conversion]
  ```
2020-05-04 12:28:24 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Hiroshi SHIBATA 05485868cb Workaround for bison provided by scoop on mswin environment 2020-02-15 21:20:25 +09:00