Граф коммитов

640 Коммитов

Автор SHA1 Сообщение Дата
git 2dd1a037de * expand tabs. [ci skip]
Tabs were expanded because the file did not have any tab indentation in unedited lines.
Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
2022-10-10 13:22:15 +09:00
Nobuyoshi Nakada 0a98dd1cff
Should use dedecated function `Check_Type` 2022-10-10 13:21:57 +09:00
Vladimir Dementyev 4954c9fc0f Add MatchData#deconstruct/deconstruct_keys 2022-10-10 12:41:13 +09:00
Nobuyoshi Nakada c53667691a
[DOC] `offset` argument of Regexp#match 2022-08-18 23:25:05 +09:00
Aaron Patterson e4e054e3ce Speed up setting the backref match object
This patch speeds up setting the backref match object by avoiding some
memcopies.  Take the following code for example:

```ruby
"hello world" =~ /hello/
p $~
```

When the RE matches the string, we have to set the Match object in the
backref global.  So we would allocate a match object[^1] and use
`rb_reg_region_copy`[^2] to make a deep copy of the stack allocated
`re_registers` struct[^3] in to the newly created Ruby object.  This
could possibly trigger GC[^4], and would allocate new memory.

This patch makes a shallow copy of the `re_registers` struct on to the
Match object allowing the match object to manage the `re_registers`
pointer and also avoiding some calls to `xmalloc` and some manual
memcopy.

Benchmark looks like this:

```ruby

require "benchmark/ips"

def test_re thing
  thing =~ /hello/
end

Benchmark.ips do |x|
  x.report("re hit") do
    test_re "hello world"
  end

  x.report("re miss") do
    test_re "world"
  end
end
```

Before this patch:

```
$ ruby -v test.rb
ruby 3.2.0dev (2022-07-27T22:29:00Z master 4ad69899b7) [arm64-darwin21]
Ignoring bcrypt-3.1.16 because its extensions are not built. Try: gem pristine bcrypt --version 3.1.16
Warming up --------------------------------------
              re hit   345.401k i/100ms
             re miss   673.584k i/100ms
Calculating -------------------------------------
              re hit      3.452M (± 0.5%) i/s -     17.270M in   5.002535s
             re miss      6.736M (± 0.4%) i/s -     34.353M in   5.099593s
```

After this patch:

```
$ ./ruby -v test.rb
ruby 3.2.0dev (2022-08-01T21:24:12Z less-memcpy 0ff2a56606) [arm64-darwin21]
Warming up --------------------------------------
              re hit   419.578k i/100ms
             re miss   673.251k i/100ms
Calculating -------------------------------------
              re hit      4.201M (± 0.7%) i/s -     21.398M in   5.093593s
             re miss      6.716M (± 0.4%) i/s -     33.663M in   5.012756s
```

Matches get faster and misses maintain the same speed

[^1]: 24204d54ab/re.c (L1737)
[^2]: 24204d54ab/re.c (L1738)
[^3]: 24204d54ab/re.c (L1686)
[^4]: 24204d54ab/re.c (L981)
2022-08-02 09:04:04 -07:00
Takashi Kokubun 5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
Kazuhiro NISHIYAMA 846a6bb60f
[DOC] Fix a typo [ci skip] 2022-06-26 14:17:14 +09:00
Jeremy Evans 596f4b0d3a Document that Regexp#source does not retain lexer escapes
Related to [Feature #18838]
2022-06-20 15:56:28 -07:00
Nobuyoshi Nakada 4a6facc2d6 [Feature #18788] [DOC] String options to `Regexp.new`
Co-Authored-By: Janosch Müller <janosch.mueller@betterplace.org>
2022-06-20 19:35:12 +09:00
Nobuyoshi Nakada 1e9939dae2 [Feature #18788] Support options as `String` to `Regexp.new`
`Regexp.new` now supports passing the regexp flags not only as an
`Integer`, but also as a `String.  Unknown flags raise errors.
2022-06-20 19:35:12 +09:00
Nobuyoshi Nakada ab2a43265c Warn suspicious flag to `Regexp.new`
Now second argument should be `true`, `false`, `nil` or Integer.
This flag is confused with third argument some times.
2022-06-20 19:35:12 +09:00
Nobuyoshi Nakada 7f8a915715
[DOC] Refine Regexp.new argument descriptions 2022-06-20 18:39:50 +09:00
Nobuyoshi Nakada 914c26eab3
[DOC] Regexp timeout is float or nil 2022-06-20 17:47:44 +09:00
Nobuyoshi Nakada cd3a5cd0e3
[DOC] Fixed omissions in Regexp.new arguments 2022-06-20 09:26:11 +09:00
Jeremy Evans ec3542229b
Ignore invalid escapes in regexp comments
Invalid escapes are handled at multiple levels.  The first level
is in parse.y, so skip invalid unicode escape checks for regexps
in parse.y.

Make rb_reg_preprocess and unescape_nonascii accept the regexp
options.  In unescape_nonascii, if the regexp is an extended
regexp, when "#" is encountered, ignore all characters until the
end of line or end of regexp.

Unfortunately, in extended regexps, you can use "#" as a non-comment
character inside a character class, so also parse "[" and "]"
specially for extended regexps, and only skip comments if "#" is
not inside a character class. Handle nested character classes as well.

This issue doesn't just affect extended regexps, it also affects
"(#?" comments inside all regexps.  So for those comments, scan
until trailing ")" and ignore content inside.

I'm not sure if there are other corner cases not handled.  A
better fix would be to redesign the regexp parser so that it
unescaped during parsing instead of before parsing, so you already
know the current parsing state.

Fixes [Bug #18294]

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2022-06-06 13:50:03 -07:00
Burdette Lamar b41de3a1e8
[DOC] Enhanced RDoc for MatchData (#5822)
Treats:
    #to_s
    #named_captures
    #string
    #inspect
    #hash
    #==
2022-04-18 18:19:10 -05:00
Burdette Lamar 6db3f7c405
Enhanced RDoc for MatchData (#5821)
Treats:
    #[]
    #values_at
2022-04-18 15:52:07 -05:00
Burdette Lamar 86e23529ad
Enhanced RDoc for MatchData (#5820)
Treats:
    #pre_match
    #post_match
    #to_a
    #captures
2022-04-18 14:34:40 -05:00
Burdette Lamar b074bc3d61
[DOC] Enhanced RDoc for MatchData (#5819)
Treats:
    #begin
    #end
    #match
    #match_length
2022-04-18 13:02:35 -05:00
Burdette Lamar 9d1dd7a9ed
[DOC] Enhanced RDoc for MatchData (#5818)
Treats:
    #regexp
    #names
    #size
    #offset
2022-04-18 11:31:30 -05:00
Burdette Lamar 51ea67698e
[DOC] Enhanced RDoc for Regexp (#5815)
Treats:
    ::new
    ::escape
    ::try_convert
    ::union
    ::last_match
2022-04-18 10:45:29 -05:00
Burdette Lamar 2b4b513ef0
[DOC] Enhanced RDoc for Regexp (#5812)
Treats:

    #fixed_encoding?
    #hash
    #==
    #=~
    #match
    #match?

Also, in regexp.rdoc:

    Changes heading from 'Special Global Variables' to 'Regexp Global Variables'.
    Add tiny section 'Regexp Interpolation'.
2022-04-16 15:20:03 -05:00
Burdette Lamar e021754db0
[DOC] Enhanced RDoc for Regexp (#5807)
Treats:

    #source
    #inspect
    #to_s
    #casefold?
    #options
    #names
    #named_captures
2022-04-15 13:31:15 -05:00
Nobuyoshi Nakada d8189ed23f
Return only captured range in `MatchData` [Bug #18670] 2022-03-31 18:01:15 +09:00
Yusuke Endoh c499a4c28a re.c: stop a wrong warning of "flags ignored" on Regexp.new(//)
[Bug #18669]
2022-03-31 10:07:09 +09:00
Yusuke Endoh 5df2589b64 internal/ractor.h: Added
Currently it has only one function prototype.
2022-03-30 16:50:46 +09:00
Yusuke Endoh 2ade40276b re.c: raise Regexp::TimeoutError instead of RuntimeError 2022-03-30 16:50:46 +09:00
Yusuke Endoh ce87bb8bd6 re.c: Add `timeout` keyword for Regexp.new and Regexp#timeout 2022-03-30 16:50:46 +09:00
Yusuke Endoh ffc3b37f96 re.c: Add Regexp.timeout= and Regexp.timeout
[Feature #17837]
2022-03-30 16:50:46 +09:00
Shugo Maeda c8817d6a3e
Add String#byteindex, String#byterindex, and MatchData#byteoffset (#5518)
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]

Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
2022-02-19 19:10:00 +09:00
Shugo Maeda cda5aee74e
LONG2NUM() should be used for rmatch_offset::{beg,end}
https://github.com/ruby/ruby/pull/5518#discussion_r809645406
2022-02-18 22:13:45 +09:00
Nobuyoshi Nakada 16fdc1ff46
[DOC] Fix broken links to literals.rdoc 2022-02-08 01:27:52 +09:00
S-H-GAMELINKS 804a714971 Replace to RBOOL macro 2022-01-17 13:49:37 +09:00
Burdette Lamar 28fb6d6b9e
Adding links to literals and Kernel (#5192)
* Adding links to literals and Kernel
2021-12-03 07:12:28 -06:00
S.H dc9112cf10
Using NIL_P macro instead of `== Qnil` 2021-10-03 22:34:45 +09:00
Jeremy Evans abc0304cb2 Avoid race condition in Regexp#match
In certain conditions, Regexp#match could return a MatchData with
missing captures.  This seems to require at the least, multiple
threads calling a method that calls the same block/proc/lambda
which calls Regexp#match.

The race condition happens because the MatchData is passed from
indirectly via the backref, and other threads can modify the
backref.

Fix the issue by:

1. Not reusing the existing MatchData from the backref, and always
   allocating a new MatchData.
2. Passing the MatchData directly to the caller using a VALUE*,
   instead of indirectly through the backref.

It's likely that variants of this issue exist for other Regexp
methods.  Anywhere that MatchData is passed implicitly through
the backref is probably vulnerable to this issue.

Fixes [Bug #17507]
2021-10-01 19:50:19 -09:00
Nobuyoshi Nakada f2cb6288bc
[Feature #18172] Add MatchData#match_length
The method to return the length of the matched substring
corresponding to the given argument.
2021-09-16 19:55:06 +09:00
Nobuyoshi Nakada 09d724e6f8
[Feature #18172] Add MatchData#match
The method to return the single matched substring corresponding to
the given argument.
2021-09-16 19:55:06 +09:00
S.H b8c3a84bdd
Refactor and Using RBOOL macro 2021-09-15 08:11:05 +09:00
Nobuyoshi Nakada c5570a7c11 Extract backref_number_check 2021-09-12 11:16:51 +09:00
Nobuyoshi Nakada 99d8c4832a Preserve the encoding of the argument in IndexError [Bug #18160] 2021-09-12 11:16:51 +09:00
Martin Dürst f2ffa88964 Show default argument explicitly for Rexexp#match? [ci skip] 2021-09-01 09:37:13 +09:00
Martin Dürst 45b8846bec Fix minor grammar issue in documentation of Regexp#match? [ci skip] 2021-09-01 09:24:34 +09:00
S.H 378e8cdad6
Using RBOOL macro 2021-08-02 12:06:44 +09:00
Nobuyoshi Nakada 9f3888d6a3 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_REGEXP
2021-06-03 15:11:18 +09:00
S.H d627b75e01
Add static modifier to C function in re.c (#3153)
* add static modifier for rb_reg_eqq func

* add static modifier for rb_check_regexp_type func
2021-06-01 00:59:33 -07:00
Nobuyoshi Nakada 947d93b715
[DOC] {Array,MatchData}#values_at understand ranges [ci skip] 2021-02-07 10:30:43 +09:00
Marcus Stollsteimer 3108ad7bf3 [DOC] Fix grammar: "is same as" -> "is the same as" 2021-01-05 15:13:53 +01:00
Jeremy Evans 05313c914b Use category: :deprecated in warnings that are related to deprecation
Also document that both :deprecated and :experimental are supported
:category option values.

The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.

Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).

Add assert_deprecated_warn to test assertions.  Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
2020-12-18 09:54:11 -08:00
Nobuyoshi Nakada 85aabef023 [Feature #17136] Remove special behavior from $KCODE 2020-11-28 18:51:36 +09:00
Koichi Sasada 7ad56fd87b freeze dynamic regexp literals
Regexp literals are frozen, and also dynamically comppiled Regexp
literals (/#{expr}/) are frozen.
2020-10-27 01:45:57 +09:00
Koichi Sasada 99310e3eb5 Some global variables can be accessed from ractors
Some global variables should be used from non-main Ractors.
[Bug #17268]

```ruby
     # ractor-local (derived from created ractor): debug
     '$DEBUG' => $DEBUG,
     '$-d' => $-d,

     # ractor-local (derived from created ractor): verbose
     '$VERBOSE' => $VERBOSE,
     '$-w' => $-w,
     '$-W' => $-W,
     '$-v' => $-v,

     # process-local (readonly): other commandline parameters
     '$-p' => $-p,
     '$-l' => $-l,
     '$-a' => $-a,

     # process-local (readonly): getpid
     '$$'  => $$,

     # thread local: process result
     '$?'  => $?,

     # scope local: match
     '$~'  => $~.inspect,
     '$&'  => $&,
     '$`'  => $`,
     '$\''  => $',
     '$+'  => $+,
     '$1'  => $1,

     # scope local: last line
     '$_' => $_,

     # scope local: last backtrace
     '$@' => $@,
     '$!' => $!,

     # ractor local: stdin, out, err
     '$stdin'  => $stdin.inspect,
     '$stdout' => $stdout.inspect,
     '$stderr' => $stderr.inspect,
```
2020-10-20 15:38:54 +09:00
Kazuhiro NISHIYAMA 1c138327e0
Try to fix compile error on windows
https://github.com/ruby/ruby/runs/1041040167?check_suite_focus=true#step:11:177
```
compiling ../src/re.c
re.c
../src/re.c(317): error C2057: expected constant expression
../src/re.c(317): error C2466: cannot allocate an array of constant size 0
../src/re.c(467): error C2057: expected constant expression
../src/re.c(467): error C2466: cannot allocate an array of constant size 0
../src/re.c(467): error C2133: 'opts': unknown size
../src/re.c(559): error C2057: expected constant expression
../src/re.c(559): error C2466: cannot allocate an array of constant size 0
../src/re.c(559): error C2133: 'optbuf': unknown size
../src/re.c(673): error C2057: expected constant expression
../src/re.c(673): error C2466: cannot allocate an array of constant size 0
../src/re.c(673): error C2133: 'opts': unknown size
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX64\x64\cl.EXE"' : return code '0x2'
Stop.
```
2020-08-28 22:03:06 +09:00
Nobuyoshi Nakada 75c4e9b72e
Named the magic number for regexp option buffer size
In `rb_enc_reg_error_desc`, no longer kcode option is added.
2020-08-28 19:29:16 +09:00
Nobuyoshi Nakada e658040266
RSTRING_LEN was not used 2020-08-14 16:12:58 +09:00
Yusuke Endoh 4318aba9c9 re.c: prevent "warning: variable 'n' set but not used"
by adding MAYBE_UNUSED.
2020-08-14 08:51:14 +09:00
Nobuyoshi Nakada 787cb0fd86
Replace repeated RSTRING_PTR and RSTRING_LEN with RSTRING_GETMEM
As now RSTRING_PTR and RSTRING_LEN are functions, they very bother
stepping in/out during debugging.
2020-08-13 20:56:23 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Kazuhiro NISHIYAMA c79e3a5957
Add {Regexp,String}#match with block to call-seq [ci skip] 2020-04-14 12:39:16 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada bc646e6715
[DOC] get rid of parsing as TIDYLINK unintentionally 2020-04-07 13:59:38 +09:00
Nobuyoshi Nakada 4f19666e8b
`Regexp` in `MatchData` can be `nil`
`String#sub` with a string pattern defers creating a `Regexp`
until `MatchData#regexp` creates a `Regexp` from the matched
string.  `Regexp#last_match(group_name)` accessed its content
without creating the `Regexp` though.  [Bug #16508]
2020-01-16 11:32:11 +09:00
Jean Boussier 98ef38ada4 Freeze Regexp literals
[Feature #8948] [Feature #16377]

Since Regexp literals always reference the same instance,
allowing to mutate them can lead to state leak.
2020-01-15 10:38:47 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
NARUSE, Yui 8852fa8760 Revert "Regexp#match{?} with nil raises TypeError as String, Symbol (#1506)"
This reverts commit 2a22a6b2d8.
Revert [Feature #13083]
2019-12-04 06:40:54 +09:00
NARUSE, Yui 08074eb712 Revert "Revert nil error and adding deprecation message"
This reverts commit 452bee3ee8.
2019-12-04 06:40:54 +09:00
NARUSE, Yui a705f6472c Revert "Improve warning message"
This reverts commit 31110d820c.
2019-12-04 06:40:54 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
Nobuyoshi Nakada aa94245a09
Undefine MatchData.allocate [Feature #16294] 2019-11-06 08:54:32 +09:00
Kenichi Kamiya 31110d820c Improve warning message
https://github.com/ruby/ruby/pull/2637#discussion_r341812475
2019-11-03 11:03:04 +01:00
Kenichi Kamiya 452bee3ee8 Revert nil error and adding deprecation message 2019-11-03 11:03:04 +01:00
Alan Wu c56d8deaff Mention correct class name in uninitialized error
I think this meant to mention `MatchData`? This is a breaking change, but
should be a minor one.
2019-11-01 18:37:57 +09:00
Kenichi Kamiya 2a22a6b2d8 Regexp#match{?} with nil raises TypeError as String, Symbol (#1506)
* {String|Symbol}#match{?} with nil returns falsy

To improve consistency with Regexp#match{?}

* String#match(nil) returns `nil` instead of TypeError
* String#match?(nil) returns `false` instead of TypeError
* Symbol#match(nil) returns `nil` instead of TypeError
* Symbol#match?(nil) returns `false` instead of TypeError

* Prefer exception

* Follow empty ENV

* Drop outdated specs

* Write ruby/spec for above

https://github.com/ruby/ruby/pull/1506/files#r183242981

* Fix merge miss
2019-10-17 17:44:46 +09:00
Yusuke Endoh ebc2198d9f re.c (match_set_string): add a check for memory allocation
Found by Coverity Scan
2019-10-12 22:44:23 +09:00
卜部昌平 3df37259d8 drop-in type check for rb_define_singleton_method
We can check the function pointer passed to
rb_define_singleton_method like how we do so in rb_define_method.
Doing so revealed many arity mismatches.
2019-08-29 18:34:09 +09:00
卜部昌平 1663d347c9 delete `$` sign from C identifiers
They lack portability. See also
https://travis-ci.org/shyouhei/ruby/jobs/577164015
2019-08-27 15:52:26 +09:00
卜部昌平 ae2dc3f217 rb_define_hooked_variable now free from ANYARGS
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct.  This commit uses rb_gvar_getter_t /
rb_gvar_setter_t for rb_define_hooked_variable /
rb_define_virtual_variable which revealed lots of function prototype
inconsistencies.  Some of them were literally decades old, going back
to dda5dc00cf.
2019-08-27 15:52:26 +09:00
Nobuyoshi Nakada 1d1f98d49c
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data.  [Bug #16024]
2019-07-28 07:33:21 +09:00
Jeremy Evans 32ec6dd5c7 Document encoding of string returned by Regexp.quote [ci skip]
Also, remove documentation about returning self, which makes no
sense as self would be the Regexp class. It could be interpreted
as return the argument if no changes were made, but that hasn't
been the behavior at least since 1.8.7 (and probably before).

Fixes [Bug #10239]
2019-07-22 14:43:36 -07:00
Lourens Naudé cf930985da Remove member char_offset_updated from struct rmatch as member char_offset_num_allocated can serve the same purpose as that predicate 2019-04-24 02:02:05 +09:00
nobu de0ef1a9df [DOC] fix markups [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-28 03:33:35 +00:00
stomar 5c1fd79f1d re.c: [DOC] fix typos
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-13 20:25:36 +00:00
kazu 069b730f96 [DOC] Fix typos [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-13 09:51:05 +00:00
naruse 3a637971a2 Enchance MatchData docs [Bug #14450]
From: Victor Shepelev <zverok.offline@gmail.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66350 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-12 06:10:29 +00:00
nobu 4b85e88174 Prefer rb_check_arity when 0 or 1 arguments
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66179 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-04 02:24:15 +00:00
shyouhei 953091a4b1 char is not unsigned
It seems that decades ago, ruby was written under assumption that
char is unsigned.  Which is of course a false assumption.  We
need to explicitly store a numeric value into an unsigned char
variable to tell we expect 0..255 value.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65900 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-21 08:51:39 +00:00
shyouhei 21e1260fb9 char is neither signed nor unsigned
read_escaped_byte() returns values of range -1...256. -1 indicates
error.  So the function basically expects char to be 0..255 range.
There is no such guarantee. `char` is not always unsigned.  We
need to explicitly declare chbuf to be unsigned char.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65677 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-12 02:39:24 +00:00
kazu 4d3c254ebe Fix call-seq [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65591 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-07 05:07:56 +00:00
naruse ef69efef1d no-op if it is T_STRING
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64887 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-29 17:49:33 +00:00
svn 19e6af9f00 * expand tabs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64885 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-29 17:49:06 +00:00
naruse 7bcc535a05 Remove unnecessary use of function pointer
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-29 17:49:03 +00:00
nobu 7e9ee35fb8 Remove -Wno-parentheses flag.
[Fix GH-1958]

From: Jun Aruga <jaruga@redhat.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64806 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-21 10:19:10 +00:00
nobu 8a8f542c43 re.c: do not escape terminator in Regexp.union
* re.c (rb_reg_str_with_term): change terminator.

* re.c (rb_reg_s_union): terminator in source string does not need
  to be escaped.  terminators are outside of regexp source itself.
  [ruby-core:86149] [Bug #14608]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-03-16 13:37:44 +00:00
nobu 5fade63482 re.c: fixed escaped multibyte char
* re.c (unescape_nonascii): escaped multibyte character should be
  copied as-is, just with checking if the encoding matches.
  https://twitter.com/sakuro/status/972014409986883584

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-03-11 00:05:12 +00:00
k0kubun ed935aa5be mjit_compile.c: merge initial JIT compiler
which has been developed by Takashi Kokubun <takashikkbn@gmail> as
YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>.

This JIT compiler is designed to be a safe migration path to introduce
JIT compiler to MRI. So this commit does not include any bytecode
changes or dynamic instruction modifications, which are done in original
MJIT.

This commit even strips off some aggressive optimizations from
YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still
fairly faster than Ruby 2.5 in some benchmarks (attached below).

Note that this JIT compiler passes `make test`, `make test-all`, `make
test-spec` without JIT, and even with JIT. Not only it's perfectly safe
with JIT disabled because it does not replace VM instructions unlike
MJIT, but also with JIT enabled it stably runs Ruby applications
including Rails applications.

I'm expecting this version as just "initial" JIT compiler. I have many
optimization ideas which are skipped for initial merging, and you may
easily replace this JIT compiler with a faster one by just replacing
mjit_compile.c. `mjit_compile` interface is designed for the purpose.

common.mk: update dependencies for mjit_compile.c.

internal.h: declare `rb_vm_insn_addr2insn` for MJIT.

vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to
compiler. This avoids to include some functions which take a long time
to compile, e.g. vm_exec_core. Some of the purpose is achieved in
transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are
manually resolved for now. Load mjit_helper.h for MJIT header.
mjit_helper.h: New. This is a file used only by JIT-ed code. I'll
refactor `mjit_call_cfunc` later.
vm_eval.c: add some #ifdef switches to skip compiling some functions
like Init_vm_eval.

win32/mkexports.rb: export thread/ec functions, which are used by MJIT.

include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify
that a function is exported only for MJIT.

array.c: export a function used by MJIT.
bignum.c: ditto.
class.c: ditto.
compile.c: ditto.
error.c: ditto.
gc.c: ditto.
hash.c: ditto.
iseq.c: ditto.
numeric.c: ditto.
object.c: ditto.
proc.c: ditto.
re.c: ditto.
st.c: ditto.
string.c: ditto.
thread.c: ditto.
variable.c: ditto.
vm_backtrace.c: ditto.
vm_insnhelper.c: ditto.
vm_method.c: ditto.

I would like to improve maintainability of function exports, but I
believe this way is acceptable as initial merging if we clarify the
new exports are for MJIT (so that we can use them as TODO list to fix)
and add unit tests to detect unresolved symbols.
I'll add unit tests of JIT compilations in succeeding commits.

Author: Takashi Kokubun <takashikkbn@gmail.com>
Contributor: wanabe <s.wanabe@gmail.com>

Part of [Feature #14235]

---

* Known issues
  * Code generated by gcc is faster than clang. The benchmark may be worse
    in macOS. Following benchmark result is provided by gcc w/ Linux.
  * Performance is decreased when Google Chrome is running
  * JIT can work on MinGW, but it doesn't improve performance at least
    in short running benchmark.
  * Currently it doesn't perform well with Rails. We'll try to fix this
    before release.

---

* Benchmark reslts

Benchmarked with:
Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores

- 2.0.0-p0: Ruby 2.0.0-p0
- r62186: Ruby trunk (early 2.6.0), before MJIT changes
- JIT off: On this commit, but without `--jit` option
- JIT on: On this commit, and with `--jit` option

** Optcarrot fps

Benchmark: https://github.com/mame/optcarrot

|         |2.0.0-p0 |r62186   |JIT off  |JIT on   |
|:--------|:--------|:--------|:--------|:--------|
|fps      |37.32    |51.46    |51.31    |58.88    |
|vs 2.0.0 |1.00x    |1.38x    |1.37x    |1.58x    |

** MJIT benchmarks

Benchmark: https://github.com/benchmark-driver/mjit-benchmarks
(Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks)

|           |2.0.0-p0 |r62186   |JIT off  |JIT on   |
|:----------|:--------|:--------|:--------|:--------|
|aread      |1.00     |1.09     |1.07     |2.19     |
|aref       |1.00     |1.13     |1.11     |2.22     |
|aset       |1.00     |1.50     |1.45     |2.64     |
|awrite     |1.00     |1.17     |1.13     |2.20     |
|call       |1.00     |1.29     |1.26     |2.02     |
|const2     |1.00     |1.10     |1.10     |2.19     |
|const      |1.00     |1.11     |1.10     |2.19     |
|fannk      |1.00     |1.04     |1.02     |1.00     |
|fib        |1.00     |1.32     |1.31     |1.84     |
|ivread     |1.00     |1.13     |1.12     |2.43     |
|ivwrite    |1.00     |1.23     |1.21     |2.40     |
|mandelbrot |1.00     |1.13     |1.16     |1.28     |
|meteor     |1.00     |2.97     |2.92     |3.17     |
|nbody      |1.00     |1.17     |1.15     |1.49     |
|nest-ntimes|1.00     |1.22     |1.20     |1.39     |
|nest-while |1.00     |1.10     |1.10     |1.37     |
|norm       |1.00     |1.18     |1.16     |1.24     |
|nsvb       |1.00     |1.16     |1.16     |1.17     |
|red-black  |1.00     |1.02     |0.99     |1.12     |
|sieve      |1.00     |1.30     |1.28     |1.62     |
|trees      |1.00     |1.14     |1.13     |1.19     |
|while      |1.00     |1.12     |1.11     |2.41     |

** Discourse's script/bench.rb

Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb

NOTE: Rails performance was somehow a little degraded with JIT for now.
We should fix this.
(At least I know opt_aref is performing badly in JIT and I have an idea
 to fix it. Please wait for the fix.)

*** JIT off
Your Results: (note for timings- percentile is first, duration is second in millisecs)

categories_admin:
  50: 17
  75: 18
  90: 22
  99: 29
home_admin:
  50: 21
  75: 21
  90: 27
  99: 40
topic_admin:
  50: 17
  75: 18
  90: 22
  99: 32
categories:
  50: 35
  75: 41
  90: 43
  99: 77
home:
  50: 39
  75: 46
  90: 49
  99: 95
topic:
  50: 46
  75: 52
  90: 56
  99: 101

*** JIT on
Your Results: (note for timings- percentile is first, duration is second in millisecs)

categories_admin:
  50: 19
  75: 21
  90: 25
  99: 33
home_admin:
  50: 24
  75: 26
  90: 30
  99: 35
topic_admin:
  50: 19
  75: 20
  90: 25
  99: 30
categories:
  50: 40
  75: 44
  90: 48
  99: 76
home:
  50: 42
  75: 48
  90: 51
  99: 89
topic:
  50: 49
  75: 55
  90: 58
  99: 99

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-02-04 11:22:28 +00:00
shyouhei cdff88b8b4 rb_reg_raise_str marked as NORETURN
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61920 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-01-18 09:44:42 +00:00
shyouhei 8691515246 rb_enc_reg_raise marked as NORETURN
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61919 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-01-18 09:44:41 +00:00
shyouhei 8bc3615950 rb_reg_enc_error marked as NORETURN
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61918 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-01-18 09:44:41 +00:00
shyouhei f41b1d07ab rb_reg_raise marked as NORETURN
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61917 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-01-18 09:44:40 +00:00