Граф коммитов

1538 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada d5c33364e3
Fixed heap-use-after-free
* string.c (rb_str_sub_bang): retrieves a pointer to the
  replacement string buffer just before using it, for the case of
  replacement with the receiver string itself.  [Bug #16105]
2019-08-15 23:39:14 +09:00
git 132b7eb104 * expand tabs. [ci skip] 2019-08-15 08:16:14 +09:00
Jeremy Evans 082424ef58 Fold to lowercase intead of uppercase for String#casecmp
strcasecmp(3) and String#casecmp? both fold to lowercase.
2019-08-14 14:11:39 -07:00
Aaron Patterson 957bdfbab8
Update docs to use more natural English
Just a few updates to make the English sound a bit more natural
2019-08-12 12:21:37 -04:00
Yusuke Endoh 8d302c914c string.c (rb_str_sub, _gsub): improve the rdoc
This change:

* Added an explanation about back references except \n and \k<n>
  (\` \& \' \+ \0)
* Added an explanation about an escape (\\)
* Added some rdoc references
* Rephrased and clarified the reason why double escape is needed, added
  some examples, and moved the note to the last (because it is not
  specific to the method itself).
2019-08-12 23:28:35 +09:00
卜部昌平 b5146e375a
leafify opt_plus
Inspired by 346aa557b3

Closes: https://github.com/ruby/ruby/pull/2321
2019-08-06 20:59:19 +09:00
Takashi Kokubun 346aa557b3
Make opt_eq and opt_neq insns leaf
# Benchmark zero?

```
require 'benchmark/ips'

Numeric.class_eval do
  def ruby_zero?
    self == 0
  end
end

Benchmark.ips do |x|
  x.report('0.zero?') { 0.ruby_zero? }
  x.report('1.zero?') { 1.ruby_zero? }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  0.zero?: 21855445.5 i/s
  1.zero?: 21770817.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  1.zero?: 21958912.3 i/s
  0.zero?: 21881625.9 i/s - same-ish: difference falls within error

## JIT
The performance improves about 1.23x.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  0.zero?: 36343111.6 i/s
  1.zero?: 36295153.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  0.zero?: 44740467.2 i/s
  1.zero?: 44363616.1 i/s - same-ish: difference falls within error

# Benchmark str == str / str != str

```
# frozen_string_literal: true
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('a == a') { 'a' == 'a' }
  x.report('a == b') { 'a' == 'b' }
  x.report('a != a') { 'a' != 'a' }
  x.report('a != b') { 'a' != 'b' }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  a == a: 27286219.0 i/s
  a != a: 24892389.5 i/s - 1.10x  slower
  a == b: 23623635.8 i/s - 1.16x  slower
  a != b: 21800958.0 i/s - 1.25x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  a == a: 27224016.2 i/s
  a != a: 24490109.5 i/s - 1.11x  slower
  a == b: 23391052.4 i/s - 1.16x  slower
  a != b: 21811321.7 i/s - 1.25x  slower

## JIT
The performance improves on JIT a little.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  a == a: 42010674.7 i/s
  a != a: 38920311.2 i/s - same-ish: difference falls within error
  a == b: 32574262.2 i/s - 1.29x  slower
  a != b: 32099790.3 i/s - 1.31x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  a == a: 46902738.8 i/s
  a != a: 43097258.6 i/s - 1.09x  slower
  a == b: 35822018.4 i/s - 1.31x  slower
  a != b: 33377257.8 i/s - 1.41x  slower

This is needed towards Bug#15589.
Closes: https://github.com/ruby/ruby/pull/2318
2019-08-04 22:20:12 +09:00
Nobuyoshi Nakada 1d1f98d49c
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data.  [Bug #16024]
2019-07-28 07:33:21 +09:00
Nobuyoshi Nakada f1b76ea63c
Occupy match data
* string.c (rb_str_split_m): occupy match data not to be modified
  during yielding the block.  [Bug #16024]
2019-07-27 21:54:34 +09:00
Yusuke Endoh 43c337dfc1 string.c (str_succ): refactoring
Use more communicative variable name
2019-07-14 23:09:24 +09:00
Yusuke Endoh 3fd086ed56 string.c (str_succ): remove a unnecessary assignment
This change will suppress Coverity Scan warnings
2019-07-14 23:09:24 +09:00
git 9987296b8b * expand tabs. 2019-07-14 17:16:35 +09:00
Yusuke Endoh 934e6b2aeb Prefer `rb_error_arity` to `rb_check_arity` when it can be used 2019-07-14 17:16:19 +09:00
Jeremy Evans 0f283054e7 Check that String#scrub block does not modify receiver
Similar to the check used for String#gsub.  Can fix possible
segfault.

Fixes [Bug #15941]
2019-07-02 08:34:01 -07:00
Jeremy Evans 7582287eb2 Make String#-@ not freeze receiver if called on unfrozen subclass instance
rb_fstring behavior in this case is to freeze the receiver.  I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.

Fixes [Bug #15926]
2019-07-02 08:26:50 -07:00
git a88107c44d * expand tabs. 2019-06-29 10:17:37 +09:00
Nobuyoshi Nakada 2f6cc15cdb
Fixed String#grapheme_clusters with wide encodings
* string.c (get_reg_grapheme_cluster): make regexp from properly
  encoded sources fro wide-char encodings.  [Bug #15965]

* regparse.c (node_extended_grapheme_cluster): suppress false
  duplicated range warning for the time being.
2019-06-29 10:10:17 +09:00
John Hawthorn 04bc4c0662 Resize capacity for fstring
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.

    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...

(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)

Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.

    > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...

This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.

Closes: https://github.com/ruby/ruby/pull/2256
2019-06-26 15:01:48 +09:00
git 551ef27490 * expand tabs. 2019-06-21 22:48:50 +09:00
Nobuyoshi Nakada 8f51da5d41
Get rid of undefined behavior
* string.c (rb_str_sub_bang): str and repl can be same.
  [Bug #15946]
2019-06-21 22:42:35 +09:00
Nobuyoshi Nakada 8797f48373
New buffer for shared string
* string.c (rb_str_init): allocate new buffer if the string is
  shared.  [Bug #15937]
2019-06-19 14:39:19 +09:00
Nobuyoshi Nakada 28678997e4
Preserve the string content at self-copying
* string.c (rb_str_init): preserve the embedded content when
  self-copying with a capacity.  [Bug #15937]
2019-06-19 09:44:26 +09:00
Nobuyoshi Nakada 8b3774be3d
Fix memory leak
* string.c (str_make_independent_expand): free independent buffer.
  [Bug# 15935]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-18 13:40:04 +09:00
git c770c98ac4 * expand tabs. 2019-06-18 12:21:38 +09:00
Alan Wu 9dec4e8fc3
String#b: Don't depend on dependent string
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.

The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.

```ruby
a = ('j' * 24).b.b
eval('', binding, a)

p a
4.times { GC.start }
p a
```

 - string.c (str_replace_shared_without_enc): when given a
   dependent string, depend on the root of the dependent
   string.

[Bug #15934]
2019-06-18 12:18:13 +09:00
Nobuyoshi Nakada 53e9908d8a
Fix memory leak
* string.c (str_replace_shared_without_enc): free previous buffer
  before replaced.

* parse.y (gettable): make sure in advance that the `__FILE__`
  object shares a fstring, to get rid of replacement with the
  fstring later.
  TODO: this hack may be needed in other places.

[Bug #15916]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-16 23:51:22 +09:00
Nobuyoshi Nakada d2003a6d39
Symbol just represents a name 2019-05-14 00:30:08 +09:00
Alan Wu c06ddfee87
str_duplicate: Don't share with a frozen shared string
This is a follow up for 3f9562015e.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.

Such string looks like:

```
 --------                    -----------------
 | root | ------ owns -----> | root's buffer |
 --------                    -----------------
     ^                             ^   ^
 -----------                       |   |
 | shared1 | ------ references -----   |
 -----------                           |
     ^                                 |
 -----------                           |
 | shared2 | ------ references ---------
 -----------
```

This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:

```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2);  /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
    str_make_independent(str);  /* no frozen check */
}
```

If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:

```
 -----------                         --------------------
 | shared1 | -------- owns --------> | shared1's buffer |
 -----------                         --------------------
      ^
      |
 -----------                         -------------------------
 | shared2 | ------ references ----> | root's buffer (freed) |
 -----------                         -------------------------
```

Here is a reproduction script for the situation this commit fixes.

```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```

 - string.c (str_duplicate): always share with the root string when
   the original is a shared string.
 - test_rb_str_dup.rb: specifically test `rb_str_dup` to make
   sure it does not try to share with a shared string.

[Bug #15792]

Closes: https://github.com/ruby/ruby/pull/2159
2019-05-09 10:04:19 +09:00
Nobuyoshi Nakada f1b0db2c70
Revert "UTF-8 is one of byte based encodings"
This reverts commit 5776ae3475.

Mistaken `max` as `min`.
2019-05-06 11:02:12 +09:00
Marcus Stollsteimer 35ff4ed47f Improve documentation for String#{dump,undump} 2019-05-05 09:51:40 +02:00
git 04fd98d596 * expand tabs. 2019-05-03 23:59:58 +09:00
Nobuyoshi Nakada 77440e949b
Improve performance of case-conversion methods 2019-05-03 23:59:18 +09:00
Nobuyoshi Nakada 5776ae3475
UTF-8 is one of byte based encodings 2019-05-03 15:33:59 +09:00
git 5c87bb3b90 * expand tabs. 2019-05-02 22:44:43 +09:00
Nobuyoshi Nakada 5e23b1138f
Fix potential memory leak 2019-05-02 22:44:20 +09:00
Urabe, Shyouhei f4c68640d6 this variable is not guaranteed aligned
No problem for unaligned-ness because we never dereference.
2019-04-29 21:52:44 +09:00
Urabe, Shyouhei 7c0f513e97 fix typo 2019-04-29 21:52:44 +09:00
Nobuyoshi Nakada 3f9562015e
Get rid of indirect sharing
* string.c (str_duplicate): share the root shared string if the
  original string is already sharing, so that all shared strings
  refer the root shared string directly.  indirect sharing can
  cause a dangling pointer.

[Bug #15792]
2019-04-27 21:26:42 +09:00
nobu 4d1f86a1ff string.c: warn non-nil $;
* string.c (rb_str_split_m): warn use of non-nil $;.

* string.c (rb_fs_setter): warn when set to non-nil value.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-18 09:34:40 +00:00
nobu e1eb54b99d string.c: improve splitting into chars
* string.c (rb_str_split_m): improve splitting into chars by an
  empty string, without a regexp.

    Comparison:
                           to_chars-1
              built-ruby:   1273527.6 i/s
            compare-ruby:    189423.3 i/s - 6.72x  slower

                          to_chars-10
              built-ruby:    120993.5 i/s
            compare-ruby:     37075.8 i/s - 3.26x  slower

                         to_chars-100
              built-ruby:     15646.4 i/s
            compare-ruby:      4012.1 i/s - 3.90x  slower

                        to_chars-1000
              built-ruby:      1295.1 i/s
            compare-ruby:       408.5 i/s - 3.17x  slower

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-17 05:34:46 +00:00
nobu 46968fab0a string.c: [DOC] fix reference to sprintf [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-20 01:35:27 +00:00
nobu 8b49e5b47d string.c: [DOC] remove unnecessary markups [ci skip]
* string.c: remove <code> markups, which are not only unnecessary
  but also prevented cross-references.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-20 01:31:44 +00:00
nobu a265141c84 string.c: [DOC] fix indent [ci skip]
* string.c (rb_str_crypt): fix indent not to make the whole list
  verbatim entirely.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-20 01:17:16 +00:00
nobu 593505ac6f string.c: respect the actual encoding
* string.c (rb_enc_str_coderange): respect the actual encoding of
  if a BOM presents, and scan for the actual code range.
  [ruby-core:91662] [Bug #15635]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67167 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-05 00:32:15 +00:00
nobu fb84b86be0 * string.c (chopped_length): early return for empty strings
[Bug #11391]

From: Franck Verrot <franck@verrot.fr>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-02-07 07:39:47 +00:00
kazu bdbc8a8f12 Add more example of `String#dump`
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-22 12:43:57 +00:00
samuel 502f159421 Improvements to documentation.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21 10:56:29 +00:00
mame 6891a1cd5d string.c (rb_str_dump): Fix the rdoc
* Officially states that String#dump is intended for round-trip.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66894 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21 08:51:51 +00:00
nobu d7976d1451 Use `&` instead of `modulo`
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-15 12:05:46 +00:00
shyouhei d154bec0d5 setbyte / ungetbyte allow out-of-range integers
* string.c: String#setbyte to accept arbitrary integers [Bug #15460]

* io.c: ditto for IO#ungetbyte

* ext/strringio/stringio.c: ditto for StringIO#ungetbyte



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66824 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-15 06:41:58 +00:00