Граф коммитов

1653 Коммитов

Автор SHA1 Сообщение Дата
Koichi Sasada e81d7189a0 sync fstring pool
fstring pool should be sync with other Ractors.
2020-09-15 00:04:59 +09:00
Soutaro Matsumoto f0ddbd502c
Let String#slice! return nil (#3533)
Returns `nil` instead of an empty string when non-integer number is given (to make it 2.7 compatible).
2020-09-11 14:34:10 +09:00
Nobuyoshi Nakada eb67c603ca
Added Symbol#name
https://bugs.ruby-lang.org/issues/16150#change-87446
2020-09-04 22:18:59 +09:00
Burdette Lamar 51525557fd
Partial compliance with doc/method_documentation.rdoc in string.c (#3436)
Removes references to *-convertible thingies.
2020-08-20 12:09:49 -05:00
Jean Boussier aaf0e33c0a register_fstring: avoid duping the passed string when possible
If the passed string is frozen, bare and not shared, then there
is no need to duplicate it.

Ref: 4ab69ebbd7
Ref: https://bugs.ruby-lang.org/issues/11386
2020-08-19 08:08:56 -07:00
Nobuyoshi Nakada d75433ae19
[DOC] fixed a missing markup 2020-08-15 14:17:02 +09:00
Kasumi Hanazuki 014a4fda54 rb_str_{index,rindex}_m: Handle /\K/ in pattern
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.

```
# without patch
"abcdbce".index(/b\Kc/)  # => 1
"abcdbce".rindex(/b\Kc/)  # => 4
```

This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".index(/b\Kc/)  # => 2
"abcdbce".rindex(/b\Kc/)  # => 5
```

Fixes [Bug #17118]
2020-08-13 20:54:12 +09:00
Kasumi Hanazuki 5d71eed1a7 rb_str_{partition,rpartition}_m: Handle /\K/ in pattern
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.

```
# without patch
"abcdbce".partition(/b\Kc/)  # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcd", "c", "ce"]
```

This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".partition(/b\Kc/)  # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcdb", "c", "e"]
```

As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.

Fixes [Bug #17119]
2020-08-13 20:50:50 +09:00
Kasumi Hanazuki e79cdcf61b string.c(rb_str_split_m): Handle /\K/ correctly
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.

Fixes [Bug #17113]
2020-08-12 10:01:39 +09:00
Nobuyoshi Nakada 0ca6b973e8
Removed non-ASCII code to suppress warnings by localized compilers 2020-08-10 19:46:13 +09:00
Nobuyoshi Nakada fac62f094e
Adjust indent 2020-08-10 16:35:42 +09:00
Kazuhiro NISHIYAMA 946cd6c534
Use https instead of http 2020-07-28 19:51:54 +09:00
卜部昌平 de3e931df7 add UNREACHABLE_RETURN
Not every compilers understand that rb_raise does not return.  When a
function does not end with a return statement, such compilers can issue
warnings.  We would better tell them about reachabilities.
2020-06-29 11:05:41 +09:00
卜部昌平 5f926b2b00 rb_str_partition: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 e3d821a36c rb_str_crypt: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a5ae9aebbc trnext: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c7a4073154 chompped_length: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 fdae2063fb get_pat_quoted: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 673ddea934 get_pat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 31e5d138d7 rb_str_slice_bang: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 19f2cabed8 rb_str_aset: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 841eea4bcb rb_str_subpat_set: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 0358846f8c rb_str_update: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 d49924ed81 rb_str_match: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c422fc4bbc rb_str_rindex_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c29ec1ef1a rb_str_index_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 6df790f22e rb_enc_cr_str_buf_cat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
S-H-GAMELINKS 0fcb2dd51d
add static modifier for rb_str_ord func 2020-05-27 11:26:44 +09:00
Kazuhiro NISHIYAMA fa7addebb4
Fix typos [ci skip] 2020-05-17 21:01:29 +09:00
Burdette Lamar cc525d764b
[ci skip] Enhanced rdoc for String.new (#3067)
* Per @nobu review

* Enhanced rdoc for String.new

* Respond to review
2020-05-15 14:14:50 -07:00
Nobuyoshi Nakada 693f7ab315 Optimize String#split
Optimized `String#split` with `/ /` (single space regexp) as
simple string splitting.  [ruby-core:98272]

|               |compare-ruby|built-ruby|
|:--------------|-----------:|---------:|
|re_space-1     |    432.786k|    1.539M|
|               |           -|     3.56x|
|re_space-10    |     76.231k|  191.547k|
|               |           -|     2.51x|
|re_space-100   |      8.152k|   19.557k|
|               |           -|     2.40x|
|re_space-1000  |     837.405|    2.022k|
|               |           -|     2.41x|

ruby-core:98272: https://bugs.ruby-lang.org/issues/15771#change-85511
2020-05-12 19:58:58 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
S.H 034b8472ba
remove unused rb_str_clear define (#3059) 2020-04-25 20:39:44 -07:00
Nobuyoshi Nakada d4215dafea
Use UNREACHABLE_RETURN for non-void function 2020-04-16 17:59:55 +09:00
Kazuhiro NISHIYAMA c79e3a5957
Add {Regexp,String}#match with block to call-seq [ci skip] 2020-04-14 12:39:16 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada 8a7e0aaaef
Warn non-nil `$/` [Feature #14240] 2020-02-23 13:37:40 +09:00
Nobuyoshi Nakada fce667ed08
Get rid of warnings/exceptions at cleanup
After the encoding index instance variable is removed when all
instance variables are removed in `obj_free`, then `rb_str_free`
causes uninitialized instance variable warning and nil-to-integer
conversion exception.  Both cases result in object allocation
during GC, and crashes.
2020-02-13 12:46:48 +09:00
Nobuyoshi Nakada 160d3165eb
Copy non-inlined encoding index 2020-02-12 20:09:57 +09:00
Nobuyoshi Nakada bdf3032e35
Make temporary lock string encoding free
As a temporary lock string is hidden, it can not have instance
variables, including non-inlined encoding index.
2020-02-12 19:58:22 +09:00
Nobuyoshi Nakada 05229cef45
Improve `String#slice!` performance
Instead of searching twice to extract and to delete, extract and
delete the found position at the first search.

This makes faster nearly twice, for regexps and strings.

|              |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|regexp-short  |      2.143M|    3.918M|
|regexp-long   |    105.162k|  205.410k|
|string-short  |      3.789M|    7.964M|
|string-long   |      1.301M|    2.457M|
2020-01-31 17:12:05 +09:00
Nobuyoshi Nakada 0dd6f020fc
Make `empty_string` a fake string 2020-01-31 14:24:07 +09:00
Jean Boussier 52dc0632fa Avoid allocating a temporary empty string in String#slice! 2020-01-31 09:22:20 +09:00
Nobuyoshi Nakada aefb13eb63
Added rb_warn_deprecated_to_remove
Warn the deprecation and future removal, with obeying the warning
flag.
2020-01-23 21:42:15 +09:00
Jeremy Evans e18b817b1f Make taint warnings non-verbose instead of verbose 2020-01-22 11:19:13 -08:00
Nobuyoshi Nakada fce54a5404
Fix `String#partition`
Split with the matched part when the separator matches the empty
part at the beginning.  [Bug #11014]
2020-01-16 15:36:38 +09:00
Marcus Stollsteimer 1d09acd82b [DOC] Improve docs for String#match
Fix invalid code to make it syntax highlighted; other small fixes.
2020-01-08 20:53:31 +01:00
Marcus Stollsteimer f74021e12b Improve docs for String#=~
Move existing example to the corresponding paragraph and
add an example for `string =~ regexp` vs. `regexp =~ string`;
avoid using the receiver's identifier from the call-seq
because it does not appear in rendered HTML docs;
mention deprecation of Object#=~; fix some markup and typos.
2020-01-08 20:47:10 +01:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada 2b2030f265
Refined the warning message for $, and $;
[Bug #16438]
2019-12-20 15:09:23 +09:00
NARUSE, Yui b5fbefbf2c Added Symbol#start_with? and Symbol#end_with? method. [Feature #16348] 2019-11-28 23:49:28 +09:00
卜部昌平 7a9b2039b7 delete unused codes
Suppress compiler warnings.
2019-11-18 18:28:03 +09:00
Nobuyoshi Nakada 22c9504905
rb_tainted_str_new_with_enc is no longer used 2019-11-18 11:05:22 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平 c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
NARUSE, Yui bea322a352 Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"
This reverts commit 6ffc045a81.
2019-11-05 17:30:54 +09:00
zverok bddb31bb37 Documentation improvements for Ruby core
* Top-level `return`;
* Documentation for comments syntax;
* `rescue` inside blocks;
* Enhance `Object#to_enum` docs;
* Make `chomp:` option more obvious for `String#each_line` and
  `#lines`;
* Enhance `Proc#>>` and `#<<` docs;
* Enhance `Processs` class docs.
2019-10-26 14:58:08 +09:00
Lourens Naudé 9c24ce551d Reduce the minimum string buffer size from 127 to 63 bytes 2019-10-11 11:16:16 +09:00
卜部昌平 7e0ae1698d avoid overflow in integer multiplication
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`.  Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
2019-10-09 12:12:28 +09:00
Benoit Daloze 6ffc045a81 [EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol.
* Avoids extra allocations whenever calling Symbol#to_s.
* See [Feature #16150]
2019-09-26 10:23:02 +02:00
Alan Wu 47a234954a Rename STR_IS_SHARED_M to STR_BORROWED
Since the introduction of STR_SHARED_ROOT, the word "shared"
has become very overloaded with respect to String's internal
states. Use a different name for STR_IS_SHARED_M and explain
its purpose.
2019-09-26 15:30:18 +09:00
Alan Wu 93faa011d3 Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.

Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].

The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.

[Bug #16151]
2019-09-26 15:30:18 +09:00
Jeremy Evans 6f9b86616a Make Symbol#to_proc calls handle keyword arguments
Make rb_sym_proc_call take a flag for whether a keyword argument
is used, and use the new rb_funcall_with_block_kw function to
pass that information.
2019-09-05 17:47:12 -07:00
卜部昌平 3df37259d8 drop-in type check for rb_define_singleton_method
We can check the function pointer passed to
rb_define_singleton_method like how we do so in rb_define_method.
Doing so revealed many arity mismatches.
2019-08-29 18:34:09 +09:00
Nobuyoshi Nakada d5c33364e3
Fixed heap-use-after-free
* string.c (rb_str_sub_bang): retrieves a pointer to the
  replacement string buffer just before using it, for the case of
  replacement with the receiver string itself.  [Bug #16105]
2019-08-15 23:39:14 +09:00
git 132b7eb104 * expand tabs. [ci skip] 2019-08-15 08:16:14 +09:00
Jeremy Evans 082424ef58 Fold to lowercase intead of uppercase for String#casecmp
strcasecmp(3) and String#casecmp? both fold to lowercase.
2019-08-14 14:11:39 -07:00
Aaron Patterson 957bdfbab8
Update docs to use more natural English
Just a few updates to make the English sound a bit more natural
2019-08-12 12:21:37 -04:00
Yusuke Endoh 8d302c914c string.c (rb_str_sub, _gsub): improve the rdoc
This change:

* Added an explanation about back references except \n and \k<n>
  (\` \& \' \+ \0)
* Added an explanation about an escape (\\)
* Added some rdoc references
* Rephrased and clarified the reason why double escape is needed, added
  some examples, and moved the note to the last (because it is not
  specific to the method itself).
2019-08-12 23:28:35 +09:00
卜部昌平 b5146e375a
leafify opt_plus
Inspired by 346aa557b3

Closes: https://github.com/ruby/ruby/pull/2321
2019-08-06 20:59:19 +09:00
Takashi Kokubun 346aa557b3
Make opt_eq and opt_neq insns leaf
# Benchmark zero?

```
require 'benchmark/ips'

Numeric.class_eval do
  def ruby_zero?
    self == 0
  end
end

Benchmark.ips do |x|
  x.report('0.zero?') { 0.ruby_zero? }
  x.report('1.zero?') { 1.ruby_zero? }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  0.zero?: 21855445.5 i/s
  1.zero?: 21770817.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  1.zero?: 21958912.3 i/s
  0.zero?: 21881625.9 i/s - same-ish: difference falls within error

## JIT
The performance improves about 1.23x.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  0.zero?: 36343111.6 i/s
  1.zero?: 36295153.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  0.zero?: 44740467.2 i/s
  1.zero?: 44363616.1 i/s - same-ish: difference falls within error

# Benchmark str == str / str != str

```
# frozen_string_literal: true
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('a == a') { 'a' == 'a' }
  x.report('a == b') { 'a' == 'b' }
  x.report('a != a') { 'a' != 'a' }
  x.report('a != b') { 'a' != 'b' }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  a == a: 27286219.0 i/s
  a != a: 24892389.5 i/s - 1.10x  slower
  a == b: 23623635.8 i/s - 1.16x  slower
  a != b: 21800958.0 i/s - 1.25x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  a == a: 27224016.2 i/s
  a != a: 24490109.5 i/s - 1.11x  slower
  a == b: 23391052.4 i/s - 1.16x  slower
  a != b: 21811321.7 i/s - 1.25x  slower

## JIT
The performance improves on JIT a little.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  a == a: 42010674.7 i/s
  a != a: 38920311.2 i/s - same-ish: difference falls within error
  a == b: 32574262.2 i/s - 1.29x  slower
  a != b: 32099790.3 i/s - 1.31x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  a == a: 46902738.8 i/s
  a != a: 43097258.6 i/s - 1.09x  slower
  a == b: 35822018.4 i/s - 1.31x  slower
  a != b: 33377257.8 i/s - 1.41x  slower

This is needed towards Bug#15589.
Closes: https://github.com/ruby/ruby/pull/2318
2019-08-04 22:20:12 +09:00
Nobuyoshi Nakada 1d1f98d49c
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data.  [Bug #16024]
2019-07-28 07:33:21 +09:00
Nobuyoshi Nakada f1b76ea63c
Occupy match data
* string.c (rb_str_split_m): occupy match data not to be modified
  during yielding the block.  [Bug #16024]
2019-07-27 21:54:34 +09:00
Yusuke Endoh 43c337dfc1 string.c (str_succ): refactoring
Use more communicative variable name
2019-07-14 23:09:24 +09:00
Yusuke Endoh 3fd086ed56 string.c (str_succ): remove a unnecessary assignment
This change will suppress Coverity Scan warnings
2019-07-14 23:09:24 +09:00
git 9987296b8b * expand tabs. 2019-07-14 17:16:35 +09:00
Yusuke Endoh 934e6b2aeb Prefer `rb_error_arity` to `rb_check_arity` when it can be used 2019-07-14 17:16:19 +09:00
Jeremy Evans 0f283054e7 Check that String#scrub block does not modify receiver
Similar to the check used for String#gsub.  Can fix possible
segfault.

Fixes [Bug #15941]
2019-07-02 08:34:01 -07:00
Jeremy Evans 7582287eb2 Make String#-@ not freeze receiver if called on unfrozen subclass instance
rb_fstring behavior in this case is to freeze the receiver.  I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.

Fixes [Bug #15926]
2019-07-02 08:26:50 -07:00
git a88107c44d * expand tabs. 2019-06-29 10:17:37 +09:00
Nobuyoshi Nakada 2f6cc15cdb
Fixed String#grapheme_clusters with wide encodings
* string.c (get_reg_grapheme_cluster): make regexp from properly
  encoded sources fro wide-char encodings.  [Bug #15965]

* regparse.c (node_extended_grapheme_cluster): suppress false
  duplicated range warning for the time being.
2019-06-29 10:10:17 +09:00
John Hawthorn 04bc4c0662 Resize capacity for fstring
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.

    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...

(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)

Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.

    > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...

This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.

Closes: https://github.com/ruby/ruby/pull/2256
2019-06-26 15:01:48 +09:00
git 551ef27490 * expand tabs. 2019-06-21 22:48:50 +09:00
Nobuyoshi Nakada 8f51da5d41
Get rid of undefined behavior
* string.c (rb_str_sub_bang): str and repl can be same.
  [Bug #15946]
2019-06-21 22:42:35 +09:00
Nobuyoshi Nakada 8797f48373
New buffer for shared string
* string.c (rb_str_init): allocate new buffer if the string is
  shared.  [Bug #15937]
2019-06-19 14:39:19 +09:00
Nobuyoshi Nakada 28678997e4
Preserve the string content at self-copying
* string.c (rb_str_init): preserve the embedded content when
  self-copying with a capacity.  [Bug #15937]
2019-06-19 09:44:26 +09:00
Nobuyoshi Nakada 8b3774be3d
Fix memory leak
* string.c (str_make_independent_expand): free independent buffer.
  [Bug# 15935]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-18 13:40:04 +09:00
git c770c98ac4 * expand tabs. 2019-06-18 12:21:38 +09:00
Alan Wu 9dec4e8fc3
String#b: Don't depend on dependent string
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.

The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.

```ruby
a = ('j' * 24).b.b
eval('', binding, a)

p a
4.times { GC.start }
p a
```

 - string.c (str_replace_shared_without_enc): when given a
   dependent string, depend on the root of the dependent
   string.

[Bug #15934]
2019-06-18 12:18:13 +09:00
Nobuyoshi Nakada 53e9908d8a
Fix memory leak
* string.c (str_replace_shared_without_enc): free previous buffer
  before replaced.

* parse.y (gettable): make sure in advance that the `__FILE__`
  object shares a fstring, to get rid of replacement with the
  fstring later.
  TODO: this hack may be needed in other places.

[Bug #15916]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-16 23:51:22 +09:00
Nobuyoshi Nakada d2003a6d39
Symbol just represents a name 2019-05-14 00:30:08 +09:00
Alan Wu c06ddfee87
str_duplicate: Don't share with a frozen shared string
This is a follow up for 3f9562015e.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.

Such string looks like:

```
 --------                    -----------------
 | root | ------ owns -----> | root's buffer |
 --------                    -----------------
     ^                             ^   ^
 -----------                       |   |
 | shared1 | ------ references -----   |
 -----------                           |
     ^                                 |
 -----------                           |
 | shared2 | ------ references ---------
 -----------
```

This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:

```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2);  /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
    str_make_independent(str);  /* no frozen check */
}
```

If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:

```
 -----------                         --------------------
 | shared1 | -------- owns --------> | shared1's buffer |
 -----------                         --------------------
      ^
      |
 -----------                         -------------------------
 | shared2 | ------ references ----> | root's buffer (freed) |
 -----------                         -------------------------
```

Here is a reproduction script for the situation this commit fixes.

```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```

 - string.c (str_duplicate): always share with the root string when
   the original is a shared string.
 - test_rb_str_dup.rb: specifically test `rb_str_dup` to make
   sure it does not try to share with a shared string.

[Bug #15792]

Closes: https://github.com/ruby/ruby/pull/2159
2019-05-09 10:04:19 +09:00
Nobuyoshi Nakada f1b0db2c70
Revert "UTF-8 is one of byte based encodings"
This reverts commit 5776ae3475.

Mistaken `max` as `min`.
2019-05-06 11:02:12 +09:00
Marcus Stollsteimer 35ff4ed47f Improve documentation for String#{dump,undump} 2019-05-05 09:51:40 +02:00
git 04fd98d596 * expand tabs. 2019-05-03 23:59:58 +09:00
Nobuyoshi Nakada 77440e949b
Improve performance of case-conversion methods 2019-05-03 23:59:18 +09:00
Nobuyoshi Nakada 5776ae3475
UTF-8 is one of byte based encodings 2019-05-03 15:33:59 +09:00
git 5c87bb3b90 * expand tabs. 2019-05-02 22:44:43 +09:00
Nobuyoshi Nakada 5e23b1138f
Fix potential memory leak 2019-05-02 22:44:20 +09:00