Граф коммитов

1565 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada fce667ed08
Get rid of warnings/exceptions at cleanup
After the encoding index instance variable is removed when all
instance variables are removed in `obj_free`, then `rb_str_free`
causes uninitialized instance variable warning and nil-to-integer
conversion exception.  Both cases result in object allocation
during GC, and crashes.
2020-02-13 12:46:48 +09:00
Nobuyoshi Nakada 160d3165eb
Copy non-inlined encoding index 2020-02-12 20:09:57 +09:00
Nobuyoshi Nakada bdf3032e35
Make temporary lock string encoding free
As a temporary lock string is hidden, it can not have instance
variables, including non-inlined encoding index.
2020-02-12 19:58:22 +09:00
Nobuyoshi Nakada 05229cef45
Improve `String#slice!` performance
Instead of searching twice to extract and to delete, extract and
delete the found position at the first search.

This makes faster nearly twice, for regexps and strings.

|              |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|regexp-short  |      2.143M|    3.918M|
|regexp-long   |    105.162k|  205.410k|
|string-short  |      3.789M|    7.964M|
|string-long   |      1.301M|    2.457M|
2020-01-31 17:12:05 +09:00
Nobuyoshi Nakada 0dd6f020fc
Make `empty_string` a fake string 2020-01-31 14:24:07 +09:00
Jean Boussier 52dc0632fa Avoid allocating a temporary empty string in String#slice! 2020-01-31 09:22:20 +09:00
Nobuyoshi Nakada aefb13eb63
Added rb_warn_deprecated_to_remove
Warn the deprecation and future removal, with obeying the warning
flag.
2020-01-23 21:42:15 +09:00
Jeremy Evans e18b817b1f Make taint warnings non-verbose instead of verbose 2020-01-22 11:19:13 -08:00
Nobuyoshi Nakada fce54a5404
Fix `String#partition`
Split with the matched part when the separator matches the empty
part at the beginning.  [Bug #11014]
2020-01-16 15:36:38 +09:00
Marcus Stollsteimer 1d09acd82b [DOC] Improve docs for String#match
Fix invalid code to make it syntax highlighted; other small fixes.
2020-01-08 20:53:31 +01:00
Marcus Stollsteimer f74021e12b Improve docs for String#=~
Move existing example to the corresponding paragraph and
add an example for `string =~ regexp` vs. `regexp =~ string`;
avoid using the receiver's identifier from the call-seq
because it does not appear in rendered HTML docs;
mention deprecation of Object#=~; fix some markup and typos.
2020-01-08 20:47:10 +01:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada 2b2030f265
Refined the warning message for $, and $;
[Bug #16438]
2019-12-20 15:09:23 +09:00
NARUSE, Yui b5fbefbf2c Added Symbol#start_with? and Symbol#end_with? method. [Feature #16348] 2019-11-28 23:49:28 +09:00
卜部昌平 7a9b2039b7 delete unused codes
Suppress compiler warnings.
2019-11-18 18:28:03 +09:00
Nobuyoshi Nakada 22c9504905
rb_tainted_str_new_with_enc is no longer used 2019-11-18 11:05:22 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平 c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
NARUSE, Yui bea322a352 Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"
This reverts commit 6ffc045a81.
2019-11-05 17:30:54 +09:00
zverok bddb31bb37 Documentation improvements for Ruby core
* Top-level `return`;
* Documentation for comments syntax;
* `rescue` inside blocks;
* Enhance `Object#to_enum` docs;
* Make `chomp:` option more obvious for `String#each_line` and
  `#lines`;
* Enhance `Proc#>>` and `#<<` docs;
* Enhance `Processs` class docs.
2019-10-26 14:58:08 +09:00
Lourens Naudé 9c24ce551d Reduce the minimum string buffer size from 127 to 63 bytes 2019-10-11 11:16:16 +09:00
卜部昌平 7e0ae1698d avoid overflow in integer multiplication
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`.  Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
2019-10-09 12:12:28 +09:00
Benoit Daloze 6ffc045a81 [EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol.
* Avoids extra allocations whenever calling Symbol#to_s.
* See [Feature #16150]
2019-09-26 10:23:02 +02:00
Alan Wu 47a234954a Rename STR_IS_SHARED_M to STR_BORROWED
Since the introduction of STR_SHARED_ROOT, the word "shared"
has become very overloaded with respect to String's internal
states. Use a different name for STR_IS_SHARED_M and explain
its purpose.
2019-09-26 15:30:18 +09:00
Alan Wu 93faa011d3 Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.

Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].

The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.

[Bug #16151]
2019-09-26 15:30:18 +09:00
Jeremy Evans 6f9b86616a Make Symbol#to_proc calls handle keyword arguments
Make rb_sym_proc_call take a flag for whether a keyword argument
is used, and use the new rb_funcall_with_block_kw function to
pass that information.
2019-09-05 17:47:12 -07:00
卜部昌平 3df37259d8 drop-in type check for rb_define_singleton_method
We can check the function pointer passed to
rb_define_singleton_method like how we do so in rb_define_method.
Doing so revealed many arity mismatches.
2019-08-29 18:34:09 +09:00
Nobuyoshi Nakada d5c33364e3
Fixed heap-use-after-free
* string.c (rb_str_sub_bang): retrieves a pointer to the
  replacement string buffer just before using it, for the case of
  replacement with the receiver string itself.  [Bug #16105]
2019-08-15 23:39:14 +09:00
git 132b7eb104 * expand tabs. [ci skip] 2019-08-15 08:16:14 +09:00
Jeremy Evans 082424ef58 Fold to lowercase intead of uppercase for String#casecmp
strcasecmp(3) and String#casecmp? both fold to lowercase.
2019-08-14 14:11:39 -07:00
Aaron Patterson 957bdfbab8
Update docs to use more natural English
Just a few updates to make the English sound a bit more natural
2019-08-12 12:21:37 -04:00
Yusuke Endoh 8d302c914c string.c (rb_str_sub, _gsub): improve the rdoc
This change:

* Added an explanation about back references except \n and \k<n>
  (\` \& \' \+ \0)
* Added an explanation about an escape (\\)
* Added some rdoc references
* Rephrased and clarified the reason why double escape is needed, added
  some examples, and moved the note to the last (because it is not
  specific to the method itself).
2019-08-12 23:28:35 +09:00
卜部昌平 b5146e375a
leafify opt_plus
Inspired by 346aa557b3

Closes: https://github.com/ruby/ruby/pull/2321
2019-08-06 20:59:19 +09:00
Takashi Kokubun 346aa557b3
Make opt_eq and opt_neq insns leaf
# Benchmark zero?

```
require 'benchmark/ips'

Numeric.class_eval do
  def ruby_zero?
    self == 0
  end
end

Benchmark.ips do |x|
  x.report('0.zero?') { 0.ruby_zero? }
  x.report('1.zero?') { 1.ruby_zero? }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  0.zero?: 21855445.5 i/s
  1.zero?: 21770817.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  1.zero?: 21958912.3 i/s
  0.zero?: 21881625.9 i/s - same-ish: difference falls within error

## JIT
The performance improves about 1.23x.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  0.zero?: 36343111.6 i/s
  1.zero?: 36295153.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  0.zero?: 44740467.2 i/s
  1.zero?: 44363616.1 i/s - same-ish: difference falls within error

# Benchmark str == str / str != str

```
# frozen_string_literal: true
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('a == a') { 'a' == 'a' }
  x.report('a == b') { 'a' == 'b' }
  x.report('a != a') { 'a' != 'a' }
  x.report('a != b') { 'a' != 'b' }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  a == a: 27286219.0 i/s
  a != a: 24892389.5 i/s - 1.10x  slower
  a == b: 23623635.8 i/s - 1.16x  slower
  a != b: 21800958.0 i/s - 1.25x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  a == a: 27224016.2 i/s
  a != a: 24490109.5 i/s - 1.11x  slower
  a == b: 23391052.4 i/s - 1.16x  slower
  a != b: 21811321.7 i/s - 1.25x  slower

## JIT
The performance improves on JIT a little.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  a == a: 42010674.7 i/s
  a != a: 38920311.2 i/s - same-ish: difference falls within error
  a == b: 32574262.2 i/s - 1.29x  slower
  a != b: 32099790.3 i/s - 1.31x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  a == a: 46902738.8 i/s
  a != a: 43097258.6 i/s - 1.09x  slower
  a == b: 35822018.4 i/s - 1.31x  slower
  a != b: 33377257.8 i/s - 1.41x  slower

This is needed towards Bug#15589.
Closes: https://github.com/ruby/ruby/pull/2318
2019-08-04 22:20:12 +09:00
Nobuyoshi Nakada 1d1f98d49c
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data.  [Bug #16024]
2019-07-28 07:33:21 +09:00
Nobuyoshi Nakada f1b76ea63c
Occupy match data
* string.c (rb_str_split_m): occupy match data not to be modified
  during yielding the block.  [Bug #16024]
2019-07-27 21:54:34 +09:00
Yusuke Endoh 43c337dfc1 string.c (str_succ): refactoring
Use more communicative variable name
2019-07-14 23:09:24 +09:00
Yusuke Endoh 3fd086ed56 string.c (str_succ): remove a unnecessary assignment
This change will suppress Coverity Scan warnings
2019-07-14 23:09:24 +09:00
git 9987296b8b * expand tabs. 2019-07-14 17:16:35 +09:00
Yusuke Endoh 934e6b2aeb Prefer `rb_error_arity` to `rb_check_arity` when it can be used 2019-07-14 17:16:19 +09:00
Jeremy Evans 0f283054e7 Check that String#scrub block does not modify receiver
Similar to the check used for String#gsub.  Can fix possible
segfault.

Fixes [Bug #15941]
2019-07-02 08:34:01 -07:00
Jeremy Evans 7582287eb2 Make String#-@ not freeze receiver if called on unfrozen subclass instance
rb_fstring behavior in this case is to freeze the receiver.  I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.

Fixes [Bug #15926]
2019-07-02 08:26:50 -07:00
git a88107c44d * expand tabs. 2019-06-29 10:17:37 +09:00
Nobuyoshi Nakada 2f6cc15cdb
Fixed String#grapheme_clusters with wide encodings
* string.c (get_reg_grapheme_cluster): make regexp from properly
  encoded sources fro wide-char encodings.  [Bug #15965]

* regparse.c (node_extended_grapheme_cluster): suppress false
  duplicated range warning for the time being.
2019-06-29 10:10:17 +09:00
John Hawthorn 04bc4c0662 Resize capacity for fstring
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.

    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...

(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)

Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.

    > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...

This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.

Closes: https://github.com/ruby/ruby/pull/2256
2019-06-26 15:01:48 +09:00
git 551ef27490 * expand tabs. 2019-06-21 22:48:50 +09:00
Nobuyoshi Nakada 8f51da5d41
Get rid of undefined behavior
* string.c (rb_str_sub_bang): str and repl can be same.
  [Bug #15946]
2019-06-21 22:42:35 +09:00
Nobuyoshi Nakada 8797f48373
New buffer for shared string
* string.c (rb_str_init): allocate new buffer if the string is
  shared.  [Bug #15937]
2019-06-19 14:39:19 +09:00
Nobuyoshi Nakada 28678997e4
Preserve the string content at self-copying
* string.c (rb_str_init): preserve the embedded content when
  self-copying with a capacity.  [Bug #15937]
2019-06-19 09:44:26 +09:00
Nobuyoshi Nakada 8b3774be3d
Fix memory leak
* string.c (str_make_independent_expand): free independent buffer.
  [Bug# 15935]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-18 13:40:04 +09:00