Граф коммитов

1760 Коммитов

Автор SHA1 Сообщение Дата
Nobuyoshi Nakada 842b0008c1 Skip broken strings as the locale encoding 2021-10-01 20:28:44 +09:00
Kazuhiro NISHIYAMA e0c6e8c64a
[DOC] Use `unpack1` instead of `unpack(template)[0]` [ci skip] 2021-09-23 09:20:00 +09:00
Nobuyoshi Nakada cbbda3e648
Adjust indent in string.c [ci skip] 2021-09-16 23:49:16 +09:00
S.H b8c3a84bdd
Refactor and Using RBOOL macro 2021-09-15 08:11:05 +09:00
Nobuyoshi Nakada cd829bb078 Remove printf family from the mjit header
Linking printf family functions makes mjit objects to link
unnecessary code.
2021-09-11 08:41:32 +09:00
卜部昌平 091faca99c include/ruby/internal/intern/string.h: add doygen
Must not be a bad idea to improve documents. [ci skip]
2021-09-10 20:00:06 +09:00
Peter Zhu 5d81554281 [Bug #18154] Fix memory leak in String#initialize
String#initialize can leak memory when called on a string that is marked
with STR_NOFREE because it does not unset the STR_NOFREE flag.
2021-09-08 10:20:12 -04:00
Nobuyoshi Nakada edf01d4e82
Treat NULL fake string as an empty string
And the NULL string must be of size 0.
2021-08-17 18:45:36 +09:00
Jeremy Evans 84bf4d2ce5 Term fill in String#{,l,r}strip! even when SHARABLE_MIDDLE_SUBSTRING
Each of these methods calls str_modify_keep_cr before
term filling, which should ensure the backing string
uses private memory, and therefore term filling should
not affect other strings.

Skipping the term filling was added in
a707ab4bc8.

Fixes [Bug #12540]
2021-08-11 13:40:49 +09:00
Peter Zhu c463a5e008 Fix indentation in string.c
7 spaces were used for 2 levels of indentation. This commit changes it
to use 8 spaces.
2021-08-03 16:39:02 -04:00
Troy Chance 7f4e86804d
Fix documentation of #<=> and #casecmp [ci skip]
Descriptions for return values of -1 and 1 were reversed.
2021-08-02 12:09:07 +09:00
S.H 378e8cdad6
Using RBOOL macro 2021-08-02 12:06:44 +09:00
Nobuyoshi Nakada eec45a93ef
Escape unprintable chars only, without surrounding quotes 2021-07-24 14:31:41 +09:00
S-H-GAMELINKS 9952e9358e Refactor rb_str_export and rb_str_export_locale function's 2021-07-07 12:31:43 +09:00
Nobuyoshi Nakada 94bd3bde81 Specify version to remove as bare numbers 2021-06-30 10:47:01 +09:00
Nobuyoshi Nakada 8118d435d0 rb_warn_deprecated_to_remove_at [Feature #17432]
At compilation time with RUBY_DEBUG enabled, check if the removal
version has been reached.
2021-06-30 10:47:01 +09:00
Nobuyoshi Nakada 391abc543c
Scan the coderange in the given encoding 2021-06-26 16:05:15 +09:00
Ketan Bhatt 2fb435b3ab Add Related link from String#hash to Object#hash
We came across a bug in our code because we assumed `String#hash` to be consistent across Ruby processes, which was incorrect.

Our search lead us to `Object#hash` which has the right warning that `String#hash` doesn't. We also noticed that a previous version of the documentation for `String#hash` pointed to `Object#hash` that was removed by https://github.com/ruby/ruby/pull/3565.
We think this removal might not be intended and just got missed amidst other changes.
2021-06-23 07:42:02 -07:00
Burdette Lamar c1741df1a1 What's Here for Numeric and Comparable 2021-06-21 10:38:16 -07:00
Nobuyoshi Nakada e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada 12f7ba5ed4
Make String#crypt ractor-safe 2021-04-13 12:05:31 +09:00
Nobuyoshi Nakada df7efdcb6b
Get rid of LONG_LONG redefinition 2021-04-12 22:47:07 +09:00
Jean Boussier 7e8a9af9db rb_enc_interned_str: handle autoloaded encodings
If called with an autoloaded encoding that was not yet
initialized, `rb_enc_interned_str` would crash with
a NULL pointer exception.

See: https://github.com/ruby/ruby/pull/4119#issuecomment-800189841
2021-03-22 21:37:48 +09:00
Jeremy Evans cfd162d535 Make String#{strip,lstrip}{,!} strip leading NUL bytes
The documentation already specifies that they strip whitespace
and defines whitespace to include null.

This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.

Fixes [Bug #17467]
2021-02-20 11:17:47 +09:00
Sarun Rattanasiri 1a3b68e7c1
correct the result of casecmp? examples [ci skip] 2021-02-12 06:56:51 +09:00
Nobuyoshi Nakada 81f17857a7
Merged too-short salt conditions instead of UNREACHABLE_RETURN 2021-02-11 22:25:31 +09:00
S-H-GAMELINKS 9e66c511ff Fix 404 link 2021-02-11 13:33:21 +09:00
S-H-GAMELINKS 90f008f569 Remove unsued str_new_shared function declaration 2021-02-04 16:25:55 +09:00
Nobuyoshi Nakada 1f5b8f7084
Constified pointers in str_casecmp 2021-01-30 20:08:18 +09:00
Burdette Lamar d7a844cb08
Fix broken link in RDoc for String (#4123)
Link was correct; its target was incorrect; now fixed.
2021-01-26 11:22:13 -06:00
Burdette Lamar 6e44de752e
What's Here for String RDoc (#4093)
* What's Here for String RDoc
2021-01-22 15:01:09 -06:00
Eric Schneider 11b8bb99e6 Minor grammar fix in String#chomp documentation 2020-12-30 01:25:00 -05:00
Nobuyoshi Nakada 6083fed366 Use `size_t` for `RSTRING_LEN` in String#count
https://hackerone.com/reports/1042722
2020-12-26 01:40:51 +09:00
zverok 4728c0d900 Add Symbol#name and freezing explanation to #to_s 2020-12-21 19:22:38 -05:00
Nobuyoshi Nakada c7a5cc2c30
Replaced magic numbers tr table 2020-12-21 23:45:38 +09:00
Jeremy Evans 05313c914b Use category: :deprecated in warnings that are related to deprecation
Also document that both :deprecated and :experimental are supported
:category option values.

The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.

Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).

Add assert_deprecated_warn to test assertions.  Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
2020-12-18 09:54:11 -08:00
Koichi Sasada 344ec26a99 tuning trial: newobj with current ec
Passing current ec can improve performance of newobj. This patch
tries it for Array and String literals ([] and '').
2020-12-07 08:28:36 +09:00
Koichi Sasada 764de7566f should not use rb_str_modify(), too
Same as 8247b8edde, should not use rb_str_modify() here.

https://bugs.ruby-lang.org/issues/17343#change-88858
2020-12-01 18:16:23 +09:00
Jean Boussier 6bef49427a Fix rb_interned_str_* functions to not assume static strings
Fixes [Feature #13381]

When passed a `fake_str`, `register_fstring` would create new strings
with `str_new_static`. That's not what was expected, and answer
almost no use cases.
2020-11-30 17:33:28 +09:00
Nobuyoshi Nakada 02c32b2e92
Get rid of allocation when the capacity is small 2020-11-29 15:01:41 +09:00
Takashi Kokubun 3f8c60cf09
Remove obsoleted str_new_empty
since 58325daae3.

../string.c:1339:1: warning: ‘str_new_empty’ defined but not used [-Wunused-function]
 1339 | str_new_empty(VALUE str)
      | ^~~~~~~~~~~~~
2020-11-20 22:22:29 -08:00
Jeremy Evans 58325daae3 Make String methods return String instances when called on a subclass instance
This modifies the following String methods to return String instances
instead of subclass instances:

* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase

This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.

Some string methods were left to return subclass instances:

* String#+@
* String#-@

Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.

Fixes [#10845]
2020-11-20 16:30:23 -08:00
Jean Boussier ef19fb111a Expose the rb_interned_str_* family of functions
Fixes [Feature #13381]
2020-11-17 09:39:25 +09:00
Alan Wu 520b86caf1 Move variable closer to usage 2020-10-30 19:34:41 -04:00
Stefan Stüben 8c2e5bbf58 Don't redefine #rb_intern over and over again 2020-10-21 12:45:18 +09:00
Burdette Lamar 33776598f7
Enhanced RDoc for String#insert (#3643)
* Enhanced RDoc for String#insert
2020-10-08 15:35:13 -05:00
Burdette Lamar 4bc6190a34
Enhanced RDoc for String#[] (#3607)
* Enhanced RDoc for String#[]
2020-09-30 14:58:12 -05:00
Burdette Lamar 48b94b7919
Enhanced RDoc for String#upto (#3603)
* Enhanced RDoc for String#upto
2020-09-29 19:15:39 -05:00
Burdette Lamar 0555bd8435
Enhanced RDoc for String#succ! (#3596)
* Enhanced RDoc for String#succ!
2020-09-28 11:58:39 -05:00
Burdette Lamar 8b42474a26
Enhanced RDoc for String#succ (#3590)
* Enhanced RDoc for String#succ
2020-09-25 15:13:10 -05:00
Burdette Lamar 83ff0f74bf
Enhanced RDoc for String#match? (#3576)
* Enhanced RDoc for String#match?
2020-09-24 18:38:11 -05:00
Burdette Lamar 38385d28df
Enhanced RDoc for String (#3574)
Methods:

    =~
    match
2020-09-24 13:23:26 -05:00
Burdette Lamar 6fe2a9fcda
Enhanced RDoc for String (#3569)
Makes some methods doc compliant with https://github.com/ruby/ruby/blob/master/doc/method_documentation.rdoc. Also, other minor revisions to make more consistent.
Methods:

    ==
    ===
    eql?
    <=>
    casecmp
    casecmp?
    index
    rindex
2020-09-24 10:55:43 -05:00
Kazuhiro NISHIYAMA 9a8f5f0a9a
Fix call-seq [ci skip]
`encoding` can be not only an encoding name, but also an Encoding object.

```
s = String.new('foo', encoding: Encoding::US_ASCII)
s.encoding # => #<Encoding:US-ASCII>
```
2020-09-23 11:44:06 +09:00
Burdette Lamar b904b72960
Enhanced RDoc for String (#3565)
Makes some methods doc compliant with https://github.com/ruby/ruby/blob/master/doc/method_documentation.rdoc. Also, other minor revisions to make more consistent.
Methods:

    try_convert
    +string
    -string
    concat
    <<
    prepend
    hash
2020-09-22 16:32:17 -05:00
Burdette Lamar c6c5d4b3fa
Comply with guide for method doc: string.c (#3528)
Methods:

    ::new
    #length
    #bytesize
    #empty?
    #+
    #*
    #%
2020-09-21 11:27:54 -05:00
Koichi Sasada dd5db6f5fe sync fstring_table for deletion
Ractors can access this table simultaneously so we need to sync
accesses.
2020-09-18 14:17:49 +09:00
Koichi Sasada e81d7189a0 sync fstring pool
fstring pool should be sync with other Ractors.
2020-09-15 00:04:59 +09:00
Soutaro Matsumoto f0ddbd502c
Let String#slice! return nil (#3533)
Returns `nil` instead of an empty string when non-integer number is given (to make it 2.7 compatible).
2020-09-11 14:34:10 +09:00
Nobuyoshi Nakada eb67c603ca
Added Symbol#name
https://bugs.ruby-lang.org/issues/16150#change-87446
2020-09-04 22:18:59 +09:00
Burdette Lamar 51525557fd
Partial compliance with doc/method_documentation.rdoc in string.c (#3436)
Removes references to *-convertible thingies.
2020-08-20 12:09:49 -05:00
Jean Boussier aaf0e33c0a register_fstring: avoid duping the passed string when possible
If the passed string is frozen, bare and not shared, then there
is no need to duplicate it.

Ref: 4ab69ebbd7
Ref: https://bugs.ruby-lang.org/issues/11386
2020-08-19 08:08:56 -07:00
Nobuyoshi Nakada d75433ae19
[DOC] fixed a missing markup 2020-08-15 14:17:02 +09:00
Kasumi Hanazuki 014a4fda54 rb_str_{index,rindex}_m: Handle /\K/ in pattern
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.

```
# without patch
"abcdbce".index(/b\Kc/)  # => 1
"abcdbce".rindex(/b\Kc/)  # => 4
```

This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".index(/b\Kc/)  # => 2
"abcdbce".rindex(/b\Kc/)  # => 5
```

Fixes [Bug #17118]
2020-08-13 20:54:12 +09:00
Kasumi Hanazuki 5d71eed1a7 rb_str_{partition,rpartition}_m: Handle /\K/ in pattern
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.

```
# without patch
"abcdbce".partition(/b\Kc/)  # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcd", "c", "ce"]
```

This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".partition(/b\Kc/)  # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcdb", "c", "e"]
```

As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.

Fixes [Bug #17119]
2020-08-13 20:50:50 +09:00
Kasumi Hanazuki e79cdcf61b string.c(rb_str_split_m): Handle /\K/ correctly
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.

Fixes [Bug #17113]
2020-08-12 10:01:39 +09:00
Nobuyoshi Nakada 0ca6b973e8
Removed non-ASCII code to suppress warnings by localized compilers 2020-08-10 19:46:13 +09:00
Nobuyoshi Nakada fac62f094e
Adjust indent 2020-08-10 16:35:42 +09:00
Kazuhiro NISHIYAMA 946cd6c534
Use https instead of http 2020-07-28 19:51:54 +09:00
卜部昌平 de3e931df7 add UNREACHABLE_RETURN
Not every compilers understand that rb_raise does not return.  When a
function does not end with a return statement, such compilers can issue
warnings.  We would better tell them about reachabilities.
2020-06-29 11:05:41 +09:00
卜部昌平 5f926b2b00 rb_str_partition: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 e3d821a36c rb_str_crypt: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a5ae9aebbc trnext: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c7a4073154 chompped_length: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 fdae2063fb get_pat_quoted: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 673ddea934 get_pat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 31e5d138d7 rb_str_slice_bang: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 19f2cabed8 rb_str_aset: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 841eea4bcb rb_str_subpat_set: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 0358846f8c rb_str_update: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 d49924ed81 rb_str_match: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c422fc4bbc rb_str_rindex_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c29ec1ef1a rb_str_index_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 6df790f22e rb_enc_cr_str_buf_cat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
S-H-GAMELINKS 0fcb2dd51d
add static modifier for rb_str_ord func 2020-05-27 11:26:44 +09:00
Kazuhiro NISHIYAMA fa7addebb4
Fix typos [ci skip] 2020-05-17 21:01:29 +09:00
Burdette Lamar cc525d764b
[ci skip] Enhanced rdoc for String.new (#3067)
* Per @nobu review

* Enhanced rdoc for String.new

* Respond to review
2020-05-15 14:14:50 -07:00
Nobuyoshi Nakada 693f7ab315 Optimize String#split
Optimized `String#split` with `/ /` (single space regexp) as
simple string splitting.  [ruby-core:98272]

|               |compare-ruby|built-ruby|
|:--------------|-----------:|---------:|
|re_space-1     |    432.786k|    1.539M|
|               |           -|     3.56x|
|re_space-10    |     76.231k|  191.547k|
|               |           -|     2.51x|
|re_space-100   |      8.152k|   19.557k|
|               |           -|     2.40x|
|re_space-1000  |     837.405|    2.022k|
|               |           -|     2.41x|

ruby-core:98272: https://bugs.ruby-lang.org/issues/15771#change-85511
2020-05-12 19:58:58 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
S.H 034b8472ba
remove unused rb_str_clear define (#3059) 2020-04-25 20:39:44 -07:00
Nobuyoshi Nakada d4215dafea
Use UNREACHABLE_RETURN for non-void function 2020-04-16 17:59:55 +09:00
Kazuhiro NISHIYAMA c79e3a5957
Add {Regexp,String}#match with block to call-seq [ci skip] 2020-04-14 12:39:16 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada 8a7e0aaaef
Warn non-nil `$/` [Feature #14240] 2020-02-23 13:37:40 +09:00
Nobuyoshi Nakada fce667ed08
Get rid of warnings/exceptions at cleanup
After the encoding index instance variable is removed when all
instance variables are removed in `obj_free`, then `rb_str_free`
causes uninitialized instance variable warning and nil-to-integer
conversion exception.  Both cases result in object allocation
during GC, and crashes.
2020-02-13 12:46:48 +09:00
Nobuyoshi Nakada 160d3165eb
Copy non-inlined encoding index 2020-02-12 20:09:57 +09:00
Nobuyoshi Nakada bdf3032e35
Make temporary lock string encoding free
As a temporary lock string is hidden, it can not have instance
variables, including non-inlined encoding index.
2020-02-12 19:58:22 +09:00
Nobuyoshi Nakada 05229cef45
Improve `String#slice!` performance
Instead of searching twice to extract and to delete, extract and
delete the found position at the first search.

This makes faster nearly twice, for regexps and strings.

|              |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|regexp-short  |      2.143M|    3.918M|
|regexp-long   |    105.162k|  205.410k|
|string-short  |      3.789M|    7.964M|
|string-long   |      1.301M|    2.457M|
2020-01-31 17:12:05 +09:00
Nobuyoshi Nakada 0dd6f020fc
Make `empty_string` a fake string 2020-01-31 14:24:07 +09:00
Jean Boussier 52dc0632fa Avoid allocating a temporary empty string in String#slice! 2020-01-31 09:22:20 +09:00
Nobuyoshi Nakada aefb13eb63
Added rb_warn_deprecated_to_remove
Warn the deprecation and future removal, with obeying the warning
flag.
2020-01-23 21:42:15 +09:00
Jeremy Evans e18b817b1f Make taint warnings non-verbose instead of verbose 2020-01-22 11:19:13 -08:00
Nobuyoshi Nakada fce54a5404
Fix `String#partition`
Split with the matched part when the separator matches the empty
part at the beginning.  [Bug #11014]
2020-01-16 15:36:38 +09:00
Marcus Stollsteimer 1d09acd82b [DOC] Improve docs for String#match
Fix invalid code to make it syntax highlighted; other small fixes.
2020-01-08 20:53:31 +01:00
Marcus Stollsteimer f74021e12b Improve docs for String#=~
Move existing example to the corresponding paragraph and
add an example for `string =~ regexp` vs. `regexp =~ string`;
avoid using the receiver's identifier from the call-seq
because it does not appear in rendered HTML docs;
mention deprecation of Object#=~; fix some markup and typos.
2020-01-08 20:47:10 +01:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada 2b2030f265
Refined the warning message for $, and $;
[Bug #16438]
2019-12-20 15:09:23 +09:00
NARUSE, Yui b5fbefbf2c Added Symbol#start_with? and Symbol#end_with? method. [Feature #16348] 2019-11-28 23:49:28 +09:00
卜部昌平 7a9b2039b7 delete unused codes
Suppress compiler warnings.
2019-11-18 18:28:03 +09:00
Nobuyoshi Nakada 22c9504905
rb_tainted_str_new_with_enc is no longer used 2019-11-18 11:05:22 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平 c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
NARUSE, Yui bea322a352 Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"
This reverts commit 6ffc045a81.
2019-11-05 17:30:54 +09:00
zverok bddb31bb37 Documentation improvements for Ruby core
* Top-level `return`;
* Documentation for comments syntax;
* `rescue` inside blocks;
* Enhance `Object#to_enum` docs;
* Make `chomp:` option more obvious for `String#each_line` and
  `#lines`;
* Enhance `Proc#>>` and `#<<` docs;
* Enhance `Processs` class docs.
2019-10-26 14:58:08 +09:00
Lourens Naudé 9c24ce551d Reduce the minimum string buffer size from 127 to 63 bytes 2019-10-11 11:16:16 +09:00
卜部昌平 7e0ae1698d avoid overflow in integer multiplication
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`.  Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
2019-10-09 12:12:28 +09:00
Benoit Daloze 6ffc045a81 [EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol.
* Avoids extra allocations whenever calling Symbol#to_s.
* See [Feature #16150]
2019-09-26 10:23:02 +02:00
Alan Wu 47a234954a Rename STR_IS_SHARED_M to STR_BORROWED
Since the introduction of STR_SHARED_ROOT, the word "shared"
has become very overloaded with respect to String's internal
states. Use a different name for STR_IS_SHARED_M and explain
its purpose.
2019-09-26 15:30:18 +09:00
Alan Wu 93faa011d3 Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.

Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].

The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.

[Bug #16151]
2019-09-26 15:30:18 +09:00
Jeremy Evans 6f9b86616a Make Symbol#to_proc calls handle keyword arguments
Make rb_sym_proc_call take a flag for whether a keyword argument
is used, and use the new rb_funcall_with_block_kw function to
pass that information.
2019-09-05 17:47:12 -07:00
卜部昌平 3df37259d8 drop-in type check for rb_define_singleton_method
We can check the function pointer passed to
rb_define_singleton_method like how we do so in rb_define_method.
Doing so revealed many arity mismatches.
2019-08-29 18:34:09 +09:00
Nobuyoshi Nakada d5c33364e3
Fixed heap-use-after-free
* string.c (rb_str_sub_bang): retrieves a pointer to the
  replacement string buffer just before using it, for the case of
  replacement with the receiver string itself.  [Bug #16105]
2019-08-15 23:39:14 +09:00
git 132b7eb104 * expand tabs. [ci skip] 2019-08-15 08:16:14 +09:00
Jeremy Evans 082424ef58 Fold to lowercase intead of uppercase for String#casecmp
strcasecmp(3) and String#casecmp? both fold to lowercase.
2019-08-14 14:11:39 -07:00
Aaron Patterson 957bdfbab8
Update docs to use more natural English
Just a few updates to make the English sound a bit more natural
2019-08-12 12:21:37 -04:00
Yusuke Endoh 8d302c914c string.c (rb_str_sub, _gsub): improve the rdoc
This change:

* Added an explanation about back references except \n and \k<n>
  (\` \& \' \+ \0)
* Added an explanation about an escape (\\)
* Added some rdoc references
* Rephrased and clarified the reason why double escape is needed, added
  some examples, and moved the note to the last (because it is not
  specific to the method itself).
2019-08-12 23:28:35 +09:00
卜部昌平 b5146e375a
leafify opt_plus
Inspired by 346aa557b3

Closes: https://github.com/ruby/ruby/pull/2321
2019-08-06 20:59:19 +09:00
Takashi Kokubun 346aa557b3
Make opt_eq and opt_neq insns leaf
# Benchmark zero?

```
require 'benchmark/ips'

Numeric.class_eval do
  def ruby_zero?
    self == 0
  end
end

Benchmark.ips do |x|
  x.report('0.zero?') { 0.ruby_zero? }
  x.report('1.zero?') { 1.ruby_zero? }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  0.zero?: 21855445.5 i/s
  1.zero?: 21770817.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  1.zero?: 21958912.3 i/s
  0.zero?: 21881625.9 i/s - same-ish: difference falls within error

## JIT
The performance improves about 1.23x.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  0.zero?: 36343111.6 i/s
  1.zero?: 36295153.3 i/s - same-ish: difference falls within error

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  0.zero?: 44740467.2 i/s
  1.zero?: 44363616.1 i/s - same-ish: difference falls within error

# Benchmark str == str / str != str

```
# frozen_string_literal: true
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('a == a') { 'a' == 'a' }
  x.report('a == b') { 'a' == 'b' }
  x.report('a != a') { 'a' != 'a' }
  x.report('a != b') { 'a' != 'b' }
  x.compare!
end
```

## VM
No significant impact for VM.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux]

  a == a: 27286219.0 i/s
  a != a: 24892389.5 i/s - 1.10x  slower
  a == b: 23623635.8 i/s - 1.16x  slower
  a != b: 21800958.0 i/s - 1.25x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux]

  a == a: 27224016.2 i/s
  a != a: 24490109.5 i/s - 1.11x  slower
  a == b: 23391052.4 i/s - 1.16x  slower
  a != b: 21811321.7 i/s - 1.25x  slower

## JIT
The performance improves on JIT a little.

### before
ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux]

  a == a: 42010674.7 i/s
  a != a: 38920311.2 i/s - same-ish: difference falls within error
  a == b: 32574262.2 i/s - 1.29x  slower
  a != b: 32099790.3 i/s - 1.31x  slower

### after
ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux]

  a == a: 46902738.8 i/s
  a != a: 43097258.6 i/s - 1.09x  slower
  a == b: 35822018.4 i/s - 1.31x  slower
  a != b: 33377257.8 i/s - 1.41x  slower

This is needed towards Bug#15589.
Closes: https://github.com/ruby/ruby/pull/2318
2019-08-04 22:20:12 +09:00
Nobuyoshi Nakada 1d1f98d49c
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data.  [Bug #16024]
2019-07-28 07:33:21 +09:00
Nobuyoshi Nakada f1b76ea63c
Occupy match data
* string.c (rb_str_split_m): occupy match data not to be modified
  during yielding the block.  [Bug #16024]
2019-07-27 21:54:34 +09:00
Yusuke Endoh 43c337dfc1 string.c (str_succ): refactoring
Use more communicative variable name
2019-07-14 23:09:24 +09:00
Yusuke Endoh 3fd086ed56 string.c (str_succ): remove a unnecessary assignment
This change will suppress Coverity Scan warnings
2019-07-14 23:09:24 +09:00
git 9987296b8b * expand tabs. 2019-07-14 17:16:35 +09:00
Yusuke Endoh 934e6b2aeb Prefer `rb_error_arity` to `rb_check_arity` when it can be used 2019-07-14 17:16:19 +09:00
Jeremy Evans 0f283054e7 Check that String#scrub block does not modify receiver
Similar to the check used for String#gsub.  Can fix possible
segfault.

Fixes [Bug #15941]
2019-07-02 08:34:01 -07:00
Jeremy Evans 7582287eb2 Make String#-@ not freeze receiver if called on unfrozen subclass instance
rb_fstring behavior in this case is to freeze the receiver.  I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.

Fixes [Bug #15926]
2019-07-02 08:26:50 -07:00
git a88107c44d * expand tabs. 2019-06-29 10:17:37 +09:00
Nobuyoshi Nakada 2f6cc15cdb
Fixed String#grapheme_clusters with wide encodings
* string.c (get_reg_grapheme_cluster): make regexp from properly
  encoded sources fro wide-char encodings.  [Bug #15965]

* regparse.c (node_extended_grapheme_cluster): suppress false
  duplicated range warning for the time being.
2019-06-29 10:10:17 +09:00
John Hawthorn 04bc4c0662 Resize capacity for fstring
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.

    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...

(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)

Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.

    > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...

This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.

Closes: https://github.com/ruby/ruby/pull/2256
2019-06-26 15:01:48 +09:00
git 551ef27490 * expand tabs. 2019-06-21 22:48:50 +09:00
Nobuyoshi Nakada 8f51da5d41
Get rid of undefined behavior
* string.c (rb_str_sub_bang): str and repl can be same.
  [Bug #15946]
2019-06-21 22:42:35 +09:00
Nobuyoshi Nakada 8797f48373
New buffer for shared string
* string.c (rb_str_init): allocate new buffer if the string is
  shared.  [Bug #15937]
2019-06-19 14:39:19 +09:00
Nobuyoshi Nakada 28678997e4
Preserve the string content at self-copying
* string.c (rb_str_init): preserve the embedded content when
  self-copying with a capacity.  [Bug #15937]
2019-06-19 09:44:26 +09:00
Nobuyoshi Nakada 8b3774be3d
Fix memory leak
* string.c (str_make_independent_expand): free independent buffer.
  [Bug# 15935]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-18 13:40:04 +09:00
git c770c98ac4 * expand tabs. 2019-06-18 12:21:38 +09:00
Alan Wu 9dec4e8fc3
String#b: Don't depend on dependent string
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.

The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.

```ruby
a = ('j' * 24).b.b
eval('', binding, a)

p a
4.times { GC.start }
p a
```

 - string.c (str_replace_shared_without_enc): when given a
   dependent string, depend on the root of the dependent
   string.

[Bug #15934]
2019-06-18 12:18:13 +09:00
Nobuyoshi Nakada 53e9908d8a
Fix memory leak
* string.c (str_replace_shared_without_enc): free previous buffer
  before replaced.

* parse.y (gettable): make sure in advance that the `__FILE__`
  object shares a fstring, to get rid of replacement with the
  fstring later.
  TODO: this hack may be needed in other places.

[Bug #15916]

Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
2019-06-16 23:51:22 +09:00
Nobuyoshi Nakada d2003a6d39
Symbol just represents a name 2019-05-14 00:30:08 +09:00
Alan Wu c06ddfee87
str_duplicate: Don't share with a frozen shared string
This is a follow up for 3f9562015e.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.

Such string looks like:

```
 --------                    -----------------
 | root | ------ owns -----> | root's buffer |
 --------                    -----------------
     ^                             ^   ^
 -----------                       |   |
 | shared1 | ------ references -----   |
 -----------                           |
     ^                                 |
 -----------                           |
 | shared2 | ------ references ---------
 -----------
```

This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:

```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2);  /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
    str_make_independent(str);  /* no frozen check */
}
```

If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:

```
 -----------                         --------------------
 | shared1 | -------- owns --------> | shared1's buffer |
 -----------                         --------------------
      ^
      |
 -----------                         -------------------------
 | shared2 | ------ references ----> | root's buffer (freed) |
 -----------                         -------------------------
```

Here is a reproduction script for the situation this commit fixes.

```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```

 - string.c (str_duplicate): always share with the root string when
   the original is a shared string.
 - test_rb_str_dup.rb: specifically test `rb_str_dup` to make
   sure it does not try to share with a shared string.

[Bug #15792]

Closes: https://github.com/ruby/ruby/pull/2159
2019-05-09 10:04:19 +09:00