Adds to doc for String.new, also making it compliant with documentation_guide.rdoc.
Fixes some broken links in io.c (that I failed to correct yesterday).
Method references is not only able to be marked up as code, also
reflects `--show-hash` option.
The bug that prevented the old rdoc from correctly parsing these
methods was fixed last month.
Treats:
#chars
#codepoints
#each_char
#each_codepoint
#each_grapheme_cluster
#grapheme_clusters
Also, corrects a passage in #unicode_normalize that mentioned module UnicodeNormalize, whose doc (:nodoc:, actually) says not to mention it.
As @peterzhu2118 and @duerst have pointed out, putting string method's RDoc into doc/ (which allows non-ASCII in examples) makes the "click to toggle source" feature not work for that method.
This PR moves the primary method doc back into string.c, then includes RDoc from doc/string/*.rdoc, and also removes doc/string.rdoc.
The affected methods are:
::new
#bytes
#each_byte
#each_line
#split
The call-seq is in string.c because it works there; it did not work when the call-seq is in doc/string/*.rdoc.
This PR also updates the relevant guidance in doc/documentation_guide.rdoc.
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* String#getbyte returns `nil` if `index` is out of range.
* Add String#getbyte example with nil output.
* Modify String#getbyte example to use negative index.
I used this regex:
([A-Za-z]+)\.html#(?:class|module)-[A-Za-z]+-label-([A-Za-z0-9\-\+]+)
And performed a global find & replace for this:
rdoc-ref:$1@$2
Adds file doc/case_mapping.rdoc, which describes case mapping and provides a link target that methods doc can link to.
Revises:
String#capitalize
String#capitalize!
String#casecmp
String#casecmp?
String#downcase
String#downcase!
String#swapcase
String#swapcase!
String#upcase
String#upcase!
Symbol#capitalize
Symbol#casecmp
Symbol#casecmp?
Symbol#downcase
Symbol#swapcase
Symbol#upcase
This commit adds a Ractor cache for every size pool. Previously, all VWA
allocated objects used the slowpath and locked the VM.
On a micro-benchmark that benchmarks String allocation:
VWA turned off:
29.196591 0.889709 30.086300 ( 9.434059)
VWA before this commit:
29.279486 41.477869 70.757355 ( 12.527379)
VWA after this commit:
16.782903 0.557117 17.340020 ( 4.255603)
Also, check if a suffix is empty, to guarantee the assumption of
`onigenc_get_left_adjust_char_head` that `*s` is always accessible,
even in the case of `SHARABLE_MIDDLE_SUBSTRING`.
Each of these methods calls str_modify_keep_cr before
term filling, which should ensure the backing string
uses private memory, and therefore term filling should
not affect other strings.
Skipping the term filling was added in
a707ab4bc8.
Fixes [Bug #12540]
We came across a bug in our code because we assumed `String#hash` to be consistent across Ruby processes, which was incorrect.
Our search lead us to `Object#hash` which has the right warning that `String#hash` doesn't. We also noticed that a previous version of the documentation for `String#hash` pointed to `Object#hash` that was removed by https://github.com/ruby/ruby/pull/3565.
We think this removal might not be intended and just got missed amidst other changes.
The documentation already specifies that they strip whitespace
and defines whitespace to include null.
This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.
Fixes [Bug #17467]
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Fixes [Feature #13381]
When passed a `fake_str`, `register_fstring` would create new strings
with `str_new_static`. That's not what was expected, and answer
almost no use cases.
since 58325daae3.
../string.c:1339:1: warning: ‘str_new_empty’ defined but not used [-Wunused-function]
1339 | str_new_empty(VALUE str)
| ^~~~~~~~~~~~~
This modifies the following String methods to return String instances
instead of subclass instances:
* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase
This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.
Some string methods were left to return subclass instances:
* String#+@
* String#-@
Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.
Fixes [#10845]
`encoding` can be not only an encoding name, but also an Encoding object.
```
s = String.new('foo', encoding: Encoding::US_ASCII)
s.encoding # => #<Encoding:US-ASCII>
```
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.
```
# without patch
"abcdbce".index(/b\Kc/) # => 1
"abcdbce".rindex(/b\Kc/) # => 4
```
This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".index(/b\Kc/) # => 2
"abcdbce".rindex(/b\Kc/) # => 5
```
Fixes [Bug #17118]
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.
```
# without patch
"abcdbce".partition(/b\Kc/) # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcd", "c", "ce"]
```
This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".partition(/b\Kc/) # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcdb", "c", "e"]
```
As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.
Fixes [Bug #17119]
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.
Fixes [Bug #17113]
Not every compilers understand that rb_raise does not return. When a
function does not end with a return statement, such compilers can issue
warnings. We would better tell them about reachabilities.
After the encoding index instance variable is removed when all
instance variables are removed in `obj_free`, then `rb_str_free`
causes uninitialized instance variable warning and nil-to-integer
conversion exception. Both cases result in object allocation
during GC, and crashes.
Instead of searching twice to extract and to delete, extract and
delete the found position at the first search.
This makes faster nearly twice, for regexps and strings.
| |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|regexp-short | 2.143M| 3.918M|
|regexp-long | 105.162k| 205.410k|
|string-short | 3.789M| 7.964M|
|string-long | 1.301M| 2.457M|
Move existing example to the corresponding paragraph and
add an example for `string =~ regexp` vs. `regexp =~ string`;
avoid using the receiver's identifier from the call-seq
because it does not appear in rendered HTML docs;
mention deprecation of Object#=~; fix some markup and typos.
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.
There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.
This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`. Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
Since the introduction of STR_SHARED_ROOT, the word "shared"
has become very overloaded with respect to String's internal
states. Use a different name for STR_IS_SHARED_M and explain
its purpose.
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.
Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].
The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.
[Bug #16151]