Граф коммитов

1889 Коммитов

Автор SHA1 Сообщение Дата
Jean Boussier 65122d09d5 [Feature #18595] Alias String#-@ as String#dedup 2022-05-20 11:31:59 -07:00
Nobuyoshi Nakada 5d45afdbbf
[DOC] Move the documentations of moved Symbol methods 2022-04-14 11:17:37 +09:00
Burdette Lamar dfdc03248f
[DOC] Enhanced RDoc for Symbol (#5796)
Treats:
    #[]
    #length
    #empty?
    #upcase
    #downcase
    #capitalize
    #swapcase
    #start_with?
    #end_with?
    #encoding
    ::all_symbols
2022-04-13 13:45:18 -05:00
Nobuyoshi Nakada 7e97ebb6eb
Enforce literals on the second arguments 2022-04-13 18:33:34 +09:00
Burdette Lamar b21026cb1a
Enhanced RDoc for Symbol (#5795)
Treats:

    #==
    #inspect
    #name
    #to_s
    #to_sym
    #to_proc
    #succ
    #<=>
    #casecmp
    #casecmp?
    #=~
    #match
    #match?
2022-04-12 17:27:18 -05:00
Burdette Lamar 70415071e8
Fix some RDoc links (#5778) 2022-04-08 14:25:38 -05:00
Burdette Lamar 9ca3d537b9
All-in-one RDoc for class String (#5777) 2022-04-07 14:29:04 -05:00
Burdette Lamar 717b20ee30
[DOC] Enhanced RDoc for string slices (#5769)
Creates file doc/string/slices.rdoc that the string slicing methods can link to.
2022-04-06 15:47:22 -05:00
Burdette Lamar 4a4485adbd
Enhanced RDoc for String#index (#5759) 2022-04-04 14:18:10 -05:00
Burdette Lamar 0b0ae583f4
[DOC] Enhanced RDoc for String (#5753)
Treats:
    #length
    #bytesize
2022-04-03 10:09:34 -05:00
Burdette Lamar 7be4d900f0
[DOC] Enhanced RDoc for String (#5751)
Adds to doc for String.new, also making it compliant with documentation_guide.rdoc.
    Fixes some broken links in io.c (that I failed to correct yesterday).
2022-04-02 14:26:49 -05:00
Burdette Lamar 056b7a8633
[DOC] Enhanced RDoc for String (#5742)
Treats:
    #force_encoding
    #b
    #valid_encoding?
    #ascii_only?
    #scrub
    #scrub!
    #unicode_normalized?
Plus a couple of minor tweaks.
2022-03-31 15:09:25 -05:00
Burdette Lamar ffcdbedbfb
Repaired What's Here sections for Range, String, Symbol, Struct (#5735)
Repaired What's Here sections for Range, String, Symbol, Struct.
2022-03-30 13:46:24 -05:00
Burdette Lamar b257034ae5
[DOC] Enhanced RDoc for String (#5730)
Treats:

    #start_with?
    #end_with?
    #delete_prefix
    #delete_prefix!
    #delete_suffix
    #delete_suffix!
2022-03-29 09:54:29 -05:00
Burdette Lamar 5525e47a0b
[DOC] Enhanced RDoc for String (#5726)
Treats:

    #ljust
    #rjust
    #center
    #partition
    #rpartition
2022-03-28 15:49:18 -05:00
Burdette Lamar d52cf1013f
[DOC] Enhanced RDoc for String (#5724)
Treats:

    #scan
    #hex
    #oct
    #crypt
    #ord
    #sum
2022-03-27 14:45:14 -05:00
Nobuyoshi Nakada 1b0f05168d
[DOC] Fix references to unary operator 2022-03-27 11:24:06 +09:00
Burdette Lamar e699e2d9bf
Enhanced RDoc for String (#5723)
Treats:

    #lstrip
    #lstrip!
    #rstrip
    #rstrip!
    #strip
    #strip!

Adds section Whitespace in Strings.
2022-03-26 12:42:44 -05:00
Nobuyoshi Nakada 300f4677c9
[DOC] Use simple references to operator methods
Method references is not only able to be marked up as code, also
reflects `--show-hash` option.
The bug that prevented the old rdoc from correctly parsing these
methods was fixed last month.
2022-03-26 21:13:16 +09:00
Burdette Lamar 465edb96f0
[DOC] Enhanced RDoc for String (#5707)
Treated:

    #chomp
    #chomp!
    #chop
    #chop!
2022-03-24 19:40:58 -05:00
Burdette Lamar 0140e6c41e
[DOC] Enhanced RDoc for String (#5685)
Treats:

    #chars
    #codepoints
    #each_char
    #each_codepoint
    #each_grapheme_cluster
    #grapheme_clusters

Also, corrects a passage in #unicode_normalize that mentioned module UnicodeNormalize, whose doc (:nodoc:, actually) says not to mention it.
2022-03-22 14:51:05 -05:00
Burdette Lamar c129b6119d
[DOC] Use RDoc inclusions in string.c (#5683)
As @peterzhu2118 and @duerst have pointed out, putting string method's RDoc into doc/ (which allows non-ASCII in examples) makes the "click to toggle source" feature not work for that method.

This PR moves the primary method doc back into string.c, then includes RDoc from doc/string/*.rdoc, and also removes doc/string.rdoc.

The affected methods are:

    ::new
    #bytes
    #each_byte
    #each_line
    #split

The call-seq is in string.c because it works there; it did not work when the call-seq is in doc/string/*.rdoc.

This PR also updates the relevant guidance in doc/documentation_guide.rdoc.
2022-03-21 14:58:00 -05:00
Burdette Lamar d52f41b765
[DOC] Enhanced RDoc for String (#5675)
Treats:
    #split
    #each_line
    #lines
    #each_byte
    #bytes
2022-03-18 17:17:00 -05:00
Shugo Maeda 1107839a7f Add String#bytesplice 2022-03-18 11:51:03 +09:00
Burdette Lamar 59a1a8185f
[DOC] Enhanced RDoc for String#split (#5644)
* Enhanced RDoc for String#split

* Enhanced RDoc for String#split

* Enhanced RDoc for String#split

* Enhanced RDoc for String#split

* Enhanced RDoc for String#split
2022-03-16 14:45:48 -05:00
Nobuyoshi Nakada 4d93b6299c
Initialize mutex for crypt(3) statically
Assuming that all platforms, where only `crypt` is available but
not `crypt_r`, are POSIX-base.
2022-03-16 18:51:34 +09:00
Burdette Lamar 561dda9934
[DOC] Enhanced RDoc for String (#5635)
Treats:

    #count
    #delete
    #delete!
    #squeeze
    #squeeze!

Adds section "Multiple Character Selectors" to doc/character_selectors.rdoc.

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2022-03-09 19:53:51 -06:00
Burdette Lamar 72c038a8f5
[DOC] Enhanced RDoc for String (#5633)
Treats:

    #tr (revised to link to "Character Selectors" document)
    #tr!
    #tr_s
    #tr_s!

Also renames doc/character_selector.rdoc to match its title.
2022-03-09 08:42:12 -06:00
Kazuhiro NISHIYAMA b068a53dc9
[DOC] Fix default offset of String#byterindex 2022-03-09 15:15:11 +09:00
Burdette Lamar faff37da57
[DOC] Enhanced RDoc for String #tr and #tr! (#5626) 2022-03-07 12:58:29 -06:00
Nobuyoshi Nakada 7f7f07a600
[DOC] mark `rb_str_init` as `:nodoc:`
Otherwise, an empty entry will be generated as `String::new` along
with the one from doc/string.rb.
2022-03-03 13:39:07 +09:00
Mau Magnaguagno 347c3faf8e
[DOC] Fix String#getbyte doc
* String#getbyte returns `nil` if `index` is out of range.

* Add String#getbyte example with nil output.

* Modify String#getbyte example to use negative index.
2022-03-01 10:05:49 +09:00
Nobuyoshi Nakada 3e5d7e3176
[DOC] Move String.new to allow non US-ASCII characters 2022-02-26 21:50:46 +09:00
Burdette Lamar 26ffda2fd2
[DOC] Enhanced RDoc for some encoding methods (#5598)
In String, treats:

    #b
    #scrub
    #scrub!
    #unicode_normalize
    #unicode_normalize!
    #encode
    #encode!

Also adds a note to IO.new (suggested by @jeremyevans).
2022-02-25 13:12:59 -06:00
Shugo Maeda 63401b1384
Rename the wrong variable name `beg` to `len` 2022-02-23 11:23:33 +09:00
Nobuyoshi Nakada 8f0e3a97f9
rb_debug_rstring_null_ptr: add newlines in the message [ci skip]
The message should end with a newline, and break the long
paragraph.
2022-02-21 16:22:23 +09:00
Shugo Maeda c8817d6a3e
Add String#byteindex, String#byterindex, and MatchData#byteoffset (#5518)
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]

Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
2022-02-19 19:10:00 +09:00
Nobuyoshi Nakada 6e65e04186
[DOC] Remove unnecessary `rdoc-ref:` schemes 2022-02-12 12:38:37 +09:00
Nobuyoshi Nakada 50c972a1ae
[DOC] Simplify operator method references 2022-02-12 12:38:36 +09:00
Paarth Madan 2a30ddd9f3 Remove extraneous "." in String#+@ documentation 2022-02-08 10:33:49 +09:00
Nobuyoshi Nakada 8ca7b0b68a
[DOC] Fix broken links to operator methods
Once https://github.com/ruby/rdoc/pull/865 is merged, these hacks
are no longer needed.
2022-02-08 01:39:37 +09:00
Nobuyoshi Nakada 07bf65858d
[DOC] Fix broken links to case_mapping.rdoc 2022-02-08 01:28:08 +09:00
Nobuyoshi Nakada 16fdc1ff46
[DOC] Fix broken links to literals.rdoc 2022-02-08 01:27:52 +09:00
Nobuyoshi Nakada bc5662d9d8
[DOC] Simplify links to global methods 2022-02-08 01:18:56 +09:00
Peter Zhu a32e5e1b97 [DOC] Use RDoc link style for links in the same class/module
I used this regex:

(?<=\[)#(?:class|module)-([A-Za-z]+)-label-([A-Za-z0-9\-\+]+)

And performed a global find & replace for this:

rdoc-ref:$1@$2
2022-02-07 09:52:06 -05:00
Peter Zhu f9a2802bc5 [DOC] Use RDoc link style for links to other classes/modules
I used this regex:

([A-Za-z]+)\.html#(?:class|module)-[A-Za-z]+-label-([A-Za-z0-9\-\+]+)

And performed a global find & replace for this:

rdoc-ref:$1@$2
2022-02-07 09:52:06 -05:00
Burdette Lamar a07fa198a6
Improve links to labels in string.c and struct.c (#5531) 2022-02-06 09:44:40 -06:00
Peter Zhu be68b3a490 Change termlen when changing encoding during concatenation
After changing the encoding, we should update the terminator length.
2022-01-07 10:50:03 -05:00
Nobuyoshi Nakada 3f9af8a9dc
[DOC] Fix typos in a doxygen comment [ci skip] 2022-01-07 23:55:59 +09:00
Peter Zhu ae0d67d762 Revert "Set encoding before concatenating to string"
This reverts commit 44368b5f8b.
2022-01-06 17:23:05 -05:00
Peter Zhu 5f55b03716 Set correct termlen for frozen strings
Frozen strings should have the same termlen as the original string when
copy_encoding is true.
2022-01-06 14:33:35 -05:00
Peter Zhu 44368b5f8b Set encoding before concatenating to string
If we set encoding after the call to rb_str_buf_cat, then rb_str_buf_cat
will not set the correct terminator length.
2022-01-06 14:33:35 -05:00
Nobuyoshi Nakada 39bc5de833
Remove tainted and trusted features
Already these had been announced to be removed in 3.2.
2021-12-26 23:28:54 +09:00
Nobuyoshi Nakada e2ec97c4b8
[DOC] How to get the longest last match [Bug #18415] 2021-12-19 20:27:31 +09:00
Burdette Lamar 5588aa79d4
What's Here for Symbol (#5289)
* What's Here for Symbol
2021-12-17 17:02:12 -06:00
Burdette Lamar f7e266e6d2
Enhanced RDoc for case mapping (#5245)
Adds file doc/case_mapping.rdoc, which describes case mapping and provides a link target that methods doc can link to.

Revises:

    String#capitalize
    String#capitalize!
    String#casecmp
    String#casecmp?
    String#downcase
    String#downcase!
    String#swapcase
    String#swapcase!
    String#upcase
    String#upcase!
    Symbol#capitalize
    Symbol#casecmp
    Symbol#casecmp?
    Symbol#downcase
    Symbol#swapcase
    Symbol#upcase
2021-12-17 06:05:31 -06:00
Burdette Lamar e5ff030f60
Enhanced RDoc for String (#5234)
Treated:

    #to_i
    #to_f
    #to_s
    #inspect
    #dump
    #undump
2021-12-10 10:50:13 -06:00
Burdette Lamar 9a2ecddf32
Enhanced RDoc for String (#5227)
Treats:

    #replace
    #clear
    #chr
    #getbyte
    #setbyte
    #byteslice
    #reverse
    #reverse!
    #include?
2021-12-08 12:29:56 -06:00
Burdette Lamar 7fc9d83bd1
Fix link (#5208) 2021-12-03 10:46:35 -06:00
Burdette Lamar 28fb6d6b9e
Adding links to literals and Kernel (#5192)
* Adding links to literals and Kernel
2021-12-03 07:12:28 -06:00
Peter Zhu 7cfacbcad2 Improve performance of embedded string allocation
Non-VWA embedded string allocation had a performance regression. This
commit improves performance of non-VWA embedded string allocation.
2021-11-26 13:27:32 -05:00
Peter Zhu 9aded89f40 Speed up Ractors for Variable Width Allocation
This commit adds a Ractor cache for every size pool. Previously, all VWA
allocated objects used the slowpath and locked the VM.

On a micro-benchmark that benchmarks String allocation:

VWA turned off:
  29.196591   0.889709  30.086300 (  9.434059)

VWA before this commit:
  29.279486  41.477869  70.757355 ( 12.527379)

VWA after this commit:
  16.782903   0.557117  17.340020 (  4.255603)
2021-11-23 10:51:27 -05:00
Peter Zhu aeae6e2842 [Feature #18290] Remove all usages of rb_gc_force_recycle
This commit removes usages of rb_gc_force_recycle since it is a burden
to maintain and makes changes to the GC difficult.
2021-11-08 14:05:54 -05:00
Yusuke Endoh 4b248e7994 string.c: Follow up to ae2359f602
* Mention `\0`
* Make the example of hash replacement meaningful
2021-11-03 03:52:28 +09:00
Burdette Lamar ae2359f602
Enhanced RDoc for String (#5060)
Treated:

    #slice!
    #sub
    #sub!
    #gsub
    #gsub!
2021-11-02 13:04:58 -05:00
Burdette Lamar 3e743d3147
Cleanup some RDoc (#5050)
Mostly adding blank line before and after code segment, to improve compliance with doc\documentation_guide.rdoc.
2021-10-28 17:01:49 -05:00
Yusuke Endoh acb2f86caa string.c: Add some comments about STR flags 2021-10-29 01:57:29 +09:00
Peter Zhu a5b6598192 [Feature #18239] Implement VWA for strings
This commit adds support for embedded strings with variable capacity and
uses Variable Width Allocation to allocate strings.
2021-10-25 13:26:23 -04:00
Peter Zhu 46b66eb9e8 [Feature #18239] Add struct for embedded strings 2021-10-25 13:26:23 -04:00
Jeremy Evans 2a5c3a4d0f Update documentation for String and Symbol to discuss differences
Implements [Feature #14347]
2021-10-15 13:54:03 -07:00
Nobuyoshi Nakada 78ff9b719c
Add tests for the edge caces of `String#end_with?`
Also, check if a suffix is empty, to guarantee the assumption of
`onigenc_get_left_adjust_char_head` that `*s` is always accessible,
even in the case of `SHARABLE_MIDDLE_SUBSTRING`.
2021-10-08 14:08:03 +09:00
git 1bf3f3f4da * remove trailing spaces. [ci skip] 2021-10-06 00:40:54 +09:00
Jeremy Evans c6706f15af Fix documentation for String#{<<,concat,prepend}
These methods mutate and return the receiver, they don't create
and return a new string.

Fixes [Bug #18241]
2021-10-05 08:39:27 -07:00
Nobuyoshi Nakada cd182c5ee1
Adjust types to rb_enc_left_char_head
I dislike unnatural casts.
2021-10-05 17:14:29 +09:00
Nobuyoshi Nakada 5a961c3768
Remove a redundant cast between the exact same types 2021-10-05 15:56:34 +09:00
卜部昌平 f032c09bca rb_enc_left_char_head(): take void*
Nobu doesn't like (char*) cast.
2021-10-05 14:18:23 +09:00
卜部昌平 499660b04f downcase_single/upcase_single: assume ASCII
These functions assume ASCII compatibility.  That has to be ensured in
their caller.
2021-10-05 14:18:23 +09:00
卜部昌平 5112a54846 include/ruby/encoding.h: convert macros into inline functions
Less macros == huge win.
2021-10-05 14:18:23 +09:00
卜部昌平 e42c8c160d add undeclared variables
Why did they even exist?
2021-10-05 14:18:23 +09:00
Nobuyoshi Nakada 842b0008c1 Skip broken strings as the locale encoding 2021-10-01 20:28:44 +09:00
Kazuhiro NISHIYAMA e0c6e8c64a
[DOC] Use `unpack1` instead of `unpack(template)[0]` [ci skip] 2021-09-23 09:20:00 +09:00
Nobuyoshi Nakada cbbda3e648
Adjust indent in string.c [ci skip] 2021-09-16 23:49:16 +09:00
S.H b8c3a84bdd
Refactor and Using RBOOL macro 2021-09-15 08:11:05 +09:00
Nobuyoshi Nakada cd829bb078 Remove printf family from the mjit header
Linking printf family functions makes mjit objects to link
unnecessary code.
2021-09-11 08:41:32 +09:00
卜部昌平 091faca99c include/ruby/internal/intern/string.h: add doygen
Must not be a bad idea to improve documents. [ci skip]
2021-09-10 20:00:06 +09:00
Peter Zhu 5d81554281 [Bug #18154] Fix memory leak in String#initialize
String#initialize can leak memory when called on a string that is marked
with STR_NOFREE because it does not unset the STR_NOFREE flag.
2021-09-08 10:20:12 -04:00
Nobuyoshi Nakada edf01d4e82
Treat NULL fake string as an empty string
And the NULL string must be of size 0.
2021-08-17 18:45:36 +09:00
Jeremy Evans 84bf4d2ce5 Term fill in String#{,l,r}strip! even when SHARABLE_MIDDLE_SUBSTRING
Each of these methods calls str_modify_keep_cr before
term filling, which should ensure the backing string
uses private memory, and therefore term filling should
not affect other strings.

Skipping the term filling was added in
a707ab4bc8.

Fixes [Bug #12540]
2021-08-11 13:40:49 +09:00
Peter Zhu c463a5e008 Fix indentation in string.c
7 spaces were used for 2 levels of indentation. This commit changes it
to use 8 spaces.
2021-08-03 16:39:02 -04:00
Troy Chance 7f4e86804d
Fix documentation of #<=> and #casecmp [ci skip]
Descriptions for return values of -1 and 1 were reversed.
2021-08-02 12:09:07 +09:00
S.H 378e8cdad6
Using RBOOL macro 2021-08-02 12:06:44 +09:00
Nobuyoshi Nakada eec45a93ef
Escape unprintable chars only, without surrounding quotes 2021-07-24 14:31:41 +09:00
S-H-GAMELINKS 9952e9358e Refactor rb_str_export and rb_str_export_locale function's 2021-07-07 12:31:43 +09:00
Nobuyoshi Nakada 94bd3bde81 Specify version to remove as bare numbers 2021-06-30 10:47:01 +09:00
Nobuyoshi Nakada 8118d435d0 rb_warn_deprecated_to_remove_at [Feature #17432]
At compilation time with RUBY_DEBUG enabled, check if the removal
version has been reached.
2021-06-30 10:47:01 +09:00
Nobuyoshi Nakada 391abc543c
Scan the coderange in the given encoding 2021-06-26 16:05:15 +09:00
Ketan Bhatt 2fb435b3ab Add Related link from String#hash to Object#hash
We came across a bug in our code because we assumed `String#hash` to be consistent across Ruby processes, which was incorrect.

Our search lead us to `Object#hash` which has the right warning that `String#hash` doesn't. We also noticed that a previous version of the documentation for `String#hash` pointed to `Object#hash` that was removed by https://github.com/ruby/ruby/pull/3565.
We think this removal might not be intended and just got missed amidst other changes.
2021-06-23 07:42:02 -07:00
Burdette Lamar c1741df1a1 What's Here for Numeric and Comparable 2021-06-21 10:38:16 -07:00
Nobuyoshi Nakada e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada 12f7ba5ed4
Make String#crypt ractor-safe 2021-04-13 12:05:31 +09:00
Nobuyoshi Nakada df7efdcb6b
Get rid of LONG_LONG redefinition 2021-04-12 22:47:07 +09:00
Jean Boussier 7e8a9af9db rb_enc_interned_str: handle autoloaded encodings
If called with an autoloaded encoding that was not yet
initialized, `rb_enc_interned_str` would crash with
a NULL pointer exception.

See: https://github.com/ruby/ruby/pull/4119#issuecomment-800189841
2021-03-22 21:37:48 +09:00
Jeremy Evans cfd162d535 Make String#{strip,lstrip}{,!} strip leading NUL bytes
The documentation already specifies that they strip whitespace
and defines whitespace to include null.

This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.

Fixes [Bug #17467]
2021-02-20 11:17:47 +09:00
Sarun Rattanasiri 1a3b68e7c1
correct the result of casecmp? examples [ci skip] 2021-02-12 06:56:51 +09:00
Nobuyoshi Nakada 81f17857a7
Merged too-short salt conditions instead of UNREACHABLE_RETURN 2021-02-11 22:25:31 +09:00
S-H-GAMELINKS 9e66c511ff Fix 404 link 2021-02-11 13:33:21 +09:00
S-H-GAMELINKS 90f008f569 Remove unsued str_new_shared function declaration 2021-02-04 16:25:55 +09:00
Nobuyoshi Nakada 1f5b8f7084
Constified pointers in str_casecmp 2021-01-30 20:08:18 +09:00
Burdette Lamar d7a844cb08
Fix broken link in RDoc for String (#4123)
Link was correct; its target was incorrect; now fixed.
2021-01-26 11:22:13 -06:00
Burdette Lamar 6e44de752e
What's Here for String RDoc (#4093)
* What's Here for String RDoc
2021-01-22 15:01:09 -06:00
Eric Schneider 11b8bb99e6 Minor grammar fix in String#chomp documentation 2020-12-30 01:25:00 -05:00
Nobuyoshi Nakada 6083fed366 Use `size_t` for `RSTRING_LEN` in String#count
https://hackerone.com/reports/1042722
2020-12-26 01:40:51 +09:00
zverok 4728c0d900 Add Symbol#name and freezing explanation to #to_s 2020-12-21 19:22:38 -05:00
Nobuyoshi Nakada c7a5cc2c30
Replaced magic numbers tr table 2020-12-21 23:45:38 +09:00
Jeremy Evans 05313c914b Use category: :deprecated in warnings that are related to deprecation
Also document that both :deprecated and :experimental are supported
:category option values.

The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.

Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).

Add assert_deprecated_warn to test assertions.  Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
2020-12-18 09:54:11 -08:00
Koichi Sasada 344ec26a99 tuning trial: newobj with current ec
Passing current ec can improve performance of newobj. This patch
tries it for Array and String literals ([] and '').
2020-12-07 08:28:36 +09:00
Koichi Sasada 764de7566f should not use rb_str_modify(), too
Same as 8247b8edde, should not use rb_str_modify() here.

https://bugs.ruby-lang.org/issues/17343#change-88858
2020-12-01 18:16:23 +09:00
Jean Boussier 6bef49427a Fix rb_interned_str_* functions to not assume static strings
Fixes [Feature #13381]

When passed a `fake_str`, `register_fstring` would create new strings
with `str_new_static`. That's not what was expected, and answer
almost no use cases.
2020-11-30 17:33:28 +09:00
Nobuyoshi Nakada 02c32b2e92
Get rid of allocation when the capacity is small 2020-11-29 15:01:41 +09:00
Takashi Kokubun 3f8c60cf09
Remove obsoleted str_new_empty
since 58325daae3.

../string.c:1339:1: warning: ‘str_new_empty’ defined but not used [-Wunused-function]
 1339 | str_new_empty(VALUE str)
      | ^~~~~~~~~~~~~
2020-11-20 22:22:29 -08:00
Jeremy Evans 58325daae3 Make String methods return String instances when called on a subclass instance
This modifies the following String methods to return String instances
instead of subclass instances:

* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase

This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.

Some string methods were left to return subclass instances:

* String#+@
* String#-@

Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.

Fixes [#10845]
2020-11-20 16:30:23 -08:00
Jean Boussier ef19fb111a Expose the rb_interned_str_* family of functions
Fixes [Feature #13381]
2020-11-17 09:39:25 +09:00
Alan Wu 520b86caf1 Move variable closer to usage 2020-10-30 19:34:41 -04:00
Stefan Stüben 8c2e5bbf58 Don't redefine #rb_intern over and over again 2020-10-21 12:45:18 +09:00
Burdette Lamar 33776598f7
Enhanced RDoc for String#insert (#3643)
* Enhanced RDoc for String#insert
2020-10-08 15:35:13 -05:00
Burdette Lamar 4bc6190a34
Enhanced RDoc for String#[] (#3607)
* Enhanced RDoc for String#[]
2020-09-30 14:58:12 -05:00
Burdette Lamar 48b94b7919
Enhanced RDoc for String#upto (#3603)
* Enhanced RDoc for String#upto
2020-09-29 19:15:39 -05:00
Burdette Lamar 0555bd8435
Enhanced RDoc for String#succ! (#3596)
* Enhanced RDoc for String#succ!
2020-09-28 11:58:39 -05:00
Burdette Lamar 8b42474a26
Enhanced RDoc for String#succ (#3590)
* Enhanced RDoc for String#succ
2020-09-25 15:13:10 -05:00
Burdette Lamar 83ff0f74bf
Enhanced RDoc for String#match? (#3576)
* Enhanced RDoc for String#match?
2020-09-24 18:38:11 -05:00
Burdette Lamar 38385d28df
Enhanced RDoc for String (#3574)
Methods:

    =~
    match
2020-09-24 13:23:26 -05:00
Burdette Lamar 6fe2a9fcda
Enhanced RDoc for String (#3569)
Makes some methods doc compliant with https://github.com/ruby/ruby/blob/master/doc/method_documentation.rdoc. Also, other minor revisions to make more consistent.
Methods:

    ==
    ===
    eql?
    <=>
    casecmp
    casecmp?
    index
    rindex
2020-09-24 10:55:43 -05:00
Kazuhiro NISHIYAMA 9a8f5f0a9a
Fix call-seq [ci skip]
`encoding` can be not only an encoding name, but also an Encoding object.

```
s = String.new('foo', encoding: Encoding::US_ASCII)
s.encoding # => #<Encoding:US-ASCII>
```
2020-09-23 11:44:06 +09:00
Burdette Lamar b904b72960
Enhanced RDoc for String (#3565)
Makes some methods doc compliant with https://github.com/ruby/ruby/blob/master/doc/method_documentation.rdoc. Also, other minor revisions to make more consistent.
Methods:

    try_convert
    +string
    -string
    concat
    <<
    prepend
    hash
2020-09-22 16:32:17 -05:00
Burdette Lamar c6c5d4b3fa
Comply with guide for method doc: string.c (#3528)
Methods:

    ::new
    #length
    #bytesize
    #empty?
    #+
    #*
    #%
2020-09-21 11:27:54 -05:00
Koichi Sasada dd5db6f5fe sync fstring_table for deletion
Ractors can access this table simultaneously so we need to sync
accesses.
2020-09-18 14:17:49 +09:00
Koichi Sasada e81d7189a0 sync fstring pool
fstring pool should be sync with other Ractors.
2020-09-15 00:04:59 +09:00
Soutaro Matsumoto f0ddbd502c
Let String#slice! return nil (#3533)
Returns `nil` instead of an empty string when non-integer number is given (to make it 2.7 compatible).
2020-09-11 14:34:10 +09:00
Nobuyoshi Nakada eb67c603ca
Added Symbol#name
https://bugs.ruby-lang.org/issues/16150#change-87446
2020-09-04 22:18:59 +09:00
Burdette Lamar 51525557fd
Partial compliance with doc/method_documentation.rdoc in string.c (#3436)
Removes references to *-convertible thingies.
2020-08-20 12:09:49 -05:00
Jean Boussier aaf0e33c0a register_fstring: avoid duping the passed string when possible
If the passed string is frozen, bare and not shared, then there
is no need to duplicate it.

Ref: 4ab69ebbd7
Ref: https://bugs.ruby-lang.org/issues/11386
2020-08-19 08:08:56 -07:00
Nobuyoshi Nakada d75433ae19
[DOC] fixed a missing markup 2020-08-15 14:17:02 +09:00
Kasumi Hanazuki 014a4fda54 rb_str_{index,rindex}_m: Handle /\K/ in pattern
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.

```
# without patch
"abcdbce".index(/b\Kc/)  # => 1
"abcdbce".rindex(/b\Kc/)  # => 4
```

This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".index(/b\Kc/)  # => 2
"abcdbce".rindex(/b\Kc/)  # => 5
```

Fixes [Bug #17118]
2020-08-13 20:54:12 +09:00
Kasumi Hanazuki 5d71eed1a7 rb_str_{partition,rpartition}_m: Handle /\K/ in pattern
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.

```
# without patch
"abcdbce".partition(/b\Kc/)  # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcd", "c", "ce"]
```

This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.

```
# with patch
"abcdbce".partition(/b\Kc/)  # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/)  # => ["abcdb", "c", "e"]
```

As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.

Fixes [Bug #17119]
2020-08-13 20:50:50 +09:00
Kasumi Hanazuki e79cdcf61b string.c(rb_str_split_m): Handle /\K/ correctly
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.

Fixes [Bug #17113]
2020-08-12 10:01:39 +09:00
Nobuyoshi Nakada 0ca6b973e8
Removed non-ASCII code to suppress warnings by localized compilers 2020-08-10 19:46:13 +09:00
Nobuyoshi Nakada fac62f094e
Adjust indent 2020-08-10 16:35:42 +09:00
Kazuhiro NISHIYAMA 946cd6c534
Use https instead of http 2020-07-28 19:51:54 +09:00
卜部昌平 de3e931df7 add UNREACHABLE_RETURN
Not every compilers understand that rb_raise does not return.  When a
function does not end with a return statement, such compilers can issue
warnings.  We would better tell them about reachabilities.
2020-06-29 11:05:41 +09:00
卜部昌平 5f926b2b00 rb_str_partition: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 e3d821a36c rb_str_crypt: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 a5ae9aebbc trnext: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c7a4073154 chompped_length: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 fdae2063fb get_pat_quoted: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 673ddea934 get_pat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 31e5d138d7 rb_str_slice_bang: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 19f2cabed8 rb_str_aset: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 841eea4bcb rb_str_subpat_set: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 0358846f8c rb_str_update: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 d49924ed81 rb_str_match: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c422fc4bbc rb_str_rindex_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 c29ec1ef1a rb_str_index_m: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
卜部昌平 6df790f22e rb_enc_cr_str_buf_cat: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
S-H-GAMELINKS 0fcb2dd51d
add static modifier for rb_str_ord func 2020-05-27 11:26:44 +09:00
Kazuhiro NISHIYAMA fa7addebb4
Fix typos [ci skip] 2020-05-17 21:01:29 +09:00
Burdette Lamar cc525d764b
[ci skip] Enhanced rdoc for String.new (#3067)
* Per @nobu review

* Enhanced rdoc for String.new

* Respond to review
2020-05-15 14:14:50 -07:00
Nobuyoshi Nakada 693f7ab315 Optimize String#split
Optimized `String#split` with `/ /` (single space regexp) as
simple string splitting.  [ruby-core:98272]

|               |compare-ruby|built-ruby|
|:--------------|-----------:|---------:|
|re_space-1     |    432.786k|    1.539M|
|               |           -|     3.56x|
|re_space-10    |     76.231k|  191.547k|
|               |           -|     2.51x|
|re_space-100   |      8.152k|   19.557k|
|               |           -|     2.40x|
|re_space-1000  |     837.405|    2.022k|
|               |           -|     2.41x|

ruby-core:98272: https://bugs.ruby-lang.org/issues/15771#change-85511
2020-05-12 19:58:58 +09:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
S.H 034b8472ba
remove unused rb_str_clear define (#3059) 2020-04-25 20:39:44 -07:00
Nobuyoshi Nakada d4215dafea
Use UNREACHABLE_RETURN for non-void function 2020-04-16 17:59:55 +09:00
Kazuhiro NISHIYAMA c79e3a5957
Add {Regexp,String}#match with block to call-seq [ci skip] 2020-04-14 12:39:16 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada 8a7e0aaaef
Warn non-nil `$/` [Feature #14240] 2020-02-23 13:37:40 +09:00
Nobuyoshi Nakada fce667ed08
Get rid of warnings/exceptions at cleanup
After the encoding index instance variable is removed when all
instance variables are removed in `obj_free`, then `rb_str_free`
causes uninitialized instance variable warning and nil-to-integer
conversion exception.  Both cases result in object allocation
during GC, and crashes.
2020-02-13 12:46:48 +09:00
Nobuyoshi Nakada 160d3165eb
Copy non-inlined encoding index 2020-02-12 20:09:57 +09:00
Nobuyoshi Nakada bdf3032e35
Make temporary lock string encoding free
As a temporary lock string is hidden, it can not have instance
variables, including non-inlined encoding index.
2020-02-12 19:58:22 +09:00
Nobuyoshi Nakada 05229cef45
Improve `String#slice!` performance
Instead of searching twice to extract and to delete, extract and
delete the found position at the first search.

This makes faster nearly twice, for regexps and strings.

|              |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|regexp-short  |      2.143M|    3.918M|
|regexp-long   |    105.162k|  205.410k|
|string-short  |      3.789M|    7.964M|
|string-long   |      1.301M|    2.457M|
2020-01-31 17:12:05 +09:00
Nobuyoshi Nakada 0dd6f020fc
Make `empty_string` a fake string 2020-01-31 14:24:07 +09:00
Jean Boussier 52dc0632fa Avoid allocating a temporary empty string in String#slice! 2020-01-31 09:22:20 +09:00
Nobuyoshi Nakada aefb13eb63
Added rb_warn_deprecated_to_remove
Warn the deprecation and future removal, with obeying the warning
flag.
2020-01-23 21:42:15 +09:00
Jeremy Evans e18b817b1f Make taint warnings non-verbose instead of verbose 2020-01-22 11:19:13 -08:00
Nobuyoshi Nakada fce54a5404
Fix `String#partition`
Split with the matched part when the separator matches the empty
part at the beginning.  [Bug #11014]
2020-01-16 15:36:38 +09:00
Marcus Stollsteimer 1d09acd82b [DOC] Improve docs for String#match
Fix invalid code to make it syntax highlighted; other small fixes.
2020-01-08 20:53:31 +01:00
Marcus Stollsteimer f74021e12b Improve docs for String#=~
Move existing example to the corresponding paragraph and
add an example for `string =~ regexp` vs. `regexp =~ string`;
avoid using the receiver's identifier from the call-seq
because it does not appear in rendered HTML docs;
mention deprecation of Object#=~; fix some markup and typos.
2020-01-08 20:47:10 +01:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada 2b2030f265
Refined the warning message for $, and $;
[Bug #16438]
2019-12-20 15:09:23 +09:00
NARUSE, Yui b5fbefbf2c Added Symbol#start_with? and Symbol#end_with? method. [Feature #16348] 2019-11-28 23:49:28 +09:00
卜部昌平 7a9b2039b7 delete unused codes
Suppress compiler warnings.
2019-11-18 18:28:03 +09:00
Nobuyoshi Nakada 22c9504905
rb_tainted_str_new_with_enc is no longer used 2019-11-18 11:05:22 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平 c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
NARUSE, Yui bea322a352 Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"
This reverts commit 6ffc045a81.
2019-11-05 17:30:54 +09:00
zverok bddb31bb37 Documentation improvements for Ruby core
* Top-level `return`;
* Documentation for comments syntax;
* `rescue` inside blocks;
* Enhance `Object#to_enum` docs;
* Make `chomp:` option more obvious for `String#each_line` and
  `#lines`;
* Enhance `Proc#>>` and `#<<` docs;
* Enhance `Processs` class docs.
2019-10-26 14:58:08 +09:00
Lourens Naudé 9c24ce551d Reduce the minimum string buffer size from 127 to 63 bytes 2019-10-11 11:16:16 +09:00
卜部昌平 7e0ae1698d avoid overflow in integer multiplication
This changeset basically replaces `ruby_xmalloc(x * y)` into
`ruby_xmalloc2(x, y)`.  Some convenient functions are also
provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates
x * y + z byes.
2019-10-09 12:12:28 +09:00
Benoit Daloze 6ffc045a81 [EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol.
* Avoids extra allocations whenever calling Symbol#to_s.
* See [Feature #16150]
2019-09-26 10:23:02 +02:00
Alan Wu 47a234954a Rename STR_IS_SHARED_M to STR_BORROWED
Since the introduction of STR_SHARED_ROOT, the word "shared"
has become very overloaded with respect to String's internal
states. Use a different name for STR_IS_SHARED_M and explain
its purpose.
2019-09-26 15:30:18 +09:00
Alan Wu 93faa011d3 Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.

Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].

The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.

[Bug #16151]
2019-09-26 15:30:18 +09:00
Jeremy Evans 6f9b86616a Make Symbol#to_proc calls handle keyword arguments
Make rb_sym_proc_call take a flag for whether a keyword argument
is used, and use the new rb_funcall_with_block_kw function to
pass that information.
2019-09-05 17:47:12 -07:00