github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Nobuyoshi Nakada	50520cc193	[DOC] Missing comment markers	2023-09-27 16:18:05 +09:00
Nobuyoshi Nakada	6b66b5fded	[Bug #19902 ] Update the coderange regarding the changed region	2023-09-26 15:35:40 +09:00
John Hawthorn	d89b15cdce	Use end of char boundary in start_with? Previously we used the next character following the found prefix to determine if the match ended on a broken character. This had caused surprising behaviour when a valid character was followed by a UTF-8 continuation byte. This commit changes the behaviour to instead look for the end of the last character in the prefix. [Bug #19784] Co-authored-by: ywenc <ywenc@github.com> Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2023-09-01 16:23:28 -07:00
Nobuyoshi Nakada	b054c2fe06	[Bug #19784 ] Fix behaviors against prefix with broken encoding - String#start_with? - String#delete_prefix - String#delete_prefix!	2023-08-26 08:58:02 +09:00
Nobuyoshi Nakada	00ac3a64ba	Introduce `at_char_boundary` function	2023-08-26 08:58:02 +09:00
Alan Wu	2214bcb70d	Fix premature string collection during append Previously, the following crashed due to use-after-free with AArch64 Alpine Linux 3.18.3 (aarch64-linux-musl): ```ruby str = 'a' * (3210241024) p({z: str}) ``` 32 MiB is the default for `GC_MALLOC_LIMIT_MAX`, and the crash could be dodged by setting `RUBY_GC_MALLOC_LIMIT_MAX` to large values. Under a debugger, one can see the `str2` of rb_str_buf_append() getting prematurely collected while str_buf_cat4() allocates capacity. Add GC guards so the buffer of `str2` lives across the GC run initiated in str_buf_cat4(). [Bug #19792]	2023-08-23 18:07:49 -04:00
Peter Zhu	837c12b0c8	Use STR_EMBED_P instead of testing STR_NOEMBED	2023-08-22 16:31:36 -04:00
Peter Zhu	724223b4ca	Don't check for STR_NOEMBED in rb_fstring We don't need to check for STR_NOEMBED because the check above for STR_EMBED_P means that it can never be false.	2023-08-18 09:24:45 -04:00
Burdette Lamar	0e162457d6	[DOC] Don't suppress autolinks (#8208 )	2023-08-11 19:22:21 -04:00
Kunshan Wang	132f097149	No computing embed_capa_max in str_subseq Fix str_subseq so that it does not attempt to predict the size of the object returned by str_alloc_heap.	2023-08-03 14:52:44 -04:00
Nobuyoshi Nakada	af04e26924	Fill terminator properly	2023-07-28 22:17:53 +09:00
alexandre184	e5825de7c9	[Bug #19769 ] Fix range of size 1 in `String#tr`	2023-07-15 16:36:53 +09:00
Nobuyoshi Nakada	9dcdffb8bf	Make the string index functions closer to symmetric So that irregular parts may be more noticeable.	2023-07-09 18:45:51 +09:00
Nobuyoshi Nakada	5e79d5a560	Make `rb_str_rindex` return byte index Leave callers to convert byte index to char index, as well as `rb_str_index`, so that `rb_str_rpartition` does not need to re-convert char index to byte index.	2023-07-09 16:39:28 +09:00
Nobuyoshi Nakada	e2257831ab	[Bug #19763 ] Raise same message exception for regexp	2023-07-09 16:21:02 +09:00
Nobuyoshi Nakada	3d7a6bbc12	Ensure the byte position is a valid boundary	2023-06-28 22:42:04 +09:00
Nobuyoshi Nakada	bc3ac1872e	[Bug #19748 ] Fix out-of-bound access in `String#byteindex`	2023-06-28 17:23:32 +09:00
Nobuyoshi Nakada	0cbfeb8210	[Bug #19746 ] `String#index` with regexp should clear `$~` unless matched	2023-06-28 14:06:28 +09:00
Burdette Lamar	932dd9f10e	[DOC] Regexp doc (#7923 )	2023-06-20 09:28:21 -04:00
Matt Valentine-House	d54f66d1b4	Assign into optimal size pools using String#split("") When String#split is used with an empty string as the field seperator it effectively splits the original string into chars, and there is a pre-existing fast path for this using SPLIT_TYPE_CHARS. However this path creates an empty array in the smallest size pool and grows from there, despite already knowing the size of the desired array. This commit pre-allocates the correct size array in this case in order to allow the arrays to be embedded and avoid being allocated in the transient heap	2023-06-09 10:54:40 +01:00
Peter Zhu	7577c101ed	Unify length field for embedded and heap strings (#7908 ) * Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN	2023-06-06 10:19:20 -04:00
Peter Zhu	1a7ee14578	[DOC] Update flags doc for strings The length of an embedded string is no longer in the flags.	2023-06-05 09:49:35 -04:00
Peter Zhu	a16cffe384	Simplify duplicated code The capacity of the string can be calculated using the str_capacity function.	2023-06-01 08:32:29 -04:00
Peter Zhu	8a8618d4f3	Don't refetch ptr and len The call to RSTRING_GETMEM already fetched the pointer and length, so we don't need to fetch it again.	2023-06-01 08:32:29 -04:00
Peter Zhu	c37ebfe08f	Remove dead code in string.c The STR_DEC_LEN macro is not used.	2023-05-26 13:34:26 -04:00
Matt Valentine-House	026321c5b9	[Feature #19474 ] Refactor NEWOBJ macros NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec	2023-04-06 11:07:16 +01:00
Peter Zhu	1da2e7fca3	[Feature #19579 ] Remove !USE_RVARGC code (#7655 ) Remove !USE_RVARGC code [Feature #19579] The Variable Width Allocation feature was turned on by default in Ruby 3.2. Since then, we haven't received bug reports or backports to the non-Variable Width Allocation code paths, so we assume that nobody is using it. We also don't plan on maintaining the non-Variable Width Allocation code, so we are going to remove it.	2023-04-04 17:30:06 -04:00
Takashi Kokubun	32e0c97dfa	RJIT: Optimize String#bytesize	2023-03-18 23:35:42 -07:00
Takashi Kokubun	233ddfac54	Stop exporting symbols for MJIT	2023-03-06 21:59:23 -08:00
Takashi Kokubun	f0218303e0	Optimize String#getbyte	2023-03-05 23:28:59 -08:00
Rômulo Ceccon	d78ae78fd7	rb_str_modify_expand: clear the string coderange [Bug #19468] `b0b9f7201a` errornously stopped clearing the coderange. Since `rb_str_modify` clears it, `rb_str_modify_expand` should too.	2023-03-03 15:32:25 +01:00
John Bampton	2f7270c681	Fix spelling (#7389 )	2023-02-27 09:56:06 -08:00
Adam Daniels	2535b1819f	Symbol#end_with? accepts Strings only Regular expressions are not supported (same as String#end_with?).	2023-02-27 09:26:17 +09:00
BurdetteLamar	3b239d2480	Remove (newly unneeded) remarks about aliases	2023-02-19 14:26:34 -08:00
zverok	51bb5b23d4	[DOC] Small adjustment for String method docs * Hide freeze method (no useful docs, same as Object#freeze) * Add dedup to call-seq of str_uminus	2023-02-19 22:32:52 +02:00
Matt Valentine-House	d620855101	Rename rb_str_splice_{0,1} -> rb_str_update_{0,1}	2023-02-09 15:02:26 -05:00
Matt Valentine-House	601b83dcfc	Remove alias macro rb_str_splice	2023-02-09 15:02:26 -05:00
Matt Valentine-House	72aba64fff	Merge gc.h and internal/gc.h [Feature #19425]	2023-02-09 10:32:29 -05:00
Jean Boussier	c6b90e5e9c	Mark "mapping_buffer" as write barrier protected It doesn't have any reference so it can be marked as protected.	2023-02-03 19:10:42 +01:00
Shugo Maeda	cce3960964	[Feature #19314 ] Add new arguments of String#bytesplice bytesplice(index, length, str, str_index, str_length) -> string bytesplice(range, str, str_range) -> string In these forms, the content of +self+ is replaced by str.byteslice(str_index, str_length) or str.byteslice(str_range); however the substring of +str+ is not allocated as a new string.	2023-01-20 18:02:37 +09:00
Shugo Maeda	f7b72462aa	String#bytesplice should return self In Feature #19314, we concluded that the return value of String#bytesplice should be changed from the source string to the receiver, because the source string is useless and confusing when extra arguments are added. This change should be included in Ruby 3.2.1.	2023-01-19 17:13:07 +09:00
Matt Valentine-House	8a93e5d01b	Use str_enc_copy_direct to improve performance str_enc_copy_direct copies the string encoding over without checking the frozen status of the string. Because we know that we're safe here (we only use this function when interpolating strings on the stack via a concatstrings instruction) we can safely skip this check	2023-01-13 10:31:35 -05:00
Matt Valentine-House	bb5fddd070	Remove MIN_PRE_ALLOC_SIZE from Strings. This optimisation is no longer helpful now that we use VWA to allocate strings in larger size pools where they can be embedded.	2023-01-13 10:31:35 -05:00
Peter Zhu	bfc887f391	Add str_enc_copy_direct This commit adds str_enc_copy_direct, which is like str_enc_copy but does not check the frozen status of str1 and does not check the validity of the encoding of str2. This makes certain string operations ~5% faster. ```ruby puts(Benchmark.measure do 100_000_000.times do "a".downcase end end) ``` Before this patch: ``` 7.587598 0.040858 7.628456 ( 7.669022) ``` After this patch: ``` 7.133128 0.039809 7.172937 ( 7.183124) ```	2023-01-12 09:06:15 -05:00
Peter Zhu	9726736006	Set STR_SHARED_ROOT flag on root of string	2023-01-09 08:49:29 -05:00
Peter Zhu	3be2acfafd	Fix re-embedding of strings during compaction The reference updating code for strings is not re-embedding strings because the code is incorrectly wrapped inside of a `if (STR_SHARED_P(obj))` clause. Shared strings can't be re-embedded so this ends up being a no-op. This means that strings can be moved to a large size pool during compaction, but won't be re-embedded, which would waste the space.	2023-01-09 08:49:29 -05:00
Peter Zhu	d8ef0a98c6	[Bug #19319 ] Fix crash in rb_str_casemap The following code crashes on my machine: ``` GC.stress = true str = "testing testing testing" puts str.capitalize ``` We need to ensure that the object `buffer_anchor` remains on the stack so it does not get GC'd.	2023-01-06 11:36:28 -05:00
Nobuyoshi Nakada	98fbebf110	[DOC] Fix typo	2022-12-22 00:01:18 +09:00
S-H-GAMELINKS	1a64d45c67	Introduce encoding check macro	2022-12-02 01:31:27 +09:00
Jeremy Evans	571d21fd4a	Make String#rstrip{,!} raise Encoding::CompatibilityError for broken coderange It's questionable whether we want to allow rstrip to work for strings where the broken coderange occurs before the trailing whitespace and not after, but this approach is probably simpler, and I don't think users should expect string operations like rstrip to work on broken strings. In some cases, this changes rstrip to raise Encoding::CompatibilityError instead of ArgumentError. However, as the problem is related to an encoding issue in the receiver, and due not due to an issue with an argument, I think Encoding::CompatibilityError is the more appropriate error. Fixes [Bug #18931]	2022-11-24 18:24:42 -08:00

1 2 3 4 5 ...

1826 Коммитов