Profiling of `JSON.dump` shows a significant amount of time spent
in `rb_enc_str_asciionly_p`, in large part because it fetches the
encoding.
It can be made twice as fast in this scenario by first checking the
coderange and only falling back to fetching the encoding if the
coderange is unknown.
Additionally we can skip fetching the encoding for the common
popular encodings.
On OSX, String#rindex is slow due to the lack of `memrchr`.
The fallback implementation finds a match by instead doing
a `memcmp` on every single character in the search string
looking for a substring match.
For OSX hosts, this changeset introduces a simple `memrchr`
implementation, `rb_memrchr`, that can be used instead. An
example benchmark below demonstrates an 8000 char long
search string with a 10 char substring near the end.
```
ruby-master | substring near the end | osx
UTF-8
user system total real
index 0.000111 0.000000 0.000111 ( 0.000110)
rindex 0.000446 0.000005 0.000451 ( 0.000454)
```
```
ruby-patched | substring near the end | osx
UTF-8
user system total real
index 0.000112 0.000000 0.000112 ( 0.000111)
rindex 0.000057 0.000001 0.000058 ( 0.000057)
```
`rb_enc_from_index` is a costly operation so it is worth avoiding
to call it for the common encodings.
Also in the case of UTF-8, it's more efficient to scan the
coderange if it is unknown that to fallback to the slower
algorithms.
If we assume that most strings we modify are not frozen and
are independent, then we can optimize this case by replacing
multiple flag checks by a single mask check.
* Document why we need to explicitly spill registers.
* Simplify passing a byte value to `str_buf_cat`.
* YJIT: Enhance the `String#<<` method substitution to handle integer codepoint values.
* YJIT: Move runtime type check into YJIT.
Performing the check in YJIT means we can make assumptions about the type. It also improves correctness of stack traces in cases where the codepoint argument is not a String or a Fixnum.
[Bug #20585]
This was changed in 36a06efdd9f0604093dccbaf96d4e2cb17874dc8 because
`String.new(1024)` would end up allocating `1025` bytes, but the problem
with this change is that the caller may be trying to right size a String.
So instead, we should just better document the behavior of `capacity:`.
Previously, on common platforms, this code made a pointer to a union of
8 byte alignment out of a char pointer that is not guaranteed to satisfy
the alignment requirement. That is undefined behavior according
to [C99 6.3.2.3p7](https://port70.net/~nsz/c/c99/n1256.html#6.3.2.3p7).
Use memcpy() to do the unaligned read instead.
[Feature #20205]
Now that chilled strings no longer appear as frozen, there is no
need to offer an API to check for chilled strings.
We however need to change `rb_check_frozen_internal` to no
longer be a macro, as it needs to check for chilled strings.
With embedded strings we often have some space left in the slot, which
we can use to store the string Hash code.
It's probably only worth it for string literals, as they are the ones
likely to be used as hash keys.
We chose to store the Hash code right after the string terminator as to
make it easy/fast to compute, and not require one more union in RString.
```
compare-ruby: ruby 3.4.0dev (2024-04-22T06:32:21Z main f77618c1fa) [arm64-darwin23]
built-ruby: ruby 3.4.0dev (2024-04-22T10:13:03Z interned-string-ha.. 8a1a32331b) [arm64-darwin23]
last_commit=Precompute embedded string literals hash code
| |compare-ruby|built-ruby|
|:-----------|-----------:|---------:|
|symbol | 39.275M| 39.753M|
| | -| 1.01x|
|dyn_symbol | 37.348M| 37.704M|
| | -| 1.01x|
|small_lit | 29.514M| 33.948M|
| | -| 1.15x|
|frozen_lit | 27.180M| 33.056M|
| | -| 1.22x|
|iseq_lit | 27.391M| 32.242M|
| | -| 1.18x|
```
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
They were initially made frozen to avoid false positives for cases such
as:
str = str.dup if str.frozen?
But this may cause bugs and is generally confusing for users.
[Feature #20205]
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
[Feature #18576]
Since outright renaming `ASCII-8BIT` is deemed to backward incompatible,
the next best thing would be to only change its `#inspect`, particularly
in exception messages.
Previously it would bypass the `FL_ABLE` check, but
since shapes introduction, it started having a different
behavior than `OBJ_FREEZE`, as it would onyl set the `FL_FREEZE`
flag, but not update the shape.
I have no indication of this causing a bug yet, but it seems
like a trap waiting to happen.
I discovered the problem in `compile.c` from a failing
TestIseqLoad#test_stressful_roundtrip test with ASAN enabled. The other
two changes in array.c and string.c I found by auditing similar usages
of DATA_PTR in the codebase.
[Bug #20402]
Some extensions (like stringio) may need to differentiate between
chilled strings and frozen strings.
They can now use rb_str_chilled_p but must check for its presence since
the function will be removed when chilled strings are removed.
[Bug #20389]
[Feature #20205]
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
[Feature #20205]
As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.
Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.
When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.
Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.
Notes:
- `String#freeze`: clears the chilled flag.
- `String#-@`: acts as if the string was mutable.
- `String#+@`: acts as if the string was mutable.
- `String#clone`: copies the chilled flag.
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
The documentation for `rb_enc_interned_str_cstr` notes that `enc` can be
a null pointer, but this currently causes a segmentation fault when
trying to autoload the encoding. This commit fixes the issue by checking
for NULL before calling `rb_enc_autoload`.
* YJIT: Lazily push a frame for specialized C funcs
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
* Fix a comment on pc_to_cfunc
* Rename rb_yjit_check_pc to rb_yjit_lazy_push_frame
* Rename it to jit_prepare_lazy_frame_call
* Fix a typo
* Optimize String#getbyte as well
* Optimize String#byteslice as well
---------
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
* Specialize String#byteslice(a, b)
This adds a specialization for String#byteslice when there are two
parameters.
This makes our protobuf parser go from 5.84x slower to 5.33x slower
```
Comparison:
decode upstream (53738 bytes): 7228.5 i/s
decode protobuff (53738 bytes): 1236.8 i/s - 5.84x slower
Comparison:
decode upstream (53738 bytes): 7024.8 i/s
decode protobuff (53738 bytes): 1318.5 i/s - 5.33x slower
```
* Update yjit/src/codegen.rs
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
assert does not print the bug report, only the file and line number of
the assertion that failed. RUBY_ASSERT prints the full bug report, which
makes it much easier to debug.
[Bug #20169]
Embedded strings are not safe for system calls without the GVL because
compaction can cause pages to be locked causing the operation to fail
with EFAULT. This commit changes io_fwrite to use rb_str_tmp_frozen_no_embed_acquire,
which guarantees that the return string is not embedded.