Fix update_coderange for binary strings

Although a binary (aka ASCII-8BIT) string will never have a broken
coderange, it still has to differentiate between "valid" and "7bit".

On Ruby 3.4/trunk this problem is masked because we now clear the
coderange more agressively in rb_str_resize, and we happened to always
be strinking this string, but we should not assume that.

On Ruby 3.3 this created strings where `ascii_only?` was true in cases
it shouldn't be as well as other problems.

Fixes [Bug #20883]

Co-authored-by: Daniel Colson <danieljamescolson@gmail.com>
Co-authored-by: Matthew Draper <matthew@trebex.net>
This commit is contained in:
John Hawthorn 2024-10-22 19:22:37 -07:00
Родитель 51ffef2819
Коммит 1f6dd9071c
2 изменённых файлов: 9 добавлений и 2 удалений

Просмотреть файл

@ -247,8 +247,7 @@ rb_str_format(int argc, const VALUE *argv, VALUE fmt)
} }
#define update_coderange(partial) do { \ #define update_coderange(partial) do { \
if (coderange != ENC_CODERANGE_BROKEN && scanned < blen \ if (coderange != ENC_CODERANGE_BROKEN && scanned < blen) { \
&& rb_enc_to_index(enc) /* != ENCINDEX_ASCII_8BIT */) { \
int cr = coderange; \ int cr = coderange; \
scanned += rb_str_coderange_scan_restartable(buf+scanned, buf+blen, enc, &cr); \ scanned += rb_str_coderange_scan_restartable(buf+scanned, buf+blen, enc, &cr); \
ENC_CODERANGE_SET(result, \ ENC_CODERANGE_SET(result, \

Просмотреть файл

@ -546,4 +546,12 @@ class TestSprintf < Test::Unit::TestCase
sprintf("%*s", RbConfig::LIMITS["INT_MIN"], "") sprintf("%*s", RbConfig::LIMITS["INT_MIN"], "")
end end
end end
def test_binary_format_coderange
1.upto(500) do |i|
str = sprintf("%*s".b, i, "\xe2".b)
refute_predicate str, :ascii_only?
assert_equal i, str.bytesize
end
end
end end