Ignoring `CHAR_BITS` > 8 platform, as far as `ch` indexes
`escape_table` that is hard-coded as 256 elements.
```
../../../../src/ext/json/generator/generator.c(121): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(122): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(243): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(244): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(291): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(292): warning C4333: '>>': right shift by too large amount, data loss
```
https://github.com/ruby/json/commit/fb82373612
Fix: https://github.com/ruby/json/issues/667
This is yet another behavior on which the various implementations
differed, but the C implementation used to call `to_json` on String
subclasses used as keys.
This was optimized out in e125072130229e54a651f7b11d7d5a782ae7fb65
but there is an Active Support test case for it, so it's best to
make all 3 implementation respect this behavior.
Given we expect these to almost always be null, we might as
well keep them in RString.
And even when provided, assuming we're passed frozen strings
we'll save on copying them.
This also reduce the size of the struct from 112B to 72B.
https://github.com/ruby/json/commit/6382c231b0
Ref: https://github.com/ruby/json/issues/655
Followup: https://github.com/ruby/json/issues/657
Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.
In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.
Before:
```
ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 518.700k i/100ms
JSON reuse 483.370k i/100ms
Calculating -------------------------------------
Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s
JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s
Comparison:
Oj: 5722283.8 i/s
JSON reuse: 5278061.7 i/s - 1.08x slower
```
After:
```
ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 517.837k i/100ms
JSON reuse 548.871k i/100ms
Calculating -------------------------------------
Oj 5.693M (± 1.6%) i/s (175.65 ns/i) - 28.481M in 5.004056s
JSON reuse 5.855M (± 1.2%) i/s (170.80 ns/i) - 29.639M in 5.063004s
Comparison:
Oj: 5692985.6 i/s
JSON reuse: 5854857.9 i/s - 1.03x faster
```
https://github.com/ruby/json/commit/fe607f4806
Fix: https://github.com/ruby/json/issues/646
Since both `json` and `json_pure` expose the same files, if the
versions don't match, the native extension may be loaded with Ruby
code that don't match and is incompatible.
By doing the `require json/ext/generator/state` from C we ensure
we're at least loading that.
But this is a dirty workaround for the 2.7.x branch, we should
find a better way to fully isolate the two gems.
https://github.com/ruby/json/commit/dfdd4acf36
Ref: https://github.com/ruby/json/issues/647
Ref: https://github.com/rubygems/rubygems/pull/6490
Older rubygems are executing `extconf.rb` with a broken `$LOAD_PATH`
causing the `json` gem native extension to be loaded with the stdlib
version of the `.rb` files.
This fails with
```
json/common.rb:82:in `initialize': wrong number of arguments (given 1, expected 0) (ArgumentError)
```
Since this is just for `extconf.rb` we can probably just accept that
extra argument and ignore it.
The bug was fixed in rubygems 3.4.9 / 2023-03-20
https://github.com/ruby/json/commit/1f5e849fe0
Profiling revealed that we were spending lots of time growing the buffer.
Buffer operations is definitely something we want to optimize, but for
this specific benchmark what we're interested in is UTF-8 scanning performance.
Each iteration of the two scaning benchmark were producing 20MB of JSON,
now they only produce 5MB.
Now:
```
== Encoding mostly utf8 (5001001 bytes)
ruby 3.4.0dev (2024-10-18T19:01:45Z master https://github.com/ruby/json/commit/7be9a333ca) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 35.000 i/100ms
oj 36.000 i/100ms
rapidjson 10.000 i/100ms
Calculating -------------------------------------
json 359.161 (± 1.4%) i/s (2.78 ms/i) - 1.820k in 5.068542s
oj 359.699 (± 0.6%) i/s (2.78 ms/i) - 1.800k in 5.004291s
rapidjson 99.687 (± 2.0%) i/s (10.03 ms/i) - 500.000 in 5.017321s
Comparison:
json: 359.2 i/s
oj: 359.7 i/s - same-ish: difference falls within error
rapidjson: 99.7 i/s - 3.60x slower
```
https://github.com/ruby/json/commit/1a338532d2
The purpose of this change is to exploit `fbuffer_append_char` that is
faster than `fbuffer_append`.
`array_delim` was a buffer that concatenated a single comma with
`array_nl`. However, in the typical use case (`JSON.generate(data)`),
`array_nl` is empty. This means that `array_delim` was a
single-character buffer in many cases.
`fbuffer_append(buffer, array_delim)` used `memcpy` to copy one byte,
which was not so efficient.
Rather, this change uses `fbuffer_append_char(buffer, ',')` and then
`fbuffer_append(buffer, array_nl)` only when `array_nl` is not NULL.
This speeds up `JSON.generate` by about 9% in a benchmark.
https://github.com/ruby/json/commit/445de6e459
... instead of `generate_json`.
Since the object key is already confirmed to be a string, using a
generic dispatch function brings an unnecessary overhead.
This speeds up `JSON.generate` by about 3% in a benchmark.
https://github.com/ruby/json/commit/e125072130
Dispatching based on Ruby's VALUE structure is more efficient than
simply cascaded "if ... else if ..." checks.
This speeds up `JSON.generate` by about 5% in a benchmark.
https://github.com/ruby/json/commit/4f9180debb
It is safe to use `RARRAY_AREF` here because no Ruby code is executed
between `RARRAY_LEN` and `RARRAY_AREF`.
This speeds up `JSON.generate` by about 4% in a benchmark.
https://github.com/ruby/json/commit/c5d80f9fd4
```
generator.c:69:27: warning: comparison of integers of different signs: 'short' and 'unsigned long' [-Wsign-compare]
for (i = 1; i < ch_len; i++) {
```
https://github.com/ruby/json/commit/ff8edcd47c
I, Luke T. Shumaker, am the sole author of the added code.
I did not reference CVTUTF when writing it. I did reference the
Unicode standard (15.0.0), the Wikipedia article on UTF-8, and the
Wikipedia article on UTF-16. When I saw some tests fail, I did
reference the old deleted code (but a JSON-specific part, inherently
not as based on CVTUTF) to determine that script_safe should also
escape U+2028 and U+2029.
I targeted simplicity and clarity when writing the code--it can likely
be optimized. In my mind, the obvious next optimization is to have it
combine contiguous non-escaped characters into just one call to
fbuffer_append(), instead of calling fbuffer_append() for each
character.
Regarding the use of the "modern" types `uint32_t`, `uint16_t`, and
`bool`:
- ruby.h is guaranteed to give us uint32_t and uint16_t.
- Since Ruby 3.0.0, ruby.h is guaranteed to give us bool... but we
support down to Ruby 2.3. But, ruby.h is guaranteed to give us
HAVE_STDBOOL_H for the C99 stdbool.h; so use that to include
stdbool.h if we can, and if not then fall back to a copy of the
same bool definition that Ruby 3.0.5 uses with C89.
https://github.com/ruby/json/commit/c96351f874
Rather than checking the class we can check the type.
This is very subtly different for String subclasses, but I think it's
OK.
We also save on checking the type again in the fast path.
https://github.com/flori/json/commit/772a0201ab
Given that we called `rb_enc_str_asciionly_p`, if the string encoding
isn't valid UTF-8, we can't know it very cheaply by checking the
encoding and coderange that was just computed by Ruby, rather than
to do it ourselves.
Also Ruby might have already computed that earlier.
https://github.com/flori/json/commit/4b04c469d5