The granularity is calculated as `10 ** ndigits.abs` rather than
`ndigits.abs * 10`. For example, if `ndigits` is `-2`, it should be
`10 ** 2 == 100` rather than `2 * 10 == 20`.
Speeds up ChunkyPNG.
The interpreter is about 70% faster:
```
before: ruby 3.4.0dev (2024-07-03T15:16:17Z master 786cf9db48) [arm64-darwin23]
after: ruby 3.4.0dev (2024-07-03T15:32:25Z ruby-downto 0b8b744ce2) [arm64-darwin23]
---------- ----------- ---------- ---------- ---------- ------------- ------------
bench before (ms) stddev (%) after (ms) stddev (%) after 1st itr before/after
chunky-png 892.2 0.1 526.3 1.0 1.65 1.70
---------- ----------- ---------- ---------- ---------- ------------- ------------
Legend:
- after 1st itr: ratio of before/after time for the first benchmarking iteration.
- before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup.
```
YJIT is 2.5x faster:
```
before: ruby 3.4.0dev (2024-07-03T15:16:17Z master 786cf9db48) +YJIT [arm64-darwin23]
after: ruby 3.4.0dev (2024-07-03T15:32:25Z ruby-downto 0b8b744ce2) +YJIT [arm64-darwin23]
---------- ----------- ---------- ---------- ---------- ------------- ------------
bench before (ms) stddev (%) after (ms) stddev (%) after 1st itr before/after
chunky-png 709.4 0.1 278.8 0.3 2.35 2.54
---------- ----------- ---------- ---------- ---------- ------------- ------------
Legend:
- after 1st itr: ratio of before/after time for the first benchmarking iteration.
- before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup.
```
If a GC runs right during creating a rb_fix_to_s_static, it may cause
the previous ones to become swept by the GC because they have not been
registered by rb_vm_register_global_object.
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.
Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.
So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.
`vm->mark_object_ary` is also being refactored.
Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.
This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.
But using a custom TypedData we can save from having to mark
all the references on minor GC runs.
Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
assert does not print the bug report, only the file and line number of
the assertion that failed. RUBY_ASSERT prints the full bug report, which
makes it much easier to debug.
In ae0ceafb0c0d05cc80623b525070255e3abb34ef ruby_dtoa was switched to
use malloc instead of xmalloc, which means that consumers should be
using free instead of xfree. Otherwise we will artificially shrink
oldmalloc_increase_bytes.
Method references is not only able to be marked up as code, also
reflects `--show-hash` option.
The bug that prevented the old rdoc from correctly parsing these
methods was fixed last month.