addr2line.c: fix DW_FORM_ref_addr parsing for DWARF 2
This fixes a crash when retrieving backtrace info with YJIT enabled on
macOS with Rust 1.71.0. Since Rust 1.71.0, the DWARF info generated by
the Rust compiler uses DW_FORM_ref_addr instead of DW_FORM_ref4 for
pointers to other DIEs.
DW_FORM_ref_addr representation in DWARF 2 is different from DWARF 3+,
so we need to handle it separately.
This patch fixes the parsing of DW_FORM_ref_addr for DWARF 2, which is
the default DWARF version Rustc uses on macOS.
See the DWARF 2.0.0 spec, section 7.5.4 Attribute Encodings
https://dwarfstd.org/doc/dwarf-2.0.0.pdfhttps://bugs.ruby-lang.org/issues/19789
DW_FORM_GNU_ref_alt and DW_FORM_GNU_strp_alt refer to data stored in an
external ELF file specified by a .gnu_debugaltlink attribute. These
attributes are generated by dwz(1), which extracts DWARF data common
amongst several files and stores it in a single, new file. It leaves
behind these two forms in the original file to point at the new, common
data.
We don't support actually reading the .gnu_debugaltlink file in
addr2line.c (and maybe we don't really need to), but we do need to know
how to read the actual value of these forms so we can skip over the
right number of bytes and not lose track of where we are in the CU.
While trying to fix YJIT's symbol hygiene issue over at GH-7115, I found
that addr2line.c's DWARF 5 parsing is half-disabled when building with
GCC. Rust's output contains some DW_AT_rnglists_base records, which the
disabled code reads. Without DW_AT_rnglists_base, it crashes when
generating a backtrace.
In common Ruby build configurations, GCC opts to only use
DW_FORM_sec_offset for the range lists, and so it doesn't generate
DW_AT_rnglists_base records, so consuming GCC's DWARF 5 while building
with GCC was not a problem.
However, even when building with GCC, we might need to parse DWARF 5
generated by other compilers at runtime. They could come from C
extensions built by Clang, or come from Rust extensions. This
can happen even when building without YJIT.
We need to manually strip pointer authentication bits on M1 mac because
libunwind leaks them out.
Co-Authored-By: NARUSE, Yui <naruse@airemix.jp>
Co-Authored-By: Yuta Saito <kateinoigakukun@gmail.com>
Background: GCC 12 generates DWARF 5 with .debug_rnglists, while rustc
generates DWARF 4 with .debug_ranges.
The previous logic always used .debug_rnglists if there is the section.
However, we need to refer .debug_ranges for DWARF 4.
This change keeps DWARF version of the current compilation unit and use
a proper section depending on the version.
Currently, addr2line.c supports only one path format of debuglink:
"/usr/lib/debug/usr/bin/ruby.debug".
However, recent debian packages seem to use another format by build_id:
"/usr/lib/debug/.build-id/ab/cdef1234.debug".
5d1bb29841/dh_strip (L292)5d1bb29841/dh_strip (L353)
This changeset makes ruby backtrace support the second format.
When ruby is compiled by GCC 8 or later, some frames of C level
backtrace information lacks.
```
$ ./miniruby -e '1.times { Process.kill(:SEGV, $$) }'
...
-- C level backtrace information
-------------------------------------------
/home/mame/work/ruby-gcc-9/miniruby(rb_vm_bugreport+0x611) [0x558a5fdcbc21] ../ruby/vm_dump.c:758
[0x558a5fbc789a]
/home/mame/work/ruby-gcc-9/miniruby(sigsegv+0x4d) [0x558a5fd1eaed] ../ruby/signal.c:959
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f687e6713c0]
/lib/x86_64-linux-gnu/libc.so.6(kill+0xb) [0x7f687e31355b] ../sysdeps/unix/syscall-template.S:78
/home/mame/work/ruby-gcc-9/miniruby(rb_f_kill+0x350) [0x558a5fd1fe60] ../ruby/signal.c:480
[0x558a5fda50d3]
[0x558a5fdb085c]
[0x558a5fdb0fe7]
[0x558a5fdbae1a]
[0x558a5fdaf484]
/home/mame/work/ruby-gcc-9/miniruby(rb_yield_1+0x29f) [0x558a5fdb2fbf] ../ruby/vm.c:1265
/home/mame/work/ruby-gcc-9/miniruby(int_dotimes+0x5c) [0x558a5fc72f2c] ../ruby/numeric.c:5198
[0x558a5fda50d3]
[0x558a5fdb085c]
[0x558a5fdb0fe7]
[0x558a5fdbaf21]
[0x558a5fdaf484]
/home/mame/work/ruby-gcc-9/miniruby(rb_ec_exec_node+0xed) [0x558a5fbcc4fd] ../ruby/eval.c:317
/home/mame/work/ruby-gcc-9/miniruby(ruby_run_node+0x4f) [0x558a5fbd110f] ../ruby/eval.c:375
/home/mame/work/ruby-gcc-9/miniruby(main+0x73) [0x558a5fb2c083] ../ruby/main.c:50
```
By this one-line change, it shows all locations.
```
$ ./miniruby -e '1.times { Process.kill(:SEGV, $$) }'
...
-- C level backtrace information -------------------------------------------
/home/mame/work/ruby-gcc-9/miniruby(rb_print_backtrace+0x11) [0x558247adec21] ../ruby/vm_dump.c:758
/home/mame/work/ruby-gcc-9/miniruby(rb_vm_bugreport) ../ruby/vm_dump.c:956
/home/mame/work/ruby-gcc-9/miniruby(rb_bug_for_fatal_signal+0x15a) [0x5582478da89a] ../ruby/error.c:773
/home/mame/work/ruby-gcc-9/miniruby(sigsegv+0x4d) [0x558247a31aed] ../ruby/signal.c:959
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f82202f73c0]
/lib/x86_64-linux-gnu/libc.so.6(kill+0xb) [0x7f821ff9955b] ../sysdeps/unix/syscall-template.S:78
/home/mame/work/ruby-gcc-9/miniruby(rb_f_kill+0x350) [0x558247a32e60] ../ruby/signal.c:480
/home/mame/work/ruby-gcc-9/miniruby(vm_call_cfunc_with_frame+0x123) [0x558247ab80d3] ../ruby/vm_insnhelper.c:2821
/home/mame/work/ruby-gcc-9/miniruby(vm_call_method_each_type+0x7c) [0x558247ac385c] ../ruby/vm_insnhelper.c:3324
/home/mame/work/ruby-gcc-9/miniruby(vm_call_method+0xc7) [0x558247ac3fe7] ../ruby/vm_insnhelper.c:3428
/home/mame/work/ruby-gcc-9/miniruby(vm_sendish+0x14) [0x558247acde1a] ../ruby/vm_insnhelper.c:4412
/home/mame/work/ruby-gcc-9/miniruby(vm_exec_core) ../ruby/insns.def:789
/home/mame/work/ruby-gcc-9/miniruby(rb_vm_exec+0x1a4) [0x558247ac2484] ../ruby/vm.c:2165
/home/mame/work/ruby-gcc-9/miniruby(rb_yield_1+0x29f) [0x558247ac5fbf] ../ruby/vm.c:1265
/home/mame/work/ruby-gcc-9/miniruby(int_dotimes+0x5c) [0x558247985f2c] ../ruby/numeric.c:5198
/home/mame/work/ruby-gcc-9/miniruby(vm_call_cfunc_with_frame+0x123) [0x558247ab80d3] ../ruby/vm_insnhelper.c:2821
/home/mame/work/ruby-gcc-9/miniruby(vm_call_method_each_type+0x7c) [0x558247ac385c] ../ruby/vm_insnhelper.c:3324
/home/mame/work/ruby-gcc-9/miniruby(vm_call_method+0xc7) [0x558247ac3fe7] ../ruby/vm_insnhelper.c:3428
/home/mame/work/ruby-gcc-9/miniruby(vm_sendish+0x14) [0x558247acdf21] ../ruby/vm_insnhelper.c:4412
/home/mame/work/ruby-gcc-9/miniruby(vm_exec_core) ../ruby/insns.def:770
/home/mame/work/ruby-gcc-9/miniruby(rb_vm_exec+0x1a4) [0x558247ac2484] ../ruby/vm.c:2165
/home/mame/work/ruby-gcc-9/miniruby(rb_ec_exec_node+0xed) [0x5582478df4fd] ../ruby/eval.c:317
/home/mame/work/ruby-gcc-9/miniruby(ruby_run_node+0x4f) [0x5582478e410f] ../ruby/eval.c:375
/home/mame/work/ruby-gcc-9/miniruby(main+0x73) [0x55824783f083] ../ruby/main.c:50
```
Details:
In short, it is an uninitialized variable bug.
Until GCC 7, all function locations are represented by a pair of
DW_AT_low_pc and DW_AT_high_pc in DWARF information.
But since GCC 8, some functions are split to multiple chunks, which are
represented by DW_AT_ranges.
DW_AT_ranges are represented as offsets from a base address.
According to DWARF specification, it is the base address of the
compilation unit, but GCC seems to use zero as default.
The function "di_read_cu" in addr2line.c had a comment about the fact.
However, the base address wasn't initialized as zero.
Noticed that internal/stdbool.h and addr2line.c are the only two place
where missing/stdbool.h is included. Why not delete the file so that
we can merge internal/stdbool.h and missing/stdbool.h into one.
to suppress Coverity Scan warning.
This expression converted uint8_t to int, and then int to unsigned long.
Now it directly converts uint8_t to unsigned long.