Since universal-parser and prism support, prelude code used functions
inaccessible from outside libruby shared library.
```
linking goruby
/usr/bin/ld: goruby.o: in function `prelude_eval':
/home/runner/work/ruby/ruby/build/golf_prelude.c:221: undefined reference to `rb_ruby_prism_ptr'
/usr/bin/ld: goruby.o: in function `pm_prelude_load':
/home/runner/work/ruby/ruby/build/golf_prelude.c:192: undefined reference to `pm_options_line_set'
/usr/bin/ld: /home/runner/work/ruby/ruby/build/golf_prelude.c:193: undefined reference to `pm_parse_string'
/usr/bin/ld: goruby.o: in function `prelude_eval':
/home/runner/work/ruby/ruby/build/golf_prelude.c:224: undefined reference to `pm_iseq_new_with_opt'
/usr/bin/ld: /home/runner/work/ruby/ruby/build/golf_prelude.c:226: undefined reference to `pm_parse_result_free'
/usr/bin/ld: goruby.o: in function `prelude_ast_value':
/home/runner/work/ruby/ruby/build/golf_prelude.c:181: undefined reference to `rb_ruby_ast_data_get'
/usr/bin/ld: goruby.o: in function `prelude_eval':
/home/runner/work/ruby/ruby/build/golf_prelude.c:231: undefined reference to `rb_ruby_ast_data_get'
/usr/bin/ld: goruby.o: in function `pm_prelude_load':
/home/runner/work/ruby/ruby/build/golf_prelude.c:196: undefined reference to `pm_parse_result_free'
collect2: error: ld returned 1 exit status
```
This commit creates a new directory `gc` to put different GC
implementations and moves the default GC from gc_impl.c to gc/gc_impl.c.
The default GC can be easily switched using the `BUILTIN_GC` variable
in Makefile.in.
The dtrace python script from systemtap on Linux actually looks at the
CFLAGS environment variable when invoking gcc to make the probes.o file.
If we don't pass the CFLAGS we're using, this probes.o file can wind up
without the required annotations indicating that it supports e.g. Intel
CET.
Fix this by explicitly exporting our build flags to the environment for
this script.
[Bug #18061]
We already assemble our assembly files using the $(CC) compiler driver,
rather than the actual $(AS) assembler. This means that
* The C preprocessor gets run on the assembly file
* It's valid to pass gcc-style flags to it, like e.g.
-mbranch-protection or -fcf-protection
* If you do so, the relevant preprocessor macros like __CET__ get set
* If you really wanted to pass assembler flags, you would need to do
that using -Wa,... anyway
So I think it makes sense to pass "$(XCFLAGS) $(CFLAGS) $(CPPFLAGS)" to
gcc/clang/etc when assembling, rather than passing $(ASFLAGS) (since
the flags are not actually passed to `as`, but `cc`!).
The side effect of this is that if there are mitigation flags like
-fcf-protection in $CFLAGS, then the relevant macros like __CET__ will
be defined when assembling the files.
[Bug #20601]
This changes the automatic detection of -fstack-protector,
-D_FORTIFY_SOURCE, and -mbranch-protection to write to $hardenflags
instead of $XCFLAGS. The definition of $cflags is changed to
"$hardenflags $orig_cflags $optflags $debugflags $warnflags" to match.
Furthermore, these flags are _prepended_ to $hardenflags, rather than
appended.
The implications of doing this are as follows:
* If a CRuby builder specifies cflags="-mbranch-protection=foobar" at
the ./configure script, and the configure script detects that
-mbranch-protection=pac-ret is accepted, then GCC will be invoked as
"gcc -mbranch-protection=pac-ret -mbranch-protection=foobar". Since
the last flags take precedence, that means that user-supplied values
of these flags in $cflags will take priority.
* Likewise, if a CRuby builder explicitly specifies
"hardenflags=-mbranch-protection=foobar", because we _prepend_ to
$hardenflags in our autoconf script, we will still invoke GCC as
"gcc -mbranch-protection=pac-ret -mbranch-protection=foobar".
* If a CRuby builder specifies CFLAGS="..." at the configure line,
automatic detection of hardening flags is ignored as before.
* C extensions will _also_ be built with hardening flags now as well
(this was not the case by default before because the detected flags
went into $XCFLAGS).
Additionally, as part of this work, I changed how the detection of
PAC/BTI in Context.S works. Rather than appending the autodetected
option to ASFLAGS, we simply compile a set of test programs with the
actual CFLAGS in use to determine what PAC/BTI settings were actually
chosen by the builder. Context.S is made aware of these choices through
some custom macros.
The result of this work is that:
* Ruby will continue to choose some sensible defaults for hardening
options for the C compiler
* Distributors are able to specify CFLAGS that are consistent with their
distribution and override these defaults
* Context.S will react to whatever -mbranch-protection is actually in
use, not what was autodetected
* Extensions get built with hardening flags too.
[Bug #20154]
[Bug #20520]
Since #10209 we've been noticing that on macos after running `make
clean` the `coroutine/arm64/Context.S` file is missing, causing
subsequent make calls to fail because `Context.S` is needed to build
`Context.o`.
The reason this is happening is because macos is case-insensitive so the
`.s` looks for `coroutine/arm64/Context.s` and finds
`coroutine/arm64/Context.s`. This does not happen on linux because the
filesystem is case sensitive.
I attempted to use `find` because it is case sensitive regardless of
filesystem, but it was a lot slower than `rm` since we can't pass
multiple file names the same way to `find`.
Reverting this small part of #10209 fixes the issue for macos and it
wasn't clear that those changes were strictly necessary for the rest of
the PR.
We changed the original code to use `rm` instead of `delete` because it
is not standarized on POSIX.
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.
## Background
We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.
## Summary (file by file)
- `rubyparser.h`
- Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
- Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
- You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
- Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
- rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
- Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
- This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
- Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
- Fix `node_memsize()`
- Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
- Follow-up due to the above changes
- `imemo.{c|h}`
- If an object with `imemo_ast` appears, considers it a bug
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
This patch is part of universal parser work.
## Summary
- Decouple VALUE from members below:
- `(struct parser_params *)->debug_lines`
- `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
- They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
- In order to do this,
- Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
- Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`
## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
- Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
- After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
- With regard to this, please see *Future tasks* below
## Future tasks
- Decouple IMEMO from `rb_ast_t *`
- This lifts the five-members-restriction of Ruby object,
- So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
- Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
- Simplify globbed file names.
- Prefer `File.open` over `Kernel#open`.
- Swallow initializer blocks instead of line by line with flip-flop.
- Re-structure converter list.
When multiple libraries exist in a subdirectory under `ext`, `rmdir
-p` may fail, because other directories have not been removed yet or
the parent directory has been removed by other `distclean`. `rmdir`
in GNU coreutils has `--ignore-fail-on-non-empty` option for the
former case but others may not, and the latter race condition is not
avoidable anyway.
By replacing `ALLOBJS` suffix with intermediate file suffixes instead
of roughly removing by wildcards. Made `cleanlibs` append `.dSYM`
suffix for each word in `TARGET_SO`, not the end of the entire list.
The extension library has each initialization function named "Init_" +
basename. If multiple extensions have the same base name (such as
cgi/escape and erb/escape), the same function will be registered for
both names.
To fix this conflict, rename the initialization functions under sub
directories using using parent names, when statically linking.
During the build, Ruby has special logic to serialize its own builtin
module to disk using the binary iseq format during the build (I assume
for speed so it doesn't have to parse builtin every time it starts
up).
However, since iseq format is architecture-specific, when building on
x86_64 for universal x86_64 + arm64, the serialized builtin module is
written with the x86_64 architecture of the build machine, which fails
this check whenever ruby imports the builtin module on arm64:
1fdaa06660/compile.c (L13243)
Thankfully, there's logic to disable this feature for cross-compiled builds:
1fdaa06660/builtin.c (L6)
This disables the iseq logic for universal builds as well.
Fixes [Bug #18286]
On some platforms, such as FreeBSD and Oracle Linux, symbols defined
in the crt0 setup routine are exported from shared libraries. So
ignore the symbols that would be exported even in an empty shared
library.