Co-authored-by: Nobuyoshi Nakada [nobu@ruby-lang.org](mailto:nobu@ruby-lang.org)
See https://github.com/ruby/ruby/pull/6451 and
https://bugs.ruby-lang.org/issues/19007.
This keeps the Unicode version at 14.0.0, so this commit
is suited for backporting where applicable.
At the time of this commit, the reason for the wrong properties
which we fix here is still not completely known, so issue 19007
should be kept open.
The reason why this was commented out was because of gperf 3.0 vs 3.1
differences (see [Feature #13883]). Five years passed, I am pretty
confident that we can drop support of old versions here.
Ditto for uniname2ctype_p(), onig_jis_property(), and zonetab().
Previously, newline: :lf was accepted but ignored. Where it
should have been used was commented out code that didn't work,
but unlike all other invalid values, using newline: :lf did
not raise an error.
This adds support for newline: :lf and :lf_newline, for consistency
with newline: :cr and :cr_newline. This is basically the same as
universal_newline, except that it only affects writing and not
reading due to RUBY_ECONV_NEWLINE_DECORATOR_WRITE_MASK.
Add tests for the File.open :newline option while here.
Fixes [Bug #12436]
Adding `ruby` to `PREP` causes the following circular dependencies
because `PREP` is used as a prerequisite by some targets required to
build `ruby` target itself.
```
make: Circular .rbconfig.time <- ruby dependency dropped.
make: Circular builtin_binary.inc <- ruby dependency dropped.
make: Circular ext/extinit.c <- ruby dependency dropped.
make: Circular ruby <- ruby dependency dropped.
```
Adding a new Make variable like `EXTPREP` only for exts may be also
reasonable, but it would introduce another complexity into our build
system. `-bundle_loader` doesn't care that link-time and run-time
loader executables are different as long as bound symbols are provided,
so it's ok to resolve from miniruby to simplify our build.
ld64 shipped with Xcode 14 emits a warning when using `-undefined
dynamic_lookup`.
```
ld: warning: -undefined dynamic_lookup may not work with chained fixups
```
Actually, `-undefined dynamic_lookup` doesn't work when:
1. Link a *shared library* with the option
2. Link it with a program that uses the chained-fixup introduced from
macOS 12 and iOS 15
because `-undefined dynamic_lookup` uses lazy-bindings and they won't be
bound while dyld fixes-up by traversing chained-fixup info.
However, we build exts as *bundles* and they are loaded only through
`dlopen`, so it's safe to use `-undefined dynamic_lookup` in theory.
So the warning produced by ld64 is false-positive, and it results
failure of option checking in configuration. Therefore, it would be an
option to ignore the warning during our configuration.
On the other hand, `-undefined dynamic_lookup` is already deprecated on
all darwin platforms except for macOS, so it's good time to get rid of
the option. ld64 also provides `-bundle_loader <executable>` option,
which allows to resolve symbols defined in the executable symtab while
linking. It behaves almost the same with `-undefined dynamic_lookup`,
but it makes the following changes:
1. Require that unresolved symbols among input objects must be defined
in the executable.
2. Lazy symbol binding will lookup only the symtab of the bundle loader
executable. (`-undefined dynamic_lookup` lookups all symtab as flat
namespace)
This patch adds `-bundle_loader $(RUBY)` when non-EXTSTATIC
configuration by assuming ruby executable can be linked before building
exts.
See "New Features" subsection under "Linking" section for chained fixup
https://developer.apple.com/documentation/xcode-release-notes/xcode-13-release-notes
When out-of-src build, at the beginning of a build, `make -f enc.mk
srcs` generates trans C sources under build dir.
On the other hand, enc/trans/*.o were built from trans C sources
generated under srcdir due to the following auto-generated rules from
enc/depend.
```
encsrcdir = ../src/enc
...
enc/trans/big5.$(OBJEXT): $(encsrcdir)/trans/big5.c
```
Therefore, trans C sources are generated twice under srcdir and build
dir during a build.
Ideally, trans C sources have always been built before compilation of
enc/trans/*.o because the source generation is prereq, so making
enc/trans/*.o doesn't trigger trans C source generation and shouldn't
require MINIRUBY as a make arg for enc.mk. However, the second trans C
source gen is unintentionally triggered by enc/trans/*.o, so `make -f
enc.mk libencs` requires MINIRUBY for now.
When no `--with-static-linked-ext`, `make -f enc.mk libencs` is
triggered from common.mk with MINIRUBY, so there is no problem.
But when `--with-static-linked-ext`, libencs should be statically-linked
to ruby, so `make -f enc.mk libencs` is triggered from exts.mk, and
exts.mk invokes it without MINIRUBY.
Therefore, when out-of-src build and with `--with-static-linked-ext`,
the second trans C source gen fails due to missing MINIRUBY.
This issue is deterministically reproducible without -j because
common.mk's `main` rule also has libencs prerequisite.
This patch supresses the second trans C source gen.
* Folding results should not be empty.
If `OnigCodePointCount(to->n)` were 0, `for` loop using `fn`
wouldn't execute and `ncs` elements are not initialized.
```
enc/unicode.c:557:21: warning: 'ncs[0]' may be used uninitialized in this function [-Wmaybe-uninitialized]
557 | for (i = 0; i < ncs[0]; i++) {
| ~~~^~~
```
* Cast to `enum yytokentype`
Additional enums for scanner events by ripper are not included
in `yytokentype`.
```
ripper.y:7274:28: warning: implicit conversion from 'enum <anonymous>' to 'enum yytokentype' [-Wenum-conversion]
```
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
This reverts commit c9eb8f82e9.
The change caused "implicit declaration" warning and actual segfault.
```
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c: In function ‘Init_gb2312’:
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c:6:31: warning: implicit declaration of function ‘rb_enc_find’ [-Wimplicit-function-declaration]
rb_enc_register("GB2312", rb_enc_find("EUC-KR"));
^~~~~~~~~~~
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c:6:31: warning: passing argument 2 of ‘rb_enc_register’ makes pointer from integer without a cast [-Wint-conversion]
<command-line>:0:19: note: expected ‘OnigEncoding {aka const struct OnigEncodingTypeST *}’ but argument is of type ‘int’
/tmp/ruby/v2/src/trunk-gc-asserts/regenc.h:231:12: note: in expansion of macro ‘ONIG_ENC_REGISTER’
extern int ONIG_ENC_REGISTER(const char *, OnigEncoding);
^~~~~~~~~~~~~~~~~
```
Add encoding conversion (transcoding) from UTF-8 to CESU-8
and back. CESU-8 is an encoding similar to UTF-8, but encodes
codepoints above U+FFFF as two surrogates, these surrogates
again being encoded as if they were UTF-8 codepoints. This
preserves the same binary sorting order as in UTF-16. It is
also somewhat similar (although not exactly identical) to an
encoding used internally by Java.
This completes issue #15995.
enc/trans/cesu_8.trans: Add encoding conversion from/to CESU-8
test/ruby/test_transcode.rb: Add tests for above