Граф коммитов

661 Коммитов

Автор SHA1 Сообщение Дата
Martin Dürst 56d9d78f14 Remove Unicode 13.0.0 related files 2022-03-16 08:30:04 +09:00
Martin Dürst 45e0711f29 update Unicode Version to 14.0.0 and Emoji version to 14.0 2022-03-13 09:19:52 +09:00
Nobuyoshi Nakada db740b7e5c
Revert "enc/depend: fix out-of-src build with --with-static-linked-ext" (#5616)
This reverts commit 32ad8df9d1,
which broke out-of-src build with the pre-generated transcoder sources.
2022-03-02 18:19:01 +09:00
Yuta Saito 32ad8df9d1 enc/depend: fix out-of-src build with --with-static-linked-ext
When out-of-src build, at the beginning of a build, `make -f enc.mk
srcs` generates trans C sources under build dir.

On the other hand, enc/trans/*.o were built from trans C sources
generated under srcdir due to the following auto-generated rules from
enc/depend.

```
encsrcdir = ../src/enc
...
enc/trans/big5.$(OBJEXT): $(encsrcdir)/trans/big5.c
```

Therefore, trans C sources are generated twice under srcdir and build
dir during a build.

Ideally, trans C sources have always been built before compilation of
enc/trans/*.o because the source generation is prereq, so making
enc/trans/*.o doesn't trigger trans C source generation and shouldn't
require MINIRUBY as a make arg for enc.mk. However, the second trans C
source gen is unintentionally triggered by enc/trans/*.o, so `make -f
enc.mk libencs` requires MINIRUBY for now.

When no `--with-static-linked-ext`, `make -f enc.mk libencs` is
triggered from common.mk with MINIRUBY, so there is no problem.
But when `--with-static-linked-ext`, libencs should be statically-linked
to ruby, so `make -f enc.mk libencs` is triggered from exts.mk, and
exts.mk invokes it without MINIRUBY.

Therefore, when out-of-src build and with `--with-static-linked-ext`,
the second trans C source gen fails due to missing MINIRUBY.
This issue is deterministically reproducible without -j because
common.mk's `main` rule also has libencs prerequisite.

This patch supresses the second trans C source gen.
2022-03-02 09:40:58 +09:00
Peter Zhu 2d5ecd60a5 [Feature #18249] Update dependencies 2022-02-22 09:55:21 -05:00
Peter Zhu 638fd8774b [Feature #18249] Include ruby.h in extensions to have ABI version
All shared libraries must have `include/ruby/internal/abi.h` to include
the ABI version. Including `ruby.h` will guarantee that.
2022-02-22 09:55:21 -05:00
Nobuyoshi Nakada ac152b3cac
Update dependencies 2021-11-21 16:21:18 +09:00
卜部昌平 5c167a9778 ruby tool/update-deps --fix 2021-10-05 14:18:23 +09:00
Martin Dürst 6072239121 Remove no longer needed include files (Unicode Version 12.1.0) 2021-07-09 16:22:38 +09:00
Martin Dürst 323ff38c04 Add directory and include files for Unicode version 13.0.0
- Add directory enc/unicode/13.0.0
- Add include files casefold.h and name2ctype.h for Unicode
  version 13.0.0
2021-07-08 14:45:03 +09:00
Nobuyoshi Nakada 1ac228378c
Use $ignore_error defined in mkmf.rb 2021-07-03 12:52:46 +09:00
卜部昌平 6413dc27dc dependency updates 2021-04-13 14:30:21 +09:00
Nobuyoshi Nakada d57c5a7b61 transcode-tblgen.rb: make silent a little when just -v 2020-12-29 17:45:19 +09:00
Lars Kanis d403591b34
Add string encoding IBM720 alias CP720 (#3803)
The mapping table is generated from the ICU project:
  https://github.com/unicode-org/icu/blob/master/icu4c/source/data/mappings/ibm-720_P100-1997.ucm

Fixes bug 16233 : https://bugs.ruby-lang.org/issues/16233
2020-11-22 22:23:40 +09:00
卜部昌平 490010084e sed -i '/rmodule.h/d' 2020-08-27 16:42:06 +09:00
卜部昌平 756403d775 sed -i '/r_cast.h/d' 2020-08-27 15:03:36 +09:00
卜部昌平 0da2a3f1fc sed -i '\,2/extern.h,d' 2020-08-27 14:07:49 +09:00
Kazuhiro NISHIYAMA 946cd6c534
Use https instead of http 2020-07-28 19:51:54 +09:00
Jeremy Evans ddd9704ae9 Encode ' as ' when using encode(xml: :attr)
Fixes [Bug #16922]
2020-07-10 09:34:08 -07:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada b7e1eda932
Suppress warnings by gcc 10.1.0-RC-20200430
* Folding results should not be empty.

  If `OnigCodePointCount(to->n)` were 0, `for` loop using `fn`
  wouldn't execute and `ncs` elements are not initialized.

  ```
  enc/unicode.c:557:21: warning: 'ncs[0]' may be used uninitialized in this function [-Wmaybe-uninitialized]
    557 |  for (i = 0; i < ncs[0]; i++) {
        |                  ~~~^~~
  ```

* Cast to `enum yytokentype`

  Additional enums for scanner events by ripper are not included
  in `yytokentype`.

  ```
  ripper.y:7274:28: warning: implicit conversion from 'enum <anonymous>' to 'enum yytokentype' [-Wenum-conversion]
  ```
2020-05-04 12:28:24 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Kazuki Tsujimoto b25ef4bf70
Suppress warnings: reserved for numbered parameter 2020-04-05 18:24:59 +09:00
Nobuyoshi Nakada 21d0b40de2 Added tooldir variable 2020-04-05 09:26:57 +09:00
卜部昌平 115fec062c more on NULL versus functions.
Function pointers are not void*.  See also
ce4ea956d2
8427fca49b
2020-02-07 14:24:19 +09:00
卜部昌平 0c2d731ef2 update dependencies 2019-12-26 20:45:12 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Nobuyoshi Nakada 992aa2cda5
enc/x_emoji.h: fixed dead-links [ci skip]
English version pages seem no longer provided.
2019-12-24 10:33:32 +09:00
Nobuyoshi Nakada cc87037f1c
Fixed misspellings
Fixed misspellings reported at [Bug #16437], missed and a new
typo.
2019-12-22 22:49:17 +09:00
Nobuyoshi Nakada e1b2341488
Update dependencies 2019-11-18 23:16:22 +09:00
Nobuyoshi Nakada 162cf2879a
Init function is need to link statically 2019-08-10 01:41:50 +09:00
Nobuyoshi Nakada cecae8593a
Removed unnecessary headers 2019-08-10 01:05:09 +09:00
Nobuyoshi Nakada 88db6fa479
Use ENC_REPLICATE to copy an encoding 2019-08-10 01:04:39 +09:00
Yusuke Endoh a8ba22cd32 Revert "Removed unused includes"
This reverts commit c9eb8f82e9.

The change caused "implicit declaration" warning and actual segfault.

```
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c: In function ‘Init_gb2312’:
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c:6:31: warning: implicit declaration of function ‘rb_enc_find’ [-Wimplicit-function-declaration]
     rb_enc_register("GB2312", rb_enc_find("EUC-KR"));
                               ^~~~~~~~~~~
/tmp/ruby/v2/src/trunk-gc-asserts/enc/gb2312.c:6:31: warning: passing argument 2 of ‘rb_enc_register’ makes pointer from integer without a cast [-Wint-conversion]
<command-line>:0:19: note: expected ‘OnigEncoding {aka const struct OnigEncodingTypeST *}’ but argument is of type ‘int’
/tmp/ruby/v2/src/trunk-gc-asserts/regenc.h:231:12: note: in expansion of macro ‘ONIG_ENC_REGISTER’
 extern int ONIG_ENC_REGISTER(const char *, OnigEncoding);
            ^~~~~~~~~~~~~~~~~
```
2019-08-10 00:01:36 +09:00
Nobuyoshi Nakada c9eb8f82e9
Removed unused includes 2019-08-09 23:08:30 +09:00
Nobuyoshi Nakada 715955ff27
Include ruby/assert.h in ruby/ruby.h so that assertions can be there 2019-07-14 17:58:03 +09:00
Takashi Kokubun 18603e9046
Update dependencies for 369ff79394
Just copy-pasting diff from
https://travis-ci.org/ruby/ruby/jobs/558407687
2019-07-14 12:55:58 +09:00
Martin Dürst 369ff79394 add encoding conversion from/to CESU-8
Add encoding conversion (transcoding) from UTF-8 to CESU-8
and back. CESU-8 is an encoding similar to UTF-8, but encodes
codepoints above U+FFFF as two surrogates, these surrogates
again being encoded as if they were UTF-8 codepoints. This
preserves the same binary sorting order as in UTF-16. It is
also somewhat similar (although not exactly identical) to an
encoding used internally by Java.

This completes issue #15995.

enc/trans/cesu_8.trans: Add encoding conversion from/to CESU-8
test/ruby/test_transcode.rb: Add tests for above
2019-07-14 10:58:50 +09:00
Nobuyoshi Nakada d905ff61e6
Update dependencies 2019-07-09 13:47:07 +09:00
NARUSE, Yui 4275f09015 remove UNREACHABLE 2019-06-24 16:01:46 +09:00
NARUSE, Yui 7f64a0b4db Add new encoding CESU-8 [Feature #15931] 2019-06-24 12:58:33 +09:00
duerst 50eea44a27 remove Unicode 12.0.0 related directory and generated files
This completes issue #15195.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67453 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-05 23:52:15 +00:00
duerst 7fe64d17d3 update to Unicode Version 12.1.0 (beta)
Unicode Version 12.1.0 adds one single character, U+32FF SQUARE ERA NAME REIWA,
for the new Japanese era starting on May 1st. 12.1.0 will be finalized only on
May 7th, so we go with the beta version because further changes in the data we
need are highly unlikely, and we want to make sure Ruby is ready for the new era.

* common.mk: change UNICODE_VERSION to 12.1.0, UNICODE_BETA to YES

* enc/unicode/12.1.0, enc/unicode/12.1.0/casefold.h, enc/unicode/12.1.0/name2ctype.h:
  add directory and generated data files for new version

* lib/unicode_normalize/tables.rb: update for new character

* test/ruby/test_regexp.rb: add test for character property age=12.1

* test/test_unicode_normalize.rb: add test for NFKC decomposition of new character

This (mostly) completes issue #15195.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-05 00:58:51 +00:00
duerst f831ca6764 delete directory and files related to Unicode version 11.0.0
this completes and closes feature #15321

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-06 03:19:10 +00:00
duerst cff7eefa07 update Unicode version (and Emoji version) to 12.0.0
- common.mk: set UNICODE_VERSION and UNICODE_EMOJI_VERSION to 12.0.0

- lib/unicode_normalize/tables.rb: update table data to Unicode version 12.0.0

- enc/unicode/12.0.0/casefold.h, enc/unicode/12.0.0/name2ctype.h: add generated
  files for Unicode version 12.0.0

This is the main commit for #15321.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-03-06 01:55:19 +00:00
nobu 4fc656a2f3 Removed duplicate dependents
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-02-14 05:40:08 +00:00
nobu b2f4241509 Update dependencies, internal.h includes ruby.h
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-02-12 12:31:55 +00:00
duerst 3628eae2e7 implement special behavior for Georgian for String#capitalize
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-09 23:14:29 +00:00
duerst c2d8078e3d delete Unicode 10.0.0 related files, no longer needed [#14802]
This line, and those below, will be ignored--

D    enc/unicode/10.0.0


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-09 02:02:45 +00:00