* enc/unicode/case-folding.rb (CaseFolding#load): read in binary
mode to deal with non-ASCII charater in CaseFolding.txt.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* common.mk (lib/unicode_normalize/tables.rb): should not depend
on Unicode data files unless ALWAYS_UPDATE_UNICODE=yes, to get
rid of downloading Unicode data unnecessary. [ruby-dev:49681]
* common.mk (enc/unicode/casefold.h): update Unicode files in a
sub-make, not to let the header depend on the files always.
* enc/unicode/case-folding.rb: if gperf is not usable, assume the
existing file is OK.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/iso_8859.h (SHARP_s): name frequently used codepoint.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
case mapping methods.
* enc/unicode.c: Check for invalid string and signal with negative
length value.
* test/ruby/enc/test_case_mapping.rb: Add tests for above.
* test/ruby/test_m17n_comb.rb: Add a message to clarify test failure.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
swapcase functionality for titlecase characters. Swapcase isn't defined
by Unicode, because the purpose/usage of swapcase is unclear anyway.
The implementation follows a proposal from Nobu, swaping the case of
each component of a titlecase character individually.
This means that the titlecase characters have to be decomposed.
* enc/unicode.c: Code using the above data.
* test/ruby/enc/test_case_mapping.rb: Tests for the above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
special cases in CaseUnfold_11_Table.
* enc/unicode.c: Adjustments for above.
* test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in
test_titlecase activated; test_greek added. A test in test_cherokee fixed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data.
* enc/unicode.c: Marking capitalized character as unmodified if it is
already titlecase.
* test/ruby/enc/test_case_mapping.rb: Tests for above functionality.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* test/ruby/enc/test_case_mapping.rb: Test cases that detected
the above bugs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
and macros to work with unified CaseMappingSpecials array.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
case mapping data not available from case folding by unifying all
three cases (special title, special upper, special lower).
* enc/unicode.c: Adjust macro names for above (macros are currently inactive).
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
space for titlecase indices; adding additional macros to add or
extract titlecase index; adding comments for better documentation.
* enc/unicode.c: Moving some macros to include/ruby/oniguruma.h;
activating use of titlecase indices.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
data (new table, with indices from other tables).
* enc/unicode.c: Ignoring titlecase data indices for the moment.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rather than all) of target in CaseUnfold_11 array.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
single-letter; use flags in casefold.h for logic.
* enc/unicode/case-folding.rb: Added flag for case folding.
Changed parameter passing.
* enc/unicode/casefold.h: New flags added.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/unicode.c: Added shortening macros for enc/unicode/casefold.h
* enc/unicode/case-folding.rb: Fixed file encoding for CaseFolding.txt
to ASCII-8BIT (should fix some ci errors). Clarified usage. Created
class MapItem. Partially implemented class CaseMapping.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53767 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to pass as parameters; not yet implemented or used.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
String#downcase :fold.
* enc/unicode.c: Fixed a range error (lowest non-ASCII character affected
by case operations is U+00B5, MICRO SIGN)
* test/ruby/enc/test_case_mapping.rb: Explicit test for case folding of
MICRO SIGN to Greek mu.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
option for String#downcase by using case folding data from
regular expression engine, and added a few simple tests.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/depend: make timestamps for each work directory, instead of
making for each compilation and link.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53714 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
be able to use the remaining bits for flags.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
added hand-coded support for Turkic, fixed logic for swapcase.
* string.c: Made use of new case mapping code possible from upcase,
capitalize, and swapcase (with :lithuanian as a guard).
* test/ruby/enc/test_case_mapping.rb: Adjusted for above.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
case mapping. The code path is currently guarded by the :lithuanian
option to avoid accidental problems in daily use.
* test/ruby/enc/test_case_mapping.rb: Test for above.
* string.c: function 'check_case_options': fixed logical errors
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/Makefile.in (ECHO1): expand NULLCMD by configured value to
get rid of a bug of nmake, that it can expand bare single name
variable but cannot in substition.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53437 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/depend (enc, trans): fix version dependency, let encoding
and transcoding shared object files depend on config.status,
instead of enc.mk which is regenerated at each build, for the
RUBY_SO_NAME value used at runtime link.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53325 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/depend (enc, trans): fix version dependency, shared object
files depend on the RUBY_SO_NAME value for runtime link.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
from ISO-8859-2 to fix 0x80..0x9e range (from Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53198 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
test/ruby/test_transcode.rb: Fixed encoding name
to the correct one in the IANA registry (IBM037)
and added an alias (ebcdic-cp-us)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53124 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/trans/ebcdic.trans: transcodings between EBCDIC-US
and iso-8859-1 [with code from Andrea Ribuoli]
* test/ruby/test_transcode.rb: tests for above
* tool/transcode_tablegen.rb: additional argument for
method transcode_tblgen
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/windows_1252.c: separate from ISO-8859-1 to fix 0x80..0x9e
range. [ruby-core:64049] [Bug #10097]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53046 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
regular expressions from 7.0.0 to 8.0.0
(with help from Kimihito Matsui) [Feature #11563]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52612 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/{ascii,us_ascii,utf_8}.c: set encoding indexes of
fundamental built-in encodings so that usable as well as
allocated rb_encoding before rb_enc_init().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/make_encmake.rb: @srcdir@ in enc/Makefile.in needs to be
expanded.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51770 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
not include encinit.c itself. It caused "undefined reference to
Init_encinit".
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e