byte sequence exception. store the part as an instance variable.
(ecerr_readagain_bytes): new method to access the readagain part.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
additional transcoders.
(econv_description): extracted from rb_econv_open_exc.
(rb_econv_open_exc): use econv_description.
(econv_inspect): use econv_description.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
last_error. num_trans may be zero.
(rb_econv_convert0): num_trans may be zero.
(rb_econv_putbackable): ditto.
(rb_econv_putback): ditto.
(rb_econv_convert): input_ptr and output_ptr may be NULL.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18835 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_option_t*.
* transcode.c (transcode_loop): take rb_econv_option_t* as a argument.
(str_transcode0): ditto.
(str_transcode): make rb_econv_option_t and call str_transcode0 with
it.
(rb_str_transcode): take rb_econv_option_t*.
* io.c (io_fwrite): follow the rb_str_transcode change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (econv_opts): extracted from str_transcode.
(str_transcode_enc_args): extracted from str_transcode.
(str_transcode0): extracted from str_transcode.
(str_transcode): use econv_opts, str_transcode_enc_args,
str_transcode0.
(rb_str_transcode): call str_transcode0.
(econv_primitive_insert_output): give the additional argument for
rb_str_transcode.
* io.c (make_writeconv): use invalid/undef flags.
(io_fwrite): ditto.
(rb_scan_open_args): give the additional argument for
rb_str_transcode.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18808 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/encoding.h (rb_econv_t): new field: flags.
(rb_econv_binmode): declared.
* io.c (io_unread): text mode hack removed.
(NEED_NEWLINE_DECODER): defined.
(NEED_NEWLINE_ENCODER): defined.
(NEED_READCONV): defined.
(NEED_WRITECONV): defined.
(TEXTMODE_NEWLINE_ENCODER): defined for windows.
(make_writeconv): setup converter with TEXTMODE_NEWLINE_ENCODER for
text mode.
(io_fwrite): use NEED_WRITECONV. character code conversion is
disabled if fptr->writeconv_stateless is nil.
(make_readconv): setup converter with
ECONV_UNIVERSAL_NEWLINE_DECODER for text mode.
(read_all): use NEED_READCONV.
(appendline): use NEED_READCONV.
(rb_io_getline_1): use NEED_READCONV.
(io_getc): use NEED_READCONV.
(rb_io_ungetc): use NEED_READCONV.
(rb_io_binmode): OS-level text mode test removed. call
rb_econv_binmode.
(rb_io_binmode_m): call rb_io_binmode_m with write_io as well.
(rb_io_flags_mode): return mode string including "t".
(rb_io_mode_flags): detect "t" for text mode.
(rb_sysopen): always specify O_BINARY.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize flags.
(rb_econv_open): if source and destination encoding is
both empty string, open newline converter. last_tc will be NULL in
this case.
(rb_econv_encoding_to_insert_output): last_tc may be NULL now.
(rb_econv_string): ditto.
(output_replacement_character): ditto.
(transcode_loop): ditto.
(econv_init): ditto.
(econv_inspect): ditto.
(rb_econv_binmode): new function.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to exception object.
(ecerr_source_encoding): new method:
Encoding::ConversionUndefined#source_encoding and
Encoding::InvalidByteSequence#source_encoding.
(ecerr_destination_encoding): new method:
Encoding::ConversionUndefined#destination_encoding and
Encoding::InvalidByteSequence#destination_encoding.
(econverr_error_char): new method:
Encoding::ConversionUndefined#error_char.
(econverr_error_bytes): new method:
Encoding::ConversionUndefined#error_bytes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to.
(rb_econv_t): new fields: source_encoding_name and
destination_encoding_name.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize the
new fields.
(rb_econv_open): set up the new fields.
(econv_inspect): use the new fields.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18655 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
in_data_start, in_data_end, in_buf_end and last_trans_index.
(rb_econv_output): removed.
(rb_econv_insert_output): declared.
(rb_econv_encoding_to_insert_output): declared.
* enc/trans/newline.trans (rb_universal_newline): stateful_type
changed.
* transcode.c (transcode_restartable0): initialize inchar_start,
tc->recognized_len and next_table at beginning of the loop.
(rb_econv_open_by_transcoder_entries): initialize new fields.
(rb_econv_open): setup last_trans_index.
(trans_sweep): last out_buf_start can be non-NULL now.
(rb_econv_convert): check last out_buf_start and in_buf_start at
first.
(rb_econv_output_with_destination_encoding): removed.
(econv_just_convert): removed.
(rb_econv_output): removed.
(econv_primitive_output): method removed.
(rb_econv_encoding_to_insert_output): new function.
(allocate_converted_string): new function.
(rb_econv_insert_output): new function.
(econv_primitive_insert_output): new method.
(output_replacement_character): use rb_econv_insert_output. unused
arguments removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18654 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_output): use econv_just_convert.
(econv_primitive_output): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18647 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (rb_trans_conv): new argument: result_position_ptr.
(rb_econv_convert): fill last_error.
(econv_result_to_symbol): extracted from econv_primitive_convert.
(econv_primitive_errinfo): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18643 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_open is failed.
(make_dummy_encoding): new function extracted from make_encoding.
(make_encoding): removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18634 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode_data.h (rb_transcoder): add resetsize_func field.
* enc/trans/iso2022.trans (iso2022jp_reset_sequence_size): defined.
(rb_EUC_JP_to_ISO_2022_JP): provede resetsize_func.
* tool/transcode-tblgen.rb: set NULL for resetsize_func.
* transcode.c (rb_econv_output): new function for inserting output.
(output_replacement_character): use rb_econv_output.
(transcode_loop): check return value of
output_replacement_character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_data.h.
(rb_econv_elem_t): ditto.
(rb_econv_t): ditto. source_encoding and destination_encoding field
is added.
(rb_econv_open): declared.
(rb_econv_convert): ditto.
(rb_econv_close): ditto.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize
source_encoding and destination_encoding field as NULL.
(rb_econv_open): make it external linkage.
(rb_econv_close): ditto.
(rb_econv_convert): ditto. renamed from rb_econv_conv.
(make_encoding): new function.
(econv_init): use make_encoding and store rb_encoding* in
rb_econv_t.
(econv_source_encoding): new method
Encoding::Converter#source_encoding.
(econv_destination_encoding): new method
Encoding::Converter#destination_encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_invalid_input.
(transcode_destination_buffer_full): renamed from transcode_obuf_full.
(transcode_source_buffer_empty): renamed from transcode_ibuf_empty.
(rb_econv_result_t): renamed from rb_trans_result_t.
(rb_econv_elem_t): renamed from rb_trans_elem_t.
(rb_econv_t): renamed from rb_trans_t.
* transcode.c (UNIVERSAL_NEWLINE_DECODER): renamed from
UNIVERSAL_NEWLINE.
(CRLF_NEWLINE_ENCODER): renamed from CRLF_NEWLINE.
(CR_NEWLINE_ENCODER): renamed from CR_NEWLINE.
(rb_econv_open): renamed from rb_trans_open.
(rb_econv_close): renamed from rb_trans_close.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
too much, even for multilevel conversion.
(transcode_loop): use rb_econv_conv.
(econv_primitive_convert): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_output_followed_by_input.
* transcode.c (OUTPUT_FOLLOWED_BY_INPUT): new flag.
(transcode_restartable0): suspend when output followed by input if
OUTPUT_FOLLOWED_BY_INPUT is specified.
(trans_sweep): check OUTPUT_FOLLOWED_BY_INPUT.
(rb_trans_conv): support OUTPUT_FOLLOWED_BY_INPUT.
(econv_primitive_convert): return :output_followed_by_input for
transcode_output_followed_by_input.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18608 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_cr_newline): new transcoder.
* transcode.c (trans_open_i): one more exra room for input newline
converter.
(rb_trans_open): crlf newline and cr newline implemented.
(Init_transcode): Encoding::Converter::CRLF_NEWLINE and
Encoding::Converter::LF_NEWLINE defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
path.
(load_transcoder_entry): renamed from load_transcoder.
(load_transcoder): new function for loding transcoder by encoding
names.
(rb_transcoding_open_by_transcoder): extracted from
rb_transcoding_open.
(rb_transcoding_open): use load_transcoder and
rb_transcoding_open_by_transcoder.
(rb_trans_open_by_transcoder_entries): new function.
(trans_open_i): construct entries array.
(rb_trans_open): use rb_trans_open_by_transcoder_entries.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (trans_open_i): just record from and to.
(rb_trans_open): load transcodings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18531 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* tool/transcode-tblgen.rb: 8bit byte of ASCII-8BIT is a valid
(but unique to ASCII-8BIT) character.
* transcode.c (rb_eConversionUndefined): new error.
(rb_eInvalidByteSequence): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
resetting a state of stateful encoding.
* enc/trans/iso2022.trans (rb_EUC_JP_to_ISO_2022_JP): specify
finish_eucjp_to_iso2022jp for resetstate_func.
* tool/transcode-tblgen.rb: specify NULL for resetstate_func.
* transcode.c (output_replacement_character): call resetstate_func
before appending the replacement character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_trans_elem_t): new type.
(rb_trans_t): new type.
* transcode.c (transcode_dispatch_cb): removed.
(transcode_dispatch): removed.
(rb_transcoding_result_t): moved to rb_trans_result_t in
transcode_data.h.
(transcode_restartable0): goto follow_info when FUNsi.
(rb_transcoding_open): use get_transcoder_entry.
(rb_trans_open): new function.
(rb_trans_conv): ditto.
(rb_trans_close): ditto.
(trans_open_i): ditto.
(trans_sweep): ditto.
(more_output_buffer): take rb_trans_t instead of rb_transcoding as
an argument.
(transcode_loop): take from_encoding and to_encoding instead of tr
as arguments. use rb_trans_open/rb_trans_conv/rb_trans_close.
(str_transcode): don't use transcode_dispatch.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18498 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(transcode_restartable): use PARTIAL_INPUT for converting buffered
input.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18476 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (load_transcoder): extracted from transcode_dispatch_cb.
(rb_transcoding_result_t): renamed from transcode_result_t.
(rb_transcoding_open): new function.
(rb_transcoding_convert): ditto.
(rb_transcoding_close): ditto.
(transcode_loop): use rb_transcoding_open, rb_transcoding_convert
and rb_transcoding_close.
(str_transcode): don't need rb_transcoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18474 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (transcode_restartable0): renamed from
transcode_restartable.
save input buffer into feed buffer if next character is started the
point before input buffer. for example, "\x00\xd8\x01" then "\x02"
in UTF-16LE. \x02 causes invalid and next character is started from
\x01.
(transcode_restartable): new function to call
transcode_restartable0. if feed buffer is not empty, convert it at
first.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18467 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
as parameters.
(more_output_buffer): ditto.
(str_transcoding_resize): argument changed from rb_transcoding* to
VALUE.
(str_transcode): call transcode_loop with destination string and its
resize function.
* transcode_data.h (rb_transcoding): move ruby_string_dest and
flush_func to transcode_loop parameters.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18458 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(transcode_restartable): arguments changed to avoid *in_pos points
out of buffer by decreasing *in_pos.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18455 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_transcoder): preprocessor and postprocessor field removed.
change arguments of func_ii, func_si, func_io and func_so.
new field "finish_func".
* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
generatable.
* transcode.c (transcoder_lib_table): removed.
(transcoder_table): change structure.
(transcoder_key): removed because the above structure change.
(make_transcoder_entry): new function.
(get_transcoder_entry): ditto.
(rb_register_transcoder): follow the structure change.
(declare_transcoder): ditto.
(transcode_search_path): new function for breadth first search to
find a list of converters.
(transcode_search_path_i): new function.
(transcode_dispatch_cb): ditto.
(transcode_dispatch): use transcode_search_path.
(transcode_loop): follow the argument change.
(str_transcode): preprocessor and postprocessor stuff removed.
* enc/trans/iso2022.erb.c: new file. ISO-2022-JP conversion
re-implemented.
* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.
nute(23:52:53)% head -40 ChangeLog
Thu Aug 7 23:43:11 2008 Tanaka Akira <akr@fsij.org>
* transcode_data.h (rb_transcoding): new field "stateful".
(rb_transcoder): preprocessor and postprocessor field removed.
change arguments of func_ii, func_si, func_io and func_so.
new field "finish_func".
* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
generatable.
* transcode.c (transcoder_lib_table): removed.
(transcoder_table): change structure.
(transcoder_key): removed because the above structure change.
(make_transcoder_entry): new function.
(get_transcoder_entry): ditto.
(rb_register_transcoder): follow the structure change.
(declare_transcoder): ditto.
(transcode_search_path): new function for breadth first search to
find a list of converters.
(transcode_search_path_i): new function.
(transcode_dispatch_cb): ditto.
(transcode_dispatch): use transcode_search_path.
(transcode_loop): follow the argument change.
(str_transcode): preprocessor and postprocessor stuff removed.
* enc/trans/iso2022.erb.c: new file. ISO-2022-JP conversion
re-implemented.
* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.
* enc/trans/utf_16_32.erb.c: follow argument change of FUNso.
[ruby-dev:35798]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18419 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
string.c (rb_str_replace), transcode.c (transcode_dispatch): fixed
memory leaks. based on patches from shinichiro.h <shinichiro.hamaji
AT gmail.com> at [ruby-dev:35751].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18341 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
character when convert to Unicode.
* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
rename from test_public_review_issue_121.
* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
enable option2.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18294 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
suppress warnings.
* util.c (quorem, rv_alloc, nrv_alloc): only used in dtoa().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16873 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
C API of encoding conversion for Ruby object.
VALUE rb_str_transcode(VALUE str, VALUE to).
* transcode.c (str_encode, str_encode_bang):
rename from rb_tr_transcode or rb_str_transcode_bang.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (init_transcoder_table): moved to enc/trans/transdb.c.
* enc/depend (enc/encdb.o enc/trans/transdb.o): depend on
corresponding headers.
* common.mk (COMMONOBJS): moved transcode.o from OBJS
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c: Added basic support for passing options to String#encode
via a hash. Currently only one option, with one value, is supported:
invalid: :ignore (dropping invalid byte sequences instead of
producing an error). Option naming is not yet stable!
* test/ruby/test_transcode.rb: Added a single test for invalid: :ignore
option. Not more tests because most data does not yet distinguish
between INVALID and UNKNOWN.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode_data.h (rb_transcoding): include pointer to rb_transcoder
and auxiliary data.
* transcode_data.h (rb_transcoder): all callback functions shoud have
their own parameters.
* enc/trans/{japanese,single_byte}.c: constified.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/trans/utf_16_32.c: new file, currently implementing
UTF-16BE conversions only.
* test/ruby/test_transcode.rb: Added tests for UTF-16BE;
made check_both_ways() use force_encoding differently.
* transcode_data.h, transcode.c: Support for more conversion
functions.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (transcode_dispatch): reverted some of the changes
in r14746.
* transcode.c, enc/trans/single_byte.c: Added conversions to/from
US-ASCII and ASCII-8BIT (using data tables).
* enc/trans/single_byte.c: Some spacing/ordering changes due to
automatic data file generation.
* transcode_data.h, transcode.c: Preliminary code for using
micro-conversion functions.
* test/ruby/test_transcode.rb: Added some tests for US-ASCII and
ASCII-8BIT conversions.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c: Moving a static counter from inside register_transcoder()
and register_functional_transcoder() to outside the functions, renaming
from n to next_transcoder_position. Fixes 3) in [ruby-dev:32715].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
ruby.c, transcode.c: rename rb_ascii_encoding. to
rb_ascii8bit_encoding. rb_ascii_encoding is ambiguous with
ASCII-8BIT and US-ASCII.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode_data_one_byte: slightly optimized
* transcode_data_japanese: new data file for EUC-JP and SHIFT_JIS
(not yet optimized; tests to follow; data from
http://nkf.sourceforge.jp/ucm/{SJIS|eucJP}-nkf.ucm)
* common.mk, transcode.c: Adjusted for transcode_data_japanese
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14472 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
encoding even if no conversion is done because of 7bit only.
[ruby-dev:32591]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14293 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode_data_iso_8859.c: Changed from character constants
('\xC2') to integer contants (0xC2) for shorter files and
better readability; eliminated duplicated tables; changed
from -1 offset to actual UNDEF entry (not yet distinguishing
UNDEF and ILLEGAL correctly).
* test/ruby/test_transcode.rb: added a test for UNDEF conversion.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* trancode.c: some minor formatting fixes
* transcode_data.h, transcode_data_iso_8859.c: Shortened
extremely frequently used macros to shorten file length.
* test/ruby/test_transcode.rb: Fixed name of test class;
added setup method to ensure all necessary encodings exist;
split tests into more test methods; added tests; fixed ordering
of arguments in assert_equal to have expected result first.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14236 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (transcoding): added a pointer to function to flush.
* transcode.c (transcode_loop): do not use string internal.
[ruby-dev:32512]
* transcode.c (str_transcode): allow Encoding objects.
* transcode_data.h (BYTE_LOOKUP): use actual struct name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14176 b2dd03c8-39d4-4d8f-98ff-823fe69b080e