Encoding::Converter.stateless_encoding("html-attr-escaped") should be
nil.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
allocated by caller.
(rb_econv_insert_output): provide caller allocated buffer to
allocate_converted_string.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(econv_primitive_convert): accept a hash as 5th argument as well.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
it tend to cause security problem. If the behaviour is really
required, ECONV_INVALID_REPLACE with empty string can be used.
For example, CVE-2006-2313, CVE-2008-1036, [ruby-core:15645]
(ECONV_UNDEF_IGNORE): ditto.
* transcode.c (rb_econv_convert): follow the above change.
(econv_opts): ditto.
(Init_transcode): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (rb_econv_t): new fields: replacement_str,
replacement_len, replacement_enc and replacement_allocated.
(get_replacement_character): make len as size_t.
(rb_econv_open_by_transcoder_entries): initialize the new fields.
(rb_econv_close): deallocate replacement_str if it allocated.
(make_replacement): new function.
(output_replacement_character): use make_replacement.
(rb_econv_set_replacemenet): defined.
(econv_get_replacement): new method.
(econv_set_replacement): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19108 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_option_t has only one field, int flags, rb_econv_option_t is
replaced by int.
* include/ruby/io.h: follow the above change.
* io.c: ditto.
* transcode.c: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19103 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
add state field.
(TRANSCODING_STATE): defined.
(rb_transcoder): add fields: state_size, state_init_func,
state_fini_func.
change rb_transcoding* argument to void*.
* transcode.c (transcode_restartable0): use TRANSCODING_STATE for
first arguments of transcoder functions.
(rb_transcoding_open_by_transcoder): initialize state field.
(rb_transcoding_close): finalize state field.
* tool/transcode-tblgen.rb: provide state size/init/fini.
* enc/trans/newline.trans (universal_newline_init): defined.
(fun_so_universal_newline): take void* as a state pointer.
(rb_universal_newline): provide state size/init/fini.
(rb_crlf_newline): ditto.
(rb_cr_newline): ditto.
* enc/trans/iso2022.trans (iso2022jp_init): defined.
(fun_si_iso2022jp_to_eucjp): take void* as a state pointer.
(fun_so_iso2022jp_to_eucjp): ditto.
(fun_so_eucjp_to_iso2022jp): ditto.
(iso2022jp_reset_sequence_size): ditto.
(finish_eucjp_to_iso2022jp): ditto.
(rb_ISO_2022_JP_to_EUC_JP): provide state size/init/fini.
(rb_EUC_JP_to_ISO_2022_JP): ditto.
* enc/trans/utf_16_32.trans (fun_so_from_utf_16be): take void* as a
state pointer.
(fun_so_to_utf_16be): ditto.
(fun_so_from_utf_16le): ditto.
(fun_so_to_utf_16le): ditto.
(fun_so_from_utf_32be): ditto.
(fun_so_to_utf_32be): ditto.
(fun_so_from_utf_32le): ditto.
(fun_so_to_utf_32le): ditto.
(rb_from_UTF_16BE): provide state size/init/fini.
(rb_to_UTF_16BE): ditto.
(rb_from_UTF_16LE): ditto.
(rb_to_UTF_16LE): ditto.
(rb_from_UTF_32BE): ditto.
(rb_to_UTF_32BE): ditto.
(rb_from_UTF_32LE): ditto.
(rb_to_UTF_32LE): ditto.
* enc/trans/japanese.trans (fun_so_eucjp2sjis): take void* as a state
pointer.
(fun_so_sjis2eucjp): ditto.
(rb_eucjp2sjis): provide state size/init/fini.
(rb_sjis2eucjp): provide state size/init/fini.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(WORD_ADDR): ditto.
(BL_BASE): use BYTE_ADDR and WORD_ADDR.
(BL_INFO): use WORD_ADDR.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19089 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
word_array to avoid relocation.
* transcode.c (transcode_restartable0): add word_array to get infos
and BYTE_LOOKUPs.
* transcode_data.h (BYTE_LOOKUP_INFO): change return type to
uintptr_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_t): defined as an incomplete type.
* transcode.c (rb_econv_elem_t): moved from encoding.h.
(rb_econv_t): complete type defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18872 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
byte sequence exception. store the part as an instance variable.
(ecerr_readagain_bytes): new method to access the readagain part.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
additional transcoders.
(econv_description): extracted from rb_econv_open_exc.
(rb_econv_open_exc): use econv_description.
(econv_inspect): use econv_description.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
last_error. num_trans may be zero.
(rb_econv_convert0): num_trans may be zero.
(rb_econv_putbackable): ditto.
(rb_econv_putback): ditto.
(rb_econv_convert): input_ptr and output_ptr may be NULL.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18835 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_option_t*.
* transcode.c (transcode_loop): take rb_econv_option_t* as a argument.
(str_transcode0): ditto.
(str_transcode): make rb_econv_option_t and call str_transcode0 with
it.
(rb_str_transcode): take rb_econv_option_t*.
* io.c (io_fwrite): follow the rb_str_transcode change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (econv_opts): extracted from str_transcode.
(str_transcode_enc_args): extracted from str_transcode.
(str_transcode0): extracted from str_transcode.
(str_transcode): use econv_opts, str_transcode_enc_args,
str_transcode0.
(rb_str_transcode): call str_transcode0.
(econv_primitive_insert_output): give the additional argument for
rb_str_transcode.
* io.c (make_writeconv): use invalid/undef flags.
(io_fwrite): ditto.
(rb_scan_open_args): give the additional argument for
rb_str_transcode.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18808 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/encoding.h (rb_econv_t): new field: flags.
(rb_econv_binmode): declared.
* io.c (io_unread): text mode hack removed.
(NEED_NEWLINE_DECODER): defined.
(NEED_NEWLINE_ENCODER): defined.
(NEED_READCONV): defined.
(NEED_WRITECONV): defined.
(TEXTMODE_NEWLINE_ENCODER): defined for windows.
(make_writeconv): setup converter with TEXTMODE_NEWLINE_ENCODER for
text mode.
(io_fwrite): use NEED_WRITECONV. character code conversion is
disabled if fptr->writeconv_stateless is nil.
(make_readconv): setup converter with
ECONV_UNIVERSAL_NEWLINE_DECODER for text mode.
(read_all): use NEED_READCONV.
(appendline): use NEED_READCONV.
(rb_io_getline_1): use NEED_READCONV.
(io_getc): use NEED_READCONV.
(rb_io_ungetc): use NEED_READCONV.
(rb_io_binmode): OS-level text mode test removed. call
rb_econv_binmode.
(rb_io_binmode_m): call rb_io_binmode_m with write_io as well.
(rb_io_flags_mode): return mode string including "t".
(rb_io_mode_flags): detect "t" for text mode.
(rb_sysopen): always specify O_BINARY.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize flags.
(rb_econv_open): if source and destination encoding is
both empty string, open newline converter. last_tc will be NULL in
this case.
(rb_econv_encoding_to_insert_output): last_tc may be NULL now.
(rb_econv_string): ditto.
(output_replacement_character): ditto.
(transcode_loop): ditto.
(econv_init): ditto.
(econv_inspect): ditto.
(rb_econv_binmode): new function.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to exception object.
(ecerr_source_encoding): new method:
Encoding::ConversionUndefined#source_encoding and
Encoding::InvalidByteSequence#source_encoding.
(ecerr_destination_encoding): new method:
Encoding::ConversionUndefined#destination_encoding and
Encoding::InvalidByteSequence#destination_encoding.
(econverr_error_char): new method:
Encoding::ConversionUndefined#error_char.
(econverr_error_bytes): new method:
Encoding::ConversionUndefined#error_bytes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
to.
(rb_econv_t): new fields: source_encoding_name and
destination_encoding_name.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize the
new fields.
(rb_econv_open): set up the new fields.
(econv_inspect): use the new fields.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18655 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
in_data_start, in_data_end, in_buf_end and last_trans_index.
(rb_econv_output): removed.
(rb_econv_insert_output): declared.
(rb_econv_encoding_to_insert_output): declared.
* enc/trans/newline.trans (rb_universal_newline): stateful_type
changed.
* transcode.c (transcode_restartable0): initialize inchar_start,
tc->recognized_len and next_table at beginning of the loop.
(rb_econv_open_by_transcoder_entries): initialize new fields.
(rb_econv_open): setup last_trans_index.
(trans_sweep): last out_buf_start can be non-NULL now.
(rb_econv_convert): check last out_buf_start and in_buf_start at
first.
(rb_econv_output_with_destination_encoding): removed.
(econv_just_convert): removed.
(rb_econv_output): removed.
(econv_primitive_output): method removed.
(rb_econv_encoding_to_insert_output): new function.
(allocate_converted_string): new function.
(rb_econv_insert_output): new function.
(econv_primitive_insert_output): new method.
(output_replacement_character): use rb_econv_insert_output. unused
arguments removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18654 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_output): use econv_just_convert.
(econv_primitive_output): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18647 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (rb_trans_conv): new argument: result_position_ptr.
(rb_econv_convert): fill last_error.
(econv_result_to_symbol): extracted from econv_primitive_convert.
(econv_primitive_errinfo): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18643 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_open is failed.
(make_dummy_encoding): new function extracted from make_encoding.
(make_encoding): removed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18634 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode_data.h (rb_transcoder): add resetsize_func field.
* enc/trans/iso2022.trans (iso2022jp_reset_sequence_size): defined.
(rb_EUC_JP_to_ISO_2022_JP): provede resetsize_func.
* tool/transcode-tblgen.rb: set NULL for resetsize_func.
* transcode.c (rb_econv_output): new function for inserting output.
(output_replacement_character): use rb_econv_output.
(transcode_loop): check return value of
output_replacement_character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_data.h.
(rb_econv_elem_t): ditto.
(rb_econv_t): ditto. source_encoding and destination_encoding field
is added.
(rb_econv_open): declared.
(rb_econv_convert): ditto.
(rb_econv_close): ditto.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize
source_encoding and destination_encoding field as NULL.
(rb_econv_open): make it external linkage.
(rb_econv_close): ditto.
(rb_econv_convert): ditto. renamed from rb_econv_conv.
(make_encoding): new function.
(econv_init): use make_encoding and store rb_encoding* in
rb_econv_t.
(econv_source_encoding): new method
Encoding::Converter#source_encoding.
(econv_destination_encoding): new method
Encoding::Converter#destination_encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_invalid_input.
(transcode_destination_buffer_full): renamed from transcode_obuf_full.
(transcode_source_buffer_empty): renamed from transcode_ibuf_empty.
(rb_econv_result_t): renamed from rb_trans_result_t.
(rb_econv_elem_t): renamed from rb_trans_elem_t.
(rb_econv_t): renamed from rb_trans_t.
* transcode.c (UNIVERSAL_NEWLINE_DECODER): renamed from
UNIVERSAL_NEWLINE.
(CRLF_NEWLINE_ENCODER): renamed from CRLF_NEWLINE.
(CR_NEWLINE_ENCODER): renamed from CR_NEWLINE.
(rb_econv_open): renamed from rb_trans_open.
(rb_econv_close): renamed from rb_trans_close.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
too much, even for multilevel conversion.
(transcode_loop): use rb_econv_conv.
(econv_primitive_convert): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
transcode_output_followed_by_input.
* transcode.c (OUTPUT_FOLLOWED_BY_INPUT): new flag.
(transcode_restartable0): suspend when output followed by input if
OUTPUT_FOLLOWED_BY_INPUT is specified.
(trans_sweep): check OUTPUT_FOLLOWED_BY_INPUT.
(rb_trans_conv): support OUTPUT_FOLLOWED_BY_INPUT.
(econv_primitive_convert): return :output_followed_by_input for
transcode_output_followed_by_input.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18608 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_cr_newline): new transcoder.
* transcode.c (trans_open_i): one more exra room for input newline
converter.
(rb_trans_open): crlf newline and cr newline implemented.
(Init_transcode): Encoding::Converter::CRLF_NEWLINE and
Encoding::Converter::LF_NEWLINE defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
path.
(load_transcoder_entry): renamed from load_transcoder.
(load_transcoder): new function for loding transcoder by encoding
names.
(rb_transcoding_open_by_transcoder): extracted from
rb_transcoding_open.
(rb_transcoding_open): use load_transcoder and
rb_transcoding_open_by_transcoder.
(rb_trans_open_by_transcoder_entries): new function.
(trans_open_i): construct entries array.
(rb_trans_open): use rb_trans_open_by_transcoder_entries.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (trans_open_i): just record from and to.
(rb_trans_open): load transcodings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18531 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* tool/transcode-tblgen.rb: 8bit byte of ASCII-8BIT is a valid
(but unique to ASCII-8BIT) character.
* transcode.c (rb_eConversionUndefined): new error.
(rb_eInvalidByteSequence): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
resetting a state of stateful encoding.
* enc/trans/iso2022.trans (rb_EUC_JP_to_ISO_2022_JP): specify
finish_eucjp_to_iso2022jp for resetstate_func.
* tool/transcode-tblgen.rb: specify NULL for resetstate_func.
* transcode.c (output_replacement_character): call resetstate_func
before appending the replacement character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_trans_elem_t): new type.
(rb_trans_t): new type.
* transcode.c (transcode_dispatch_cb): removed.
(transcode_dispatch): removed.
(rb_transcoding_result_t): moved to rb_trans_result_t in
transcode_data.h.
(transcode_restartable0): goto follow_info when FUNsi.
(rb_transcoding_open): use get_transcoder_entry.
(rb_trans_open): new function.
(rb_trans_conv): ditto.
(rb_trans_close): ditto.
(trans_open_i): ditto.
(trans_sweep): ditto.
(more_output_buffer): take rb_trans_t instead of rb_transcoding as
an argument.
(transcode_loop): take from_encoding and to_encoding instead of tr
as arguments. use rb_trans_open/rb_trans_conv/rb_trans_close.
(str_transcode): don't use transcode_dispatch.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18498 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(transcode_restartable): use PARTIAL_INPUT for converting buffered
input.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18476 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (load_transcoder): extracted from transcode_dispatch_cb.
(rb_transcoding_result_t): renamed from transcode_result_t.
(rb_transcoding_open): new function.
(rb_transcoding_convert): ditto.
(rb_transcoding_close): ditto.
(transcode_loop): use rb_transcoding_open, rb_transcoding_convert
and rb_transcoding_close.
(str_transcode): don't need rb_transcoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18474 b2dd03c8-39d4-4d8f-98ff-823fe69b080e