given explicit but the same destination and source encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21047 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
behave as if :undef => :replace, :invalid => :replace specified.
* transcode.c (rb_econv_prepare_opts): should preserve options in
any case.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
specified transcode the string into Encoding.default_internal.
inspired by [ruby-core:19298].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
superclass for all encoding related exception classes,
e.g. Encoding::CompatibilityError. [ruby-dev:36371]
* transcode.c (Init_transcode): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19570 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
problem. StringValueCStr modifies the argument and it should be
preserved while the string StringValueCStr returns is used.
Since the string is used by caller, the modified argument should be
hold by caller. Actually
GC.stress = true
def (o=Object.new).to_str()
"universal"+"_newline"
end
"\u3042".encode(o, "")'
causes curious warning:
rb_define_const: invalid name `' for constant
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19408 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
a patch from Tadashi Saito <shiba at mail2.accsnet.ne.jp> in
[ruby-dev:36346].
* encoding.c (Init_Encoding): rename EncodingCompatibilityError to
Encoding::CompatibilityError. [ruby-dev:36366]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19407 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
raising.
* transcode.c (enc_arg): need not to take pointer argument.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19406 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(makeSTR1LEN): defined.
* tool/transcode-tblgen.rb: use makeSTR1LEN. generate STR1 for 4 to
259 bytes.
* transcode.c (rb_transcoding): new field: output_index.
(transcode_restartable0): use STR1_LENGTH.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19366 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (transcode_restartable0): don't need to cast the result
of output functions.
* enc/trans/newline.trans: follow the type change.
* enc/trans/escape.trans: ditto.
* enc/trans/utf_16_32.trans: ditto.
* enc/trans/iso2022.trans: ditto.
* enc/trans/japanese.trans: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19351 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_init_by_convpath): new function.
(econv_init): use rb_econv_init_by_convpath.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_open): use decorator_names.
(econv_args): extracted from econv_init.
(econv_init): use econv_args.
(decorate_convpath): new function.
(search_convpath_i): new function.
(econv_s_search_convpath): new method.
(Init_transcode): new method defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19305 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_alloc): extracted from
rb_econv_open_by_transcoder_entries.
(rb_econv_add_transcoder_at): extracted from rb_econv_decorate_at
and generalized
(rb_econv_open_by_transcoder_entries): use rb_econv_alloc and
rb_econv_add_transcoder_at.
(rb_econv_add_converter): extracted from rb_econv_decorate_at.
(rb_econv_decorate_at): use rb_econv_add_converter.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_open_by_transcoder_entries): initialize started field.
(rb_econv_convert): set started field.
(rb_econv_insert_output): ditto.
(rb_econv_decorate_at): check started field instead of num_finished.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19303 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(make_encobj): new function.
(econv_s_asciicompat_encoding): use make_encoding.
(rb_econv_open_exc): use SUPPLEMENTAL_CONVERSION.
(econv_convpath): use encoding object in the result.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19288 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
incompatible replacements.
(make_replacement): don't convert the result of
get_replacement_character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19277 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
ECONV_ENCODER_MASK and ECONV_DECORATOR_MASK.
(ECONV_UNIVERSAL_NEWLINE_DECORATOR): renamed from
ECONV_UNIVERSAL_NEWLINE_DECODER.
(ECONV_CRLF_NEWLINE_DECORATOR): renamed from
ECONV_CRLF_NEWLINE_ENCODER.
(ECONV_CR_NEWLINE_DECORATOR): renamed from ECONV_CR_NEWLINE_ENCODER.
(ECONV_XML_TEXT_DECORATOR): renamed from ECONV_XML_TEXT_ENCODER.
(ECONV_XML_ATTR_CONTENT_DECORATOR): renamed from
ECONV_XML_ATTR_CONTENT_ENCODER.
(ECONV_STATEFUL_DECORATOR_MASK): renamed from
ECONV_STATEFUL_ENCODER_MASK.
(ECONV_XML_ATTR_QUOTE_DECORATOR): renamed from
ECONV_XML_ATTR_CONTENT_DECORATOR.
* io.c: follow the renaming.
* transcode.c: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_open_by_transcoder_entries): follow the type change.
(rb_econv_open0): ditto.
(rb_econv_decorate_at): ditto.
(rb_econv_binmode): ditto.
(rb_econv_insert_output): simplified because there are no decorators
at last.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_decorate_at_last): declared.
* transcode.c (rb_econv_open_by_transcoder_entries): initialize
replacement_enc. allocate outbuf for the last transcoder.
(rb_econv_open0): extracted from rb_econv_open.
(rb_econv_open): use rb_econv_open0 and decorate the result using
rb_econv_decorate_at_first and rb_econv_decorate_at_last.
(rb_econv_decorate_at): new function.
(rb_econv_decorate_at_first): ditto.
(rb_econv_decorate_at_last): ditto.
(rb_econv_binmode): fix iteration end condition.
(econv_init): don't set source_encoding_name and
destination_encoding_name because they are set in rb_econv_open0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19262 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
from rb_econv_stateless_encoding to apply stateless ASCII
incompatible encodings such as UTF-16BE.
* io.c (make_writeconv): use rb_econv_asciicompat_encoding.
* transcode_data.h (rb_transcoder_asciicompat_type_t): renamed from
rb_transcoder_stateful_type_t.
(rb_transcoder): use rb_transcoder_asciicompat_type_t.
* transcode.c: follow the type change.
(asciicompat_encoding_i): renamed from stateless_encoding_i.
(rb_econv_asciicompat_encoding): renamed from
rb_econv_stateless_encoding.
(econv_s_asciicompat_encoding): method renamed.
* tool/transcode-tblgen.rb: follow the type change.
* enc/trans/utf_16_32.trans: follow the type change.
rb_from_UTF_16BE to UTF-8 is asciicompat_decoder.
rb_from_UTF_16LE to UTF-8 is asciicompat_decoder.
rb_from_UTF_32BE to UTF-8 is asciicompat_decoder.
rb_from_UTF_32LE to UTF-8 is asciicompat_decoder.
UTF-8 to rb_to_UTF_16BE is asciicompat_encoder.
UTF-8 to rb_to_UTF_16LE is asciicompat_encoder.
UTF-8 to rb_to_UTF_32BE is asciicompat_encoder.
UTF-8 to rb_to_UTF_32LE is asciicompat_encoder.
* enc/trans/newline.trans: follow the type change. universal newline
decoder is asciicompat_converter.
* enc/trans/escape.trans: follow the type change.
* enc/trans/iso2022.trans: ditto.
* enc/trans/japanese.trans: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19249 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Encoding::Converter.stateless_encoding("html-attr-escaped") should be
nil.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
allocated by caller.
(rb_econv_insert_output): provide caller allocated buffer to
allocate_converted_string.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(econv_primitive_convert): accept a hash as 5th argument as well.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
it tend to cause security problem. If the behaviour is really
required, ECONV_INVALID_REPLACE with empty string can be used.
For example, CVE-2006-2313, CVE-2008-1036, [ruby-core:15645]
(ECONV_UNDEF_IGNORE): ditto.
* transcode.c (rb_econv_convert): follow the above change.
(econv_opts): ditto.
(Init_transcode): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (rb_econv_t): new fields: replacement_str,
replacement_len, replacement_enc and replacement_allocated.
(get_replacement_character): make len as size_t.
(rb_econv_open_by_transcoder_entries): initialize the new fields.
(rb_econv_close): deallocate replacement_str if it allocated.
(make_replacement): new function.
(output_replacement_character): use make_replacement.
(rb_econv_set_replacemenet): defined.
(econv_get_replacement): new method.
(econv_set_replacement): new method.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19108 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_option_t has only one field, int flags, rb_econv_option_t is
replaced by int.
* include/ruby/io.h: follow the above change.
* io.c: ditto.
* transcode.c: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19103 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
add state field.
(TRANSCODING_STATE): defined.
(rb_transcoder): add fields: state_size, state_init_func,
state_fini_func.
change rb_transcoding* argument to void*.
* transcode.c (transcode_restartable0): use TRANSCODING_STATE for
first arguments of transcoder functions.
(rb_transcoding_open_by_transcoder): initialize state field.
(rb_transcoding_close): finalize state field.
* tool/transcode-tblgen.rb: provide state size/init/fini.
* enc/trans/newline.trans (universal_newline_init): defined.
(fun_so_universal_newline): take void* as a state pointer.
(rb_universal_newline): provide state size/init/fini.
(rb_crlf_newline): ditto.
(rb_cr_newline): ditto.
* enc/trans/iso2022.trans (iso2022jp_init): defined.
(fun_si_iso2022jp_to_eucjp): take void* as a state pointer.
(fun_so_iso2022jp_to_eucjp): ditto.
(fun_so_eucjp_to_iso2022jp): ditto.
(iso2022jp_reset_sequence_size): ditto.
(finish_eucjp_to_iso2022jp): ditto.
(rb_ISO_2022_JP_to_EUC_JP): provide state size/init/fini.
(rb_EUC_JP_to_ISO_2022_JP): ditto.
* enc/trans/utf_16_32.trans (fun_so_from_utf_16be): take void* as a
state pointer.
(fun_so_to_utf_16be): ditto.
(fun_so_from_utf_16le): ditto.
(fun_so_to_utf_16le): ditto.
(fun_so_from_utf_32be): ditto.
(fun_so_to_utf_32be): ditto.
(fun_so_from_utf_32le): ditto.
(fun_so_to_utf_32le): ditto.
(rb_from_UTF_16BE): provide state size/init/fini.
(rb_to_UTF_16BE): ditto.
(rb_from_UTF_16LE): ditto.
(rb_to_UTF_16LE): ditto.
(rb_from_UTF_32BE): ditto.
(rb_to_UTF_32BE): ditto.
(rb_from_UTF_32LE): ditto.
(rb_to_UTF_32LE): ditto.
* enc/trans/japanese.trans (fun_so_eucjp2sjis): take void* as a state
pointer.
(fun_so_sjis2eucjp): ditto.
(rb_eucjp2sjis): provide state size/init/fini.
(rb_sjis2eucjp): provide state size/init/fini.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(WORD_ADDR): ditto.
(BL_BASE): use BYTE_ADDR and WORD_ADDR.
(BL_INFO): use WORD_ADDR.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19089 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
word_array to avoid relocation.
* transcode.c (transcode_restartable0): add word_array to get infos
and BYTE_LOOKUPs.
* transcode_data.h (BYTE_LOOKUP_INFO): change return type to
uintptr_t.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_econv_t): defined as an incomplete type.
* transcode.c (rb_econv_elem_t): moved from encoding.h.
(rb_econv_t): complete type defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18872 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
byte sequence exception. store the part as an instance variable.
(ecerr_readagain_bytes): new method to access the readagain part.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
additional transcoders.
(econv_description): extracted from rb_econv_open_exc.
(rb_econv_open_exc): use econv_description.
(econv_inspect): use econv_description.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
last_error. num_trans may be zero.
(rb_econv_convert0): num_trans may be zero.
(rb_econv_putbackable): ditto.
(rb_econv_putback): ditto.
(rb_econv_convert): input_ptr and output_ptr may be NULL.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18835 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_econv_option_t*.
* transcode.c (transcode_loop): take rb_econv_option_t* as a argument.
(str_transcode0): ditto.
(str_transcode): make rb_econv_option_t and call str_transcode0 with
it.
(rb_str_transcode): take rb_econv_option_t*.
* io.c (io_fwrite): follow the rb_str_transcode change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transcode.c (econv_opts): extracted from str_transcode.
(str_transcode_enc_args): extracted from str_transcode.
(str_transcode0): extracted from str_transcode.
(str_transcode): use econv_opts, str_transcode_enc_args,
str_transcode0.
(rb_str_transcode): call str_transcode0.
(econv_primitive_insert_output): give the additional argument for
rb_str_transcode.
* io.c (make_writeconv): use invalid/undef flags.
(io_fwrite): ditto.
(rb_scan_open_args): give the additional argument for
rb_str_transcode.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18808 b2dd03c8-39d4-4d8f-98ff-823fe69b080e