"locale encoding" is misleading since it doesn't mean Encoding.find("locale")
but the encoding used to interpret the script file. It's therefore better to
call it "script encoding" as in the paragraphs above.
Closes: https://github.com/ruby/ruby/pull/1611
* encoding.c (rb_enc_get_index): external encoding may not be Data
object. [ruby-core:89016] [Bug #15122]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64753 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Clarify logic and add spec.
* Now passes test-all with the JSON fix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64178 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* This reverts commit fb253d2032.
* The CI is failing, this seems a bug in the JSON C extension.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_capable): make it extern to check enc_capable.
enc_index can be set to limited types such as T_STRING, T_REGEX
and so on. This function check an object is this kind of types.
* include/ruby/encoding.h: ditto.
* encoding.c (enc_set_index): check a given object is enc_capable.
* include/ruby/encoding.h (PUREFUNC):
* marshal.c (encoding_name): check `rb_enc_capable` first.
* marshal.c (r_ivar): ditto. If it is not enc_capable, it should be
malformed data.
* spec/ruby/optional/capi/encoding_spec.rb: remove tests depending
on the wrong feature: all objects can set enc_index.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63777 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_gc_mark_encodings has been empty for a decade
(since r17875 / 28b216ac45).
Just remove it and its only caller in gc.c
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
These strings already exist (or will exist soon) in the fstring
table, so avoid the duplicate, sooner.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62030 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c (str_undump): use rb_enc_find_index2 to find encoding
by unterminated string. check the format before encoding name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61396 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_enc_ascget() erroneously reports success even if the given byte
sequence is incomplete, for non-ASCII compatible encoding strings.
rb_enc_precise_mbclen() may return a negative value on error, and thus
rb_enc_ascget() must not store the return value in 'unsigned int';
otherwise the subsequent MBCLEN_CHARFOUND_P() check won't catch the
error. [ruby-core:78646] [Bug #13034]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
termlen and resize heap for the terminator. This is split from
rb_str_fill_terminator (str_fill_term) because filling terminator
and changing terminator length are different things. [Bug #12536]
* internal.h: declaration for rb_str_change_terminator_length.
* string.c (str_fill_term): Simplify only to zero-fill the terminator.
For non-shared strings, it assumes that (capa + termlen) bytes of
heap is allocated. This partially reverts r55557.
* encoding.c (rb_enc_associate_index): rb_str_change_terminator_length
is used, and it should be called whenever the termlen is changed.
* string.c (str_capacity): New static function to return capacity
of a string with the given termlen, because the termlen may
sometimes be different from TERM_LEN(str) especially during
changing termlen or filling terminator with specific termlen.
* string.c (rb_str_capacity): Use str_capacity.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_isalnum, rb_isxdigit, rb_isblank, rb_isspace, rb_isblank,
rb_iscntrl, rb_isprint, rb_ispunct, rb_isgraph,
rb_tolower, rb_toupper): use inline function to avoid function call.
* include/ruby/ruby.h (rb_isascii): use inline function to clarify
the logic.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54391 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* error.c (rb_assert_failure): assertion with stack dump.
* ruby_assert.h (RUBY_ASSERT): new header for the assertion.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53615 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
and a pre-compilation/runtime loader sample.
[Feature #11788]
* iseq.c: add new methods:
* RubyVM::InstructionSequence#to_binary_format(extra_data = nil)
* RubyVM::InstructionSequence.from_binary_format(binary)
* RubyVM::InstructionSequence.from_binary_format_extra_data(binary)
* compile.c: implement body of this new feature.
* load.c (rb_load_internal0), iseq.c (rb_iseq_load_iseq):
call RubyVM::InstructionSequence.load_iseq(fname) with
loading script name if this method is defined.
We can return any ISeq object as a result value.
Otherwise loading will be continue as usual.
This interface is not matured and is not extensible.
So that we don't guarantee the future compatibility of this method.
Basically, you should'nt use this method.
* iseq.h: move ISEQ_MAJOR/MINOR_VERSION (and some definitions)
from iseq.c.
* encoding.c (rb_data_is_encoding), internal.h: added.
* vm_core.h: add several supports for lazy load.
* add USE_LAZY_LOAD macro to specify enable or disable of
this feature.
* add several fields to rb_iseq_t.
* introduce new macro rb_iseq_check().
* insns.def: some check for lazy loading feature.
* vm_insnhelper.c: ditto.
* proc.c: ditto.
* vm.c: ditto.
* test/lib/iseq_loader_checker.rb: enabled iff suitable
environment variables are provided.
* test/runner.rb: enable lib/iseq_loader_checker.rb.
* sample/iseq_loader.rb: add sample compiler and loader.
$ ruby sample/iseq_loader.rb [dir]
will compile all ruby scripts in [dir].
With default setting, this compile creates *.rb.yarb files
in same directory of target .rb scripts.
$ ruby -r sample/iseq_loader.rb [app]
will run with enable to load compiled binary data.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_name, rb_enc_name_list_i, rb_enc_aliases_enc_i):
make fstring instead of making each copies.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52859 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_m_loader): defer finding encoding object not to
be infected by marshal source. [ruby-core:71793] [Bug #11760]
* marshal.c (r_object0): enable compatible loader on USERDEF
class. the loader function is called with the class itself,
instead of an allocated object, and the loaded data.
* marshal.c (compat_allocator_table): intialize
compat_allocator_tbl on demand.
* object.c (rb_undefined_alloc): extract from rb_obj_alloc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52856 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_dump): use rb_check_arity to just check number
of arguments.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52854 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (ENC_ASSERT): make an expression, and prevent the
argument from further expansions.
* encoding.c (rb_enc_check_str): assert before using.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This function only accept T_STRING (and T_REGEXP).
This patch improves performance of a tiny_segmenter benchmark
(num=2) 2.54sec -> 2.42sec on my machine.
https://github.com/chezou/TinySegmenter.jl/blob/master/benchmark/benchmark.rb
* encoding.c: add ENC_DEBUG and ENC_ASSERT() macros.
* internal.h: add a decl. of rb_enc_check_str().
* string.c (rb_str_plus): use rb_enc_check_str().
* string.c (rb_str_subpat_set): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52350 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encindex.h: separate encoding index constants from internal.h.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51861 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_unicode_p): fix document. predicate
functions may return non-zero values other than 1 as true.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51727 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_locale_encindex): find encoding index without
making a string object every time. [ruby-core:58160] [Bug #9080]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51670 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_autoload): drop dummy encoding flag from
the loaded encoding index. this flag is used only in this
source.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* use rb_funcallv() for no arguments call instead of variadic
rb_funcall().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@49612 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (load_encoding): use rb_require_internal instead of
calling rb_require_safe with protection.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@48700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
use 0 for rb_data_type_t::reserved instead of NULL, since its type
may be changed in the future and possibly not a pointer type.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@48662 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c: move `ruby_encoding_index` stuff from
include/ruby/encoding.h to hide the extra field.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46323 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_default_internal): fix rdoc. `__FILE__` is
in filesystem encoding but not `default_internal`.
[ruby-core:61894] [Bug #9713]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45539 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* parse.y (rb_str_dynamic_intern): associate proper encoding with
the result symbol.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45439 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_get_index): the encoding of dynamic symbol
comes from fstr.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e