* hash.c (env_enc_str_new): convert to the expected encoding
without intermediate string, and set econv flags if default
internal encoding is set too.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
These caused numerous CI failures I haven't been able to
reproduce [ruby-core:82102]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The same hash keys may be loaded from tainted data sources
frequently (e.g. parsing headers from socket or loading
YAML data from a file). If a non-tainted fstring already
exists (because the application expects the hash key),
cache and deduplicate the tainted version in the new
tainted_frozen_strings table.
For non-embedded strings, this also allows sharing with the
underlying malloc-ed data.
* vm_core.h (rb_vm_struct): add tainted_frozen_strings
* vm.c (ruby_vm_destruct): free tainted_frozen_strings
(Init_vm_objects): initialize tainted_frozen_strings
(rb_vm_tfstring_table): accessor for tainted_frozen_strings
* internal.h: declare rb_fstring_existing, rb_vm_tfstring_table
* hash.c (fstring_existing_str): remove (moved to string.c)
(hash_aset_str): use rb_fstring_existing
* string.c (rb_fstring_existing): new, based on fstring_existing_str
(tainted_fstr_update): new
(rb_fstring_existing0): new, based on fstring_existing_str
(rb_tainted_fstring_existing): new, special case for tainted strings
(rb_str_free): delete from tainted_frozen_strings table
* test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test
[ruby-core:82012] [Bug #13737]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Fix up r59328. It is possible that the given block abuses
ObjectSpace.each_object to shrink the temporary array.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59334 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (hash_aset_str): create frozen string for tainted objects.
(should not use fsting table on this case).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.
We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.
Regarding <https://bugs.ruby-lang.org/issues/9188>, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.
* hash.c: #include gc.h for rb_objspace_garbage_object_p
(hash_aset_str): do read-only check of fstring table and
reuse fstring if it exists and is still alive (not garbage)
[ruby-core:81942] [Feature #13725]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
To convert the object implicitly, it has had two parts in convert_type() which are
1. lookink up the method's id
2. calling the method
Seems that strncmp() and strcmp() in convert_type() are slightly heavy to look up
the method's id for type conversion.
This patch will add and use internal APIs (rb_convert_type_with_id, rb_check_convert_type_with_id)
to call the method without looking up the method's id when convert the object.
Array#flatten -> 19 % up
Array#+ -> 3 % up
[ruby-dev:50024] [Bug #13341] [Fix GH-1537]
### Before
Array#flatten 104.119k (± 1.1%) i/s - 525.690k in 5.049517s
Array#+ 1.993M (± 1.8%) i/s - 10.010M in 5.024258s
### After
Array#flatten 124.005k (± 1.0%) i/s - 624.240k in 5.034477s
Array#+ 2.058M (± 4.8%) i/s - 10.302M in 5.019328s
### Test Code
require 'benchmark/ips'
class Foo
def to_ary
[1,2,3]
end
end
Benchmark.ips do |x|
ary = []
100.times { |i| ary << i }
array = [ary]
x.report "Array#flatten" do |i|
i.times { array.flatten }
end
x.report "Array#+" do |i|
obj = Foo.new
i.times { array + obj }
end
end
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c: [DOC] fix return value in call-seq of Hash#transform_values;
other small fixes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58888 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Hash#transform_values! returns the receiver rather than a new Hash
object.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58845 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Explicitly says that the methods return a new hash rather than just
stating it return a new something we don't know.
[ci skip]
[Fix GH-1619]
Author: Nicolas Cavigneaux <nico@bounga.org>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_merge): use rb_hash_dup() instead of rb_obj_dup() to duplicate
Hash object. rb_hash_dup() is faster duplicating function for Hash object
which got rid of Hash#initialize_dup method calling.
Hash#merge will be faster around 60%.
[ruby-dev:50026] [Bug #13343] [Fix GH-1533]
### Before
user system total real
Hash#merge 0.160000 0.020000 0.180000 ( 0.182357)
### After
user system total real
Hash#merge 0.110000 0.010000 0.120000 ( 0.114404)
### Test code
require 'benchmark'
Benchmark.bmbm do |x|
hash1 = {}
100.times { |i| hash1[i.to_s] = i }
hash2 = {}
100.times { |i| hash2[(i*2).to_s] = i*2 }
x.report "Hash#merge" do
10000.times do
hash1.merge(hash2)
end
end
end
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (rb_hash_bulk_insert): new API to bulk insert entries
into a hash. Given arguments are first inserted into the
table at once, then reindexed. This is faster than inserting
things using rb_hash_aset() one by one.
This arrangement (rb_ prefixed function placed in st.c) is
unavoidable because it both touches table internal and write
barrier at once.
* internal.h: delcare the new function.
* hash.c (rb_hash_s_create): use the new function.
* vm.c (core_hash_merge): ditto.
* insns.def (newhash): ditto.
* test/ruby/test_hash.rb: more coverage on hash creation.
* test/ruby/test_literal.rb: ditto.
-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name before after
loop_whileloop2 0.136 0.137
vm2_bighash* 1.249 0.623
Speedup ratio: compare with the result of `before' (greater is better)
name after
loop_whileloop2 0.996
vm2_bighash* 2.004
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We need to fix GC bug before merging this. Revert revisions
58452, 58435, 58434, 58428, 58427 in this order.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_new_from_object): same as r58434.
Newly created frozen objects are not referred from any roots/objects.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58452 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_new_from_values_with_klass): before this fix,
only a st table are filled with passed values. However, newly
created frozen strings are not marked correctly only reference
from st table. This patch marks such created frozen strings
by Hash object which refers to the st table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58434 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Same as rb_ary_tmp_new_from_values(), it reduces vm_exec_core binary
size from 26,176 bytes to 26,080 bytes. But this time, also with a
bit of optimizations:
- Because we are allocating a new hash and no back references are
introduced at all, we can safely skip write barriers.
- Also, the iteration never recurs. We can avoid complicated
function callbacks by using st_insert instead of st_update.
----
* hash.c (rb_hash_new_from_values): refactor
extract the bulk insert into a function.
* hash.c (rb_hash_new_from_object): also refactor.
* hash.c (rb_hash_s_create): use the new functions.
* insns.def (newhash): ditto.
* vm.c (core_hash_from_ary): ditto.
* iternal.h: export the new function.
-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name before after
loop_whileloop2 0.135 0.134
vm2_bighash* 1.236 0.687
Speedup ratio: compare with the result of `before' (greater is better)
name after
loop_whileloop2 1.008
vm2_bighash* 1.798
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58427 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Patch by Eric Wong [ruby-core:78797].
I don't like the idea of making insns.def any bigger to support
a corner case, and "test_hash_aref_fstring_identity" shows
how contrived this is.
[ruby-core:78783] [Bug #12855]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57278 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (prime1, prime2): split long long literals for platforms
where LL suffix is not supported, e.g., VC6.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (any_hash): should return `long', because ruby assumes
the hash value of the object id of an object is `long'.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57012 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.h (struct st_hash_type): Remove strong_hash.
(struct st_table): Remove inside_rebuild_p and curr_hash.
* st.c (do_hash): Use type->hash instead of curr_hash.
(make_tab_empty): Remove setting up curr_hash.
(st_init_table_with_size): Remove setting up inside_rebuild_p.
(rebuild_table): Remove clearing inside_rebuild_p.
(reset_entry_hashes, HIT_THRESHOULD_FOR_STRONG_HASH): Remove code
recognizing a denial attack and switching to strong hash.
* hash.c (rb_dbl_long_hash, rb_objid_hash, rb_ident_hash): Use
rb_hash_start to randomize the hash.
(str_seed): Remove.
(any_hash): Remove strong_p and use always rb_str_hash for
strings.
(any_hash_weak, rb_any_hash_weak): Remove.
(st_hash_type objhash): Remove rb_any_hash_weak.
based on the patch by Vladimir N Makarov <vmakarov@redhat.com> at
[ruby-core:78490]. [Bug #13002]
* test/ruby/test_hash.rb (test_wrapper): objects other than special
constants should be able to be wrapped.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56992 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Always add a space between a comma and the next element. These spaces
were there sometimes, but not always. This keeps to code consistent.
Patch by: Herwin Weststrate <herwin@snt.utwente.nl>
[ruby-core:78297] [Misc #12977] [GH-1492]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56971 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
[Feature #12142]
See header of st.c for improvment details.
You can see all of code history here:
<https://github.com/vnmakarov/ruby/tree/hash_tables_with_open_addressing>
This improvement is discussed at
<https://bugs.ruby-lang.org/issues/12142>
with many people, especially with Yura Sokolov.
* st.c: improve st_table.
* include/ruby/st.h: ditto.
* internal.h, numeric.c, hash.c (rb_dbl_long_hash): extract a function.
* ext/-test-/st/foreach/foreach.c: catch up this change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_compact_bang): should return nil if no elements
is deleted. [ruby-core:77709] [Bug #12863]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56473 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_compact, rb_hash_compact_bang): Removes nil
values from the original hash, to port Active Support behavior.
[Feature #11818]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56414 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
a hash value of Object might be Bignum, but it causes many troubles
expecially the Object is used as a key of a hash. so I've gave up
to do so.
* array.c (rb_ary_hash): use above macro.
* bignum.c (rb_big_hash): ditto.
* hash.c (rb_obj_hash, rb_hash_hash): ditto.
* numeric.c (rb_dbl_hash): ditto.
* proc.c (proc_hash): ditto.
* re.c (rb_reg_hash, match_hash): ditto.
* string.c (rb_str_hash_m): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (env_enc_str_new): make string for an environment
variable name or value.
* hash.c (env_name_new): make environment value string with the
encoding for its name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55819 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (w32_getenv): call rb_w32_getenv and rb_w32_ugetenv via
this pointer without further comparisons.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (env_assoc): the encoding of the value should be the
locale, as well as other methods, [], fetch, values, etc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e