Previous implementation had an issues:
- macros murmur1 assumes murmur_step takes rotation value
as a second argument
- but murmur_step second argument is "next block"
- this makes st_hash_uint and st_hash_end to not mix high bits of
hash value into lower bits
- this leads to pure hash behavior on doubles and mixing hashes using
st_hash_uint.
It didn't matter when bins amount were prime numbers, but it hurts
when bins are powers of two.
Mistake were created cause of attempt to co-exist Murmur1 and Murmur2
in a same code.
Change it to single hash-function implementation.
- block function is in a spirit of Murmur functions,
but handles inter-block dependency a bit better (imho).
- final block is read in bit more optimal way on CPU with unaligned word access,
- final block is mixed in simple way,
- finalizer is taken from MurmurHash3 (it makes most of magic :) )
(64bit finalizer is taken from
http://zimbry.blogspot.ru/2011/09/better-bit-mixing-improving-on.html)
Also remove ST_USE_FNV1: it lacks implementation of many functions,
and looks to be abandoned
Author: Sokolov Yura aka funny_falcon <funny.falcon@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57134 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.h (struct st_hash_type): Remove strong_hash.
(struct st_table): Remove inside_rebuild_p and curr_hash.
* st.c (do_hash): Use type->hash instead of curr_hash.
(make_tab_empty): Remove setting up curr_hash.
(st_init_table_with_size): Remove setting up inside_rebuild_p.
(rebuild_table): Remove clearing inside_rebuild_p.
(reset_entry_hashes, HIT_THRESHOULD_FOR_STRONG_HASH): Remove code
recognizing a denial attack and switching to strong hash.
* hash.c (rb_dbl_long_hash, rb_objid_hash, rb_ident_hash): Use
rb_hash_start to randomize the hash.
(str_seed): Remove.
(any_hash): Remove strong_p and use always rb_str_hash for
strings.
(any_hash_weak, rb_any_hash_weak): Remove.
(st_hash_type objhash): Remove rb_any_hash_weak.
based on the patch by Vladimir N Makarov <vmakarov@redhat.com> at
[ruby-core:78490]. [Bug #13002]
* test/ruby/test_hash.rb (test_wrapper): objects other than special
constants should be able to be wrapped.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56992 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
From: Vladimir Makarov <vmakarov@redhat.com>
By Vladimir's estimation, this requires at least 64 GB of memory
to reproduce this bug due to the hash sizes required. So there
is no new test case (and I am unable to test it, myself).
* st.c (get_bins_num): avoid out-of-bounds on shift by using correct type
[ruby-core:78139] [Bug #12939]
* st.c (get_allocated_entries): ditto
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56793 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
[Feature #12142]
See header of st.c for improvment details.
You can see all of code history here:
<https://github.com/vnmakarov/ruby/tree/hash_tables_with_open_addressing>
This improvement is discussed at
<https://bugs.ruby-lang.org/issues/12142>
with many people, especially with Yura Sokolov.
* st.c: improve st_table.
* include/ruby/st.h: ditto.
* internal.h, numeric.c, hash.c (rb_dbl_long_hash): extract a function.
* ext/-test-/st/foreach/foreach.c: catch up this change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
and define CONSTFUNC and PUREFUNC if available.
Note that I don't add those options as default because
it still shows many false-positive (it seems not to consider
longjmp).
* vm_eval.c (stack_check): get rb_thread_t* as an argument
to avoid duplicate call of GET_THREAD().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (EQUAL, st_delete_safe): fix arguments order to compare
function, searching key is the first and stored key is the
second always.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
variable used). I think that these are wrong, but should shut them
up.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51105 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This improves the bm_vm2_bighash benchmark significantly by
removing branches during insert, but slows down anything
requiring iteration with the more complex loop termination
checking.
Speedup ratio of 1.10 - 1.20 is typical for the vm2_bighash
benchmark.
v3 - st_head calculates list_head address in two steps
to avoid a bug in old gcc 4.4 (Debian 4.4.7-2)
bug which incorrectly warned with:
warning: dereferencing pointer ‘({anonymous})’ does break
strict-aliasing rules
* include/ruby/st.h (struct st_table): hide struct list_head
* st.c (struct st_table_entry): adjust struct
(head, tail): remove shortcut macros
(st_head): new wrapper function
(st_init_table_with_size): adjust to new struct and API
(st_clear): ditto
(add_direct): ditto
(unpack_entries): ditto
(rehash): ditto
(st_copy): ditto
(remove_entry): ditto
(st_shift): ditto
(st_foreach_check): ditto
(st_foreach): ditto
(get_keys): ditto
(get_values): ditto
(st_values_check): ditto
(st_reverse_foreach_check): ditto (unused)
(st_reverse_foreach): ditto (unused)
[ruby-core:69726] [Misc #10278]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51064 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This reverts commit r51044
Still getting failure notices from ko1's CI machine.
ref: g3qkqn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I suspect the build failures with r51034 ("st.c: use ccan linked-list")
on ko1's CI machine was due to me forgetting to update common.mk :x
Lets see what happens when I only include ccan/list/list.h
Will followup with appropriate reverts or reinstating the
rest of r51034 along with a common.mk update as I watch CI
build.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51041 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Reduces object size slightly on x86-64:
text data bss dec hex filename
2782359 22400 71880 2876639 2be4df ruby.orig
2781831 22400 71880 2876111 2be2cf ruby.pow2
And on 32-bit x86:
text data bss dec hex filename
2814751 12100 30552 2857403 2b99bb ruby.orig
2814051 12100 30552 2856703 2b96ff ruby.pow2
This is not a performance-critical function, but the
smaller icache footprint seems worth it.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47767 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (st_update): old_key is uninitialized by jump to the label
unpacked.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (st_update): remove equality checks, callers should ensure
the equality, otherwise the behavior is undefined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (st_update): re-calculate hash_val before adding if key was
changed, otherwise cannot access the newly added element if it
has different hash value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46722 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (st_update): fix a bug that the key was not updated even if
it was changed by the callback function.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46720 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* configure.in (with-jemalloc): also check for header, for ABIs
which JEMALLOC_MANGLE is needed, i.e., Mach-O and PE-COFF
platforms. [ruby-core:62939] [Feature #9113]
* include/ruby/missing.h: include alternative malloc header to
replace memory management functions.
* dln.c, io.c, parse.y, st.c: undef malloc family before
re-definition to suppress warnings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c (hash_pos): use bitwise AND to avoid slow modulo op
(new_size): power-of-two sizes for hash_pos change
(st_numhash): adjust for common keys due to lack of prime modulo
[Feature #9425]
* hash.c (rb_any_hash): right shift for symbols
* benchmark/bm_hash_aref_miss.rb: added to show improvement
* benchmark/bm_hash_aref_sym_long.rb: ditto
* benchmark/bm_hash_aref_str.rb: ditto
* benchmark/bm_hash_aref_sym.rb: ditto
* benchmark/bm_hash_ident_num.rb: added to prevent regression
* benchmark/bm_hash_ident_obj.rb: ditto
* benchmark/bm_hash_ident_str.rb: ditto
* benchmark/bm_hash_ident_sym.rb: ditto
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/st.h: add prototypes for above.
* hash.c (rb_hash_values): use st_values_check() for performance
improvement if VALUE and st_data_t are compatible.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43895 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (rb_hash_keys): use st_keys() for performance improvement
if st_data_t and VALUE are compatible.
* st.h: define macro ST_DATA_COMPATIBLE_P() to predicate whether
st_data_t and passed type are compatible.
* configure.in: check existence of builtin function to use in
ST_DATA_COMPATIBLE_P().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43885 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c (foreach_safe_i, hash_foreach_iter): deal with error detected
by ST_CHECK.
* st.c (st_foreach_check): call with non-error argument in normal case.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43674 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* st.c: revert st_keys() at r43238. VALUE cannot be in st.c.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43243 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/ruby.h: use strcasecmp() as st_strcasecmp() if it
exists.
* st.c (st_strcasecmp): define the function only if strcasecmp()
doesn't exist.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42008 b2dd03c8-39d4-4d8f-98ff-823fe69b080e