The core loop of Iterator::Next() requires multiple branches, one to
check for entry liveness and one to check for wraparound. We can
rewrite this to use masking instead, as well as iterating only over the
hashes, and reconstructing the entry pointer when we know we've reached
a live entry. This change cuts the time taken on the collections
benchmark by the iteration portion in half.
As discussed in the previous commit message, PLDHashTable's storage
wastes space for common entry types. This commit reorganizes the
storage to store all the hashes packed together, followed by all the
entries, which eliminates said waste.
PLDHashTable requires that all items stored therein inherit from
PLDHashEntryHdr:
struct PLDHashEntryHdr {
// PLDHashNumber is a uint32_t.
PLDHashNumber mKeyHash; // Cached hash key for object.
};
class MyType : public PLDHashEntryHdr {
// Data members, etc.
};
PLDHashEntryHdr::mKeyHash is used to cache the computed hash value of
the entry, so we aren't rehashing entries on every lookup/add/etc.
Because of structure layout requirements on 64-bit platforms, the data
members of MyType will typically start at offset 8:
MyType, offset 0: mKeyHash
MyType, offset 4: padding required by alignment
MyType, offset 8: first data member of MyType
MyType, offset N: ...
The padding at offset 4 is dead, unused space.
We'd like to change this state of affairs by having PLDHashTable manage
the cached hash key itself, which would eliminate the dead space in the
object and would enable packing the table storage more tightly. But
PLDHashTable pervasively assumes that its internal storage is an array
of PLDHashEntryHdrs (with an associated object size to account for
subclassing).
As a first step to laying out the hash table storage differently, we
have to make the internals of PLDHashTable not access PLDHashEntryHdr
items directly, but layer an abstraction on top. We call this
abstraction "slots": a slot represents storage for the cached hash key
and the associated entry, and can produce a PLDHashEntryHdr* on demand.
The only place where this is used where the const-ness matters is in
AddressEntry, and that use const_cast's away the const-ness. So let's
just ditch the method that attempts to return const pointers.
This change is satisfying insofar as it removes a load from every
iteration step over the hashtable, but it doesn't seem to affect
our collection benchmarks in any way. The change does not make
iterators any larger, as there is padding at the end of the iterator
structure on both 32- and 64-bit platforms.
We only use its value in one place, and said value is easily computable
from readily available information. This change makes iterators
slightly smaller.
Currently we have three ways of representing hash values.
- uint32_t: used in HashFunctions.h.
- PLDHashNumber: defined in PLDHashTable.{h,cpp}.
- js::HashNumber: defined in js/public/Utility.h.
Functions that create hash values with functions from HashFunctions.h use a mix
of these three types. It's a bit of a mess.
This patch introduces mozilla::HashNumber, and redefines PLDHashNumber and
js::HashNumber as synonyms. It also changes HashFunctions.h to use
mozilla::HashNumber throughout instead of uint32_t.
This leaves plenty of places that still use uint32_t that should use
mozilla::HashNumber or one of its synonyms, but I didn't want to tackle that
now.
The patch also:
- Does similar things for the constants defining the number of bits in each
hash number type.
- Moves js::HashNumber from Utility.h to HashTable.h, which is a better spot
for it. (This required changing the signature of ScrambleHashCode(); that's
ok, it'll get moved by the next patch anyway.)
MozReview-Commit-ID: EdoWlCm7OUC
--HG--
extra : rebase_source : 5b92c0c3560eb56850cd8832f8ee514d25e3c16f
This speeds up BenchCollections.PLDHash as follows:
> succ_lookups fail_lookups insert_remove iterate
> 42.8 ms 51.6 ms 21.0 ms 34.9 ms
> 41.8 ms 51.9 ms 20.0 ms 34.6 ms
> 41.6 ms 51.3 ms 19.3 ms 34.5 ms
> 41.6 ms 51.3 ms 19.8 ms 35.0 ms
> 41.7 ms 50.8 ms 20.5 ms 35.2 ms
After:
> succ_lookups fail_lookups insert_remove iterate
> 37.7 ms 33.1 ms 19.7 ms 35.0 ms
> 37.0 ms 32.5 ms 19.1 ms 35.4 ms
> 37.6 ms 33.6 ms 19.2 ms 36.2 ms
> 36.7 ms 33.3 ms 19.1 ms 35.3 ms
> 37.1 ms 33.1 ms 19.1 ms 35.0 ms
Successful lookups are about 1.13x faster, failing lookups are about 1.54x
faster, and insertions/removals are about 1.05x faster.
On Linux64, this increases the size of libxul (as measured by `size`) by a mere
16 bytes.
--HG--
extra : rebase_source : 4bfdee68ba40a0cdd5396964d1647a70fb10736a
The current code for resizing PLDHashTable modifies the cached hash for
all entries in the old hash table. This is unnecessary, because we're
going to throw away the old hash table shortly, and inefficient, because
writing to memory we're never going to use again just wastes time and
memory bandwidth. Instead, let's avoid the write by pulling out the
cached key and doing the necessary manipulation on local variables,
which is probably slightly faster.
PLDHashTable::Search() does not modify any members. So, this method and
methods called by it should be marked as const.
MozReview-Commit-ID: 6g4jrYK1j9E
--HG--
extra : rebase_source : eda6c50c538fec0e8c09cb2ba629735eea6ec711
This was done automatically replacing:
s/mozilla::Move/std::move/
s/ Move(/ std::move(/
s/(Move(/(std::move(/
Removing the 'using mozilla::Move;' lines.
And then with a few manual fixups, see the bug for the split series..
MozReview-Commit-ID: Jxze3adipUh
This patch reduces sizeof(PLDHashTable) as follows.
- 64-bit: from 40 bytes to 32
- 32-bit: from 28 bytes to 20
It does this by doing the following.
- It moves mGeneration from EntryStore to PLDHashTable, to avoid unnecessary
padding on 64-bit. This requires tweaking EntryStore::Set() as explained in a
comment.
- It also shrinks mGeneration from uint32_t to uint16_t, saving 2 bytes of
data.
- It shrinks mEntrySize from uint32_t to uint8_t, to cut 3 bytes of data.
- It shrinks mHashShift from int16_t to uint8_t, trimming another byte of data,
and moves it, saving another 2 bytes of padding.
And it reorders the fields so the word-sized ones are at the start, which makes
it easier to imagine the memory layout.
The patch also adds a test, and fixes some misordered function arguments in
existing tests.
--HG--
extra : rebase_source : 6ed6f7be68477fd4a82f07dd2f51c1f1d9b92dcc
The current code replaces one PLDHashTable's mGeneration with another. If the
two generation values happen to be equal, it will look as though the storage
hasn't changed when really it has.
The fix is to use EntryStore::Set() to update mEntryStore, which increments
mGeneration appropriately.
--HG--
extra : rebase_source : d1779a143746c8d1a3e67bc3685e703496206b0f
The simplistic shift-based hashing function creates a lot of collisions
for pointers pointing to arrays as it doesn't do a great job at distributing
the data randomly based on the input bytes.
PLDHashTable takes the result of the hash function and multiplies it by
kGoldenRatio to ensure that it has a good distribution of bits across
the 32-bit hash value, and then zeroes out the low bit so that it can be
used for the collision flag. This result is called hash0. From hash0
it computes two different numbers used to find entries in the table
storage: hash1 is used to find an initial position in the table to
begin searching for an entry; hash2 is then used to repeatedly offset
that position (mod the size of the table) to build a chain of positions
to search.
In a table with capacity 2^c entries, hash1 is simply the upper c bits
of hash0. This patch does not change this.
Prior to this patch, hash2 was the c bits below hash1, padded at the low
end with zeroes when c > 16. (Note that bug 927705, changeset
1a02bec165e16f370cace3da21bb2b377a0a7242, increased the maximum capacity
from 2^23 to 2^26 since 2^23 was sometimes insufficient!) This manner
of computing hash2 is problematic because it increases the risk of long
chains for very large tables, since there is less variation in the hash2
result due to the zero padding.
So this patch changes the hash2 computation by using the low bits of
hash0 instead of shifting it around, thus avoiding 0 bits in parts of
the hash2 value that are significant.
Note that this changes what hash2 is in all cases except when the table
capacity is exactly 2^16, so it does change our hashing characteristics.
For tables with capacity less than 2^16, it should be using a different
second hash, but with the same amount of random-ish data. For tables
with capacity greater than 2^16, it should be using more random-ish
data.
Note that this patch depends on the patch for bug 1353458 in order to
avoid causing test failures.
MozReview-Commit-ID: JvnxAMBY711
--HG--
extra : transplant_source : 2%D2%C2%CE%E1%92%C8%F8H%D7%15%A4%86%5B%3Ac%0B%08%3DA
PLDHashTable's entry store has two types of unoccupied entries: free
entries and removed entries. The search of a chain of entries
(determined by the hash value) in the entry store to search for an entry
can stop at free entries, but it continues across removed entries,
because removed entries are entries that may have been skipped over when
we were adding the value we're searching for to the hash, but have since
been removed. For live entries, we also maintain this distinction by
using one bit of storage for a collision flag, which notes that if the
hashtable entry is removed, its place in the entry store must become a
removed entry rather than a free entry.
When we add a new entry to the table, Add's semantics require that we
return an existing entry if there is one, and only create a new entry if
no existing entry exists. (Bug 1352198 suggests the possibility of a
faster alternative Add API where the caller guarantees that the key is
not already in the hashtable.) When we search for the existing entry,
we must thus continue the search across removed entries, even though we
record the first removed entry found to return if the search for an
existing entry fails.
The existing code adds the collision flag through the entire table
search during an Add. This patch changes that behavior so that we only
add the collision flag prior to finding the first removed entry. Adding
it after we find the first removed entry is unnecessary, since we are
not making that entry part of a path to a new entry. If it is part of a
path to an existing entry, it will already have the collision flag set.
This patch effectively puts an if (!firstRemoved) around the else branch
of the if (MOZ_UNLIKELY(EntryIsRemoved(entry))), and then refactors that
condition outwards since it is now around the contents of both the if
and else branches.
MozReview-Commit-ID: CsXnMYttHVy
--HG--
extra : transplant_source : %80%9E%83%EC%CCY%B4%B0%86%86%18%99%B6U%21o%5D%29%AD%04
PLDHashTable takes the result of the hash function and multiplies it by
kGoldenRatio to ensure that it has a good distribution of bits across
the 32-bit hash value, and then zeroes out the low bit so that it can be
used for the collision flag. This result is called hash0. From hash0
it computes two different numbers used to find entries in the table
storage: hash1 is used to find an initial position in the table to
begin searching for an entry; hash2 is then used to repeatedly offset
that position (mod the size of the table) to build a chain of positions
to search.
In a table with capacity 2^c entries, hash1 is simply the upper c bits
of hash0. This patch does not change this.
Prior to this patch, hash2 was the c bits below hash1, padded at the low
end with zeroes when c > 16. (Note that bug 927705, changeset
1a02bec165e16f370cace3da21bb2b377a0a7242, increased the maximum capacity
from 2^23 to 2^26 since 2^23 was sometimes insufficient!) This manner
of computing hash2 is problematic because it increases the risk of long
chains for very large tables, since there is less variation in the hash2
result due to the zero padding.
So this patch changes the hash2 computation by using the low bits of
hash0 instead of shifting it around, thus avoiding 0 bits in parts of
the hash2 value that are significant.
Note that this changes what hash2 is in all cases except when the table
capacity is exactly 2^16, so it does change our hashing characteristics.
For tables with capacity less than 2^16, it should be using a different
second hash, but with the same amount of random-ish data. For tables
with capacity greater than 2^16, it should be using more random-ish
data.
Note that this patch depends on the patch for bug 1353458 in order to
avoid causing test failures.
MozReview-Commit-ID: JvnxAMBY711
--HG--
extra : rebase_source : dbde93b9312dd10fb3a549ee4098b57e1df5cadf
PLDHashTable's entry store has two types of unoccupied entries: free
entries and removed entries. The search of a chain of entries
(determined by the hash value) in the entry store to search for an entry
can stop at free entries, but it continues across removed entries,
because removed entries are entries that may have been skipped over when
we were adding the value we're searching for to the hash, but have since
been removed. For live entries, we also maintain this distinction by
using one bit of storage for a collision flag, which notes that if the
hashtable entry is removed, its place in the entry store must become a
removed entry rather than a free entry.
When we add a new entry to the table, Add's semantics require that we
return an existing entry if there is one, and only create a new entry if
no existing entry exists. (Bug 1352198 suggests the possibility of a
faster alternative Add API where the caller guarantees that the key is
not already in the hashtable.) When we search for the existing entry,
we must thus continue the search across removed entries, even though we
record the first removed entry found to return if the search for an
existing entry fails.
The existing code adds the collision flag through the entire table
search during an Add. This patch changes that behavior so that we only
add the collision flag prior to finding the first removed entry. Adding
it after we find the first removed entry is unnecessary, since we are
not making that entry part of a path to a new entry. If it is part of a
path to an existing entry, it will already have the collision flag set.
This patch effectively puts an if (!firstRemoved) around the else branch
of the if (MOZ_UNLIKELY(EntryIsRemoved(entry))), and then refactors that
condition outwards since it is now around the contents of both the if
and else branches.
MozReview-Commit-ID: CsXnMYttHVy
--HG--
extra : transplant_source : W4%B8%BA%D5p%102%1B%8D%83%23%E0s%B3%B0f%0D%05%AE
PLDHashTable takes the result of the hash function and multiplies it by
kGoldenRatio to ensure that it has a good distribution of bits across
the 32-bit hash value, and then zeroes out the low bit so that it can be
used for the collision flag. This result is called hash0. From hash0
it computes two different numbers used to find entries in the table
storage: hash1 is used to find an initial position in the table to
begin searching for an entry; hash2 is then used to repeatedly offset
that position (mod the size of the table) to build a chain of positions
to search.
In a table with capacity 2^c entries, hash1 is simply the upper c bits
of hash0. This patch does not change this.
Prior to this patch, hash2 was the c bits below hash1, padded at the low
end with zeroes when c > 16. (Note that bug 927705, changeset
1a02bec165e16f370cace3da21bb2b377a0a7242, increased the maximum capacity
from 2^23 to 2^26 since 2^23 was sometimes insufficient!) This manner
of computing hash2 is problematic because it increases the risk of long
chains for very large tables, since there is less variation in the hash2
result due to the zero padding.
So this patch changes the hash2 computation by using the low bits of
hash0 instead of shifting it around, thus avoiding 0 bits in parts of
the hash2 value that are significant.
Note that this changes what hash2 is in all cases except when the table
capacity is exactly 2^16, so it does change our hashing characteristics.
For tables with capacity less than 2^16, it should be using a different
second hash, but with the same amount of random-ish data. For tables
with capacity greater than 2^16, it should be using more random-ish
data.
MozReview-Commit-ID: JvnxAMBY711
--HG--
extra : transplant_source : %8A%25%FB%E3H%B8_%F1G%F6%3E%0B%29%DF%20%FF%D8%E1%AEw
PLDHashTable's entry store has two types of unoccupied entries: free
entries and removed entries. The search of a chain of entries
(determined by the hash value) in the entry store to search for an entry
can stop at free entries, but it continues across removed entries,
because removed entries are entries that may have been skipped over when
we were adding the value we're searching for to the hash, but have since
been removed. For live entries, we also maintain this distinction by
using one bit of storage for a collision flag, which notes that if the
hashtable entry is removed, its place in the entry store must become a
removed entry rather than a free entry.
When we add a new entry to the table, Add's semantics require that we
return an existing entry if there is one, and only create a new entry if
no existing entry exists. (Bug 1352198 suggests the possibility of a
faster alternative Add API where the caller guarantees that the key is
not already in the hashtable.) When we search for the existing entry,
we must thus continue the search across removed entries, even though we
record the first removed entry found to return if the search for an
existing entry fails.
The existing code adds the collision flag through the entire table
search during an Add. This patch changes that behavior so that we only
add the collision flag prior to finding the first removed entry. Adding
it after we find the first removed entry is unnecessary, since we are
not making that entry part of a path to a new entry. If it is part of a
path to an existing entry, it will already have the collision flag set.
This patch effectively puts an if (!firstRemoved) around the else branch
of the if (MOZ_UNLIKELY(EntryIsRemoved(entry))), and then refactors that
condition outwards since it is now around the contents of both the if
and else branches.
MozReview-Commit-ID: CsXnMYttHVy
--HG--
extra : transplant_source : 0T%B0%FA%C0%85v%8B%16%E7%81%03p%F5K%97%B1%9E%92%27