* Remote dealloc refactor.
* Improve remote dealloc
Change remote to count down to 0, so fast path does not need a constant.
Use signed value so that branch does not depend on addition.
* Inline remote_dealloc
The fast path of remote_dealloc is sufficiently compact that it can be
inlined.
* Improve fast path in Slab::alloc
Turn the internal structure into tail calls, to improve fast path.
Should be no algorithmic changes.
* Refactor initialisation to help fast path.
Break lazy initialisation into two functions, so it is easier to codegen
fast paths.
* Minor tidy to statically sized dealloc.
* Refactor semi-slow path for alloc
Make the backup path a bit faster. Only algorithmic change is to delay
checking for first allocation. Otherwise, should be unchanged.
* Test initial operation of a thread
The first operation a new thread takes is special. It results in
allocating an allocator, and swinging it into the TLS. This makes
this a very special path, that is rarely tested. This test generates
a lot of threads to cover the first alloc and dealloc operations.
* Correctly handle reusing get_noncachable
* Fix large alloc stats
Large alloc stats aren't necessarily balanced on a thread, this changes
to tracking individual pushs and pops, rather than the net effect
(with an unsigned value).
* Fix TLS init on large alloc path
* Add Bump ptrs to allocator
Each allocator has a bump ptr for each size class. This is no longer
slab local.
Slabs that haven't been fully allocated no longer need to be in the DLL
for this sizeclass.
* Change to a cycle non-empty list
This change reduces the branching in the case of finding a new free
list. Using a non-empty cyclic list enables branch free add, and a
single branch in remove to detect the empty case.
* Update differences
* Rename first allocation
Use needs initialisation as makes more sense for other scenarios.
* Use a ptrdiff to help with zero init.
* Make GlobalPlaceholder zero init
The GlobalPlaceholder allocator is now a zero init block of memory.
This removes various issues for when things are initialised. It is made read-only
to we detect write to it on some platforms.