Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.
We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.
Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ewah/ewok.h header pollutes the global namespace with
"BITS_IN_WORD", without any specific notion that we are
talking about the bits in an eword_t. We can give this the
more specific name "BITS_IN_EWORD".
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When buffer_grow changes the size of the buffer using realloc,
it first computes and saves the rlw pointer's offset into the
buffer using (uint8_t *) math before the realloc but then
restores it using (eword_t *) math.
In order to do this it's necessary to convert the (uint8_t *)
offset into an (eword_t *) offset. It was doing this by
dividing by the sizeof(size_t). Unfortunately sizeof(size_t)
is not same as sizeof(eword_t) on all platforms.
This causes illegal memory accesses and other bad things to
happen when attempting to use bitmaps on those platforms.
Fix this by dividing by the sizeof(eword_t) instead which
will always be correct for all platforms.
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).
It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.
The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.
EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.
This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.
The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:
https://github.com/vmg/libewok
The original Java implementation can also be found on GitHub:
https://github.com/lemire/javaewah
[jc: stripped debug-only code per Peff's $gmane/239768]
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>