Merge branch 'jk/pack-bitmap'

Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2014-02-27 14:01:48 -08:00 · 2014-02-27 14:01:48 -08:00 · 0f9e62e084
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@ -1870,6 +1870,31 @@ pack.packSizeLimit::
 	Common unit suffixes of 'k', 'm', or 'g' are
 	supported.

+pack.useBitmaps::
+	When true, git will use pack bitmaps (if available) when packing
+	to stdout (e.g., during the server side of a fetch). Defaults to
+	true. You should not generally need to turn this off unless
+	you are debugging pack bitmaps.
+
+pack.writebitmaps::
+	When true, git will write a bitmap index when packing all
+	objects to disk (e.g., when `git repack -a` is run).  This
+	index can speed up the "counting objects" phase of subsequent
+	packs created for clones and fetches, at the cost of some disk
+	space and extra time spent on the initial repack.  Defaults to
+	false.
+
+pack.writeBitmapHashCache::
+	When true, git will include a "hash cache" section in the bitmap
+	index (if one is written). This cache can be used to feed git's
+	delta heuristics, potentially leading to better deltas between
+	bitmapped and non-bitmapped objects (e.g., when serving a fetch
+	between an older, bitmapped pack and objects that have been
+	pushed since the last gc). The downside is that it consumes 4
+	bytes per object of disk space, and that JGit's bitmap
+	implementation does not understand it, causing it to complain if
+	Git and JGit are used on the same repository. Defaults to false.
+
 pager.<cmd>::
 	If the value is boolean, turns on or off pagination of the
 	output of a particular Git subcommand when writing to a tty.
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@ -9,7 +9,7 @@ git-repack - Pack unpacked objects in a repository
 SYNOPSIS
 --------
 [verse]
-'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [--window=<n>] [--depth=<n>]
+'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [--window=<n>] [--depth=<n>]

 DESCRIPTION
 -----------
@ -110,6 +110,13 @@ other objects in that pack they already have locally.
 	The default is unlimited, unless the config variable
 	`pack.packSizeLimit` is set.

+-b::
+--write-bitmap-index::
+	Write a reachability bitmap index as part of the repack. This
+	only makes sense when used with `-a` or `-A`, as the bitmaps
+	must be able to refer to all reachable objects. This option
+	overrides the setting of `pack.writebitmaps`.
+

 Configuration
 -------------
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@ -55,6 +55,7 @@ SYNOPSIS
 	     [ \--reverse ]
 	     [ \--walk-reflogs ]
 	     [ \--no-walk ] [ \--do-walk ]
+	     [ \--use-bitmap-index ]
 	     <commit>... [ \-- <paths>... ]

 DESCRIPTION
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@ -257,6 +257,14 @@ See also linkgit:git-reflog[1].
 	Output excluded boundary commits. Boundary commits are
 	prefixed with `-`.

+ifdef::git-rev-list[]
+--use-bitmap-index::
+
+	Try to speed up the traversal using the pack bitmap index (if
+	one is available). Note that when traversing with `--objects`,
+	trees and blobs will not have their associated path printed.
+endif::git-rev-list[]
+
 --

 History Simplification
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@ -0,0 +1,164 @@
+GIT bitmap v1 format
+====================
+
+	- A header appears at the beginning:
+
+		4-byte signature: {'B', 'I', 'T', 'M'}
+
+		2-byte version number (network byte order)
+			The current implementation only supports version 1
+			of the bitmap index (the same one as JGit).
+
+		2-byte flags (network byte order)
+
+			The following flags are supported:
+
+			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
+			This flag must always be present. It implies that the bitmap
+			index has been generated for a packfile with full closure
+			(i.e. where every single object in the packfile can find
+			 its parent links inside the same packfile). This is a
+			requirement for the bitmap index format, also present in JGit,
+			that greatly reduces the complexity of the implementation.
+
+			- BITMAP_OPT_HASH_CACHE (0x4)
+			If present, the end of the bitmap file contains
+			`N` 32-bit name-hash values, one per object in the
+			pack. The format and meaning of the name-hash is
+			described below.
+
+		4-byte entry count (network byte order)
+
+			The total count of entries (bitmapped commits) in this bitmap index.
+
+		20-byte checksum
+
+			The SHA1 checksum of the pack this bitmap index belongs to.
+
+	- 4 EWAH bitmaps that act as type indexes
+
+		Type indexes are serialized after the hash cache in the shape
+		of four EWAH bitmaps stored consecutively (see Appendix A for
+		the serialization format of an EWAH bitmap).
+
+		There is a bitmap for each Git object type, stored in the following
+		order:
+
+			- Commits
+			- Trees
+			- Blobs
+			- Tags
+
+		In each bitmap, the `n`th bit is set to true if the `n`th object
+		in the packfile is of that type.
+
+		The obvious consequence is that the OR of all 4 bitmaps will result
+		in a full set (all bits set), and the AND of all 4 bitmaps will
+		result in an empty bitmap (no bits set).
+
+	- N entries with compressed bitmaps, one for each indexed commit
+
+		Where `N` is the total amount of entries in this bitmap index.
+		Each entry contains the following:
+
+		- 4-byte object position (network byte order)
+			The position **in the index for the packfile** where the
+			bitmap for this commit is found.
+
+		- 1-byte XOR-offset
+			The xor offset used to compress this bitmap. For an entry
+			in position `x`, a XOR offset of `y` means that the actual
+			bitmap representing this commit is composed by XORing the
+			bitmap for this entry with the bitmap in entry `x-y` (i.e.
+			the bitmap `y` entries before this one).
+
+			Note that this compression can be recursive. In order to
+			XOR this entry with a previous one, the previous entry needs
+			to be decompressed first, and so on.
+
+			The hard-limit for this offset is 160 (an entry can only be
+			xor'ed against one of the 160 entries preceding it). This
+			number is always positive, and hence entries are always xor'ed
+			with **previous** bitmaps, not bitmaps that will come afterwards
+			in the index.
+
+		- 1-byte flags for this bitmap
+			At the moment the only available flag is `0x1`, which hints
+			that this bitmap can be re-used when rebuilding bitmap indexes
+			for the repository.
+
+		- The compressed bitmap itself, see Appendix A.
+
+== Appendix A: Serialization format for an EWAH bitmap
+
+Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
+library, making them backwards compatible with the JGit
+implementation:
+
+	- 4-byte number of bits of the resulting UNCOMPRESSED bitmap
+
+	- 4-byte number of words of the COMPRESSED bitmap, when stored
+
+	- N x 8-byte words, as specified by the previous field
+
+		This is the actual content of the compressed bitmap.
+
+	- 4-byte position of the current RLW for the compressed
+		bitmap
+
+All words are stored in network byte order for their corresponding
+sizes.
+
+The compressed bitmap is stored in a form of run-length encoding, as
+follows.  It consists of a concatenation of an arbitrary number of
+chunks.  Each chunk consists of one or more 64-bit words
+
+     H  L_1  L_2  L_3 .... L_M
+
+H is called RLW (run length word).  It consists of (from lower to higher
+order bits):
+
+     - 1 bit: the repeated bit B
+
+     - 32 bits: repetition count K (unsigned)
+
+     - 31 bits: literal word count M (unsigned)
+
+The bitstream represented by the above chunk is then:
+
+     - K repetitions of B
+
+     - The bits stored in `L_1` through `L_M`.  Within a word, bits at
+       lower order come earlier in the stream than those at higher
+       order.
+
+The next word after `L_M` (if any) must again be a RLW, for the next
+chunk.  For efficient appending to the bitstream, the EWAH stores a
+pointer to the last RLW in the stream.
+
+
+== Appendix B: Optional Bitmap Sections
+
+These sections may or may not be present in the `.bitmap` file; their
+presence is indicated by the header flags section described above.
+
+Name-hash cache
+---------------
+
+If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains
+a cache of 32-bit values, one per object in the pack. The value at
+position `i` is the hash of the pathname at which the `i`th object
+(counting in index order) in the pack can be found.  This can be fed
+into the delta heuristics to compare objects with similar pathnames.
+
+The hash algorithm used is:
+
+    hash = 0;
+    while ((c = *name++))
+	    if (!isspace(c))
+		    hash = (hash >> 2) + (c << 24);
+
+Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag.
+If implementations want to choose a different hashing scheme, they are
+free to do so, but MUST allocate a new header flag (because comparing
+hashes made under two different schemes would be pointless).
--- a/16
+++ b/16
@ -664,6 +664,8 @@ LIB_H += diff.h
 LIB_H += diffcore.h
 LIB_H += dir.h
 LIB_H += exec_cmd.h
+LIB_H += ewah/ewok.h
+LIB_H += ewah/ewok_rlw.h
 LIB_H += fetch-pack.h
 LIB_H += fmt-merge-msg.h
 LIB_H += fsck.h
@ -691,8 +693,10 @@ LIB_H += notes-merge.h
 LIB_H += notes-utils.h
 LIB_H += notes.h
 LIB_H += object.h
+LIB_H += pack-objects.h
 LIB_H += pack-revindex.h
 LIB_H += pack.h
+LIB_H += pack-bitmap.h
 LIB_H += parse-options.h
 LIB_H += patch-ids.h
 LIB_H += pathspec.h
@ -796,6 +800,10 @@ LIB_OBJS += dir.o
 LIB_OBJS += editor.o
 LIB_OBJS += entry.o
 LIB_OBJS += environment.o
+LIB_OBJS += ewah/bitmap.o
+LIB_OBJS += ewah/ewah_bitmap.o
+LIB_OBJS += ewah/ewah_io.o
+LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
@ -827,7 +835,10 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += pack-bitmap.o
+LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
+LIB_OBJS += pack-objects.o
 LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
 LIB_OBJS += pager.o
@ -2480,8 +2491,9 @@ profile-clean:
 	$(RM) $(addsuffix *.gcno,$(addprefix $(PROFILE_DIR)/, $(object_dirs)))

 clean: profile-clean coverage-clean
-	$(RM) *.o *.res block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o xdiff/*.o vcs-svn/*.o \
-		builtin/*.o $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
+	$(RM) *.o *.res block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o
+	$(RM) xdiff/*.o vcs-svn/*.o ewah/*.o builtin/*.o
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
 	$(RM) -r bin-wrappers $(dep_dirs)
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@ -62,38 +62,6 @@
  #define setW(x, val) (W(x) = (val))
 #endif

-/*
- * Performance might be improved if the CPU architecture is OK with
- * unaligned 32-bit loads and a fast ntohl() is available.
- * Otherwise fall back to byte loads and shifts which is portable,
- * and is faster on architectures with memory alignment issues.
- */
-
-#if defined(__i386__) || defined(__x86_64__) || \
-    defined(_M_IX86) || defined(_M_X64) || \
-    defined(__ppc__) || defined(__ppc64__) || \
-    defined(__powerpc__) || defined(__powerpc64__) || \
-    defined(__s390__) || defined(__s390x__)
-
-#define get_be32(p)	ntohl(*(unsigned int *)(p))
-#define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
-
-#else
-
-#define get_be32(p)	( \
-	(*((unsigned char *)(p) + 0) << 24) | \
-	(*((unsigned char *)(p) + 1) << 16) | \
-	(*((unsigned char *)(p) + 2) <<  8) | \
-	(*((unsigned char *)(p) + 3) <<  0) )
-#define put_be32(p, v)	do { \
-	unsigned int __v = (v); \
-	*((unsigned char *)(p) + 0) = __v >> 24; \
-	*((unsigned char *)(p) + 1) = __v >> 16; \
-	*((unsigned char *)(p) + 2) = __v >>  8; \
-	*((unsigned char *)(p) + 3) = __v >>  0; } while (0)
-
-#endif
-
 /* This "rolls" over the 512-bit array */
 #define W(x) (array[(x)&15])

--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@ -14,10 +14,12 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "pack-objects.h"
 #include "progress.h"
 #include "refs.h"
 #include "streaming.h"
 #include "thread-utils.h"
+#include "pack-bitmap.h"

 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@ -25,42 +27,15 @@ static const char *pack_usage[] = {
 	NULL
 };

-struct object_entry {
-	struct pack_idx_entry idx;
-	unsigned long size;	/* uncompressed size */
-	struct packed_git *in_pack; 	/* already in pack */
-	off_t in_pack_offset;
-	struct object_entry *delta;	/* delta base object */
-	struct object_entry *delta_child; /* deltified objects who bases me */
-	struct object_entry *delta_sibling; /* other deltified objects who
-					     * uses the same base as me
-					     */
-	void *delta_data;	/* cached delta (uncompressed) */
-	unsigned long delta_size;	/* delta data size (uncompressed) */
-	unsigned long z_delta_size;	/* delta data size (compressed) */
-	enum object_type type;
-	enum object_type in_pack_type;	/* could be delta */
-	uint32_t hash;			/* name hint hash */
-	unsigned char in_pack_header_size;
-	unsigned preferred_base:1; /*
-				    * we do not pack this, but is available
-				    * to be used as the base object to delta
-				    * objects against.
-				    */
-	unsigned no_try_delta:1;
-	unsigned tagged:1; /* near the very tip of refs */
-	unsigned filled:1; /* assigned write-order */
-};
-
 /*
- * Objects we are going to pack are collected in objects array (dynamically
- * expanded).  nr_objects & nr_alloc controls this array.  They are stored
- * in the order we see -- typically rev-list --objects order that gives us
- * nice "minimum seek" order.
+ * Objects we are going to pack are collected in the `to_pack` structure.
+ * It contains an array (dynamically expanded) of the object data, and a map
+ * that can resolve SHA1s to their position in the array.
 */
-static struct object_entry *objects;
+static struct packing_data to_pack;
+
 static struct pack_idx_entry **written_list;
-static uint32_t nr_objects, nr_alloc, nr_result, nr_written;
+static uint32_t nr_result, nr_written;

 static int non_empty;
 static int reuse_delta = 1, reuse_object = 1;
@ -83,27 +58,43 @@ static struct progress *progress_state;
 static int pack_compression_level = Z_DEFAULT_COMPRESSION;
 static int pack_compression_seen;

+static struct packed_git *reuse_packfile;
+static uint32_t reuse_packfile_objects;
+static off_t reuse_packfile_offset;
+
+static int use_bitmap_index = 1;
+static int write_bitmap_index;
+static uint16_t write_bitmap_options;
+
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
 static unsigned long cache_max_small_delta_size = 1000;

 static unsigned long window_memory_limit = 0;

-/*
- * The object names in objects array are hashed with this hashtable,
- * to help looking up the entry by object name.
- * This hashtable is built after all the objects are seen.
- */
-static int *object_ix;
-static int object_ix_hashsz;
-static struct object_entry *locate_object_entry(const unsigned char *sha1);
-
 /*
 * stats
 */
 static uint32_t written, written_delta;
 static uint32_t reused, reused_delta;

+/*
+ * Indexed commits
+ */
+static struct commit **indexed_commits;
+static unsigned int indexed_commits_nr;
+static unsigned int indexed_commits_alloc;
+
+static void index_commit_for_bitmap(struct commit *commit)
+{
+	if (indexed_commits_nr >= indexed_commits_alloc) {
+		indexed_commits_alloc = (indexed_commits_alloc + 32) * 2;
+		indexed_commits = xrealloc(indexed_commits,
+			indexed_commits_alloc * sizeof(struct commit *));
+	}
+
+	indexed_commits[indexed_commits_nr++] = commit;
+}

 static void *get_delta(struct object_entry *entry)
 {
@ -553,12 +544,12 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag,
 		       void *cb_data)
 {
 	unsigned char peeled[20];
-	struct object_entry *entry = locate_object_entry(sha1);
+	struct object_entry *entry = packlist_find(&to_pack, sha1, NULL);

 	if (entry)
 		entry->tagged = 1;
 	if (!peel_ref(path, peeled)) {
-		entry = locate_object_entry(peeled);
+		entry = packlist_find(&to_pack, peeled, NULL);
 		if (entry)
 			entry->tagged = 1;
 	}
@ -633,9 +624,10 @@ static struct object_entry **compute_write_order(void)
 {
 	unsigned int i, wo_end, last_untagged;

-	struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo));
+	struct object_entry **wo = xmalloc(to_pack.nr_objects * sizeof(*wo));
+	struct object_entry *objects = to_pack.objects;

-	for (i = 0; i < nr_objects; i++) {
+	for (i = 0; i < to_pack.nr_objects; i++) {
 		objects[i].tagged = 0;
 		objects[i].filled = 0;
 		objects[i].delta_child = NULL;
@ -647,7 +639,7 @@ static struct object_entry **compute_write_order(void)
 	 * Make sure delta_sibling is sorted in the original
 	 * recency order.
 	 */
-	for (i = nr_objects; i > 0;) {
+	for (i = to_pack.nr_objects; i > 0;) {
 		struct object_entry *e = &objects[--i];
 		if (!e->delta)
 			continue;
@ -665,7 +657,7 @@ static struct object_entry **compute_write_order(void)
 	 * Give the objects in the original recency order until
 	 * we see a tagged tip.
 	 */
-	for (i = wo_end = 0; i < nr_objects; i++) {
+	for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			break;
 		add_to_write_order(wo, &wo_end, &objects[i]);
@ -675,7 +667,7 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * Then fill all the tagged tips.
 	 */
-	for (; i < nr_objects; i++) {
+	for (; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			add_to_write_order(wo, &wo_end, &objects[i]);
 	}
@ -683,7 +675,7 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * And then all remaining commits and tags.
 	 */
-	for (i = last_untagged; i < nr_objects; i++) {
+	for (i = last_untagged; i < to_pack.nr_objects; i++) {
 		if (objects[i].type != OBJ_COMMIT &&
 		    objects[i].type != OBJ_TAG)
 			continue;
@ -693,7 +685,7 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * And then all the trees.
 	 */
-	for (i = last_untagged; i < nr_objects; i++) {
+	for (i = last_untagged; i < to_pack.nr_objects; i++) {
 		if (objects[i].type != OBJ_TREE)
 			continue;
 		add_to_write_order(wo, &wo_end, &objects[i]);
@ -702,17 +694,57 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * Finally all the rest in really tight order
 	 */
-	for (i = last_untagged; i < nr_objects; i++) {
+	for (i = last_untagged; i < to_pack.nr_objects; i++) {
 		if (!objects[i].filled)
 			add_family_to_write_order(wo, &wo_end, &objects[i]);
 	}

-	if (wo_end != nr_objects)
-		die("ordered %u objects, expected %"PRIu32, wo_end, nr_objects);
+	if (wo_end != to_pack.nr_objects)
+		die("ordered %u objects, expected %"PRIu32, wo_end, to_pack.nr_objects);

 	return wo;
 }

+static off_t write_reused_pack(struct sha1file *f)
+{
+	unsigned char buffer[8192];
+	off_t to_write;
+	int fd;
+
+	if (!is_pack_valid(reuse_packfile))
+		die("packfile is invalid: %s", reuse_packfile->pack_name);
+
+	fd = git_open_noatime(reuse_packfile->pack_name);
+	if (fd < 0)
+		die_errno("unable to open packfile for reuse: %s",
+			  reuse_packfile->pack_name);
+
+	if (lseek(fd, sizeof(struct pack_header), SEEK_SET) == -1)
+		die_errno("unable to seek in reused packfile");
+
+	if (reuse_packfile_offset < 0)
+		reuse_packfile_offset = reuse_packfile->pack_size - 20;
+
+	to_write = reuse_packfile_offset - sizeof(struct pack_header);
+
+	while (to_write) {
+		int read_pack = xread(fd, buffer, sizeof(buffer));
+
+		if (read_pack <= 0)
+			die_errno("unable to read from reused packfile");
+
+		if (read_pack > to_write)
+			read_pack = to_write;
+
+		sha1write(f, buffer, read_pack);
+		to_write -= read_pack;
+	}
+
+	close(fd);
+	written += reuse_packfile_objects;
+	return reuse_packfile_offset - sizeof(struct pack_header);
+}
+
 static void write_pack_file(void)
 {
 	uint32_t i = 0, j;
@ -724,7 +756,7 @@ static void write_pack_file(void)

 	if (progress > pack_to_stdout)
 		progress_state = start_progress("Writing objects", nr_result);
-	written_list = xmalloc(nr_objects * sizeof(*written_list));
+	written_list = xmalloc(to_pack.nr_objects * sizeof(*written_list));
 	write_order = compute_write_order();

 	do {
@ -737,8 +769,17 @@ static void write_pack_file(void)
 			f = create_tmp_packfile(&pack_tmp_name);

 		offset = write_pack_header(f, nr_remaining);
+
+		if (reuse_packfile) {
+			off_t packfile_size;
+			assert(pack_to_stdout);
+
+			packfile_size = write_reused_pack(f);
+			offset += packfile_size;
+		}
+
 		nr_written = 0;
-		for (; i < nr_objects; i++) {
+		for (; i < to_pack.nr_objects; i++) {
 			struct object_entry *e = write_order[i];
 			if (write_one(f, e, &offset) == WRITE_ONE_BREAK)
 				break;
@ -789,9 +830,31 @@ static void write_pack_file(void)
 			if (sizeof(tmpname) <= strlen(base_name) + 50)
 				die("pack base name '%s' too long", base_name);
 			snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
+
+			if (write_bitmap_index) {
+				bitmap_writer_set_checksum(sha1);
+				bitmap_writer_build_type_index(written_list, nr_written);
+			}
+
 			finish_tmp_packfile(tmpname, pack_tmp_name,
 					    written_list, nr_written,
 					    &pack_idx_opts, sha1);
+
+			if (write_bitmap_index) {
+				char *end_of_name_prefix = strrchr(tmpname, 0);
+				sprintf(end_of_name_prefix, "%s.bitmap", sha1_to_hex(sha1));
+
+				stop_progress(&progress_state);
+
+				bitmap_writer_show_progress(progress);
+				bitmap_writer_reuse_bitmaps(&to_pack);
+				bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1);
+				bitmap_writer_build(&to_pack);
+				bitmap_writer_finish(written_list, nr_written,
+						     tmpname, write_bitmap_options);
+				write_bitmap_index = 0;
+			}
+
 			free(pack_tmp_name);
 			puts(sha1_to_hex(sha1));
 		}
@ -801,7 +864,7 @@ static void write_pack_file(void)
 			written_list[j]->offset = (off_t)-1;
 		}
 		nr_remaining -= nr_written;
-	} while (nr_remaining && i < nr_objects);
+	} while (nr_remaining && i < to_pack.nr_objects);

 	free(written_list);
 	free(write_order);
@ -811,73 +874,6 @@ static void write_pack_file(void)
 			written, nr_result);
 }

-static int locate_object_entry_hash(const unsigned char *sha1)
-{
-	int i;
-	unsigned int ui;
-	memcpy(&ui, sha1, sizeof(unsigned int));
-	i = ui % object_ix_hashsz;
-	while (0 < object_ix[i]) {
-		if (!hashcmp(sha1, objects[object_ix[i] - 1].idx.sha1))
-			return i;
-		if (++i == object_ix_hashsz)
-			i = 0;
-	}
-	return -1 - i;
-}
-
-static struct object_entry *locate_object_entry(const unsigned char *sha1)
-{
-	int i;
-
-	if (!object_ix_hashsz)
-		return NULL;
-
-	i = locate_object_entry_hash(sha1);
-	if (0 <= i)
-		return &objects[object_ix[i]-1];
-	return NULL;
-}
-
-static void rehash_objects(void)
-{
-	uint32_t i;
-	struct object_entry *oe;
-
-	object_ix_hashsz = nr_objects * 3;
-	if (object_ix_hashsz < 1024)
-		object_ix_hashsz = 1024;
-	object_ix = xrealloc(object_ix, sizeof(int) * object_ix_hashsz);
-	memset(object_ix, 0, sizeof(int) * object_ix_hashsz);
-	for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
-		int ix = locate_object_entry_hash(oe->idx.sha1);
-		if (0 <= ix)
-			continue;
-		ix = -1 - ix;
-		object_ix[ix] = i + 1;
-	}
-}
-
-static uint32_t name_hash(const char *name)
-{
-	uint32_t c, hash = 0;
-
-	if (!name)
-		return 0;
-
-	/*
-	 * This effectively just creates a sortable number from the
-	 * last sixteen non-whitespace characters. Last characters
-	 * count "most", so things that end in ".c" sort together.
-	 */
-	while ((c = *name++) != 0) {
-		if (isspace(c))
-			continue;
-		hash = (hash >> 2) + (c << 24);
-	}
-	return hash;
-}
-
 static void setup_delta_attr_check(struct git_attr_check *check)
 {
 	static struct git_attr *attr_delta;
@ -900,42 +896,69 @@ static int no_try_delta(const char *path)
 	return 0;
 }

-static int add_object_entry(const unsigned char *sha1, enum object_type type,
-			    const char *name, int exclude)
+/*
+ * When adding an object, check whether we have already added it
+ * to our packing list. If so, we can skip. However, if we are
+ * being asked to excludei t, but the previous mention was to include
+ * it, make sure to adjust its flags and tweak our numbers accordingly.
+ *
+ * As an optimization, we pass out the index position where we would have
+ * found the item, since that saves us from having to look it up again a
+ * few lines later when we want to add the new entry.
+ */
+static int have_duplicate_entry(const unsigned char *sha1,
+				int exclude,
+				uint32_t *index_pos)
 {
 	struct object_entry *entry;
-	struct packed_git *p, *found_pack = NULL;
-	off_t found_offset = 0;
-	int ix;
-	uint32_t hash = name_hash(name);

-	ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
-	if (ix >= 0) {
-		if (exclude) {
-			entry = objects + object_ix[ix] - 1;
-			if (!entry->preferred_base)
-				nr_result--;
-			entry->preferred_base = 1;
-		}
+	entry = packlist_find(&to_pack, sha1, index_pos);
+	if (!entry)
 		return 0;
+
+	if (exclude) {
+		if (!entry->preferred_base)
+			nr_result--;
+		entry->preferred_base = 1;
 	}

+	return 1;
+}
+
+/*
+ * Check whether we want the object in the pack (e.g., we do not want
+ * objects found in non-local stores if the "--local" option was used).
+ *
+ * As a side effect of this check, we will find the packed version of this
+ * object, if any. We therefore pass out the pack information to avoid having
+ * to look it up again later.
+ */
+static int want_object_in_pack(const unsigned char *sha1,
+			       int exclude,
+			       struct packed_git **found_pack,
+			       off_t *found_offset)
+{
+	struct packed_git *p;
+
 	if (!exclude && local && has_loose_object_nonlocal(sha1))
 		return 0;

+	*found_pack = NULL;
+	*found_offset = 0;
+
 	for (p = packed_git; p; p = p->next) {
 		off_t offset = find_pack_entry_one(sha1, p);
 		if (offset) {
-			if (!found_pack) {
+			if (!*found_pack) {
 				if (!is_pack_valid(p)) {
 					warning("packfile %s cannot be accessed", p->pack_name);
 					continue;
 				}
-				found_offset = offset;
-				found_pack = p;
+				*found_offset = offset;
+				*found_pack = p;
 			}
 			if (exclude)
-				break;
+				return 1;
 			if (incremental)
 				return 0;
 			if (local && !p->pack_local)
@ -945,14 +968,21 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		}
 	}

-	if (nr_objects >= nr_alloc) {
-		nr_alloc = (nr_alloc  + 1024) * 3 / 2;
-		objects = xrealloc(objects, nr_alloc * sizeof(*entry));
-	}
+	return 1;
+}

-	entry = objects + nr_objects++;
-	memset(entry, 0, sizeof(*entry));
-	hashcpy(entry->idx.sha1, sha1);
+static void create_object_entry(const unsigned char *sha1,
+				enum object_type type,
+				uint32_t hash,
+				int exclude,
+				int no_try_delta,
+				uint32_t index_pos,
+				struct packed_git *found_pack,
+				off_t found_offset)
+{
+	struct object_entry *entry;
+
+	entry = packlist_alloc(&to_pack, sha1, index_pos);
 	entry->hash = hash;
 	if (type)
 		entry->type = type;
@ -965,16 +995,43 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		entry->in_pack_offset = found_offset;
 	}

-	if (object_ix_hashsz * 3 <= nr_objects * 4)
-		rehash_objects();
-	else
-		object_ix[-1 - ix] = nr_objects;
+	entry->no_try_delta = no_try_delta;
+}

-	display_progress(progress_state, nr_objects);
+static int add_object_entry(const unsigned char *sha1, enum object_type type,
+			    const char *name, int exclude)
+{
+	struct packed_git *found_pack;
+	off_t found_offset;
+	uint32_t index_pos;

-	if (name && no_try_delta(name))
-		entry->no_try_delta = 1;
+	if (have_duplicate_entry(sha1, exclude, &index_pos))
+		return 0;

+	if (!want_object_in_pack(sha1, exclude, &found_pack, &found_offset))
+		return 0;
+
+	create_object_entry(sha1, type, pack_name_hash(name),
+			    exclude, name && no_try_delta(name),
+			    index_pos, found_pack, found_offset);
+
+	display_progress(progress_state, to_pack.nr_objects);
+	return 1;
+}
+
+static int add_object_entry_from_bitmap(const unsigned char *sha1,
+					enum object_type type,
+					int flags, uint32_t name_hash,
+					struct packed_git *pack, off_t offset)
+{
+	uint32_t index_pos;
+
+	if (have_duplicate_entry(sha1, 0, &index_pos))
+		return 0;
+
+	create_object_entry(sha1, type, name_hash, 0, 0, index_pos, pack, offset);
+
+	display_progress(progress_state, to_pack.nr_objects);
 	return 1;
 }

@ -1175,7 +1232,7 @@ static void add_preferred_base_object(const char *name)
 {
 	struct pbase_tree *it;
 	int cmplen;
-	unsigned hash = name_hash(name);
+	unsigned hash = pack_name_hash(name);

 	if (!num_preferred_base || check_pbase_path(hash))
 		return;
@ -1327,7 +1384,7 @@ static void check_object(struct object_entry *entry)
 			break;
 		}

-		if (base_ref && (base_entry = locate_object_entry(base_ref))) {
+		if (base_ref && (base_entry = packlist_find(&to_pack, base_ref, NULL))) {
 			/*
 			 * If base_ref was set above that means we wish to
 			 * reuse delta data, and we even found that base
@ -1401,12 +1458,12 @@ static void get_object_details(void)
 	uint32_t i;
 	struct object_entry **sorted_by_offset;

-	sorted_by_offset = xcalloc(nr_objects, sizeof(struct object_entry *));
-	for (i = 0; i < nr_objects; i++)
-		sorted_by_offset[i] = objects + i;
-	qsort(sorted_by_offset, nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);
+	sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct object_entry *));
+	for (i = 0; i < to_pack.nr_objects; i++)
+		sorted_by_offset[i] = to_pack.objects + i;
+	qsort(sorted_by_offset, to_pack.nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);

-	for (i = 0; i < nr_objects; i++) {
+	for (i = 0; i < to_pack.nr_objects; i++) {
 		struct object_entry *entry = sorted_by_offset[i];
 		check_object(entry);
 		if (big_file_threshold < entry->size)
@ -2032,7 +2089,7 @@ static int add_ref_tag(const char *path, const unsigned char *sha1, int flag, vo

 	if (starts_with(path, "refs/tags/") && /* is a tag? */
 	    !peel_ref(path, peeled)        && /* peelable? */
-	    locate_object_entry(peeled))      /* object packed? */
+	    packlist_find(&to_pack, peeled, NULL))      /* object packed? */
 		add_object_entry(sha1, OBJ_TAG, NULL, 0);
 	return 0;
 }
@ -2055,14 +2112,14 @@ static void prepare_pack(int window, int depth)
 	if (!pack_to_stdout)
 		do_check_packed_object_crc = 1;

-	if (!nr_objects || !window || !depth)
+	if (!to_pack.nr_objects || !window || !depth)
 		return;

-	delta_list = xmalloc(nr_objects * sizeof(*delta_list));
+	delta_list = xmalloc(to_pack.nr_objects * sizeof(*delta_list));
 	nr_deltas = n = 0;

-	for (i = 0; i < nr_objects; i++) {
-		struct object_entry *entry = objects + i;
+	for (i = 0; i < to_pack.nr_objects; i++) {
+		struct object_entry *entry = to_pack.objects + i;

 		if (entry->delta)
 			/* This happens if we decided to reuse existing
@ -2140,6 +2197,20 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		cache_max_small_delta_size = git_config_int(k, v);
 		return 0;
 	}
+	if (!strcmp(k, "pack.writebitmaps")) {
+		write_bitmap_index = git_config_bool(k, v);
+		return 0;
+	}
+	if (!strcmp(k, "pack.writebitmaphashcache")) {
+		if (git_config_bool(k, v))
+			write_bitmap_options |= BITMAP_OPT_HASH_CACHE;
+		else
+			write_bitmap_options &= ~BITMAP_OPT_HASH_CACHE;
+	}
+	if (!strcmp(k, "pack.usebitmaps")) {
+		use_bitmap_index = git_config_bool(k, v);
+		return 0;
+	}
 	if (!strcmp(k, "pack.threads")) {
 		delta_search_threads = git_config_int(k, v);
 		if (delta_search_threads < 0)
@ -2198,6 +2269,9 @@ static void show_commit(struct commit *commit, void *data)
 {
 	add_object_entry(commit->object.sha1, OBJ_COMMIT, NULL, 0);
 	commit->object.flags |= OBJECT_ADDED;
+
+	if (write_bitmap_index)
+		index_commit_for_bitmap(commit);
 }

 static void show_object(struct object *obj,
@ -2340,7 +2414,7 @@ static void loosen_unused_packed_objects(struct rev_info *revs)

 		for (i = 0; i < p->num_objects; i++) {
 			sha1 = nth_packed_object_sha1(p, i);
-			if (!locate_object_entry(sha1) &&
+			if (!packlist_find(&to_pack, sha1, NULL) &&
 				!has_sha1_pack_kept_or_nonlocal(sha1))
 				if (force_object_loose(sha1, p->mtime))
 					die("unable to force loose object");
@ -2348,6 +2422,29 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
 	}
 }

+static int get_object_list_from_bitmap(struct rev_info *revs)
+{
+	if (prepare_bitmap_walk(revs) < 0)
+		return -1;
+
+	if (!reuse_partial_packfile_from_bitmap(
+			&reuse_packfile,
+			&reuse_packfile_objects,
+			&reuse_packfile_offset)) {
+		assert(reuse_packfile_objects);
+		nr_result += reuse_packfile_objects;
+
+		if (progress) {
+			fprintf(stderr, "Reusing existing pack: %d, done.\n",
+				reuse_packfile_objects);
+			fflush(stderr);
+		}
+	}
+
+	traverse_bitmap_commit_list(&add_object_entry_from_bitmap);
+	return 0;
+}
+
 static void get_object_list(int ac, const char **av)
 {
 	struct rev_info revs;
@ -2367,6 +2464,7 @@ static void get_object_list(int ac, const char **av)
 		if (*line == '-') {
 			if (!strcmp(line, "--not")) {
 				flags ^= UNINTERESTING;
+				write_bitmap_index = 0;
 				continue;
 			}
 			die("not a rev '%s'", line);
@ -2375,6 +2473,9 @@ static void get_object_list(int ac, const char **av)
 			die("bad revision '%s'", line);
 	}

+	if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
+		return;
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	mark_edges_uninteresting(&revs, show_edge);
@ -2504,6 +2605,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    N_("pack compression level")),
 		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
 			    N_("do not hide commits by grafts"), 0),
+		OPT_BOOL(0, "use-bitmap-index", &use_bitmap_index,
+			 N_("use a bitmap index if available to speed up counting objects")),
+		OPT_BOOL(0, "write-bitmap-index", &write_bitmap_index,
+			 N_("write a bitmap index together with the pack index")),
 		OPT_END(),
 	};

@ -2570,6 +2675,12 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (keep_unreachable && unpack_unreachable)
 		die("--keep-unreachable and --unpack-unreachable are incompatible.");

+	if (!use_internal_rev_list || !pack_to_stdout || is_repository_shallow())
+		use_bitmap_index = 0;
+
+	if (pack_to_stdout || !rev_list_all)
+		write_bitmap_index = 0;
+
 	if (progress && all_progress_implied)
 		progress = 2;

--- a/builtin/repack.c
+++ b/builtin/repack.c
@ -94,7 +94,7 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list)

 static void remove_redundant_pack(const char *dir_name, const char *base_name)
 {
-	const char *exts[] = {".pack", ".idx", ".keep"};
+	const char *exts[] = {".pack", ".idx", ".keep", ".bitmap"};
 	int i;
 	struct strbuf buf = STRBUF_INIT;
 	size_t plen;
@ -115,7 +115,14 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name)

 int cmd_repack(int argc, const char **argv, const char *prefix)
 {
-	const char *exts[2] = {".pack", ".idx"};
+	struct {
+		const char *name;
+		unsigned optional:1;
+	} exts[] = {
+		{".pack"},
+		{".idx"},
+		{".bitmap", 1},
+	};
 	struct child_process cmd;
 	struct string_list_item *item;
 	struct argv_array cmd_args = ARGV_ARRAY_INIT;
@ -137,6 +144,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	int no_update_server_info = 0;
 	int quiet = 0;
 	int local = 0;
+	int write_bitmap = -1;

 	struct option builtin_repack_options[] = {
 		OPT_BIT('a', NULL, &pack_everything,
@ -155,6 +163,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		OPT__QUIET(&quiet, N_("be quiet")),
 		OPT_BOOL('l', "local", &local,
 				N_("pass --local to git-pack-objects")),
+		OPT_BOOL('b', "write-bitmap-index", &write_bitmap,
+				N_("write bitmap index")),
 		OPT_STRING(0, "unpack-unreachable", &unpack_unreachable, N_("approxidate"),
 				N_("with -A, do not loosen objects older than this")),
 		OPT_STRING(0, "window", &window, N_("n"),
@ -196,6 +206,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		argv_array_pushf(&cmd_args, "--no-reuse-delta");
 	if (no_reuse_object)
 		argv_array_pushf(&cmd_args, "--no-reuse-object");
+	if (write_bitmap >= 0)
+		argv_array_pushf(&cmd_args, "--%swrite-bitmap-index",
+				 write_bitmap ? "" : "no-");

 	if (pack_everything & ALL_INTO_ONE) {
 		get_non_kept_pack_filenames(&existing_packs);
@ -256,17 +269,17 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	 */
 	failed = 0;
 	for_each_string_list_item(item, &names) {
-		for (ext = 0; ext < 2; ext++) {
+		for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
 			char *fname, *fname_old;
 			fname = mkpathdup("%s/pack-%s%s", packdir,
-						item->string, exts[ext]);
+						item->string, exts[ext].name);
 			if (!file_exists(fname)) {
 				free(fname);
 				continue;
 			}

 			fname_old = mkpath("%s/old-%s%s", packdir,
-						item->string, exts[ext]);
+						item->string, exts[ext].name);
 			if (file_exists(fname_old))
 				if (unlink(fname_old))
 					failed = 1;
@ -313,19 +326,23 @@ int cmd_repack(int argc, const char **argv, const char *prefix)

 	/* Now the ones with the same name are out of the way... */
 	for_each_string_list_item(item, &names) {
-		for (ext = 0; ext < 2; ext++) {
+		for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
 			char *fname, *fname_old;
 			struct stat statbuffer;
+			int exists = 0;
 			fname = mkpathdup("%s/pack-%s%s",
-					packdir, item->string, exts[ext]);
+					packdir, item->string, exts[ext].name);
 			fname_old = mkpathdup("%s-%s%s",
-					packtmp, item->string, exts[ext]);
+					packtmp, item->string, exts[ext].name);
 			if (!stat(fname_old, &statbuffer)) {
 				statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | S_IWOTH);
 				chmod(fname_old, statbuffer.st_mode);
+				exists = 1;
+			}
+			if (exists || !exts[ext].optional) {
+				if (rename(fname_old, fname))
+					die_errno(_("renaming '%s' failed"), fname_old);
 			}
-			if (rename(fname_old, fname))
-				die_errno(_("renaming '%s' failed"), fname_old);
 			free(fname);
 			free(fname_old);
 		}
@ -333,12 +350,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)

 	/* Remove the "old-" files */
 	for_each_string_list_item(item, &names) {
-		for (ext = 0; ext < 2; ext++) {
+		for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
 			char *fname;
 			fname = mkpath("%s/old-%s%s",
 					packdir,
 					item->string,
-					exts[ext]);
+					exts[ext].name);
 			if (remove_path(fname))
 				warning(_("removing '%s' failed"), fname);
 		}
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@ -3,6 +3,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "pack.h"
+#include "pack-bitmap.h"
 #include "builtin.h"
 #include "log-tree.h"
 #include "graph.h"
@ -257,6 +259,18 @@ static int show_bisect_vars(struct rev_list_info *info, int reaches, int all)
 	return 0;
 }

+static int show_object_fast(
+	const unsigned char *sha1,
+	enum object_type type,
+	int exclude,
+	uint32_t name_hash,
+	struct packed_git *found_pack,
+	off_t found_offset)
+{
+	fprintf(stdout, "%s\n", sha1_to_hex(sha1));
+	return 1;
+}
+
 int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
@ -265,6 +279,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	int bisect_list = 0;
 	int bisect_show_vars = 0;
 	int bisect_find_all = 0;
+	int use_bitmap_index = 0;

 	git_config(git_default_config, NULL);
 	init_revisions(&revs, prefix);
@ -306,6 +321,14 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			bisect_show_vars = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--use-bitmap-index")) {
+			use_bitmap_index = 1;
+			continue;
+		}
+		if (!strcmp(arg, "--test-bitmap")) {
+			test_bitmap_walk(&revs);
+			return 0;
+		}
 		usage(rev_list_usage);

 	}
@ -333,6 +356,22 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list)
 		revs.limited = 1;

+	if (use_bitmap_index) {
+		if (revs.count && !revs.left_right && !revs.cherry_mark) {
+			uint32_t commit_count;
+			if (!prepare_bitmap_walk(&revs)) {
+				count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
+				printf("%d\n", commit_count);
+				return 0;
+			}
+		} else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) {
+			if (!prepare_bitmap_walk(&revs)) {
+				traverse_bitmap_commit_list(&show_object_fast);
+				return 0;
+			}
+		}
+	}
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	if (revs.tree_objects)
--- a/cache.h
+++ b/cache.h
@ -809,6 +809,7 @@ extern int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+extern int git_open_noatime(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
--- a/compat/bswap.h
+++ b/compat/bswap.h
@ -17,7 +17,20 @@ static inline uint32_t default_swab32(uint32_t val)
 		((val & 0x000000ff) << 24));
 }

+static inline uint64_t default_bswap64(uint64_t val)
+{
+	return (((val & (uint64_t)0x00000000000000ffULL) << 56) |
+		((val & (uint64_t)0x000000000000ff00ULL) << 40) |
+		((val & (uint64_t)0x0000000000ff0000ULL) << 24) |
+		((val & (uint64_t)0x00000000ff000000ULL) <<  8) |
+		((val & (uint64_t)0x000000ff00000000ULL) >>  8) |
+		((val & (uint64_t)0x0000ff0000000000ULL) >> 24) |
+		((val & (uint64_t)0x00ff000000000000ULL) >> 40) |
+		((val & (uint64_t)0xff00000000000000ULL) >> 56));
+}
+
 #undef bswap32
+#undef bswap64

 #if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))

@ -32,15 +45,42 @@ static inline uint32_t git_bswap32(uint32_t x)
 	return result;
 }

+#define bswap64 git_bswap64
+#if defined(__x86_64__)
+static inline uint64_t git_bswap64(uint64_t x)
+{
+	uint64_t result;
+	if (__builtin_constant_p(x))
+		result = default_bswap64(x);
+	else
+		__asm__("bswap %q0" : "=r" (result) : "0" (x));
+	return result;
+}
+#else
+static inline uint64_t git_bswap64(uint64_t x)
+{
+	union { uint64_t i64; uint32_t i32[2]; } tmp, result;
+	if (__builtin_constant_p(x))
+		result.i64 = default_bswap64(x);
+	else {
+		tmp.i64 = x;
+		result.i32[0] = git_bswap32(tmp.i32[1]);
+		result.i32[1] = git_bswap32(tmp.i32[0]);
+	}
+	return result.i64;
+}
+#endif
+
 #elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))

 #include <stdlib.h>

 #define bswap32(x) _byteswap_ulong(x)
+#define bswap64(x) _byteswap_uint64(x)

 #endif

-#ifdef bswap32
+#if defined(bswap32)

 #undef ntohl
 #undef htonl
@ -48,3 +88,73 @@ static inline uint32_t git_bswap32(uint32_t x)
 #define htonl(x) bswap32(x)

 #endif
+
+#if defined(bswap64)
+
+#undef ntohll
+#undef htonll
+#define ntohll(x) bswap64(x)
+#define htonll(x) bswap64(x)
+
+#else
+
+#undef ntohll
+#undef htonll
+
+#if !defined(__BYTE_ORDER)
+# if defined(BYTE_ORDER) && defined(LITTLE_ENDIAN) && defined(BIG_ENDIAN)
+#  define __BYTE_ORDER BYTE_ORDER
+#  define __LITTLE_ENDIAN LITTLE_ENDIAN
+#  define __BIG_ENDIAN BIG_ENDIAN
+# endif
+#endif
+
+#if !defined(__BYTE_ORDER)
+# error "Cannot determine endianness"
+#endif
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+# define ntohll(n) (n)
+# define htonll(n) (n)
+#else
+# define ntohll(n) default_bswap64(n)
+# define htonll(n) default_bswap64(n)
+#endif
+
+#endif
+
+/*
+ * Performance might be improved if the CPU architecture is OK with
+ * unaligned 32-bit loads and a fast ntohl() is available.
+ * Otherwise fall back to byte loads and shifts which is portable,
+ * and is faster on architectures with memory alignment issues.
+ */
+
+#if defined(__i386__) || defined(__x86_64__) || \
+    defined(_M_IX86) || defined(_M_X64) || \
+    defined(__ppc__) || defined(__ppc64__) || \
+    defined(__powerpc__) || defined(__powerpc64__) || \
+    defined(__s390__) || defined(__s390x__)
+
+#define get_be16(p)	ntohs(*(unsigned short *)(p))
+#define get_be32(p)	ntohl(*(unsigned int *)(p))
+#define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
+
+#else
+
+#define get_be16(p)	( \
+	(*((unsigned char *)(p) + 0) << 8) | \
+	(*((unsigned char *)(p) + 1) << 0) )
+#define get_be32(p)	( \
+	(*((unsigned char *)(p) + 0) << 24) | \
+	(*((unsigned char *)(p) + 1) << 16) | \
+	(*((unsigned char *)(p) + 2) <<  8) | \
+	(*((unsigned char *)(p) + 3) <<  0) )
+#define put_be32(p, v)	do { \
+	unsigned int __v = (v); \
+	*((unsigned char *)(p) + 0) = __v >> 24; \
+	*((unsigned char *)(p) + 1) = __v >> 16; \
+	*((unsigned char *)(p) + 2) = __v >>  8; \
+	*((unsigned char *)(p) + 3) = __v >>  0; } while (0)
+
+#endif
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@ -0,0 +1,221 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include "git-compat-util.h"
+#include "ewok.h"
+
+#define MASK(x) ((eword_t)1 << (x % BITS_IN_WORD))
+#define BLOCK(x) (x / BITS_IN_WORD)
+
+struct bitmap *bitmap_new(void)
+{
+	struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
+	bitmap->words = ewah_calloc(32, sizeof(eword_t));
+	bitmap->word_alloc = 32;
+	return bitmap;
+}
+
+void bitmap_set(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+
+	if (block >= self->word_alloc) {
+		size_t old_size = self->word_alloc;
+		self->word_alloc = block * 2;
+		self->words = ewah_realloc(self->words,
+			self->word_alloc * sizeof(eword_t));
+
+		memset(self->words + old_size, 0x0,
+			(self->word_alloc - old_size) * sizeof(eword_t));
+	}
+
+	self->words[block] |= MASK(pos);
+}
+
+void bitmap_clear(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+
+	if (block < self->word_alloc)
+		self->words[block] &= ~MASK(pos);
+}
+
+int bitmap_get(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+	return block < self->word_alloc &&
+		(self->words[block] & MASK(pos)) != 0;
+}
+
+struct ewah_bitmap *bitmap_to_ewah(struct bitmap *bitmap)
+{
+	struct ewah_bitmap *ewah = ewah_new();
+	size_t i, running_empty_words = 0;
+	eword_t last_word = 0;
+
+	for (i = 0; i < bitmap->word_alloc; ++i) {
+		if (bitmap->words[i] == 0) {
+			running_empty_words++;
+			continue;
+		}
+
+		if (last_word != 0)
+			ewah_add(ewah, last_word);
+
+		if (running_empty_words > 0) {
+			ewah_add_empty_words(ewah, 0, running_empty_words);
+			running_empty_words = 0;
+		}
+
+		last_word = bitmap->words[i];
+	}
+
+	ewah_add(ewah, last_word);
+	return ewah;
+}
+
+struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
+{
+	struct bitmap *bitmap = bitmap_new();
+	struct ewah_iterator it;
+	eword_t blowup;
+	size_t i = 0;
+
+	ewah_iterator_init(&it, ewah);
+
+	while (ewah_iterator_next(&blowup, &it)) {
+		if (i >= bitmap->word_alloc) {
+			bitmap->word_alloc *= 1.5;
+			bitmap->words = ewah_realloc(
+				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
+		}
+
+		bitmap->words[i++] = blowup;
+	}
+
+	bitmap->word_alloc = i;
+	return bitmap;
+}
+
+void bitmap_and_not(struct bitmap *self, struct bitmap *other)
+{
+	const size_t count = (self->word_alloc < other->word_alloc) ?
+		self->word_alloc : other->word_alloc;
+
+	size_t i;
+
+	for (i = 0; i < count; ++i)
+		self->words[i] &= ~other->words[i];
+}
+
+void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
+{
+	size_t original_size = self->word_alloc;
+	size_t other_final = (other->bit_size / BITS_IN_WORD) + 1;
+	size_t i = 0;
+	struct ewah_iterator it;
+	eword_t word;
+
+	if (self->word_alloc < other_final) {
+		self->word_alloc = other_final;
+		self->words = ewah_realloc(self->words,
+			self->word_alloc * sizeof(eword_t));
+		memset(self->words + original_size, 0x0,
+			(self->word_alloc - original_size) * sizeof(eword_t));
+	}
+
+	ewah_iterator_init(&it, other);
+
+	while (ewah_iterator_next(&word, &it))
+		self->words[i++] |= word;
+}
+
+void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data)
+{
+	size_t pos = 0, i;
+
+	for (i = 0; i < self->word_alloc; ++i) {
+		eword_t word = self->words[i];
+		uint32_t offset;
+
+		if (word == (eword_t)~0) {
+			for (offset = 0; offset < BITS_IN_WORD; ++offset)
+				callback(pos++, data);
+		} else {
+			for (offset = 0; offset < BITS_IN_WORD; ++offset) {
+				if ((word >> offset) == 0)
+					break;
+
+				offset += ewah_bit_ctz64(word >> offset);
+				callback(pos + offset, data);
+			}
+			pos += BITS_IN_WORD;
+		}
+	}
+}
+
+size_t bitmap_popcount(struct bitmap *self)
+{
+	size_t i, count = 0;
+
+	for (i = 0; i < self->word_alloc; ++i)
+		count += ewah_bit_popcount64(self->words[i]);
+
+	return count;
+}
+
+int bitmap_equals(struct bitmap *self, struct bitmap *other)
+{
+	struct bitmap *big, *small;
+	size_t i;
+
+	if (self->word_alloc < other->word_alloc) {
+		small = self;
+		big = other;
+	} else {
+		small = other;
+		big = self;
+	}
+
+	for (i = 0; i < small->word_alloc; ++i) {
+		if (small->words[i] != big->words[i])
+			return 0;
+	}
+
+	for (; i < big->word_alloc; ++i) {
+		if (big->words[i] != 0)
+			return 0;
+	}
+
+	return 1;
+}
+
+void bitmap_reset(struct bitmap *bitmap)
+{
+	memset(bitmap->words, 0x0, bitmap->word_alloc * sizeof(eword_t));
+}
+
+void bitmap_free(struct bitmap *bitmap)
+{
+	if (bitmap == NULL)
+		return;
+
+	free(bitmap->words);
+	free(bitmap);
+}
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@ -0,0 +1,714 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include "git-compat-util.h"
+#include "ewok.h"
+#include "ewok_rlw.h"
+
+static inline size_t min_size(size_t a, size_t b)
+{
+	return a < b ? a : b;
+}
+
+static inline size_t max_size(size_t a, size_t b)
+{
+	return a > b ? a : b;
+}
+
+static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
+{
+	size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer;
+
+	if (self->alloc_size >= new_size)
+		return;
+
+	self->alloc_size = new_size;
+	self->buffer = ewah_realloc(self->buffer,
+		self->alloc_size * sizeof(eword_t));
+	self->rlw = self->buffer + (rlw_offset / sizeof(size_t));
+}
+
+static inline void buffer_push(struct ewah_bitmap *self, eword_t value)
+{
+	if (self->buffer_size + 1 >= self->alloc_size)
+		buffer_grow(self, self->buffer_size * 3 / 2);
+
+	self->buffer[self->buffer_size++] = value;
+}
+
+static void buffer_push_rlw(struct ewah_bitmap *self, eword_t value)
+{
+	buffer_push(self, value);
+	self->rlw = self->buffer + self->buffer_size - 1;
+}
+
+static size_t add_empty_words(struct ewah_bitmap *self, int v, size_t number)
+{
+	size_t added = 0;
+	eword_t runlen, can_add;
+
+	if (rlw_get_run_bit(self->rlw) != v && rlw_size(self->rlw) == 0) {
+		rlw_set_run_bit(self->rlw, v);
+	} else if (rlw_get_literal_words(self->rlw) != 0 ||
+			rlw_get_run_bit(self->rlw) != v) {
+		buffer_push_rlw(self, 0);
+		if (v) rlw_set_run_bit(self->rlw, v);
+		added++;
+	}
+
+	runlen = rlw_get_running_len(self->rlw);
+	can_add = min_size(number, RLW_LARGEST_RUNNING_COUNT - runlen);
+
+	rlw_set_running_len(self->rlw, runlen + can_add);
+	number -= can_add;
+
+	while (number >= RLW_LARGEST_RUNNING_COUNT) {
+		buffer_push_rlw(self, 0);
+		added++;
+		if (v) rlw_set_run_bit(self->rlw, v);
+		rlw_set_running_len(self->rlw, RLW_LARGEST_RUNNING_COUNT);
+		number -= RLW_LARGEST_RUNNING_COUNT;
+	}
+
+	if (number > 0) {
+		buffer_push_rlw(self, 0);
+		added++;
+
+		if (v) rlw_set_run_bit(self->rlw, v);
+		rlw_set_running_len(self->rlw, number);
+	}
+
+	return added;
+}
+
+size_t ewah_add_empty_words(struct ewah_bitmap *self, int v, size_t number)
+{
+	if (number == 0)
+		return 0;
+
+	self->bit_size += number * BITS_IN_WORD;
+	return add_empty_words(self, v, number);
+}
+
+static size_t add_literal(struct ewah_bitmap *self, eword_t new_data)
+{
+	eword_t current_num = rlw_get_literal_words(self->rlw);
+
+	if (current_num >= RLW_LARGEST_LITERAL_COUNT) {
+		buffer_push_rlw(self, 0);
+
+		rlw_set_literal_words(self->rlw, 1);
+		buffer_push(self, new_data);
+		return 2;
+	}
+
+	rlw_set_literal_words(self->rlw, current_num + 1);
+
+	/* sanity check */
+	assert(rlw_get_literal_words(self->rlw) == current_num + 1);
+
+	buffer_push(self, new_data);
+	return 1;
+}
+
+void ewah_add_dirty_words(
+	struct ewah_bitmap *self, const eword_t *buffer,
+	size_t number, int negate)
+{
+	size_t literals, can_add;
+
+	while (1) {
+		literals = rlw_get_literal_words(self->rlw);
+		can_add = min_size(number, RLW_LARGEST_LITERAL_COUNT - literals);
+
+		rlw_set_literal_words(self->rlw, literals + can_add);
+
+		if (self->buffer_size + can_add >= self->alloc_size)
+			buffer_grow(self, (self->buffer_size + can_add) * 3 / 2);
+
+		if (negate) {
+			size_t i;
+			for (i = 0; i < can_add; ++i)
+				self->buffer[self->buffer_size++] = ~buffer[i];
+		} else {
+			memcpy(self->buffer + self->buffer_size,
+				buffer, can_add * sizeof(eword_t));
+			self->buffer_size += can_add;
+		}
+
+		self->bit_size += can_add * BITS_IN_WORD;
+
+		if (number - can_add == 0)
+			break;
+
+		buffer_push_rlw(self, 0);
+		buffer += can_add;
+		number -= can_add;
+	}
+}
+
+static size_t add_empty_word(struct ewah_bitmap *self, int v)
+{
+	int no_literal = (rlw_get_literal_words(self->rlw) == 0);
+	eword_t run_len = rlw_get_running_len(self->rlw);
+
+	if (no_literal && run_len == 0) {
+		rlw_set_run_bit(self->rlw, v);
+		assert(rlw_get_run_bit(self->rlw) == v);
+	}
+
+	if (no_literal && rlw_get_run_bit(self->rlw) == v &&
+		run_len < RLW_LARGEST_RUNNING_COUNT) {
+		rlw_set_running_len(self->rlw, run_len + 1);
+		assert(rlw_get_running_len(self->rlw) == run_len + 1);
+		return 0;
+	} else {
+		buffer_push_rlw(self, 0);
+
+		assert(rlw_get_running_len(self->rlw) == 0);
+		assert(rlw_get_run_bit(self->rlw) == 0);
+		assert(rlw_get_literal_words(self->rlw) == 0);
+
+		rlw_set_run_bit(self->rlw, v);
+		assert(rlw_get_run_bit(self->rlw) == v);
+
+		rlw_set_running_len(self->rlw, 1);
+		assert(rlw_get_running_len(self->rlw) == 1);
+		assert(rlw_get_literal_words(self->rlw) == 0);
+		return 1;
+	}
+}
+
+size_t ewah_add(struct ewah_bitmap *self, eword_t word)
+{
+	self->bit_size += BITS_IN_WORD;
+
+	if (word == 0)
+		return add_empty_word(self, 0);
+
+	if (word == (eword_t)(~0))
+		return add_empty_word(self, 1);
+
+	return add_literal(self, word);
+}
+
+void ewah_set(struct ewah_bitmap *self, size_t i)
+{
+	const size_t dist =
+		(i + BITS_IN_WORD) / BITS_IN_WORD -
+		(self->bit_size + BITS_IN_WORD - 1) / BITS_IN_WORD;
+
+	assert(i >= self->bit_size);
+
+	self->bit_size = i + 1;
+
+	if (dist > 0) {
+		if (dist > 1)
+			add_empty_words(self, 0, dist - 1);
+
+		add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
+		return;
+	}
+
+	if (rlw_get_literal_words(self->rlw) == 0) {
+		rlw_set_running_len(self->rlw,
+			rlw_get_running_len(self->rlw) - 1);
+		add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
+		return;
+	}
+
+	self->buffer[self->buffer_size - 1] |=
+		((eword_t)1 << (i % BITS_IN_WORD));
+
+	/* check if we just completed a stream of 1s */
+	if (self->buffer[self->buffer_size - 1] == (eword_t)(~0)) {
+		self->buffer[--self->buffer_size] = 0;
+		rlw_set_literal_words(self->rlw,
+			rlw_get_literal_words(self->rlw) - 1);
+		add_empty_word(self, 1);
+	}
+}
+
+void ewah_each_bit(struct ewah_bitmap *self, void (*callback)(size_t, void*), void *payload)
+{
+	size_t pos = 0;
+	size_t pointer = 0;
+	size_t k;
+
+	while (pointer < self->buffer_size) {
+		eword_t *word = &self->buffer[pointer];
+
+		if (rlw_get_run_bit(word)) {
+			size_t len = rlw_get_running_len(word) * BITS_IN_WORD;
+			for (k = 0; k < len; ++k, ++pos)
+				callback(pos, payload);
+		} else {
+			pos += rlw_get_running_len(word) * BITS_IN_WORD;
+		}
+
+		++pointer;
+
+		for (k = 0; k < rlw_get_literal_words(word); ++k) {
+			int c;
+
+			/* todo: zero count optimization */
+			for (c = 0; c < BITS_IN_WORD; ++c, ++pos) {
+				if ((self->buffer[pointer] & ((eword_t)1 << c)) != 0)
+					callback(pos, payload);
+			}
+
+			++pointer;
+		}
+	}
+}
+
+struct ewah_bitmap *ewah_new(void)
+{
+	struct ewah_bitmap *self;
+
+	self = ewah_malloc(sizeof(struct ewah_bitmap));
+	if (self == NULL)
+		return NULL;
+
+	self->buffer = ewah_malloc(32 * sizeof(eword_t));
+	self->alloc_size = 32;
+
+	ewah_clear(self);
+	return self;
+}
+
+void ewah_clear(struct ewah_bitmap *self)
+{
+	self->buffer_size = 1;
+	self->buffer[0] = 0;
+	self->bit_size = 0;
+	self->rlw = self->buffer;
+}
+
+void ewah_free(struct ewah_bitmap *self)
+{
+	if (!self)
+		return;
+
+	if (self->alloc_size)
+		free(self->buffer);
+
+	free(self);
+}
+
+static void read_new_rlw(struct ewah_iterator *it)
+{
+	const eword_t *word = NULL;
+
+	it->literals = 0;
+	it->compressed = 0;
+
+	while (1) {
+		word = &it->buffer[it->pointer];
+
+		it->rl = rlw_get_running_len(word);
+		it->lw = rlw_get_literal_words(word);
+		it->b = rlw_get_run_bit(word);
+
+		if (it->rl || it->lw)
+			return;
+
+		if (it->pointer < it->buffer_size - 1) {
+			it->pointer++;
+		} else {
+			it->pointer = it->buffer_size;
+			return;
+		}
+	}
+}
+
+int ewah_iterator_next(eword_t *next, struct ewah_iterator *it)
+{
+	if (it->pointer >= it->buffer_size)
+		return 0;
+
+	if (it->compressed < it->rl) {
+		it->compressed++;
+		*next = it->b ? (eword_t)(~0) : 0;
+	} else {
+		assert(it->literals < it->lw);
+
+		it->literals++;
+		it->pointer++;
+
+		assert(it->pointer < it->buffer_size);
+
+		*next = it->buffer[it->pointer];
+	}
+
+	if (it->compressed == it->rl && it->literals == it->lw) {
+		if (++it->pointer < it->buffer_size)
+			read_new_rlw(it);
+	}
+
+	return 1;
+}
+
+void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent)
+{
+	it->buffer = parent->buffer;
+	it->buffer_size = parent->buffer_size;
+	it->pointer = 0;
+
+	it->lw = 0;
+	it->rl = 0;
+	it->compressed = 0;
+	it->literals = 0;
+	it->b = 0;
+
+	if (it->pointer < it->buffer_size)
+		read_new_rlw(it);
+}
+
+void ewah_not(struct ewah_bitmap *self)
+{
+	size_t pointer = 0;
+
+	while (pointer < self->buffer_size) {
+		eword_t *word = &self->buffer[pointer];
+		size_t literals, k;
+
+		rlw_xor_run_bit(word);
+		++pointer;
+
+		literals = rlw_get_literal_words(word);
+		for (k = 0; k < literals; ++k) {
+			self->buffer[pointer] = ~self->buffer[pointer];
+			++pointer;
+		}
+	}
+}
+
+void ewah_xor(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+	size_t literals;
+
+	rlwit_init(&rlw_i, ewah_i);
+	rlwit_init(&rlw_j, ewah_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+			size_t index;
+			int negate_words;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			negate_words = !!predator->rlw.running_bit;
+			index = rlwit_discharge(prey, out,
+				predator->rlw.running_len, negate_words);
+
+			ewah_add_empty_words(out, negate_words,
+				predator->rlw.running_len - index);
+
+			rlwit_discard_first_words(predator,
+				predator->rlw.running_len);
+		}
+
+		literals = min_size(
+			rlw_i.rlw.literal_words,
+			rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] ^
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0)
+		rlwit_discharge(&rlw_i, out, ~0, 0);
+	else
+		rlwit_discharge(&rlw_j, out, ~0, 0);
+
+	out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
+}
+
+void ewah_and(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+	size_t literals;
+
+	rlwit_init(&rlw_i, ewah_i);
+	rlwit_init(&rlw_j, ewah_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			if (predator->rlw.running_bit == 0) {
+				ewah_add_empty_words(out, 0,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(prey,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			} else {
+				size_t index = rlwit_discharge(prey, out,
+					predator->rlw.running_len, 0);
+				ewah_add_empty_words(out, 0,
+					predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			}
+		}
+
+		literals = min_size(
+			rlw_i.rlw.literal_words,
+			rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] &
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0)
+		rlwit_discharge_empty(&rlw_i, out);
+	else
+		rlwit_discharge_empty(&rlw_j, out);
+
+	out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
+}
+
+void ewah_and_not(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+	size_t literals;
+
+	rlwit_init(&rlw_i, ewah_i);
+	rlwit_init(&rlw_j, ewah_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			if ((predator->rlw.running_bit && prey == &rlw_i) ||
+				(!predator->rlw.running_bit && prey != &rlw_i)) {
+				ewah_add_empty_words(out, 0,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(prey,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			} else {
+				size_t index;
+				int negate_words;
+
+				negate_words = (&rlw_i != prey);
+				index = rlwit_discharge(prey, out,
+					predator->rlw.running_len, negate_words);
+				ewah_add_empty_words(out, negate_words,
+					predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			}
+		}
+
+		literals = min_size(
+			rlw_i.rlw.literal_words,
+			rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] &
+					~(rlw_j.buffer[rlw_j.literal_word_start + k])
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0)
+		rlwit_discharge(&rlw_i, out, ~0, 0);
+	else
+		rlwit_discharge_empty(&rlw_j, out);
+
+	out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
+}
+
+void ewah_or(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+	size_t literals;
+
+	rlwit_init(&rlw_i, ewah_i);
+	rlwit_init(&rlw_j, ewah_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			if (predator->rlw.running_bit) {
+				ewah_add_empty_words(out, 0,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(prey,
+					predator->rlw.running_len);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			} else {
+				size_t index = rlwit_discharge(prey, out,
+					predator->rlw.running_len, 0);
+				ewah_add_empty_words(out, 0,
+					predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator,
+					predator->rlw.running_len);
+			}
+		}
+
+		literals = min_size(
+			rlw_i.rlw.literal_words,
+			rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] |
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0)
+		rlwit_discharge(&rlw_i, out, ~0, 0);
+	else
+		rlwit_discharge(&rlw_j, out, ~0, 0);
+
+	out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
+}
+
+
+#define BITMAP_POOL_MAX 16
+static struct ewah_bitmap *bitmap_pool[BITMAP_POOL_MAX];
+static size_t bitmap_pool_size;
+
+struct ewah_bitmap *ewah_pool_new(void)
+{
+	if (bitmap_pool_size)
+		return bitmap_pool[--bitmap_pool_size];
+
+	return ewah_new();
+}
+
+void ewah_pool_free(struct ewah_bitmap *self)
+{
+	if (self == NULL)
+		return;
+
+	if (bitmap_pool_size == BITMAP_POOL_MAX ||
+		self->alloc_size == 0) {
+		ewah_free(self);
+		return;
+	}
+
+	ewah_clear(self);
+	bitmap_pool[bitmap_pool_size++] = self;
+}
+
+uint32_t ewah_checksum(struct ewah_bitmap *self)
+{
+	const uint8_t *p = (uint8_t *)self->buffer;
+	uint32_t crc = (uint32_t)self->bit_size;
+	size_t size = self->buffer_size * sizeof(eword_t);
+
+	while (size--)
+		crc = (crc << 5) - crc + (uint32_t)*p++;
+
+	return crc;
+}
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@ -0,0 +1,204 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include "git-compat-util.h"
+#include "ewok.h"
+
+int ewah_serialize_native(struct ewah_bitmap *self, int fd)
+{
+	uint32_t write32;
+	size_t to_write = self->buffer_size * 8;
+
+	/* 32 bit -- bit size for the map */
+	write32 = (uint32_t)self->bit_size;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	/** 32 bit -- number of compressed 64-bit words */
+	write32 = (uint32_t)self->buffer_size;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	if (write(fd, self->buffer, to_write) != to_write)
+		return -1;
+
+	/** 32 bit -- position for the RLW */
+	write32 = self->rlw - self->buffer;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	return (3 * 4) + to_write;
+}
+
+int ewah_serialize_to(struct ewah_bitmap *self,
+		      int (*write_fun)(void *, const void *, size_t),
+		      void *data)
+{
+	size_t i;
+	eword_t dump[2048];
+	const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
+	uint32_t bitsize, word_count, rlw_pos;
+
+	const eword_t *buffer;
+	size_t words_left;
+
+	/* 32 bit -- bit size for the map */
+	bitsize =  htonl((uint32_t)self->bit_size);
+	if (write_fun(data, &bitsize, 4) != 4)
+		return -1;
+
+	/** 32 bit -- number of compressed 64-bit words */
+	word_count =  htonl((uint32_t)self->buffer_size);
+	if (write_fun(data, &word_count, 4) != 4)
+		return -1;
+
+	/** 64 bit x N -- compressed words */
+	buffer = self->buffer;
+	words_left = self->buffer_size;
+
+	while (words_left >= words_per_dump) {
+		for (i = 0; i < words_per_dump; ++i, ++buffer)
+			dump[i] = htonll(*buffer);
+
+		if (write_fun(data, dump, sizeof(dump)) != sizeof(dump))
+			return -1;
+
+		words_left -= words_per_dump;
+	}
+
+	if (words_left) {
+		for (i = 0; i < words_left; ++i, ++buffer)
+			dump[i] = htonll(*buffer);
+
+		if (write_fun(data, dump, words_left * 8) != words_left * 8)
+			return -1;
+	}
+
+	/** 32 bit -- position for the RLW */
+	rlw_pos = (uint8_t*)self->rlw - (uint8_t *)self->buffer;
+	rlw_pos = htonl(rlw_pos / sizeof(eword_t));
+
+	if (write_fun(data, &rlw_pos, 4) != 4)
+		return -1;
+
+	return (3 * 4) + (self->buffer_size * 8);
+}
+
+static int write_helper(void *fd, const void *buf, size_t len)
+{
+	return write((intptr_t)fd, buf, len);
+}
+
+int ewah_serialize(struct ewah_bitmap *self, int fd)
+{
+	return ewah_serialize_to(self, write_helper, (void *)(intptr_t)fd);
+}
+
+int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
+{
+	uint8_t *ptr = map;
+	size_t i;
+
+	self->bit_size = get_be32(ptr);
+	ptr += sizeof(uint32_t);
+
+	self->buffer_size = self->alloc_size = get_be32(ptr);
+	ptr += sizeof(uint32_t);
+
+	self->buffer = ewah_realloc(self->buffer,
+		self->alloc_size * sizeof(eword_t));
+
+	if (!self->buffer)
+		return -1;
+
+	/*
+	 * Copy the raw data for the bitmap as a whole chunk;
+	 * if we're in a little-endian platform, we'll perform
+	 * the endianness conversion in a separate pass to ensure
+	 * we're loading 8-byte aligned words.
+	 */
+	memcpy(self->buffer, ptr, self->buffer_size * sizeof(uint64_t));
+	ptr += self->buffer_size * sizeof(uint64_t);
+
+	for (i = 0; i < self->buffer_size; ++i)
+		self->buffer[i] = ntohll(self->buffer[i]);
+
+	self->rlw = self->buffer + get_be32(ptr);
+
+	return (3 * 4) + (self->buffer_size * 8);
+}
+
+int ewah_deserialize(struct ewah_bitmap *self, int fd)
+{
+	size_t i;
+	eword_t dump[2048];
+	const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
+	uint32_t bitsize, word_count, rlw_pos;
+
+	eword_t *buffer = NULL;
+	size_t words_left;
+
+	ewah_clear(self);
+
+	/* 32 bit -- bit size for the map */
+	if (read(fd, &bitsize, 4) != 4)
+		return -1;
+
+	self->bit_size = (size_t)ntohl(bitsize);
+
+	/** 32 bit -- number of compressed 64-bit words */
+	if (read(fd, &word_count, 4) != 4)
+		return -1;
+
+	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
+	self->buffer = ewah_realloc(self->buffer,
+		self->alloc_size * sizeof(eword_t));
+
+	if (!self->buffer)
+		return -1;
+
+	/** 64 bit x N -- compressed words */
+	buffer = self->buffer;
+	words_left = self->buffer_size;
+
+	while (words_left >= words_per_dump) {
+		if (read(fd, dump, sizeof(dump)) != sizeof(dump))
+			return -1;
+
+		for (i = 0; i < words_per_dump; ++i, ++buffer)
+			*buffer = ntohll(dump[i]);
+
+		words_left -= words_per_dump;
+	}
+
+	if (words_left) {
+		if (read(fd, dump, words_left * 8) != words_left * 8)
+			return -1;
+
+		for (i = 0; i < words_left; ++i, ++buffer)
+			*buffer = ntohll(dump[i]);
+	}
+
+	/** 32 bit -- position for the RLW */
+	if (read(fd, &rlw_pos, 4) != 4)
+		return -1;
+
+	self->rlw = self->buffer + ntohl(rlw_pos);
+	return 0;
+}
--- a/ewah/ewah_rlw.c
+++ b/ewah/ewah_rlw.c
@ -0,0 +1,115 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include "git-compat-util.h"
+#include "ewok.h"
+#include "ewok_rlw.h"
+
+static inline int next_word(struct rlw_iterator *it)
+{
+	if (it->pointer >= it->size)
+		return 0;
+
+	it->rlw.word = &it->buffer[it->pointer];
+	it->pointer += rlw_get_literal_words(it->rlw.word) + 1;
+
+	it->rlw.literal_words = rlw_get_literal_words(it->rlw.word);
+	it->rlw.running_len = rlw_get_running_len(it->rlw.word);
+	it->rlw.running_bit = rlw_get_run_bit(it->rlw.word);
+	it->rlw.literal_word_offset = 0;
+
+	return 1;
+}
+
+void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *from_ewah)
+{
+	it->buffer = from_ewah->buffer;
+	it->size = from_ewah->buffer_size;
+	it->pointer = 0;
+
+	next_word(it);
+
+	it->literal_word_start = rlwit_literal_words(it) +
+		it->rlw.literal_word_offset;
+}
+
+void rlwit_discard_first_words(struct rlw_iterator *it, size_t x)
+{
+	while (x > 0) {
+		size_t discard;
+
+		if (it->rlw.running_len > x) {
+			it->rlw.running_len -= x;
+			return;
+		}
+
+		x -= it->rlw.running_len;
+		it->rlw.running_len = 0;
+
+		discard = (x > it->rlw.literal_words) ? it->rlw.literal_words : x;
+
+		it->literal_word_start += discard;
+		it->rlw.literal_words -= discard;
+		x -= discard;
+
+		if (x > 0 || rlwit_word_size(it) == 0) {
+			if (!next_word(it))
+				break;
+
+			it->literal_word_start =
+				rlwit_literal_words(it) + it->rlw.literal_word_offset;
+		}
+	}
+}
+
+size_t rlwit_discharge(
+	struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, int negate)
+{
+	size_t index = 0;
+
+	while (index < max && rlwit_word_size(it) > 0) {
+		size_t pd, pl = it->rlw.running_len;
+
+		if (index + pl > max)
+			pl = max - index;
+
+		ewah_add_empty_words(out, it->rlw.running_bit ^ negate, pl);
+		index += pl;
+
+		pd = it->rlw.literal_words;
+		if (pd + index > max)
+			pd = max - index;
+
+		ewah_add_dirty_words(out,
+			it->buffer + it->literal_word_start, pd, negate);
+
+		rlwit_discard_first_words(it, pd + pl);
+		index += pd;
+	}
+
+	return index;
+}
+
+void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out)
+{
+	while (rlwit_word_size(it) > 0) {
+		ewah_add_empty_words(out, 0, rlwit_word_size(it));
+		rlwit_discard_first_words(it, rlwit_word_size(it));
+	}
+}
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@ -0,0 +1,233 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#ifndef __EWOK_BITMAP_H__
+#define __EWOK_BITMAP_H__
+
+#ifndef ewah_malloc
+#	define ewah_malloc xmalloc
+#endif
+#ifndef ewah_realloc
+#	define ewah_realloc xrealloc
+#endif
+#ifndef ewah_calloc
+#	define ewah_calloc xcalloc
+#endif
+
+typedef uint64_t eword_t;
+#define BITS_IN_WORD (sizeof(eword_t) * 8)
+
+/**
+ * Do not use __builtin_popcountll. The GCC implementation
+ * is notoriously slow on all platforms.
+ *
+ * See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041
+ */
+static inline uint32_t ewah_bit_popcount64(uint64_t x)
+{
+	x = (x & 0x5555555555555555ULL) + ((x >>  1) & 0x5555555555555555ULL);
+	x = (x & 0x3333333333333333ULL) + ((x >>  2) & 0x3333333333333333ULL);
+	x = (x & 0x0F0F0F0F0F0F0F0FULL) + ((x >>  4) & 0x0F0F0F0F0F0F0F0FULL);
+	return (x * 0x0101010101010101ULL) >> 56;
+}
+
+#ifdef __GNUC__
+#define ewah_bit_ctz64(x) __builtin_ctzll(x)
+#else
+static inline int ewah_bit_ctz64(uint64_t x)
+{
+	int n = 0;
+	if ((x & 0xffffffff) == 0) { x >>= 32; n += 32; }
+	if ((x &     0xffff) == 0) { x >>= 16; n += 16; }
+	if ((x &       0xff) == 0) { x >>=  8; n +=  8; }
+	if ((x &        0xf) == 0) { x >>=  4; n +=  4; }
+	if ((x &        0x3) == 0) { x >>=  2; n +=  2; }
+	if ((x &        0x1) == 0) { x >>=  1; n +=  1; }
+	return n + !x;
+}
+#endif
+
+struct ewah_bitmap {
+	eword_t *buffer;
+	size_t buffer_size;
+	size_t alloc_size;
+	size_t bit_size;
+	eword_t *rlw;
+};
+
+typedef void (*ewah_callback)(size_t pos, void *);
+
+struct ewah_bitmap *ewah_pool_new(void);
+void ewah_pool_free(struct ewah_bitmap *self);
+
+/**
+ * Allocate a new EWAH Compressed bitmap
+ */
+struct ewah_bitmap *ewah_new(void);
+
+/**
+ * Clear all the bits in the bitmap. Does not free or resize
+ * memory.
+ */
+void ewah_clear(struct ewah_bitmap *self);
+
+/**
+ * Free all the memory of the bitmap
+ */
+void ewah_free(struct ewah_bitmap *self);
+
+int ewah_serialize_to(struct ewah_bitmap *self,
+		      int (*write_fun)(void *out, const void *buf, size_t len),
+		      void *out);
+int ewah_serialize(struct ewah_bitmap *self, int fd);
+int ewah_serialize_native(struct ewah_bitmap *self, int fd);
+
+int ewah_deserialize(struct ewah_bitmap *self, int fd);
+int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
+int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
+
+uint32_t ewah_checksum(struct ewah_bitmap *self);
+
+/**
+ * Logical not (bitwise negation) in-place on the bitmap
+ *
+ * This operation is linear time based on the size of the bitmap.
+ */
+void ewah_not(struct ewah_bitmap *self);
+
+/**
+ * Call the given callback with the position of every single bit
+ * that has been set on the bitmap.
+ *
+ * This is an efficient operation that does not fully decompress
+ * the bitmap.
+ */
+void ewah_each_bit(struct ewah_bitmap *self, ewah_callback callback, void *payload);
+
+/**
+ * Set a given bit on the bitmap.
+ *
+ * The bit at position `pos` will be set to true. Because of the
+ * way that the bitmap is compressed, a set bit cannot be unset
+ * later on.
+ *
+ * Furthermore, since the bitmap uses streaming compression, bits
+ * can only set incrementally.
+ *
+ * E.g.
+ *		ewah_set(bitmap, 1); // ok
+ *		ewah_set(bitmap, 76); // ok
+ *		ewah_set(bitmap, 77); // ok
+ *		ewah_set(bitmap, 8712800127); // ok
+ *		ewah_set(bitmap, 25); // failed, assert raised
+ */
+void ewah_set(struct ewah_bitmap *self, size_t i);
+
+struct ewah_iterator {
+	const eword_t *buffer;
+	size_t buffer_size;
+
+	size_t pointer;
+	eword_t compressed, literals;
+	eword_t rl, lw;
+	int b;
+};
+
+/**
+ * Initialize a new iterator to run through the bitmap in uncompressed form.
+ *
+ * The iterator can be stack allocated. The underlying bitmap must not be freed
+ * before the iteration is over.
+ *
+ * E.g.
+ *
+ *		struct ewah_bitmap *bitmap = ewah_new();
+ *		struct ewah_iterator it;
+ *
+ *		ewah_iterator_init(&it, bitmap);
+ */
+void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent);
+
+/**
+ * Yield every single word in the bitmap in uncompressed form. This is:
+ * yield single words (32-64 bits) where each bit represents an actual
+ * bit from the bitmap.
+ *
+ * Return: true if a word was yield, false if there are no words left
+ */
+int ewah_iterator_next(eword_t *next, struct ewah_iterator *it);
+
+void ewah_or(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out);
+
+void ewah_and_not(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out);
+
+void ewah_xor(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out);
+
+void ewah_and(
+	struct ewah_bitmap *ewah_i,
+	struct ewah_bitmap *ewah_j,
+	struct ewah_bitmap *out);
+
+/**
+ * Direct word access
+ */
+size_t ewah_add_empty_words(struct ewah_bitmap *self, int v, size_t number);
+void ewah_add_dirty_words(
+	struct ewah_bitmap *self, const eword_t *buffer, size_t number, int negate);
+size_t ewah_add(struct ewah_bitmap *self, eword_t word);
+
+
+/**
+ * Uncompressed, old-school bitmap that can be efficiently compressed
+ * into an `ewah_bitmap`.
+ */
+struct bitmap {
+	eword_t *words;
+	size_t word_alloc;
+};
+
+struct bitmap *bitmap_new(void);
+void bitmap_set(struct bitmap *self, size_t pos);
+void bitmap_clear(struct bitmap *self, size_t pos);
+int bitmap_get(struct bitmap *self, size_t pos);
+void bitmap_reset(struct bitmap *self);
+void bitmap_free(struct bitmap *self);
+int bitmap_equals(struct bitmap *self, struct bitmap *other);
+int bitmap_is_subset(struct bitmap *self, struct bitmap *super);
+
+struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap);
+struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah);
+
+void bitmap_and_not(struct bitmap *self, struct bitmap *other);
+void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other);
+void bitmap_or(struct bitmap *self, const struct bitmap *other);
+
+void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data);
+size_t bitmap_popcount(struct bitmap *self);
+
+#endif
--- a/ewah/ewok_rlw.h
+++ b/ewah/ewok_rlw.h
@ -0,0 +1,114 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#ifndef __EWOK_RLW_H__
+#define __EWOK_RLW_H__
+
+#define RLW_RUNNING_BITS (sizeof(eword_t) * 4)
+#define RLW_LITERAL_BITS (sizeof(eword_t) * 8 - 1 - RLW_RUNNING_BITS)
+
+#define RLW_LARGEST_RUNNING_COUNT (((eword_t)1 << RLW_RUNNING_BITS) - 1)
+#define RLW_LARGEST_LITERAL_COUNT (((eword_t)1 << RLW_LITERAL_BITS) - 1)
+
+#define RLW_LARGEST_RUNNING_COUNT_SHIFT (RLW_LARGEST_RUNNING_COUNT << 1)
+
+#define RLW_RUNNING_LEN_PLUS_BIT (((eword_t)1 << (RLW_RUNNING_BITS + 1)) - 1)
+
+static int rlw_get_run_bit(const eword_t *word)
+{
+	return *word & (eword_t)1;
+}
+
+static inline void rlw_set_run_bit(eword_t *word, int b)
+{
+	if (b) {
+		*word |= (eword_t)1;
+	} else {
+		*word &= (eword_t)(~1);
+	}
+}
+
+static inline void rlw_xor_run_bit(eword_t *word)
+{
+	if (*word & 1) {
+		*word &= (eword_t)(~1);
+	} else {
+		*word |= (eword_t)1;
+	}
+}
+
+static inline void rlw_set_running_len(eword_t *word, eword_t l)
+{
+	*word |= RLW_LARGEST_RUNNING_COUNT_SHIFT;
+	*word &= (l << 1) | (~RLW_LARGEST_RUNNING_COUNT_SHIFT);
+}
+
+static inline eword_t rlw_get_running_len(const eword_t *word)
+{
+	return (*word >> 1) & RLW_LARGEST_RUNNING_COUNT;
+}
+
+static inline eword_t rlw_get_literal_words(const eword_t *word)
+{
+	return *word >> (1 + RLW_RUNNING_BITS);
+}
+
+static inline void rlw_set_literal_words(eword_t *word, eword_t l)
+{
+	*word |= ~RLW_RUNNING_LEN_PLUS_BIT;
+	*word &= (l << (RLW_RUNNING_BITS + 1)) | RLW_RUNNING_LEN_PLUS_BIT;
+}
+
+static inline eword_t rlw_size(const eword_t *self)
+{
+	return rlw_get_running_len(self) + rlw_get_literal_words(self);
+}
+
+struct rlw_iterator {
+	const eword_t *buffer;
+	size_t size;
+	size_t pointer;
+	size_t literal_word_start;
+
+	struct {
+		const eword_t *word;
+		int literal_words;
+		int running_len;
+		int literal_word_offset;
+		int running_bit;
+	} rlw;
+};
+
+void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *bitmap);
+void rlwit_discard_first_words(struct rlw_iterator *it, size_t x);
+size_t rlwit_discharge(
+	struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, int negate);
+void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out);
+
+static inline size_t rlwit_word_size(struct rlw_iterator *it)
+{
+	return it->rlw.running_len + it->rlw.literal_words;
+}
+
+static inline size_t rlwit_literal_words(struct rlw_iterator *it)
+{
+	return it->pointer - it->rlw.literal_words;
+}
+
+#endif
--- a/khash.h
+++ b/khash.h
@ -0,0 +1,338 @@
+/* The MIT License
+
+   Copyright (c) 2008, 2009, 2011 by Attractive Chaos <attractor@live.co.uk>
+
+   Permission is hereby granted, free of charge, to any person obtaining
+   a copy of this software and associated documentation files (the
+   "Software"), to deal in the Software without restriction, including
+   without limitation the rights to use, copy, modify, merge, publish,
+   distribute, sublicense, and/or sell copies of the Software, and to
+   permit persons to whom the Software is furnished to do so, subject to
+   the following conditions:
+
+   The above copyright notice and this permission notice shall be
+   included in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+   SOFTWARE.
+*/
+
+#ifndef __AC_KHASH_H
+#define __AC_KHASH_H
+
+#define AC_VERSION_KHASH_H "0.2.8"
+
+typedef uint32_t khint32_t;
+typedef uint64_t khint64_t;
+
+typedef khint32_t khint_t;
+typedef khint_t khiter_t;
+
+#define __ac_isempty(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&2)
+#define __ac_isdel(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&1)
+#define __ac_iseither(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&3)
+#define __ac_set_isdel_false(flag, i) (flag[i>>4]&=~(1ul<<((i&0xfU)<<1)))
+#define __ac_set_isempty_false(flag, i) (flag[i>>4]&=~(2ul<<((i&0xfU)<<1)))
+#define __ac_set_isboth_false(flag, i) (flag[i>>4]&=~(3ul<<((i&0xfU)<<1)))
+#define __ac_set_isdel_true(flag, i) (flag[i>>4]|=1ul<<((i&0xfU)<<1))
+
+#define __ac_fsize(m) ((m) < 16? 1 : (m)>>4)
+
+#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
+
+static inline khint_t __ac_X31_hash_string(const char *s)
+{
+	khint_t h = (khint_t)*s;
+	if (h) for (++s ; *s; ++s) h = (h << 5) - h + (khint_t)*s;
+	return h;
+}
+
+#define kh_str_hash_func(key) __ac_X31_hash_string(key)
+#define kh_str_hash_equal(a, b) (strcmp(a, b) == 0)
+
+static const double __ac_HASH_UPPER = 0.77;
+
+#define __KHASH_TYPE(name, khkey_t, khval_t) \
+	typedef struct { \
+		khint_t n_buckets, size, n_occupied, upper_bound; \
+		khint32_t *flags; \
+		khkey_t *keys; \
+		khval_t *vals; \
+	} kh_##name##_t;
+
+#define __KHASH_PROTOTYPES(name, khkey_t, khval_t)	 					\
+	extern kh_##name##_t *kh_init_##name(void);							\
+	extern void kh_destroy_##name(kh_##name##_t *h);					\
+	extern void kh_clear_##name(kh_##name##_t *h);						\
+	extern khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key); 	\
+	extern int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets); \
+	extern khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret); \
+	extern void kh_del_##name(kh_##name##_t *h, khint_t x);
+
+#define __KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	SCOPE kh_##name##_t *kh_init_##name(void) {							\
+		return (kh_##name##_t*)xcalloc(1, sizeof(kh_##name##_t));		\
+	}																	\
+	SCOPE void kh_destroy_##name(kh_##name##_t *h)						\
+	{																	\
+		if (h) {														\
+			free((void *)h->keys); free(h->flags);					\
+			free((void *)h->vals);										\
+			free(h);													\
+		}																\
+	}																	\
+	SCOPE void kh_clear_##name(kh_##name##_t *h)						\
+	{																	\
+		if (h && h->flags) {											\
+			memset(h->flags, 0xaa, __ac_fsize(h->n_buckets) * sizeof(khint32_t)); \
+			h->size = h->n_occupied = 0;								\
+		}																\
+	}																	\
+	SCOPE khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key) 	\
+	{																	\
+		if (h->n_buckets) {												\
+			khint_t k, i, last, mask, step = 0; \
+			mask = h->n_buckets - 1;									\
+			k = __hash_func(key); i = k & mask;							\
+			last = i; \
+			while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
+				i = (i + (++step)) & mask; \
+				if (i == last) return h->n_buckets;						\
+			}															\
+			return __ac_iseither(h->flags, i)? h->n_buckets : i;		\
+		} else return 0;												\
+	}																	\
+	SCOPE int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets) \
+	{ /* This function uses 0.25*n_buckets bytes of working space instead of [sizeof(key_t+val_t)+.25]*n_buckets. */ \
+		khint32_t *new_flags = NULL;										\
+		khint_t j = 1;													\
+		{																\
+			kroundup32(new_n_buckets); 									\
+			if (new_n_buckets < 4) new_n_buckets = 4;					\
+			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	/* requested size is too small */ \
+			else { /* hash table size to be changed (shrink or expand); rehash */ \
+				new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t));	\
+				if (!new_flags) return -1;								\
+				memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
+				if (h->n_buckets < new_n_buckets) {	/* expand */		\
+					khkey_t *new_keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
+					if (!new_keys) return -1;							\
+					h->keys = new_keys;									\
+					if (kh_is_map) {									\
+						khval_t *new_vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
+						if (!new_vals) return -1;						\
+						h->vals = new_vals;								\
+					}													\
+				} /* otherwise shrink */								\
+			}															\
+		}																\
+		if (j) { /* rehashing is needed */								\
+			for (j = 0; j != h->n_buckets; ++j) {						\
+				if (__ac_iseither(h->flags, j) == 0) {					\
+					khkey_t key = h->keys[j];							\
+					khval_t val;										\
+					khint_t new_mask;									\
+					new_mask = new_n_buckets - 1; 						\
+					if (kh_is_map) val = h->vals[j];					\
+					__ac_set_isdel_true(h->flags, j);					\
+					while (1) { /* kick-out process; sort of like in Cuckoo hashing */ \
+						khint_t k, i, step = 0; \
+						k = __hash_func(key);							\
+						i = k & new_mask;								\
+						while (!__ac_isempty(new_flags, i)) i = (i + (++step)) & new_mask; \
+						__ac_set_isempty_false(new_flags, i);			\
+						if (i < h->n_buckets && __ac_iseither(h->flags, i) == 0) { /* kick out the existing element */ \
+							{ khkey_t tmp = h->keys[i]; h->keys[i] = key; key = tmp; } \
+							if (kh_is_map) { khval_t tmp = h->vals[i]; h->vals[i] = val; val = tmp; } \
+							__ac_set_isdel_true(h->flags, i); /* mark it as deleted in the old hash table */ \
+						} else { /* write the element and jump out of the loop */ \
+							h->keys[i] = key;							\
+							if (kh_is_map) h->vals[i] = val;			\
+							break;										\
+						}												\
+					}													\
+				}														\
+			}															\
+			if (h->n_buckets > new_n_buckets) { /* shrink the hash table */ \
+				h->keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
+				if (kh_is_map) h->vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
+			}															\
+			free(h->flags); /* free the working space */				\
+			h->flags = new_flags;										\
+			h->n_buckets = new_n_buckets;								\
+			h->n_occupied = h->size;									\
+			h->upper_bound = (khint_t)(h->n_buckets * __ac_HASH_UPPER + 0.5); \
+		}																\
+		return 0;														\
+	}																	\
+	SCOPE khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret) \
+	{																	\
+		khint_t x;														\
+		if (h->n_occupied >= h->upper_bound) { /* update the hash table */ \
+			if (h->n_buckets > (h->size<<1)) {							\
+				if (kh_resize_##name(h, h->n_buckets - 1) < 0) { /* clear "deleted" elements */ \
+					*ret = -1; return h->n_buckets;						\
+				}														\
+			} else if (kh_resize_##name(h, h->n_buckets + 1) < 0) { /* expand the hash table */ \
+				*ret = -1; return h->n_buckets;							\
+			}															\
+		} /* TODO: to implement automatically shrinking; resize() already support shrinking */ \
+		{																\
+			khint_t k, i, site, last, mask = h->n_buckets - 1, step = 0; \
+			x = site = h->n_buckets; k = __hash_func(key); i = k & mask; \
+			if (__ac_isempty(h->flags, i)) x = i; /* for speed up */	\
+			else {														\
+				last = i; \
+				while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
+					if (__ac_isdel(h->flags, i)) site = i;				\
+					i = (i + (++step)) & mask; \
+					if (i == last) { x = site; break; }					\
+				}														\
+				if (x == h->n_buckets) {								\
+					if (__ac_isempty(h->flags, i) && site != h->n_buckets) x = site; \
+					else x = i;											\
+				}														\
+			}															\
+		}																\
+		if (__ac_isempty(h->flags, x)) { /* not present at all */		\
+			h->keys[x] = key;											\
+			__ac_set_isboth_false(h->flags, x);							\
+			++h->size; ++h->n_occupied;									\
+			*ret = 1;													\
+		} else if (__ac_isdel(h->flags, x)) { /* deleted */				\
+			h->keys[x] = key;											\
+			__ac_set_isboth_false(h->flags, x);							\
+			++h->size;													\
+			*ret = 2;													\
+		} else *ret = 0; /* Don't touch h->keys[x] if present and not deleted */ \
+		return x;														\
+	}																	\
+	SCOPE void kh_del_##name(kh_##name##_t *h, khint_t x)				\
+	{																	\
+		if (x != h->n_buckets && !__ac_iseither(h->flags, x)) {			\
+			__ac_set_isdel_true(h->flags, x);							\
+			--h->size;													\
+		}																\
+	}
+
+#define KHASH_DECLARE(name, khkey_t, khval_t)		 					\
+	__KHASH_TYPE(name, khkey_t, khval_t) 								\
+	__KHASH_PROTOTYPES(name, khkey_t, khval_t)
+
+#define KHASH_INIT2(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	__KHASH_TYPE(name, khkey_t, khval_t) 								\
+	__KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
+
+#define KHASH_INIT(name, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	KHASH_INIT2(name, static inline, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
+
+/* Other convenient macros... */
+
+/*! @function
+  @abstract     Test whether a bucket contains data.
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       1 if containing data; 0 otherwise [int]
+ */
+#define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)))
+
+/*! @function
+  @abstract     Get key given an iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       Key [type of keys]
+ */
+#define kh_key(h, x) ((h)->keys[x])
+
+/*! @function
+  @abstract     Get value given an iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       Value [type of values]
+  @discussion   For hash sets, calling this results in segfault.
+ */
+#define kh_val(h, x) ((h)->vals[x])
+
+/*! @function
+  @abstract     Alias of kh_val()
+ */
+#define kh_value(h, x) ((h)->vals[x])
+
+/*! @function
+  @abstract     Get the start iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       The start iterator [khint_t]
+ */
+#define kh_begin(h) (khint_t)(0)
+
+/*! @function
+  @abstract     Get the end iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       The end iterator [khint_t]
+ */
+#define kh_end(h) ((h)->n_buckets)
+
+/*! @function
+  @abstract     Get the number of elements in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       Number of elements in the hash table [khint_t]
+ */
+#define kh_size(h) ((h)->size)
+
+/*! @function
+  @abstract     Get the number of buckets in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       Number of buckets in the hash table [khint_t]
+ */
+#define kh_n_buckets(h) ((h)->n_buckets)
+
+/*! @function
+  @abstract     Iterate over the entries in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  kvar  Variable to which key will be assigned
+  @param  vvar  Variable to which value will be assigned
+  @param  code  Block of code to execute
+ */
+#define kh_foreach(h, kvar, vvar, code) { khint_t __i;		\
+	for (__i = kh_begin(h); __i != kh_end(h); ++__i) {		\
+		if (!kh_exist(h,__i)) continue;						\
+		(kvar) = kh_key(h,__i);								\
+		(vvar) = kh_val(h,__i);								\
+		code;												\
+	} }
+
+/*! @function
+  @abstract     Iterate over the values in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  vvar  Variable to which value will be assigned
+  @param  code  Block of code to execute
+ */
+#define kh_foreach_value(h, vvar, code) { khint_t __i;		\
+	for (__i = kh_begin(h); __i != kh_end(h); ++__i) {		\
+		if (!kh_exist(h,__i)) continue;						\
+		(vvar) = kh_val(h,__i);								\
+		code;												\
+	} }
+
+static inline khint_t __kh_oid_hash(const unsigned char *oid)
+{
+	khint_t hash;
+	memcpy(&hash, oid, sizeof(hash));
+	return hash;
+}
+
+#define __kh_oid_cmp(a, b) (hashcmp(a, b) == 0)
+
+KHASH_INIT(sha1, const unsigned char *, void *, 1, __kh_oid_hash, __kh_oid_cmp)
+typedef kh_sha1_t khash_sha1;
+
+KHASH_INIT(sha1_pos, const unsigned char *, int, 1, __kh_oid_hash, __kh_oid_cmp)
+typedef kh_sha1_pos_t khash_sha1_pos;
+
+#endif /* __AC_KHASH_H */
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@ -0,0 +1,552 @@
+#include "cache.h"
+#include "commit.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "list-objects.h"
+#include "progress.h"
+#include "pack-revindex.h"
+#include "pack.h"
+#include "pack-bitmap.h"
+#include "sha1-lookup.h"
+#include "pack-objects.h"
+
+struct bitmapped_commit {
+	struct commit *commit;
+	struct ewah_bitmap *bitmap;
+	struct ewah_bitmap *write_as;
+	int flags;
+	int xor_offset;
+	uint32_t commit_pos;
+};
+
+struct bitmap_writer {
+	struct ewah_bitmap *commits;
+	struct ewah_bitmap *trees;
+	struct ewah_bitmap *blobs;
+	struct ewah_bitmap *tags;
+
+	khash_sha1 *bitmaps;
+	khash_sha1 *reused;
+	struct packing_data *to_pack;
+
+	struct bitmapped_commit *selected;
+	unsigned int selected_nr, selected_alloc;
+
+	struct progress *progress;
+	int show_progress;
+	unsigned char pack_checksum[20];
+};
+
+static struct bitmap_writer writer;
+
+void bitmap_writer_show_progress(int show)
+{
+	writer.show_progress = show;
+}
+
+/**
+ * Build the initial type index for the packfile
+ */
+void bitmap_writer_build_type_index(struct pack_idx_entry **index,
+				    uint32_t index_nr)
+{
+	uint32_t i;
+
+	writer.commits = ewah_new();
+	writer.trees = ewah_new();
+	writer.blobs = ewah_new();
+	writer.tags = ewah_new();
+
+	for (i = 0; i < index_nr; ++i) {
+		struct object_entry *entry = (struct object_entry *)index[i];
+		enum object_type real_type;
+
+		entry->in_pack_pos = i;
+
+		switch (entry->type) {
+		case OBJ_COMMIT:
+		case OBJ_TREE:
+		case OBJ_BLOB:
+		case OBJ_TAG:
+			real_type = entry->type;
+			break;
+
+		default:
+			real_type = sha1_object_info(entry->idx.sha1, NULL);
+			break;
+		}
+
+		switch (real_type) {
+		case OBJ_COMMIT:
+			ewah_set(writer.commits, i);
+			break;
+
+		case OBJ_TREE:
+			ewah_set(writer.trees, i);
+			break;
+
+		case OBJ_BLOB:
+			ewah_set(writer.blobs, i);
+			break;
+
+		case OBJ_TAG:
+			ewah_set(writer.tags, i);
+			break;
+
+		default:
+			die("Missing type information for %s (%d/%d)",
+			    sha1_to_hex(entry->idx.sha1), real_type, entry->type);
+		}
+	}
+}
+
+/**
+ * Compute the actual bitmaps
+ */
+static struct object **seen_objects;
+static unsigned int seen_objects_nr, seen_objects_alloc;
+
+static inline void push_bitmapped_commit(struct commit *commit, struct ewah_bitmap *reused)
+{
+	if (writer.selected_nr >= writer.selected_alloc) {
+		writer.selected_alloc = (writer.selected_alloc + 32) * 2;
+		writer.selected = xrealloc(writer.selected,
+					   writer.selected_alloc * sizeof(struct bitmapped_commit));
+	}
+
+	writer.selected[writer.selected_nr].commit = commit;
+	writer.selected[writer.selected_nr].bitmap = reused;
+	writer.selected[writer.selected_nr].flags = 0;
+
+	writer.selected_nr++;
+}
+
+static inline void mark_as_seen(struct object *object)
+{
+	ALLOC_GROW(seen_objects, seen_objects_nr + 1, seen_objects_alloc);
+	seen_objects[seen_objects_nr++] = object;
+}
+
+static inline void reset_all_seen(void)
+{
+	unsigned int i;
+	for (i = 0; i < seen_objects_nr; ++i) {
+		seen_objects[i]->flags &= ~(SEEN | ADDED | SHOWN);
+	}
+	seen_objects_nr = 0;
+}
+
+static uint32_t find_object_pos(const unsigned char *sha1)
+{
+	struct object_entry *entry = packlist_find(writer.to_pack, sha1, NULL);
+
+	if (!entry) {
+		die("Failed to write bitmap index. Packfile doesn't have full closure "
+			"(object %s is missing)", sha1_to_hex(sha1));
+	}
+
+	return entry->in_pack_pos;
+}
+
+static void show_object(struct object *object, const struct name_path *path,
+			const char *last, void *data)
+{
+	struct bitmap *base = data;
+	bitmap_set(base, find_object_pos(object->sha1));
+	mark_as_seen(object);
+}
+
+static void show_commit(struct commit *commit, void *data)
+{
+	mark_as_seen((struct object *)commit);
+}
+
+static int
+add_to_include_set(struct bitmap *base, struct commit *commit)
+{
+	khiter_t hash_pos;
+	uint32_t bitmap_pos = find_object_pos(commit->object.sha1);
+
+	if (bitmap_get(base, bitmap_pos))
+		return 0;
+
+	hash_pos = kh_get_sha1(writer.bitmaps, commit->object.sha1);
+	if (hash_pos < kh_end(writer.bitmaps)) {
+		struct bitmapped_commit *bc = kh_value(writer.bitmaps, hash_pos);
+		bitmap_or_ewah(base, bc->bitmap);
+		return 0;
+	}
+
+	bitmap_set(base, bitmap_pos);
+	return 1;
+}
+
+static int
+should_include(struct commit *commit, void *_data)
+{
+	struct bitmap *base = _data;
+
+	if (!add_to_include_set(base, commit)) {
+		struct commit_list *parent = commit->parents;
+
+		mark_as_seen((struct object *)commit);
+
+		while (parent) {
+			parent->item->object.flags |= SEEN;
+			mark_as_seen((struct object *)parent->item);
+			parent = parent->next;
+		}
+
+		return 0;
+	}
+
+	return 1;
+}
+
+static void compute_xor_offsets(void)
+{
+	static const int MAX_XOR_OFFSET_SEARCH = 10;
+
+	int i, next = 0;
+
+	while (next < writer.selected_nr) {
+		struct bitmapped_commit *stored = &writer.selected[next];
+
+		int best_offset = 0;
+		struct ewah_bitmap *best_bitmap = stored->bitmap;
+		struct ewah_bitmap *test_xor;
+
+		for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) {
+			int curr = next - i;
+
+			if (curr < 0)
+				break;
+
+			test_xor = ewah_pool_new();
+			ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor);
+
+			if (test_xor->buffer_size < best_bitmap->buffer_size) {
+				if (best_bitmap != stored->bitmap)
+					ewah_pool_free(best_bitmap);
+
+				best_bitmap = test_xor;
+				best_offset = i;
+			} else {
+				ewah_pool_free(test_xor);
+			}
+		}
+
+		stored->xor_offset = best_offset;
+		stored->write_as = best_bitmap;
+
+		next++;
+	}
+}
+
+void bitmap_writer_build(struct packing_data *to_pack)
+{
+	static const double REUSE_BITMAP_THRESHOLD = 0.2;
+
+	int i, reuse_after, need_reset;
+	struct bitmap *base = bitmap_new();
+	struct rev_info revs;
+
+	writer.bitmaps = kh_init_sha1();
+	writer.to_pack = to_pack;
+
+	if (writer.show_progress)
+		writer.progress = start_progress("Building bitmaps", writer.selected_nr);
+
+	init_revisions(&revs, NULL);
+	revs.tag_objects = 1;
+	revs.tree_objects = 1;
+	revs.blob_objects = 1;
+	revs.no_walk = 0;
+
+	revs.include_check = should_include;
+	reset_revision_walk();
+
+	reuse_after = writer.selected_nr * REUSE_BITMAP_THRESHOLD;
+	need_reset = 0;
+
+	for (i = writer.selected_nr - 1; i >= 0; --i) {
+		struct bitmapped_commit *stored;
+		struct object *object;
+
+		khiter_t hash_pos;
+		int hash_ret;
+
+		stored = &writer.selected[i];
+		object = (struct object *)stored->commit;
+
+		if (stored->bitmap == NULL) {
+			if (i < writer.selected_nr - 1 &&
+			    (need_reset ||
+			     !in_merge_bases(writer.selected[i + 1].commit,
+					     stored->commit))) {
+			    bitmap_reset(base);
+			    reset_all_seen();
+			}
+
+			add_pending_object(&revs, object, "");
+			revs.include_check_data = base;
+
+			if (prepare_revision_walk(&revs))
+				die("revision walk setup failed");
+
+			traverse_commit_list(&revs, show_commit, show_object, base);
+
+			revs.pending.nr = 0;
+			revs.pending.alloc = 0;
+			revs.pending.objects = NULL;
+
+			stored->bitmap = bitmap_to_ewah(base);
+			need_reset = 0;
+		} else
+			need_reset = 1;
+
+		if (i >= reuse_after)
+			stored->flags |= BITMAP_FLAG_REUSE;
+
+		hash_pos = kh_put_sha1(writer.bitmaps, object->sha1, &hash_ret);
+		if (hash_ret == 0)
+			die("Duplicate entry when writing index: %s",
+			    sha1_to_hex(object->sha1));
+
+		kh_value(writer.bitmaps, hash_pos) = stored;
+		display_progress(writer.progress, writer.selected_nr - i);
+	}
+
+	bitmap_free(base);
+	stop_progress(&writer.progress);
+
+	compute_xor_offsets();
+}
+
+/**
+ * Select the commits that will be bitmapped
+ */
+static inline unsigned int next_commit_index(unsigned int idx)
+{
+	static const unsigned int MIN_COMMITS = 100;
+	static const unsigned int MAX_COMMITS = 5000;
+
+	static const unsigned int MUST_REGION = 100;
+	static const unsigned int MIN_REGION = 20000;
+
+	unsigned int offset, next;
+
+	if (idx <= MUST_REGION)
+		return 0;
+
+	if (idx <= MIN_REGION) {
+		offset = idx - MUST_REGION;
+		return (offset < MIN_COMMITS) ? offset : MIN_COMMITS;
+	}
+
+	offset = idx - MIN_REGION;
+	next = (offset < MAX_COMMITS) ? offset : MAX_COMMITS;
+
+	return (next > MIN_COMMITS) ? next : MIN_COMMITS;
+}
+
+static int date_compare(const void *_a, const void *_b)
+{
+	struct commit *a = *(struct commit **)_a;
+	struct commit *b = *(struct commit **)_b;
+	return (long)b->date - (long)a->date;
+}
+
+void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack)
+{
+	if (prepare_bitmap_git() < 0)
+		return;
+
+	writer.reused = kh_init_sha1();
+	rebuild_existing_bitmaps(to_pack, writer.reused, writer.show_progress);
+}
+
+static struct ewah_bitmap *find_reused_bitmap(const unsigned char *sha1)
+{
+	khiter_t hash_pos;
+
+	if (!writer.reused)
+		return NULL;
+
+	hash_pos = kh_get_sha1(writer.reused, sha1);
+	if (hash_pos >= kh_end(writer.reused))
+		return NULL;
+
+	return kh_value(writer.reused, hash_pos);
+}
+
+void bitmap_writer_select_commits(struct commit **indexed_commits,
+				  unsigned int indexed_commits_nr,
+				  int max_bitmaps)
+{
+	unsigned int i = 0, j, next;
+
+	qsort(indexed_commits, indexed_commits_nr, sizeof(indexed_commits[0]),
+	      date_compare);
+
+	if (writer.show_progress)
+		writer.progress = start_progress("Selecting bitmap commits", 0);
+
+	if (indexed_commits_nr < 100) {
+		for (i = 0; i < indexed_commits_nr; ++i)
+			push_bitmapped_commit(indexed_commits[i], NULL);
+		return;
+	}
+
+	for (;;) {
+		struct ewah_bitmap *reused_bitmap = NULL;
+		struct commit *chosen = NULL;
+
+		next = next_commit_index(i);
+
+		if (i + next >= indexed_commits_nr)
+			break;
+
+		if (max_bitmaps > 0 && writer.selected_nr >= max_bitmaps) {
+			writer.selected_nr = max_bitmaps;
+			break;
+		}
+
+		if (next == 0) {
+			chosen = indexed_commits[i];
+			reused_bitmap = find_reused_bitmap(chosen->object.sha1);
+		} else {
+			chosen = indexed_commits[i + next];
+
+			for (j = 0; j <= next; ++j) {
+				struct commit *cm = indexed_commits[i + j];
+
+				reused_bitmap = find_reused_bitmap(cm->object.sha1);
+				if (reused_bitmap || (cm->object.flags & NEEDS_BITMAP) != 0) {
+					chosen = cm;
+					break;
+				}
+
+				if (cm->parents && cm->parents->next)
+					chosen = cm;
+			}
+		}
+
+		push_bitmapped_commit(chosen, reused_bitmap);
+
+		i += next + 1;
+		display_progress(writer.progress, i);
+	}
+
+	stop_progress(&writer.progress);
+}
+
+
+static int sha1write_ewah_helper(void *f, const void *buf, size_t len)
+{
+	/* sha1write will die on error */
+	sha1write(f, buf, len);
+	return len;
+}
+
+/**
+ * Write the bitmap index to disk
+ */
+static inline void dump_bitmap(struct sha1file *f, struct ewah_bitmap *bitmap)
+{
+	if (ewah_serialize_to(bitmap, sha1write_ewah_helper, f) < 0)
+		die("Failed to write bitmap index");
+}
+
+static const unsigned char *sha1_access(size_t pos, void *table)
+{
+	struct pack_idx_entry **index = table;
+	return index[pos]->sha1;
+}
+
+static void write_selected_commits_v1(struct sha1file *f,
+				      struct pack_idx_entry **index,
+				      uint32_t index_nr)
+{
+	int i;
+
+	for (i = 0; i < writer.selected_nr; ++i) {
+		struct bitmapped_commit *stored = &writer.selected[i];
+		struct bitmap_disk_entry on_disk;
+
+		int commit_pos =
+			sha1_pos(stored->commit->object.sha1, index, index_nr, sha1_access);
+
+		if (commit_pos < 0)
+			die("BUG: trying to write commit not in index");
+
+		on_disk.object_pos = htonl(commit_pos);
+		on_disk.xor_offset = stored->xor_offset;
+		on_disk.flags = stored->flags;
+
+		sha1write(f, &on_disk, sizeof(on_disk));
+		dump_bitmap(f, stored->write_as);
+	}
+}
+
+static void write_hash_cache(struct sha1file *f,
+			     struct pack_idx_entry **index,
+			     uint32_t index_nr)
+{
+	uint32_t i;
+
+	for (i = 0; i < index_nr; ++i) {
+		struct object_entry *entry = (struct object_entry *)index[i];
+		uint32_t hash_value = htonl(entry->hash);
+		sha1write(f, &hash_value, sizeof(hash_value));
+	}
+}
+
+void bitmap_writer_set_checksum(unsigned char *sha1)
+{
+	hashcpy(writer.pack_checksum, sha1);
+}
+
+void bitmap_writer_finish(struct pack_idx_entry **index,
+			  uint32_t index_nr,
+			  const char *filename,
+			  uint16_t options)
+{
+	static char tmp_file[PATH_MAX];
+	static uint16_t default_version = 1;
+	static uint16_t flags = BITMAP_OPT_FULL_DAG;
+	struct sha1file *f;
+
+	struct bitmap_disk_header header;
+
+	int fd = odb_mkstemp(tmp_file, sizeof(tmp_file), "pack/tmp_bitmap_XXXXXX");
+
+	if (fd < 0)
+		die_errno("unable to create '%s'", tmp_file);
+	f = sha1fd(fd, tmp_file);
+
+	memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE));
+	header.version = htons(default_version);
+	header.options = htons(flags | options);
+	header.entry_count = htonl(writer.selected_nr);
+	memcpy(header.checksum, writer.pack_checksum, 20);
+
+	sha1write(f, &header, sizeof(header));
+	dump_bitmap(f, writer.commits);
+	dump_bitmap(f, writer.trees);
+	dump_bitmap(f, writer.blobs);
+	dump_bitmap(f, writer.tags);
+	write_selected_commits_v1(f, index, index_nr);
+
+	if (options & BITMAP_OPT_HASH_CACHE)
+		write_hash_cache(f, index, index_nr);
+
+	sha1close(f, NULL, CSUM_FSYNC);
+
+	if (adjust_shared_perm(tmp_file))
+		die_errno("unable to make temporary bitmap file readable");
+
+	if (rename(tmp_file, filename))
+		die_errno("unable to rename temporary bitmap file to '%s'", filename);
+}
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@ -0,0 +1,64 @@
+#ifndef PACK_BITMAP_H
+#define PACK_BITMAP_H
+
+#include "ewah/ewok.h"
+#include "khash.h"
+#include "pack-objects.h"
+
+struct bitmap_disk_entry {
+	uint32_t object_pos;
+	uint8_t xor_offset;
+	uint8_t flags;
+} __attribute__((packed));
+
+struct bitmap_disk_header {
+	char magic[4];
+	uint16_t version;
+	uint16_t options;
+	uint32_t entry_count;
+	unsigned char checksum[20];
+};
+
+static const char BITMAP_IDX_SIGNATURE[] = {'B', 'I', 'T', 'M'};
+
+#define NEEDS_BITMAP (1u<<22)
+
+enum pack_bitmap_opts {
+	BITMAP_OPT_FULL_DAG = 1,
+	BITMAP_OPT_HASH_CACHE = 4,
+};
+
+enum pack_bitmap_flags {
+	BITMAP_FLAG_REUSE = 0x1
+};
+
+typedef int (*show_reachable_fn)(
+	const unsigned char *sha1,
+	enum object_type type,
+	int flags,
+	uint32_t hash,
+	struct packed_git *found_pack,
+	off_t found_offset);
+
+int prepare_bitmap_git(void);
+void count_bitmap_commit_list(uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags);
+void traverse_bitmap_commit_list(show_reachable_fn show_reachable);
+void test_bitmap_walk(struct rev_info *revs);
+char *pack_bitmap_filename(struct packed_git *p);
+int prepare_bitmap_walk(struct rev_info *revs);
+int reuse_partial_packfile_from_bitmap(struct packed_git **packfile, uint32_t *entries, off_t *up_to);
+int rebuild_existing_bitmaps(struct packing_data *mapping, khash_sha1 *reused_bitmaps, int show_progress);
+
+void bitmap_writer_show_progress(int show);
+void bitmap_writer_set_checksum(unsigned char *sha1);
+void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t index_nr);
+void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
+void bitmap_writer_select_commits(struct commit **indexed_commits,
+		unsigned int indexed_commits_nr, int max_bitmaps);
+void bitmap_writer_build(struct packing_data *to_pack);
+void bitmap_writer_finish(struct pack_idx_entry **index,
+			  uint32_t index_nr,
+			  const char *filename,
+			  uint16_t options);
+
+#endif
--- a/pack-objects.c
+++ b/pack-objects.c
@ -0,0 +1,111 @@
+#include "cache.h"
+#include "object.h"
+#include "pack.h"
+#include "pack-objects.h"
+
+static uint32_t locate_object_entry_hash(struct packing_data *pdata,
+					 const unsigned char *sha1,
+					 int *found)
+{
+	uint32_t i, hash, mask = (pdata->index_size - 1);
+
+	memcpy(&hash, sha1, sizeof(uint32_t));
+	i = hash & mask;
+
+	while (pdata->index[i] > 0) {
+		uint32_t pos = pdata->index[i] - 1;
+
+		if (!hashcmp(sha1, pdata->objects[pos].idx.sha1)) {
+			*found = 1;
+			return i;
+		}
+
+		i = (i + 1) & mask;
+	}
+
+	*found = 0;
+	return i;
+}
+
+static inline uint32_t closest_pow2(uint32_t v)
+{
+	v = v - 1;
+	v |= v >> 1;
+	v |= v >> 2;
+	v |= v >> 4;
+	v |= v >> 8;
+	v |= v >> 16;
+	return v + 1;
+}
+
+static void rehash_objects(struct packing_data *pdata)
+{
+	uint32_t i;
+	struct object_entry *entry;
+
+	pdata->index_size = closest_pow2(pdata->nr_objects * 3);
+	if (pdata->index_size < 1024)
+		pdata->index_size = 1024;
+
+	pdata->index = xrealloc(pdata->index, sizeof(uint32_t) * pdata->index_size);
+	memset(pdata->index, 0, sizeof(int) * pdata->index_size);
+
+	entry = pdata->objects;
+
+	for (i = 0; i < pdata->nr_objects; i++) {
+		int found;
+		uint32_t ix = locate_object_entry_hash(pdata, entry->idx.sha1, &found);
+
+		if (found)
+			die("BUG: Duplicate object in hash");
+
+		pdata->index[ix] = i + 1;
+		entry++;
+	}
+}
+
+struct object_entry *packlist_find(struct packing_data *pdata,
+				   const unsigned char *sha1,
+				   uint32_t *index_pos)
+{
+	uint32_t i;
+	int found;
+
+	if (!pdata->index_size)
+		return NULL;
+
+	i = locate_object_entry_hash(pdata, sha1, &found);
+
+	if (index_pos)
+		*index_pos = i;
+
+	if (!found)
+		return NULL;
+
+	return &pdata->objects[pdata->index[i] - 1];
+}
+
+struct object_entry *packlist_alloc(struct packing_data *pdata,
+				    const unsigned char *sha1,
+				    uint32_t index_pos)
+{
+	struct object_entry *new_entry;
+
+	if (pdata->nr_objects >= pdata->nr_alloc) {
+		pdata->nr_alloc = (pdata->nr_alloc  + 1024) * 3 / 2;
+		pdata->objects = xrealloc(pdata->objects,
+					  pdata->nr_alloc * sizeof(*new_entry));
+	}
+
+	new_entry = pdata->objects + pdata->nr_objects++;
+
+	memset(new_entry, 0, sizeof(*new_entry));
+	hashcpy(new_entry->idx.sha1, sha1);
+
+	if (pdata->index_size * 3 <= pdata->nr_objects * 4)
+		rehash_objects(pdata);
+	else
+		pdata->index[index_pos] = pdata->nr_objects;
+
+	return new_entry;
+}
--- a/pack-objects.h
+++ b/pack-objects.h
@ -0,0 +1,68 @@
+#ifndef PACK_OBJECTS_H
+#define PACK_OBJECTS_H
+
+struct object_entry {
+	struct pack_idx_entry idx;
+	unsigned long size;	/* uncompressed size */
+	struct packed_git *in_pack;	/* already in pack */
+	off_t in_pack_offset;
+	struct object_entry *delta;	/* delta base object */
+	struct object_entry *delta_child; /* deltified objects who bases me */
+	struct object_entry *delta_sibling; /* other deltified objects who
+					     * uses the same base as me
+					     */
+	void *delta_data;	/* cached delta (uncompressed) */
+	unsigned long delta_size;	/* delta data size (uncompressed) */
+	unsigned long z_delta_size;	/* delta data size (compressed) */
+	enum object_type type;
+	enum object_type in_pack_type;	/* could be delta */
+	uint32_t hash;			/* name hint hash */
+	unsigned int in_pack_pos;
+	unsigned char in_pack_header_size;
+	unsigned preferred_base:1; /*
+				    * we do not pack this, but is available
+				    * to be used as the base object to delta
+				    * objects against.
+				    */
+	unsigned no_try_delta:1;
+	unsigned tagged:1; /* near the very tip of refs */
+	unsigned filled:1; /* assigned write-order */
+};
+
+struct packing_data {
+	struct object_entry *objects;
+	uint32_t nr_objects, nr_alloc;
+
+	int32_t *index;
+	uint32_t index_size;
+};
+
+struct object_entry *packlist_alloc(struct packing_data *pdata,
+				    const unsigned char *sha1,
+				    uint32_t index_pos);
+
+struct object_entry *packlist_find(struct packing_data *pdata,
+				   const unsigned char *sha1,
+				   uint32_t *index_pos);
+
+static inline uint32_t pack_name_hash(const char *name)
+{
+	uint32_t c, hash = 0;
+
+	if (!name)
+		return 0;
+
+	/*
+	 * This effectively just creates a sortable number from the
+	 * last sixteen non-whitespace characters. Last characters
+	 * count "most", so things that end in ".c" sort together.
+	 */
+	while ((c = *name++) != 0) {
+		if (isspace(c))
+			continue;
+		hash = (hash >> 2) + (c << 24);
+	}
+	return hash;
+}
+
+#endif
--- a/pack-revindex.c
+++ b/pack-revindex.c
@ -16,11 +16,6 @@
 * get the object sha1 from the main index.
 */

-struct pack_revindex {
-	struct packed_git *p;
-	struct revindex_entry *revindex;
-};
-
 static struct pack_revindex *pack_revindex;
 static int pack_revindex_hashsz;

@ -201,15 +196,14 @@ static void create_pack_revindex(struct pack_revindex *rix)
 	sort_revindex(rix->revindex, num_ent, p->pack_size);
 }

-struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
+struct pack_revindex *revindex_for_pack(struct packed_git *p)
 {
 	int num;
-	unsigned lo, hi;
 	struct pack_revindex *rix;
-	struct revindex_entry *revindex;

 	if (!pack_revindex_hashsz)
 		init_pack_revindex();
+
 	num = pack_revindex_ix(p);
 	if (num < 0)
 		die("internal error: pack revindex fubar");
@ -217,30 +211,37 @@ struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
 	rix = &pack_revindex[num];
 	if (!rix->revindex)
 		create_pack_revindex(rix);
-	revindex = rix->revindex;

-	lo = 0;
-	hi = p->num_objects + 1;
+	return rix;
+}
+
+int find_revindex_position(struct pack_revindex *pridx, off_t ofs)
+{
+	int lo = 0;
+	int hi = pridx->p->num_objects + 1;
+	struct revindex_entry *revindex = pridx->revindex;
+
 	do {
 		unsigned mi = lo + (hi - lo) / 2;
 		if (revindex[mi].offset == ofs) {
-			return revindex + mi;
+			return mi;
 		} else if (ofs < revindex[mi].offset)
 			hi = mi;
 		else
 			lo = mi + 1;
 	} while (lo < hi);
+
 	error("bad offset for revindex");
-	return NULL;
+	return -1;
 }

-void discard_revindex(void)
+struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
 {
-	if (pack_revindex_hashsz) {
-		int i;
-		for (i = 0; i < pack_revindex_hashsz; i++)
-			free(pack_revindex[i].revindex);
-		free(pack_revindex);
-		pack_revindex_hashsz = 0;
-	}
+	struct pack_revindex *pridx = revindex_for_pack(p);
+	int pos = find_revindex_position(pridx, ofs);
+
+	if (pos < 0)
+		return NULL;
+
+	return pridx->revindex + pos;
 }
--- a/pack-revindex.h
+++ b/pack-revindex.h
@ -6,7 +6,14 @@ struct revindex_entry {
 	unsigned int nr;
 };

+struct pack_revindex {
+	struct packed_git *p;
+	struct revindex_entry *revindex;
+};
+
+struct pack_revindex *revindex_for_pack(struct packed_git *p);
+int find_revindex_position(struct pack_revindex *pridx, off_t ofs);
+
 struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs);
-void discard_revindex(void);

 #endif
--- a/pack-write.c
+++ b/pack-write.c
@ -364,5 +364,7 @@ void finish_tmp_packfile(char *name_buffer,
 	if (rename(idx_tmp_name, name_buffer))
 		die_errno("unable to rename temporary index file");

+	*end_of_name_prefix = '\0';
+
 	free((void *)idx_tmp_name);
 }
--- a/read-cache.c
+++ b/read-cache.c
@ -1314,26 +1314,6 @@ int read_index(struct index_state *istate)
 	return read_index_from(istate, get_index_file());
 }

-#ifndef NEEDS_ALIGNED_ACCESS
-#define ntoh_s(var) ntohs(var)
-#define ntoh_l(var) ntohl(var)
-#else
-static inline uint16_t ntoh_s_force_align(void *p)
-{
-	uint16_t x;
-	memcpy(&x, p, sizeof(x));
-	return ntohs(x);
-}
-static inline uint32_t ntoh_l_force_align(void *p)
-{
-	uint32_t x;
-	memcpy(&x, p, sizeof(x));
-	return ntohl(x);
-}
-#define ntoh_s(var) ntoh_s_force_align(&(var))
-#define ntoh_l(var) ntoh_l_force_align(&(var))
-#endif
-
 static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *ondisk,
 						   unsigned int flags,
 						   const char *name,
@ -1341,16 +1321,16 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
 {
 	struct cache_entry *ce = xmalloc(cache_entry_size(len));

-	ce->ce_stat_data.sd_ctime.sec = ntoh_l(ondisk->ctime.sec);
-	ce->ce_stat_data.sd_mtime.sec = ntoh_l(ondisk->mtime.sec);
-	ce->ce_stat_data.sd_ctime.nsec = ntoh_l(ondisk->ctime.nsec);
-	ce->ce_stat_data.sd_mtime.nsec = ntoh_l(ondisk->mtime.nsec);
-	ce->ce_stat_data.sd_dev   = ntoh_l(ondisk->dev);
-	ce->ce_stat_data.sd_ino   = ntoh_l(ondisk->ino);
-	ce->ce_mode  = ntoh_l(ondisk->mode);
-	ce->ce_stat_data.sd_uid   = ntoh_l(ondisk->uid);
-	ce->ce_stat_data.sd_gid   = ntoh_l(ondisk->gid);
-	ce->ce_stat_data.sd_size  = ntoh_l(ondisk->size);
+	ce->ce_stat_data.sd_ctime.sec = get_be32(&ondisk->ctime.sec);
+	ce->ce_stat_data.sd_mtime.sec = get_be32(&ondisk->mtime.sec);
+	ce->ce_stat_data.sd_ctime.nsec = get_be32(&ondisk->ctime.nsec);
+	ce->ce_stat_data.sd_mtime.nsec = get_be32(&ondisk->mtime.nsec);
+	ce->ce_stat_data.sd_dev   = get_be32(&ondisk->dev);
+	ce->ce_stat_data.sd_ino   = get_be32(&ondisk->ino);
+	ce->ce_mode  = get_be32(&ondisk->mode);
+	ce->ce_stat_data.sd_uid   = get_be32(&ondisk->uid);
+	ce->ce_stat_data.sd_gid   = get_be32(&ondisk->gid);
+	ce->ce_stat_data.sd_size  = get_be32(&ondisk->size);
 	ce->ce_flags = flags & ~CE_NAMEMASK;
 	ce->ce_namelen = len;
 	hashcpy(ce->sha1, ondisk->sha1);
@ -1390,14 +1370,14 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	unsigned int flags;

 	/* On-disk flags are just 16 bits */
-	flags = ntoh_s(ondisk->flags);
+	flags = get_be16(&ondisk->flags);
 	len = flags & CE_NAMEMASK;

 	if (flags & CE_EXTENDED) {
 		struct ondisk_cache_entry_extended *ondisk2;
 		int extended_flags;
 		ondisk2 = (struct ondisk_cache_entry_extended *)ondisk;
-		extended_flags = ntoh_s(ondisk2->flags2) << 16;
+		extended_flags = get_be16(&ondisk2->flags2) << 16;
 		/* We do not yet understand any bit out of CE_EXTENDED_FLAGS */
 		if (extended_flags & ~CE_EXTENDED_FLAGS)
 			die("Unknown index entry format %08x", extended_flags);
--- a/revision.c
+++ b/revision.c
@ -774,6 +774,10 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
 		return 0;
 	commit->object.flags |= ADDED;

+	if (revs->include_check &&
+	    !revs->include_check(commit, revs->include_check_data))
+		return 0;
+
 	/*
 	 * If the commit is uninteresting, don't try to
 	 * prune parents - we want the maximal uninteresting
--- a/revision.h
+++ b/revision.h
@ -172,6 +172,8 @@ struct rev_info {
 	unsigned long min_age;
 	int min_parents;
 	int max_parents;
+	int (*include_check)(struct commit *, void *);
+	void *include_check_data;

 	/* diff info for patches and for paths limiting */
 	struct diff_options diffopt;
--- a/sha1_file.c
+++ b/sha1_file.c
@ -252,8 +252,6 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 struct alternate_object_database *alt_odb_list;
 static struct alternate_object_database **alt_odb_tail;

-static int git_open_noatime(const char *name);
-
 /*
 * Prepare alternate object database registry.
 *
@ -1232,6 +1230,7 @@ static void prepare_packed_git_one(char *objdir, int local)

 		if (has_extension(de->d_name, ".idx") ||
 		    has_extension(de->d_name, ".pack") ||
+		    has_extension(de->d_name, ".bitmap") ||
 		    has_extension(de->d_name, ".keep"))
 			string_list_append(&garbage, path);
 		else
@ -1316,7 +1315,6 @@ void prepare_packed_git(void)

 void reprepare_packed_git(void)
 {
-	discard_revindex();
 	prepare_packed_git_run_once = 0;
 	prepare_packed_git();
 }
@ -1393,7 +1391,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map,
 	return hashcmp(sha1, real_sha1) ? -1 : 0;
 }

-static int git_open_noatime(const char *name)
+int git_open_noatime(const char *name)
 {
 	static int sha1_file_open_flag = O_NOATIME;

--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@ -0,0 +1,57 @@
+#!/bin/sh
+
+test_description='Tests pack performance using bitmaps'
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+# note that we do everything through config,
+# since we want to be able to compare bitmap-aware
+# git versus non-bitmap git
+test_expect_success 'setup bitmap config' '
+	git config pack.writebitmaps true &&
+	git config pack.writebitmaphashcache true
+'
+
+test_perf 'repack to disk' '
+	git repack -ad
+'
+
+test_perf 'simulated clone' '
+	git pack-objects --stdout --all </dev/null >/dev/null
+'
+
+test_perf 'simulated fetch' '
+	have=$(git rev-list HEAD~100 -1) &&
+	{
+		echo HEAD &&
+		echo ^$have
+	} | git pack-objects --revs --stdout >/dev/null
+'
+
+test_expect_success 'create partial bitmap state' '
+	# pick a commit to represent the repo tip in the past
+	cutoff=$(git rev-list HEAD~100 -1) &&
+	orig_tip=$(git rev-parse HEAD) &&
+
+	# now kill off all of the refs and pretend we had
+	# just the one tip
+	rm -rf .git/logs .git/refs/* .git/packed-refs
+	git update-ref HEAD $cutoff
+
+	# and then repack, which will leave us with a nice
+	# big bitmap pack of the "old" history, and all of
+	# the new history will be loose, as if it had been pushed
+	# up incrementally and exploded via unpack-objects
+	git repack -Ad
+
+	# and now restore our original tip, as if the pushes
+	# had happened
+	git update-ref HEAD $orig_tip
+'
+
+test_perf 'partial bitmap' '
+	git pack-objects --stdout --all </dev/null >/dev/null
+'
+
+test_done
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@ -0,0 +1,139 @@
+#!/bin/sh
+
+test_description='exercise basic bitmap functionality'
+. ./test-lib.sh
+
+test_expect_success 'setup repo with moderate-sized history' '
+	for i in $(test_seq 1 10); do
+		test_commit $i
+	done &&
+	git checkout -b other HEAD~5 &&
+	for i in $(test_seq 1 10); do
+		test_commit side-$i
+	done &&
+	git checkout master &&
+	blob=$(echo tagged-blob | git hash-object -w --stdin) &&
+	git tag tagged-blob $blob &&
+	git config pack.writebitmaps true &&
+	git config pack.writebitmaphashcache true
+'
+
+test_expect_success 'full repack creates bitmaps' '
+	git repack -ad &&
+	ls .git/objects/pack/ | grep bitmap >output &&
+	test_line_count = 1 output
+'
+
+test_expect_success 'rev-list --test-bitmap verifies bitmaps' '
+	git rev-list --test-bitmap HEAD
+'
+
+rev_list_tests() {
+	state=$1
+
+	test_expect_success "counting commits via bitmap ($state)" '
+		git rev-list --count HEAD >expect &&
+		git rev-list --use-bitmap-index --count HEAD >actual &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "counting partial commits via bitmap ($state)" '
+		git rev-list --count HEAD~5..HEAD >expect &&
+		git rev-list --use-bitmap-index --count HEAD~5..HEAD >actual &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "counting non-linear history ($state)" '
+		git rev-list --count other...master >expect &&
+		git rev-list --use-bitmap-index --count other...master >actual &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "enumerate --objects ($state)" '
+		git rev-list --objects --use-bitmap-index HEAD >tmp &&
+		cut -d" " -f1 <tmp >tmp2 &&
+		sort <tmp2 >actual &&
+		git rev-list --objects HEAD >tmp &&
+		cut -d" " -f1 <tmp >tmp2 &&
+		sort <tmp2 >expect &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "bitmap --objects handles non-commit objects ($state)" '
+		git rev-list --objects --use-bitmap-index HEAD tagged-blob >actual &&
+		grep $blob actual
+	'
+}
+
+rev_list_tests 'full bitmap'
+
+test_expect_success 'clone from bitmapped repository' '
+	git clone --no-local --bare . clone.git &&
+	git rev-parse HEAD >expect &&
+	git --git-dir=clone.git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup further non-bitmapped commits' '
+	for i in $(test_seq 1 10); do
+		test_commit further-$i
+	done
+'
+
+rev_list_tests 'partial bitmap'
+
+test_expect_success 'fetch (partial bitmap)' '
+	git --git-dir=clone.git fetch origin master:master &&
+	git rev-parse HEAD >expect &&
+	git --git-dir=clone.git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'incremental repack cannot create bitmaps' '
+	test_commit more-1 &&
+	test_must_fail git repack -d
+'
+
+test_expect_success 'incremental repack can disable bitmaps' '
+	test_commit more-2 &&
+	git repack -d --no-write-bitmap-index
+'
+
+test_expect_success 'full repack, reusing previous bitmaps' '
+	git repack -ad &&
+	ls .git/objects/pack/ | grep bitmap >output &&
+	test_line_count = 1 output
+'
+
+test_expect_success 'fetch (full bitmap)' '
+	git --git-dir=clone.git fetch origin master:master &&
+	git rev-parse HEAD >expect &&
+	git --git-dir=clone.git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_lazy_prereq JGIT '
+	type jgit
+'
+
+test_expect_success JGIT 'we can read jgit bitmaps' '
+	git clone . compat-jgit &&
+	(
+		cd compat-jgit &&
+		rm -f .git/objects/pack/*.bitmap &&
+		jgit gc &&
+		git rev-list --test-bitmap HEAD
+	)
+'
+
+test_expect_success JGIT 'jgit can read our bitmaps' '
+	git clone . compat-us &&
+	(
+		cd compat-us &&
+		git repack -adb &&
+		# jgit gc will barf if it does not like our bitmaps
+		jgit gc
+	)
+'
+
+test_done