"git repack" has been taught to generate multi-pack reachability
bitmaps.
* tb/repack-write-midx:
test-read-midx: fix leak of bitmap_index struct
builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
builtin/repack.c: make largest pack preferred
builtin/repack.c: support writing a MIDX while repacking
builtin/repack.c: extract showing progress to a variable
builtin/repack.c: rename variables that deal with non-kept packs
builtin/repack.c: keep track of existing packs unconditionally
midx: preliminary support for `--refs-snapshot`
builtin/multi-pack-index.c: support `--stdin-packs` mode
midx: expose `write_midx_file_only()` publicly
To figure out which commits we can write a bitmap for, the multi-pack
index/bitmap code does a reachability traversal, marking any commit
which can be found in the MIDX as eligible to receive a bitmap.
This approach will cause a problem when multi-pack bitmaps are able to
be generated from `git repack`, since the reference tips can change
during the repack. Even though we ignore commits that don't exist in
the MIDX (when doing a scan of the ref tips), it's possible that a
commit in the MIDX reaches something that isn't.
This can happen when a multi-pack index contains some pack which refers
to loose objects (e.g., if a pack was pushed after starting the repack
but before generating the MIDX which depends on an object which is
stored as loose in the repository, and by definition isn't included in
the multi-pack index).
By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
object pointing at a loose one), we'll either (a) take a snapshot of the
references before seeing the packed one, or (b) take it after, at which
point we can guarantee that the loose object will be packed and included
in the MIDX.
This patch does just that. It writes a temporary "reference snapshot",
which is a list of OIDs that are at the ref tips before writing a
multi-pack bitmap. References that are "preferred" (i.e,. are a suffix
of at least one value of the 'pack.preferBitmapTips' configuration) are
marked with a special '+'.
The format is simple: one line per commit at each tip, with an optional
'+' at the beginning (for preferred references, as described above).
When provided, the reference snapshot is used to drive bitmap selection
instead of the MIDX code doing its own traversal. When it isn't
provided, the usual traversal takes place instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we both can propagate values from the hashcache, and respect
the configuration to enable the hashcache at all, test that both of
these function correctly by hardening their behavior with a test.
Like the hash-cache in classic single-pack bitmaps, this helps more
proportionally the more up-to-date your bitmap coverage is. When our
bitmap coverage is out-of-date with the ref tips, we spend more time
proportionally traversing, and all of that traversal gets the name-hash
filled in.
But for the up-to-date bitmaps, this helps quite a bit. These numbers
are on git.git, with `pack.threads=1` to help see the difference
reflected in the overall runtime.
Test origin/tb/multi-pack-bitmaps HEAD
-------------------------------------------------------------------------------------
5326.4: simulated clone 1.87(1.80+0.07) 1.46(1.42+0.03) -21.9%
5326.5: simulated fetch 2.66(2.61+0.04) 1.47(1.43+0.04) -44.7%
5326.6: pack to file (bitmap) 2.74(2.62+0.12) 1.89(1.82+0.07) -31.0%
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch introduces a new test, t5326, which tests the basic
functionality of multi-pack bitmaps.
Some trivial behavior is tested, such as:
- Whether bitmaps can be generated with more than one pack.
- Whether clones can be served with all objects in the bitmap.
- Whether follow-up fetches can be served with some objects outside of
the server's bitmap
These use lib-bitmap's tests (which in turn were pulled from t5310), and
we cover cases where the MIDX represents both a single pack and multiple
packs.
In addition, some non-trivial and MIDX-specific behavior is tested, too,
including:
- Whether multi-pack bitmaps behave correctly with respect to the
pack-reuse machinery when the base for some object is selected from
a different pack than the delta.
- Whether multi-pack bitmaps correctly respect the
pack.preferBitmapTips configuration.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>