When we serve a fetch, we pass the "wants" and "haves" from
the fetch negotiation to pack-objects. That tells us not
only which objects we need to send, but we also use the
boundary commits as "preferred bases": their trees and blobs
are candidates for delta bases, both for reusing on-disk
deltas and for finding new ones.
However, this misses some opportunities. Modulo some special
cases like shallow or partial clones, we know that every
object reachable from the "haves" could be a preferred base.
We don't use all of them for two reasons:
1. It's expensive to traverse the whole history and
enumerate all of the objects the other side has.
2. The delta search is expensive, so we want to keep the
number of candidate bases sane. The boundary commits
are the most likely to work.
When we have reachability bitmaps, though, reason 1 no
longer applies. We can efficiently compute the set of
reachable objects on the other side (and in fact already did
so as part of the bitmap set-difference to get the list of
interesting objects). And using this set conveniently
covers the shallow and partial cases, since we have to
disable the use of bitmaps for those anyway.
The second reason argues against using these bases in the
search for new deltas. But there's one case where we can use
this information for free: when we have an existing on-disk
delta that we're considering reusing, we can do so if we
know the other side has the base object. This in fact saves
time during the delta search, because it's one less delta we
have to compute.
And that's exactly what this patch does: when we're
considering whether to reuse an on-disk delta, if bitmaps
tell us the other side has the object (and we're making a
thin-pack), then we reuse it.
Here are the results on p5311 using linux.git, which
simulates a client fetching after `N` days since their last
fetch:
Test origin HEAD
--------------------------------------------------------------------------
5311.3: server (1 days) 0.27(0.27+0.04) 0.12(0.09+0.03) -55.6%
5311.4: size (1 days) 0.9M 237.0K -73.7%
5311.5: client (1 days) 0.04(0.05+0.00) 0.10(0.10+0.00) +150.0%
5311.7: server (2 days) 0.34(0.42+0.04) 0.13(0.10+0.03) -61.8%
5311.8: size (2 days) 1.5M 347.7K -76.5%
5311.9: client (2 days) 0.07(0.08+0.00) 0.16(0.15+0.01) +128.6%
5311.11: server (4 days) 0.56(0.77+0.08) 0.13(0.10+0.02) -76.8%
5311.12: size (4 days) 2.8M 566.6K -79.8%
5311.13: client (4 days) 0.13(0.15+0.00) 0.34(0.31+0.02) +161.5%
5311.15: server (8 days) 0.97(1.39+0.11) 0.30(0.25+0.05) -69.1%
5311.16: size (8 days) 4.3M 1.0M -76.0%
5311.17: client (8 days) 0.20(0.22+0.01) 0.53(0.52+0.01) +165.0%
5311.19: server (16 days) 1.52(2.51+0.12) 0.30(0.26+0.03) -80.3%
5311.20: size (16 days) 8.0M 2.0M -74.5%
5311.21: client (16 days) 0.40(0.47+0.03) 1.01(0.98+0.04) +152.5%
5311.23: server (32 days) 2.40(4.44+0.20) 0.31(0.26+0.04) -87.1%
5311.24: size (32 days) 14.1M 4.1M -70.9%
5311.25: client (32 days) 0.70(0.90+0.03) 1.81(1.75+0.06) +158.6%
5311.27: server (64 days) 11.76(26.57+0.29) 0.55(0.50+0.08) -95.3%
5311.28: size (64 days) 89.4M 47.4M -47.0%
5311.29: client (64 days) 5.71(9.31+0.27) 15.20(15.20+0.32) +166.2%
5311.31: server (128 days) 16.15(36.87+0.40) 0.91(0.82+0.14) -94.4%
5311.32: size (128 days) 134.8M 100.4M -25.5%
5311.33: client (128 days) 9.42(16.86+0.49) 25.34(25.80+0.46) +169.0%
In all cases we save CPU time on the server (sometimes
significant) and the resulting pack is smaller. We do spend
more CPU time on the client side, because it has to
reconstruct more deltas. But that's the right tradeoff to
make, since clients tend to outnumber servers. It just means
the thin pack mechanism is doing its job.
From the user's perspective, the end-to-end time of the
operation will generally be faster. E.g., in the 128-day
case, we saved 15s on the server at a cost of 16s on the
client. Since the resulting pack is 34MB smaller, this is a
net win if the network speed is less than 270Mbit/s. And
that's actually the worst case. The 64-day case saves just
over 11s at a cost of just under 11s. So it's a slight win
at any network speed, and the 40MB saved is pure bonus. That
trend continues for the smaller fetches.
The implementation itself is mostly straightforward, with
the new logic going into check_object(). But there are two
tricky bits.
The first is that check_object() needs access to the
relevant information (the thin flag and bitmap result). We
can do this by pushing these into program-lifetime globals.
The second is that the rest of the code assumes that any
reused delta will point to another "struct object_entry" as
its base. But of course the case we are interested in here
is the one where don't have such an entry!
I looked at a number of options that didn't quite work:
- we could use a flag to signal a reused delta, but it's
not a single bit. We have to actually store the oid of
the base, which is normally done by pointing to the
existing object_entry. And we'd have to modify all the
code which looks at deltas.
- we could add the reused bases to the end of the existing
object_entry array. While this does create some extra
work as later stages consider the extra entries, it's
actually not too bad (we're not sending them, so they
don't cost much in the delta search, and at most we'd
have 2*N of them).
But there's a more subtle problem. Adding to the existing
array means we might need to grow it with realloc, which
could move the earlier entries around. While many of the
references to other entries are done by integer index,
some (including ones on the stack) use pointers, which
would become invalidated.
This isn't insurmountable, but it would require quite a
bit of refactoring (and it's hard to know that you've got
it all, since it may work _most_ of the time and then
fail subtly based on memory allocation patterns).
- we could allocate a new one-off entry for the base. In
fact, this is what an earlier version of this patch did.
However, since the refactoring brought in by ad635e82d6
(Merge branch 'nd/pack-objects-pack-struct', 2018-05-23),
the delta_idx code requires that both entries be in the
main packing list.
So taking all of those options into account, what I ended up
with is a separate list of "external bases" that are not
part of the main packing list. Each delta entry that points
to an external base has a single-bit flag to do so; we have a
little breathing room in the bitfield section of
object_entry.
This lets us limit the change primarily to the oe_delta()
and oe_set_delta_ext() functions. And as a bonus, most of
the rest of the code does not consider these dummy entries
at all, saving both runtime CPU and code complexity.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we do a bitmap walk, we save the result, which
represents (WANTs & ~HAVEs); i.e., every object we care
about visiting in our walk. However, we throw away the
haves bitmap, which can sometimes be useful, too. Save it
and provide an access function so code which has performed a
walk can query it.
A few notes on the accessor interface:
- the bitmap code calls these "haves" because it grew out
of the want/have negotiation for fetches. But really,
these are simply the objects that would be flagged
UNINTERESTING in a regular traversal. Let's use that
more universal nomenclature for the external module
interface. We may want to change the internal naming
inside the bitmap code, but that's outside the scope of
this patch.
- it still uses a bare "sha1" rather than "oid". That's
true of all of the bitmap code. And in this particular
instance, our caller in pack-objects is dealing with the
bare sha1 that comes from a packed REF_DELTA (we're
pointing directly to the mmap'd pack on disk). That's
something we'll have to deal with as we transition to a
new hash, but we can wait and see how the caller ends up
being fixed and adjust this interface accordingly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A server with bitmapped packs can serve a clone very
quickly. However, fetches are not necessarily made any
faster, because we spend a lot less time in object traversal
(which is what bitmaps help with) and more time finding
deltas (because we may have to throw out on-disk deltas if
the client does not have the base).
As a first step to making this faster, this patch introduces
a new perf script to measure fetches into a repo of various
ages from a fully-bitmapped server.
We separately measure the work done by the server (in
pack-objects) and that done by the client (in index-pack).
Furthermore, we measure the size of the resulting pack.
Breaking it down like this (instead of just doing a regular
"git fetch") lets us see how much each side benefits from
any changes. And since we know the pack size, if we estimate
the network speed, then one could calculate a complete
wall-clock time for the operation (though the script does
not do this automatically).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main objective of scripts in the perf framework is to
run "test_perf", which measures the time it takes to run
some operation. However, it can also be interesting to see
the change in the output size of certain operations.
This patch introduces test_size, which records a single
numeric output from the test and shows it in the aggregated
output (with pretty printing and relative size comparison).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will let us reuse the code when we add new values to
aggregate besides times.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
About half of test_perf() is boilerplate preparing to run
_any_ test, and the other half is specifically running a
timing test. Let's split it into two functions, so that we
can reuse the boilerplate in future commits.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sideband code learned to optionally paint selected keywords at
the beginning of incoming lines on the receiving end.
* hn/highlight-sideband-keywords:
sideband: do not read beyond the end of input
sideband: highlight keywords in remote sideband output
"git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even
though we won't be in a cherry-pick session after it returns, which
has been corrected.
* nd/cherry-pick-quit-fix:
cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
A few preliminary minor clean-ups in the area around submodules.
* sb/submodule-cleanup:
builtin/submodule--helper: remove stray new line
t7410: update to new style
"git rebase -i", when a 'merge <branch>' insn in its todo list
fails, segfaulted, which has been (minimally) corrected.
* pw/rebase-i-merge-segv-fix:
rebase -i: fix SIGSEGV when 'merge <branch>' fails
t3430: add conflicting commit
When "git rebase -i" is told to squash two or more commits into
one, it labeled the log message for each commit with its number.
It correctly called the first one "1st commit", but the next one
was "commit #1", which was off-by-one. This has been corrected.
* pw/rebase-i-squash-number-fix:
rebase -i: fix numbering in squash message
Recent update to "git config" broke updating variable in a
subsection, which has been corrected.
* sb/config-write-fix:
git-config: document accidental multi-line setting in deprecated syntax
config: fix case sensitive subsection names on writing
t1300: document current behavior of setting options
Code hygiene improvement for the header files.
* en/incl-forward-decl:
Remove forward declaration of an enum
compat/precompose_utf8.h: use more common include guard style
urlmatch.h: fix include guard
Move definition of enum branch_track from cache.h to branch.h
alloc: make allocate_alloc_state and clear_alloc_state more consistent
Add missing includes and forward declarations
After a partial clone, repeated fetches from promisor remote would
have accumulated many packfiles marked with .promisor bit without
getting them coalesced into fewer packfiles, hurting performance.
"git repack" now learned to repack them.
* jt/repack-promisor-packs:
repack: repack promisor objects if -a or -A is set
repack: refactor setup of pack-objects cmd
A test prerequisite defined by various test scripts with slightly
different semantics has been consolidated into a single copy and
made into a lazily defined one.
* wc/make-funnynames-shared-lazy-prereq:
t: factor out FUNNYNAMES as shared lazy prereq
"git pull --rebase -v" in a repository with a submodule barfed as
an intermediate process did not understand what "-v(erbose)" flag
meant, which has been fixed.
* sb/pull-rebase-submodule:
git-submodule.sh: accept verbose flag in cmd_update to be non-quiet
"git tbdiff" that lets us compare individual patches in two
iterations of a topic has been rewritten and made into a built-in
command.
* js/range-diff: (21 commits)
range-diff: use dim/bold cues to improve dual color mode
range-diff: make --dual-color the default mode
range-diff: left-pad patch numbers
completion: support `git range-diff`
range-diff: populate the man page
range-diff --dual-color: skip white-space warnings
range-diff: offer to dual-color the diffs
diff: add an internal option to dual-color diffs of diffs
color: add the meta color GIT_COLOR_REVERSE
range-diff: use color for the commit pairs
range-diff: add tests
range-diff: do not show "function names" in hunk headers
range-diff: adjust the output of the commit pairs
range-diff: suppress the diff headers
range-diff: indent the diffs just like tbdiff
range-diff: right-trim commit messages
range-diff: also show the diff between patches
range-diff: improve the order of the shown commits
range-diff: first rudimentary implementation
Introduce `range-diff` to compare iterations of a topic branch
...
The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.
* nd/no-the-index: (24 commits)
blame.c: remove implicit dependency on the_index
apply.c: remove implicit dependency on the_index
apply.c: make init_apply_state() take a struct repository
apply.c: pass struct apply_state to more functions
resolve-undo.c: use the right index instead of the_index
archive-*.c: use the right repository
archive.c: avoid access to the_index
grep: use the right index instead of the_index
attr: remove index from git_attr_set_direction()
entry.c: use the right index instead of the_index
submodule.c: use the right index instead of the_index
pathspec.c: use the right index instead of the_index
unpack-trees: avoid the_index in verify_absent()
unpack-trees: convert clear_ce_flags* to avoid the_index
unpack-trees: don't shadow global var the_index
unpack-trees: add a note about path invalidation
unpack-trees: remove 'extern' on function declaration
ls-files: correct index argument to get_convert_attr_ascii()
preload-index.c: use the right index instead of the_index
dir.c: remove an implicit dependency on the_index in pathspec code
...
Improve built-in facility to catch broken &&-chain in the tests.
* es/chain-lint-more:
chainlint: add test of pathological case which triggered false positive
chainlint: recognize multi-line quoted strings more robustly
chainlint: let here-doc and multi-line string commence on same line
chainlint: recognize multi-line $(...) when command cuddled with "$("
chainlint: match 'quoted' here-doc tags
chainlint: match arbitrary here-docs tags rather than hard-coded names
Among the three codepaths we use O_APPEND to open a file for
appending, one used for writing GIT_TRACE output requires O_APPEND
implementation that behaves sensibly when multiple processes are
writing to the same file. POSIX emulation used in the Windows port
has been updated to improve in this area.
* js/mingw-o-append:
mingw: enable atomic O_APPEND
The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.
* jk/for-each-object-iteration:
for_each_*_object: move declarations to object-store.h
cat-file: use a single strbuf for all output
cat-file: split batch "buf" into two variables
cat-file: use oidset check-and-insert
cat-file: support "unordered" output for --batch-all-objects
cat-file: rename batch_{loose,packed}_object callbacks
t1006: test cat-file --batch-all-objects with duplicates
for_each_packed_object: support iterating in pack-order
for_each_*_object: give more comprehensive docstrings
for_each_*_object: take flag arguments as enum
for_each_*_object: store flag definitions in a single location
"git mergetool" stopped and gave an extra prompt to continue after
the last path has been handled, which did not make much sense.
* ng/mergetool-lose-final-prompt:
mergetool: don't suggest to continue after last file
"git verify-tag" and "git verify-commit" have been taught to use
the exit status of underlying "gpg --verify" to signal bad or
untrusted signature they found.
* jc/gpg-status:
gpg-interface: propagate exit status from gpg back to the callers
"git instaweb" has been adjusted to run better with newer Apache on
RedHat based distros.
* sk/instaweb-rh-update:
git-instaweb: fix apache2 config with apache >= 2.4
git-instaweb: support Fedora/Red Hat apache module path
Test fixes.
* en/t7406-fixes:
t7406: avoid using test_must_fail for commands other than git
t7406: prefer test_* helper functions to test -[feds]
t7406: avoid having git commands upstream of a pipe
t7406: simplify by using diff --name-only instead of diff --raw
t7406: fix call that was failing for the wrong reason
The "--exec" option to "git rebase --rebase-merges" placed the exec
commands at wrong places, which has been corrected.
* js/rebase-merges-exec-fix:
rebase --exec: make it work with --rebase-merges
t3430: demonstrate what -r, --autosquash & --exec should do
Documentation update.
* ab/newhash-is-sha256:
doc hash-function-transition: pick SHA-256 as NewHash
doc hash-function-transition: note the lack of a changelog
Checkout with the -p switch uses the "add interactive" framework which
is written in Perl.
One test added in 8d7b558bae ("checkout & worktree: introduce
checkout.defaultRemote", 2018-06-05) didn't declare the PERL
prerequisite, breaking the test when built with NO_PERL.
Reported-by: CB Bailey <cb@hashpling.org>
Signed-off-by: CB Bailey <cb@hashpling.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The caller of maybe_colorize_sideband() gives a counted buffer
<src, n>, but the callee checked src[] as if it were a NUL terminated
buffer. If src[] had all isspace() bytes in it, we would have made
n negative, and then
(1) made number of strncasecmp() calls to see if the remaining
bytes in src[] matched keywords, reading beyond the end of the
array (this actually happens even if n does not go negative),
and/or
(2) called strbuf_add() with negative count, most likely triggering
the "you want to use way too much memory" error due to unsigned
integer overflow.
Fix both issues by making sure we do not go beyond &src[n].
In the longer term we may want to accept size_t as parameter for
clarity (even though we know that a sideband message we are painting
typically would fit on a line on a terminal and int is sufficient).
Write it down as a NEEDSWORK comment.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull --rebase=interactive" learned "i" as a short-hand for
"interactive".
* js/pull-rebase-type-shorthand:
pull --rebase=<type>: allow single-letter abbreviations for the type
The end result of documentation update has been made to be
inspected more easily to help developers.
* jk/diff-rendered-docs:
add a script to diff rendered documentation