When there is no sufficient overlap between old and new history
during a fetch into a shallow repository, we unnecessarily sent
objects the sending side knows the receiving end has.
* nd/fetch-into-shallow:
Add testcase for needless objects during a shallow fetch
list-objects: mark more commits as edges in mark_edges_uninteresting
list-objects: reduce one argument in mark_edges_uninteresting
upload-pack: delegate rev walking in shallow fetch to pack-objects
shallow: add setup_temporary_shallow()
shallow: only add shallow graft points to new shallow file
move setup_alternate_shallow and write_shallow_commits to shallow.c
This is a testcase that checks for a problem where, during a specific
shallow fetch where the client does not have any commits that are a
successor of the new shallow root (i.e., the fetch creates a new
detached piece of history), the server would simply send over _all_
objects, instead of taking into account the objects already present in
the client.
The actual problem was fixed by a recent patch series by Nguyễn Thái
Ngọc Duy already.
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch_pack() can remove .git/shallow file when a shallow repository
becomes a full one again. This behavior is triggered incorrectly when
tags are also fetched because fetch_pack() will be called twice. At
the first fetch_pack() call:
- shallow_lock is set up
- alternate_shallow_file points to shallow_lock.filename, which is
"shallow.lock"
- commit_lock_file is called, which sets shallow_lock.filename to "".
alternate_shallow_file also becomes "" because it points to the
same memory.
At the second call, setup_alternate_shallow() is not called and
alternate_shallow_file remains "". It's mistaken as unshallow case and
.git/shallow is removed. The end result is a broken repository.
Fix this by always initializing alternate_shallow_file when
fetch_pack() is called. As an extra measure, check if args->depth > 0
before commit/rollback shallow file.
Reported-by: Kacper Kornet <kornet@camk.edu.pl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cloning with "git clone --depth N" while fetch.fsckobjects (or
transfer.fsckobjects) is set to true did not tell the cut-off points
of the shallow history to the process that validates the objects and
the history received, causing the validation to fail.
* 'nd/clone-connectivity-shortcut' (early part):
fetch-pack: prepare updated shallow file before fetching the pack
clone: let the user know when check_everything_connected is run
Special case "git clone" and use lighter-weight implementation to
check the completeness of the history behind refs.
* nd/clone-connectivity-shortcut:
clone: open a shortcut for connectivity check
index-pack: remove dead code (it should never happen)
fetch-pack: prepare updated shallow file before fetching the pack
clone: let the user know when check_everything_connected is run
index-pack --strict looks up and follows parent commits. If shallow
information is not ready by the time index-pack is run, index-pack may
be led to non-existent objects. Make fetch-pack save shallow file to
disk before invoking index-pack.
git learns new global option --shallow-file to pass on the alternate
shallow file path. Undocumented (and not even support --shallow-file=
syntax) because it's unlikely to be used again elsewhere.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client sends a 'shallow' line for an object that the server does
not have, the server should just ignore it and let the client keep that
unknown shallow boundary.
Signed-off-by: Michael Heemskerk <mheemskerk@atlassian.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent optimization broke shallow clones.
* jk/peel-ref:
upload-pack: load non-tip "want" objects from disk
upload-pack: make sure "want" objects are parsed
upload-pack: drop lookup-before-parse optimization
When upload-pack receives a "want" line from the client, it
adds it to an object array. We call lookup_object to find
the actual object, which will only check for objects already
in memory. This works because we are expecting to find
objects that we already loaded during the ref advertisement.
We use the resulting object structs for a variety of
purposes. Some of them care only about the object flags, but
others care about the type of the object (e.g.,
ok_to_give_up), or even feed them to the revision parser
(when --depth is used), which assumes that objects it
receives are fully parsed.
Once upon a time, this was OK; any object we loaded into
memory would also have been parsed. But since 435c833
(upload-pack: use peel_ref for ref advertisements,
2012-10-04), we try to avoid parsing objects during the ref
advertisement. This means that lookup_object may return an
object with a type of OBJ_NONE. The resulting mess depends
on the exact set of objects, but can include the revision
parser barfing, or the shallow code sending the wrong set of
objects.
This patch teaches upload-pack to parse each "want" object
as we receive it. We do not replace the lookup_object call
with parse_object, as the current code is careful not to let
just any object appear on a "want" line, but rather only one
we have previously advertised (whereas parse_object would
actually load any arbitrary object from disk).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_shallow_commits() is used to determine the cut points at a given
depth (i.e. the number of commits in a chain that the user likes to
get). However we count current depth up to the commit "commit" but we
do the cutting at its parents (i.e. current depth + 1). This makes
upload-pack always return one commit more than requested. This patch
fixes it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The user can do --depth=2147483647 (*) for restoring complete repo
now. But it's hard to remember. Any other numbers larger than the
longest commit chain in the repository would also do, but some
guessing may be involved. Make easy-to-remember --unshallow an alias
for --depth=2147483647.
Make upload-pack recognize this special number as infinite depth. The
effect is essentially the same as before, except that upload-pack is
more efficient because it does not have to traverse to the bottom
anymore.
The chance of a user actually wanting exactly 2147483647 commits
depth, not infinite, on a repository with a history that long, is
probably too small to consider. The client can learn to add or
subtract one commit to avoid the special treatment when that actually
happens.
(*) This is the largest positive number a 32-bit signed integer can
contain. JGit and older C Git store depth as "int" so both are OK
with this number. Dulwich does not support shallow clone.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It used to be that if "--all", "--depth", and also explicit references
were sought, then the explicit references were not handled correctly
in filter_refs() because the "--all --depth" code took precedence over
the explicit reference handling, and the explicit references were
never noted as having been found. So check for explicitly sought
references before proceeding to the "--all --depth" logic.
This fixes two test cases in t5500.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch_pack() removes duplicates from the "sought" list, thereby
shrinking the list. But previously, the caller was not informed about
the shrinkage. This would cause a spurious error message to be
emitted by cmd_fetch_pack() if "git fetch-pack" is called with
duplicate refnames.
Instead, remove duplicates using string_list_remove_duplicates(),
which adjusts sought->nr to reflect the new length of the list.
The last test of t5500 inexplicably *required* "git fetch-pack" to
fail when fetching a list of references that contains duplicates;
i.e., it insisted on the buggy behavior. So change the test to expect
the correct behavior.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document some bugs in "git fetch-pack":
1. If "git fetch-pack" is called with "--all", "--depth", and an
explicit existing non-tag reference to fetch, then it falsely reports
that the reference was not found, even though it was fetched
correctly.
2. If "git fetch-pack" is called with "--all", "--depth", and an
explicit existing tag reference to fetch, then it segfaults in
filter_refs() because return_refs is used without having been
initialized.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "git fetch-pack" is called with reference names that do not exist
on the remote, then it should emit an error message
error: no such remote ref refs/heads/xyzzy
This is currently broken if *only* missing references are passed to
"git fetch-pack".
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- do not fetch HEAD
- do not also fetch refs following "xxx"
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git fetch" encounters repositories with too many references, the
command line of "fetch-pack" that is run by a helper e.g. remote-curl, may
fail to hold all of them. Now such an internal invocation can feed the
references through the standard input of "fetch-pack".
By Ivan Todoroski
* it/fetch-pack-many-refs:
remote-curl: main test case for the OS command line overflow
fetch-pack: test cases for the new --stdin option
remote-curl: send the refs to fetch-pack on stdin
fetch-pack: new --stdin option to read refs from stdin
Conflicts:
t/t5500-fetch-pack.sh
These test cases focus only on testing the parsing of refs on stdin,
without bothering with the rest of the fetch-pack machinery. We pass in
the refs using different combinations of command line and stdin and then
we watch fetch-pack's stdout to see whether it prints all the refs we
specified (but we ignore their order).
Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because a tag ref cannot be put to HEAD, HEAD will become detached.
This is consistent with "git checkout <tag>".
This is mostly useful in shallow clone, where it allows you to clone a
tag in addtion to branches.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible that users make a typo in the branch name. Stop and let
users recheck. Falling back to remote's HEAD is not documented any
way.
Except when using remote helper, the pack has not been transferred at
this stage yet so we don't waste much bandwidth.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --single-branch is given, only one branch, either HEAD or one
specified by --branch, will be fetched. Also only tags that point to
the downloaded history are fetched.
This helps most in shallow clones, where it can reduce the download to
minimum and that is why it is enabled by default when --depth is given.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch-pack documentation is very clear that refs given
on the command line are to be full refs:
<refs>...::
The remote heads to update from. This is relative to
$GIT_DIR (e.g. "HEAD", "refs/heads/master"). When
unspecified, update from all heads the remote side has.
and this has been the case since fetch-pack was originally documented in
8b3d9dc ([PATCH] Documentation: clone/fetch/upload., 2005-07-14).
Let's follow our own documentation to set a good example,
and to avoid breaking when this restriction is enforced in
the next patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Breaks in a test assertion's && chain can potentially hide
failures from earlier commands in the chain.
Commands intended to fail should be marked with !, test_must_fail, or
test_might_fail. The examples in this patch do not require that.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If all refs sent by the remote repo during a fetch are reachable
locally, then no further conversation is performed with the remote. This
check is skipped when the --depth argument is provided to allow the
deepening of a shallow clone which corresponding remote repo has no
changed.
However, some additional filtering was added in commit c29727d5 to
remove those refs which are equal on both sides. If the remote repo has
not changed, then the list of refs to give the remote process becomes
empty and simply attempting to deepen a shallow repo always fails.
Let's stop being smart in that case and simply send the whole list over
when that condition is met. The remote will do the right thing anyways.
Test cases for this issue are also provided.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code outside of the test harness was emitting "Initializing..." from
git-init. Fixup this test to be more modern:
- test_expect_object_count() and count_objects() are unused
- use grep directly instead of test "..." = $(grep ...)
- end the test_expect_success line with a single-quote and put the
test on a new line
- put as much code inside the test harness as possible
- no_strict_count_check is unused and duplicates the test
"new object count"
- use && whenever possible to catch errors early
- use test_tick instead of GIT_AUTHOR_DATE=$sec
- remove debugging aid log.txt
- use subshells instead of cd-ing around
Also merge the pull test into one large test.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adds the total pack size (including indexes) the verbose count-objects
output, floored to the nearest kilobyte.
Updates documentation to match this addition.
Signed-off-by: Marcus Griep <marcus@griep.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch changes every occurrence of "! git" -- with the meaning
that a git call has to gracefully fail -- into "test_must_fail git".
This is useful to
- make sure the test does not fail because of a signal,
e.g. SIGSEGV, and
- advertise the use of "test_must_fail" for new tests.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes the remainder of the issues where the test script itself is at
fault for failing when the git checkout path contains whitespace or other
shell metacharacters.
The majority of git svn tests used the idiom
test_expect_success "title" "test script using $svnrepo"
These were changed to have the test script in single-quotes:
test_expect_success "title" 'test script using "$svnrepo"'
which unfortunately makes the patch appear larger than it really is.
One consequence of this change is that in the verbose test output the
value of $svnrepo (and in some cases other variables, too) is no
longer expanded, i.e. previously we saw
* expecting success:
test script using /path/to/git/t/trash/svnrepo
but now it is:
* expecting success:
test script using "$svnrepo"
Signed-off-by: Bryan Donlan <bdonlan@fushizen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This form is not portable across all shells, so replace instances of:
export FOO=bar
with:
FOO=bar
export FOO
Signed-off-by: Bryan Donlan <bdonlan@fushizen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, test_expect_failure was designed to be the opposite
of test_expect_success, but this was a bad decision. Most tests
run a series of commands that leads to the single command that
needs to be tested, like this:
test_expect_{success,failure} 'test title' '
setup1 &&
setup2 &&
setup3 &&
what is to be tested
'
And expecting a failure exit from the whole sequence misses the
point of writing tests. Your setup$N that are supposed to
succeed may have failed without even reaching what you are
trying to test. The only valid use of test_expect_failure is to
check a trivial single command that is expected to fail, which
is a minority in tests of Porcelain-ish commands.
This large-ish patch rewrites all uses of test_expect_failure to
use test_expect_success and rewrites the condition of what is
tested, like this:
test_expect_success 'test title' '
setup1 &&
setup2 &&
setup3 &&
! this command should fail
'
test_expect_failure is redefined to serve as a reminder that
that test *should* succeed but due to a known breakage in git it
currently does not pass. So if git-foo command should create a
file 'bar' but you discovered a bug that it doesn't, you can
write a test like this:
test_expect_failure 'git-foo should create bar' '
rm -f bar &&
git foo &&
test -f bar
'
This construct acts similar to test_expect_success, but instead
of reporting "ok/FAIL" like test_expect_success does, the
outcome is reported as "FIXED/still broken".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This changes the behaviour of cloning from a repository on the
local machine, by defaulting to "-l" (use hardlinks to share
files under .git/objects) and making "-l" a no-op. A new
option, --no-hardlinks, is also added to cause file-level copy
of files under .git/objects while still avoiding the normal
"pack to pipe, then receive and index pack" network transfer
overhead. The old behaviour of local cloning without -l nor -s
is availble by specifying the source repository with the newly
introduced file:///path/to/repo.git/ syntax (i.e. "same as
network" cloning).
* With --no-hardlinks (i.e. have all .git/objects/ copied via
cpio) would not catch the source repository corruption, and
also risks corrupted recipient repository if an
alpha-particle hits memory cell while indexing and resolving
deltas. As long as the recipient is created uncorrupted, you
have a good back-up.
* same-as-network is expensive, but it would catch the breakage
of the source repository. It still risks corrupted recipient
repository due to hardware failure. As long as the recipient
is created uncorrupted, you have a good back-up.
* The new default on the same filesystem, as long as the source
repository is healthy, it is very likely that the recipient
would be, too. Also it is very cheap. You do not get any
back-up benefit, though.
None of the method is resilient against the source repository
corruption, so let's discount that from the comparison. Then
the difference with and without --no-hardlinks matters primarily
if you value the back-up benefit or not. If you want to use the
cloned repository as a back-up, then it is cheaper to do a clone
with --no-hardlinks and two git-fsck (source before clone,
recipient after clone) than same-as-network clone, especially as
you are likely to do a git-fsck on the recipient if you are so
paranoid anyway.
Which leads me to believe that being able to use file:/// is
probably a good idea, if only for testability, but probably of
little practical value. We default to hardlinked clone for
everyday use, and paranoids can use --no-hardlinks as a way to
make a back-up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows transfer.unpackLimit to specify what these two
configuration variables want to set.
We would probably want to deprecate the two separate variables,
as I do not see much point in specifying them independently.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes git-fetch over git native protocol to automatically
decide to keep the downloaded pack if the fetch results in more
than 100 objects, just like receive-pack invoked by git-push
does. This logic is disabled when --keep is explicitly given
from the command line, so that a very small clone still keeps
the downloaded pack as before.
The 100 threshold can be adjusted with fetch.unpacklimit
configuration. We might want to introduce transfer.unpacklimit
to consolidate the two unpacklimit variables, which will be a
topic for the next patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
While 'init-db' still is and probably will always remain a valid git
command for obvious backward compatibility reasons, it would be a good
idea to move shipped tools and docs to using 'init' instead.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is to adjust to:
count-objects -v: show number of packs as well.
which will break a test in this series.
Signed-off-by: Junio C Hamano <junkio@cox.net>
None of the variables seem to conflict, so local was unnecessary.
Also replaced ${var:pos:len} with the sed equivalent.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Relying on eye-candy progress bar was fragile to begin with.
Run fetch-pack with -k option, and count the objects that are in
the pack that were transferred from the other end.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now pack-object is not as chatty when its stderr is not connected
to a terminal, so the test needs to be adjusted for that.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This test provides a minimal example of what went wrong with the old
git-fetch-pack (and now works beautifully).
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>