Граф коммитов

215 Коммитов

Автор SHA1 Сообщение Дата
Jeff King 819b929d33 pkt-line: teach packet_read_line to chomp newlines
The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.

This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.

As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.

Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility.  However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.

This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.

By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch.  The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Jeff King cdf4fb8e33 pkt-line: drop safe_write function
This is just write_or_die by another name. The one
distinction is that write_or_die will treat EPIPE specially
by suppressing error messages. That's fine, as we die by
SIGPIPE anyway (and in the off chance that it is disabled,
write_or_die will simulate it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Jeff King 97a83fa839 upload-pack: remove packet debugging harness
If you set the GIT_DEBUG_SEND_PACK environment variable,
upload-pack will dump lines it receives in the receive_needs
phase to a descriptor. This debugging harness is a strict
subset of what GIT_TRACE_PACKET can do. Let's just drop it
in favor of that.

A few tests used GIT_DEBUG_SEND_PACK to confirm which
objects get sent; we have to adapt them to the new output
format.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Jeff King e58e57e49e upload-pack: do not add duplicate objects to shallow list
When the client tells us it has a shallow object via
"shallow <sha1>", we make sure we have the object, mark it
with a flag, then add it to a dynamic array of shallow
objects. This means that a client can get us to allocate
arbitrary amounts of memory just by flooding us with shallow
lines (whether they have the objects or not). You can
demonstrate it easily with:

  yes '0035shallow e83c5163316f89bfbde7d9ab23ca2e25604af290' |
  git-upload-pack git.git

We already protect against duplicates in want lines by
checking if our flag is already set; let's do the same thing
here. Note that a client can still get us to allocate some
amount of memory by marking every object in the repo as
"shallow" (or "want"). But this at least bounds it with the
number of objects in the repository, which is not under the
control of an upload-pack client.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Jeff King b7b021701c upload-pack: use get_sha1_hex to parse "shallow" lines
When we receive a line like "shallow <sha1>" from the
client, we feed the <sha1> part to get_sha1. This is a
mistake, as the argument on a shallow line is defined by
Documentation/technical/pack-protocol.txt to contain an
"obj-id".  This is never defined in the BNF, but it is clear
from the text and from the other uses that it is meant to be
a hex sha1, not an arbitrary identifier (and that is what
fetch-pack has always sent).

We should be using get_sha1_hex instead, which doesn't allow
the client to request arbitrary junk like "HEAD@{yesterday}".
Because this is just marking shallow objects, the client
couldn't actually do anything interesting (like fetching
objects from unreachable reflog entries), but we should keep
our parsing tight to be on the safe side.

Because get_sha1 is for the most part a superset of
get_sha1_hex, in theory the only behavior change should be
disallowing non-hex object references. However, there is
one interesting exception: get_sha1 will only parse
a 40-character hex sha1 if the string has exactly 40
characters, whereas get_sha1_hex will just eat the first 40
characters, leaving the rest. That means that current
versions of git-upload-pack will not accept a "shallow"
packet that has a trailing newline, even though the protocol
documentation is clear that newlines are allowed (even
encouraged) in non-binary parts of the protocol.

This never mattered in practice, though, because fetch-pack,
contrary to the protocol documentation, does not include a
newline in its shallow lines. JGit follows its lead (though
it correctly is strict on the parsing end about wanting a
hex object id).

We do not adjust fetch-pack to send newlines here, as it
would break communication with older versions of git (and
there is no actual benefit to doing so, except for
consistency with other parts of the protocol).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:20 -08:00
Junio C Hamano ce735bf7fd Merge branch 'jc/hidden-refs'
Allow the server side to redact the refs/ namespace it shows to the
client.

Will merge to 'master'.

* jc/hidden-refs:
  upload/receive-pack: allow hiding ref hierarchies
  upload-pack: simplify request validation
  upload-pack: share more code
2013-02-17 15:25:57 -08:00
Junio C Hamano 390eb36b0a upload-pack: optionally allow fetching from the tips of hidden refs
With uploadpack.allowtipsha1inwant configuration option set, future
versions of "git fetch" that allow an exact object name (likely to
have been obtained out of band) on the LHS of the fetch refspec can
make a request with a "want" line that names an object that may not
have been advertised due to transfer.hiderefs configuration.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-07 13:56:52 -08:00
Junio C Hamano daebaa7813 upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.

Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable.  Do the same to receive-pack via the
receive.hiderefs variable.  As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.

Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).

Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.

An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected.  This
is not a new restriction.  To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs.  I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit".  The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-07 13:48:47 -08:00
Junio C Hamano 2532d891a4 Merge branch 'nd/fetch-depth-is-broken'
"git fetch --depth" was broken in at least three ways.  The
resulting history was deeper than specified by one commit, it was
unclear how to wipe the shallowness of the repository with the
command, and documentation was misleading.

* nd/fetch-depth-is-broken:
  fetch: elaborate --depth action
  upload-pack: fix off-by-one depth calculation in shallow clone
  fetch: add --unshallow for turning shallow repo into complete one
2013-02-01 12:39:24 -08:00
Junio C Hamano 3f1da57fff upload-pack: simplify request validation
Long time ago, we used to punt on a large (read: asking for more
than 256 refs) fetch request and instead sent a full pack, because
we couldn't fit many refs on the command line of rev-list we run
internally to enumerate the objects to be sent.  To fix this,
565ebbf (upload-pack: tighten request validation., 2005-10-24),
added a check to count the number of refs in the request and matched
with the number of refs we advertised, and changed the invocation of
rev-list to pass "--all" to it, still keeping us under the command
line argument limit.

However, these days we feed the list of objects requested and the
list of objects the other end is known to have via standard input,
so there is no longer a valid reason to special case a full clone
request.  Remove the code associated with "create_full_pack" to
simplify the logic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-28 21:05:51 -08:00
Junio C Hamano cbbe50db76 upload-pack: share more code
We mark the objects pointed at our refs with "OUR_REF" flag in two
functions (mark_our_ref() and send_ref()), but we can just use the
former as a helper for the latter.

Update the way mark_our_ref() prepares in-core object to use
lookup_unknown_object() to delay reading the actual object data,
just like we did in 435c833 (upload-pack: use peel_ref for ref
advertisements, 2012-10-04).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-18 15:48:49 -08:00
Junio C Hamano e43171a4a7 Merge branch 'nd/upload-pack-shallow-must-be-commit'
A minor consistency check patch that does not have much relevance
to the real world.

* nd/upload-pack-shallow-must-be-commit:
  upload-pack: only accept commits from "shallow" line
2013-01-14 08:15:44 -08:00
Nguyễn Thái Ngọc Duy 4dcb167fc3 fetch: add --unshallow for turning shallow repo into complete one
The user can do --depth=2147483647 (*) for restoring complete repo
now. But it's hard to remember. Any other numbers larger than the
longest commit chain in the repository would also do, but some
guessing may be involved. Make easy-to-remember --unshallow an alias
for --depth=2147483647.

Make upload-pack recognize this special number as infinite depth. The
effect is essentially the same as before, except that upload-pack is
more efficient because it does not have to traverse to the bottom
anymore.

The chance of a user actually wanting exactly 2147483647 commits
depth, not infinite, on a repository with a history that long, is
probably too small to consider. The client can learn to add or
subtract one commit to avoid the special treatment when that actually
happens.

(*) This is the largest positive number a 32-bit signed integer can
    contain. JGit and older C Git store depth as "int" so both are OK
    with this number. Dulwich does not support shallow clone.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-11 09:09:30 -08:00
Nguyễn Thái Ngọc Duy 6293ded348 upload-pack: only accept commits from "shallow" line
We only allow cuts at commits, not arbitrary objects. upload-pack will
fail eventually in register_shallow if a non-commit is given with a
generic error "Object %s is a %s, not a commit". Check it early and
give a more accurate error.

This should never show up in an ordinary session. It's for buggy
clients, or when the user manually edits .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-08 09:28:00 -08:00
Jeff King 435c833237 upload-pack: use peel_ref for ref advertisements
When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.

Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.

This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:

  echo 0000 | git-upload-pack . >/dev/null

best-of-five, warm cache, objects and refs fully packed:

  [before]             [after]
  real    0m0.026s     real    0m0.013s
  user    0m0.024s     user    0m0.008s
  sys     0m0.000s     sys     0m0.000s

Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):

  [before]             [after]
  real    0m0.704s     real    0m0.596s
  user    0m0.600s     user    0m0.496s
  sys     0m0.096s     sys     0m0.092s

This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).

In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):

  [before]             [after]
  real    0m0.443s     real    0m0.097s
  user    0m0.416s     user    0m0.080s
  sys     0m0.024s     sys     0m0.012s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:29 -07:00
Jeff King ff5effdf45 include agent identifier in capability string
Instead of having the client advertise a particular version
number in the git protocol, we have managed extensions and
backwards compatibility by having clients and servers
advertise capabilities that they support. This is far more
robust than having each side consult a table of
known versions, and provides sufficient information for the
protocol interaction to complete.

However, it does not allow servers to keep statistics on
which client versions are being used. This information is
not necessary to complete the network request (the
capabilities provide enough information for that), but it
may be helpful to conduct a general survey of client
versions in use.

We already send the client version in the user-agent header
for http requests; adding it here allows us to gather
similar statistics for non-http requests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 13:03:34 -07:00
Junio C Hamano d1afa8baa2 Merge branch 'jk/parse-object-cached'
* jk/parse-object-cached:
  upload-pack: avoid parsing tag destinations
  upload-pack: avoid parsing objects during ref advertisement
  parse_object: try internal cache before reading object db
2012-01-29 13:18:55 -08:00
Junio C Hamano f47182c852 server_supports(): parse feature list more carefully
We have been carefully choosing feature names used in the protocol
extensions so that the vocabulary does not contain a word that is a
substring of another word, so it is not a real problem, but we have
recently added "quiet" feature word, which would mean we cannot later
add some other word with "quiet" (e.g. "quiet-push"), which is awkward.

Let's make sure that we can eventually be able to do so by teaching the
clients and servers that feature words consist of non whitespace
letters. This parser also allows us to later add features with parameters
e.g. "feature=1.5" (parameter values need to be quoted for whitespaces,
but we will worry about the detauls when we do introduce them).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-08 14:26:28 -08:00
Jeff King 90108a2441 upload-pack: avoid parsing tag destinations
When upload-pack advertises refs, it dereferences any tags
it sees, and shows the resulting sha1 to the client. It does
this by calling deref_tag. That function must load and parse
each tag object to find the sha1 of the tagged object.
However, it also ends up parsing the tagged object itself,
which is not strictly necessary for upload-pack's use.

Each tag produces two object loads (assuming it is not a
recursive tag), when it could get away with only a single
one. Dropping the second load halves the effort we spend.

The downside is that we are no longer verifying the
resulting object by loading it. In particular:

  1. We never cross-check the "type" field given in the tag
     object with the type of the pointed-to object.  If the
     tag says it points to a tag but doesn't, then we will
     keep peeling and realize the error.  If the tag says it
     points to a non-tag but actually points to a tag, we
     will stop peeling and just advertise the pointed-to
     tag.

  2. If we are missing the pointed-to object, we will not
     realize (because we never even look it up in the object
     db).

However, both of these are errors in the object database,
and both will be detected if a client actually requests the
broken objects in question. So we are simply pushing the
verification away from the advertising stage, and down to
the actual fetching stage.

On my test repo with 120K refs, this drops the time to
advertise the refs from ~3.2s to ~2.0s.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:57 -08:00
Jeff King 926f1dd954 upload-pack: avoid parsing objects during ref advertisement
When we advertise a ref, the first thing we do is parse the
pointed-to object. This gives us two things:

  1. a "struct object" we can use to store flags

  2. the type of the object, so we know whether we need to
     dereference it as a tag

Instead, we can just use lookup_unknown_object to get an
object struct, and then fill in just the type field using
sha1_object_info (which, in the case of packed files, can
find the information without actually inflating the object
data).

This can save time if you have a large number of refs, and
the client isn't actually going to request those refs (e.g.,
because most of them are already up-to-date).

The downside is that we are no longer verifying objects that
we advertise by fully parsing them (however, we do still
know we actually have them, because sha1_object_info must
find them to get the type). While we might fail to detect a
corrupt object here, if the client actually fetches the
object, we will parse (and verify) it then.

On a repository with 120K refs, the advertisement portion of
upload-pack goes from ~3.4s to 3.2s (the failure to speed up
more is largely due to the fact that most of these refs are
tags, which need dereferenced to find the tag destination
anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:55 -08:00
Ævar Arnfjörð Bjarmason 5e9637c629 i18n: add infrastructure for translating Git with gettext
Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.

This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.

This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.

The rest of the commit message goes into detail about various
sub-parts of this commit.

= Installation

Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.

= Perl

Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.

Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.

Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.

I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.

See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.

= Shell

Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.

If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.

If neither are present we'll use a dumb printf(1) fall-through
wrapper.

= About libcharset.h and langinfo.h

We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.

The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.

GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.

=Credits

This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.

[jc: squashed a small Makefile fix from Ramsay]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-05 20:46:55 -08:00
Junio C Hamano 2e2e7e9dd0 Merge branch 'jc/fetch-verify'
* jc/fetch-verify:
  fetch: verify we have everything we need before updating our ref
  rev-list --verify-object
  list-objects: pass callback data to show_objects()
2011-10-05 12:36:20 -07:00
Junio C Hamano f817f2fbb5 Merge branch 'jc/traverse-commit-list'
* jc/traverse-commit-list:
  revision.c: update show_object_with_name() without using malloc()
  revision.c: add show_object_with_name() helper function
  rev-list: fix finish_object() call
2011-10-05 12:36:19 -07:00
Junio C Hamano 4947367267 list-objects: pass callback data to show_objects()
The traverse_commit_list() API takes two callback functions, one to show
commit objects, and the other to show other kinds of objects. Even though
the former has a callback data parameter, so that the callback does not
have to rely on global state, the latter does not.

Give the show_objects() callback the same callback data parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-01 15:46:12 -07:00
Junio C Hamano b7fcd00715 Sync with 1.7.6.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-24 12:18:02 -07:00
Brian Harring 2a74532412 get_indexed_object can return NULL if nothing is in that slot; check for it
This fixes a segfault introduced by 051e400; via it, no longer able to
trigger the http/smartserv race.

Signed-off-by: Brian Harring <ferringb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-24 10:50:33 -07:00
Junio C Hamano 91f175165a revision.c: add show_object_with_name() helper function
There are two copies of traverse_commit_list callback that show the object
name followed by pathname the object was found, to produce output similar
to "rev-list --objects".

Unify them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-22 11:34:55 -07:00
Junio C Hamano 2f5cb6aa1e Merge branch 'jc/maint-smart-http-race-upload-pack'
* jc/maint-smart-http-race-upload-pack:
  helping smart-http/stateless-rpc fetch race
2011-08-17 17:35:58 -07:00
Junio C Hamano 051e4005a3 helping smart-http/stateless-rpc fetch race
A request to fetch from a client over smart HTTP protocol is served in
multiple steps. In the first round, the server side shows the set of refs
it has and their values, and the client picks from them and sends "I want
to fetch the history leading to these commits".

When the server tries to respond to this second request, its refs may have
progressed by a push from elsewhere. By design, we do not allow fetching
objects that are not at the tip of an advertised ref, and the server
rejects such a request. The client needs to try again, which is not ideal
especially for a busy server.

Teach upload-pack (which is the workhorse driven by git-daemon and smart
http server interface) that it is OK for a smart-http client to ask for
commits that are not at the tip of any advertised ref, as long as they are
reachable from advertised refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08 15:33:28 -07:00
Josh Triplett 6b01ecfe22 ref namespaces: Support remote repositories via upload-pack and receive-pack
Change upload-pack and receive-pack to use the namespace-prefixed refs
when working with the repository, and use the unprefixed refs when
talking to the client, maintaining the masquerade.  This allows
clone, pull, fetch, and push to work with a suitably configured
GIT_NAMESPACE.

receive-pack advertises refs outside the current namespace as .have refs
(as it currently does for refs in alternates), so that the client can
use them to minimize data transfer but will otherwise ignore them.

With appropriate configuration, this also allows http-backend to expose
namespaces as multiple repositories with different paths.  This only
requires setting GIT_NAMESPACE, which http-backend passes through to
upload-pack and receive-pack.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-11 09:35:38 -07:00
Junio C Hamano 982f6c90ee Merge branch 'jk/maint-upload-pack-shallow'
* jk/maint-upload-pack-shallow:
  upload-pack: start pack-objects before async rev-list
2011-04-27 11:36:42 -07:00
Jeff King b961219779 upload-pack: start pack-objects before async rev-list
In a pthread-enabled version of upload-pack, there's a race condition
that can cause a deadlock on the fflush(NULL) we call from run-command.

What happens is this:

  1. Upload-pack is informed we are doing a shallow clone.

  2. We call start_async() to spawn a thread that will generate rev-list
     results to feed to pack-objects. It gets a file descriptor to a
     pipe which will eventually hook to pack-objects.

  3. The rev-list thread uses fdopen to create a new output stream
     around the fd we gave it, called pack_pipe.

  4. The thread writes results to pack_pipe. Outside of our control,
     libc is doing locking on the stream. We keep writing until the OS
     pipe buffer is full, and then we block in write(), still holding
     the lock.

  5. The main thread now uses start_command to spawn pack-objects.
     Before forking, it calls fflush(NULL) to flush every stdio output
     buffer. It blocks trying to get the lock on pack_pipe.

And we have a deadlock. The thread will block until somebody starts
reading from the pipe. But nobody will read from the pipe until we
finish flushing to the pipe.

To fix this, we swap the start order: we start the
pack-objects reader first, and then the rev-list writer
after. Thus the problematic fflush(NULL) happens before we
even open the new file descriptor (and even if it didn't,
flushing should no longer block, as the reader at the end of
the pipe is now active).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-06 14:38:47 -07:00
Junio C Hamano 2eee1393f3 Merge branches 'sp/maint-fetch-pack-stop-early' and 'sp/maint-upload-pack-stop-early'
* sp/maint-fetch-pack-stop-early:
  enable "no-done" extension only when fetching over smart-http

* sp/maint-upload-pack-stop-early:
  enable "no-done" extension only when serving over smart-http
2011-03-29 14:09:02 -07:00
Junio C Hamano 4e10cf9a17 Revert two "no-done" reverts
Last night I had to make these two emergency reverts, but now we have a
better understanding of which part of the topic was broken, let's get rid
of the revert to fix it correctly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29 12:29:10 -07:00
Junio C Hamano cf2ad8e641 enable "no-done" extension only when serving over smart-http
Do not advertise no-done capability when upload-pack is not serving over
smart-http, as there is no way for this server to know when it should stop
reading in-flight data from the client, even though it is necessary to
drain all the in-flight data in order to unblock the client.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2011-03-29 12:21:28 -07:00
Junio C Hamano 4793b7e86d Revert "upload-pack: Implement no-done capability"
This reverts 3e63b21 (upload-pack: Implement no-done capability,
2011-03-14).  Together with 761ecf0 (fetch-pack: Implement no-done
capability, 2011-03-14) it seems to make the fetch-pack process out of
sync and makes it keep talking long after upload-pack stopped listening to
it, terminating the process with SIGPIPE.
2011-03-28 23:33:51 -07:00
Junio C Hamano f59bf09678 Merge branch 'sp/maint-upload-pack-stop-early'
* sp/maint-upload-pack-stop-early:
  upload-pack: Implement no-done capability
  upload-pack: More aggressively send 'ACK %s ready'
2011-03-22 21:38:06 -07:00
Shawn O. Pearce 3e63b21ace upload-pack: Implement no-done capability
If the client requests both multi_ack_detailed and no-done then
upload-pack is free to immediately send a PACK following its first
'ACK %s ready' message.  The upload-pack response actually winds
up being:

  ACK %s common
  ... (maybe more) ...
  ACK %s ready
  NAK
  ACK %s
  PACK.... the pack stream ....

For smart HTTP connections this saves one HTTP RPC, reducing
the overall latency for a trivial fetch.  For git:// and ssh://
a no-done option slightly reduces latency by removing one
server->client->server round-trip at the end of the common
ancestor negotiation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 12:14:35 -07:00
Shawn O. Pearce 49bee717f7 upload-pack: More aggressively send 'ACK %s ready'
If a client is merely following the remote (and has not made any
new commits itself), all "have %s" lines sent by the client will be
common to the server.  As all lines are common upload-pack never
calls ok_to_give_up() and does not compute if it has a good cut
point in the commit graph.

Without this computation the following client is going to send all
tagged commits, as these were determined to be COMMON_REF during the
initial advertisement, but the client does not parse their history
to transitively pass the COMMON flag and empty its queue of commits.

For git.git with 339 commit tags, it takes clients 11 rounds of
negotation to fully send all tagged commits and exhaust its queue
of things to send as common.  This is pretty slow for a client that
has not done any local development activity.

Force computing ok_to_give_up() and send "ACK %s ready" at the end
of the current round if this round only contained common objects
and ok_to_give_up() was therefore not called.  This may allow the
client to break early, avoiding transmission of the COMMON_REFs.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-14 17:27:25 -07:00
Jeff King bbc30f9963 add packet tracing debug code
This shows a trace of all packets coming in or out of a given
program. This can help with debugging object negotiation or
other protocol issues.

To keep the code changes simple, we operate at the lowest
level, meaning we don't necessarily understand what's in the
packets. The one exception is a packet starting with "PACK",
which causes us to skip that packet and turn off tracing
(since the gigantic pack data will not be interesting to
read, at least not in the trace format).

We show both written and read packets. In the local case,
this may mean you will see packets twice (written by the
sender and read by the receiver). However, for cases where
the other end is remote, this allows you to see the full
conversation.

Packet tracing can be enabled with GIT_TRACE_PACKET=<foo>,
where <foo> takes the same arguments as GIT_TRACE.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-08 12:12:04 -08:00
Thiago Farina 47e44ed1dc commit: Add commit_list prefix in two function names.
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.

Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 14:01:52 -08:00
Štěpán Němec 62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Thiago Farina 3cd474599f object.h: Add OBJECT_ARRAY_INIT macro and make use of it.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-29 22:42:49 -07:00
Elijah Newren 9f9aa76130 upload-pack: Improve error message when bad ref requested
When printing an error message saying a ref was requested that we do not
have, only print that ref, rather than the ref and everything sent to us
on the same packet line (e.g. protocol support specifications).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 15:31:59 -07:00
Nguyễn Thái Ngọc Duy 1e39d7deea upload-pack: remove unused "create_full_pack" code in do_rev_list
A bit of history in chronological order, the newest at bottom:

- 80ccaa7 (upload-pack: Move the revision walker into a separate function.)
   do_rev_list was introduced with create_full_pack argument

- 21edd3f (upload-pack: Run rev-list in an asynchronous function.)
   do_rev_list was now called by start_async, create_full_pack was
   passed by rev_list.data

- f0cea83 (Shift object enumeration out of upload-pack)
   rev_list.data was now zero permanently. Creating full pack was
   done by passing --all to pack-objects

- ae6a560 (run-command: support custom fd-set in async)
   rev_list.data = 0 was found out redudant and got rid of.

Get rid of the code as well, for less headache while reading do_rev_list.

[jc: noticed by Elijah Newren]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-28 13:50:11 -07:00
Erik Faye-Lund ae6a5609c0 run-command: support custom fd-set in async
This patch adds the possibility to supply a set of non-0 file
descriptors for async process communication instead of the
default-created pipe.

Additionally, we now support bi-directional communiction with the
async procedure, by giving the async function both read and write
file descriptors.

To retain compatiblity and similar "API feel" with start_command,
we require start_async callers to set .out = -1 to get a readable
file descriptor.  If either of .in or .out is 0, we supply no file
descriptor to the async process.

[sp: Note: Erik started this patch, and a huge bulk of it is
     his work.  All bugs were introduced later by Shawn.]

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-05 20:57:22 -08:00
Junio C Hamano 4cb51a65a4 Sync with 1.6.5.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 16:20:59 -08:00
Junio C Hamano 1456b043fc Remove post-upload-hook
This hook runs after "git fetch" in the repository the objects are
fetched from as the user who fetched, and has security implications.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 12:21:40 -08:00
Junio C Hamano ef3a4fd670 Merge branch 'np/maint-sideband-favor-status' into maint
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-12-03 13:50:24 -08:00
Junio C Hamano 905bf7742c Merge branch 'sp/smart-http'
* sp/smart-http: (37 commits)
  http-backend: Let gcc check the format of more printf-type functions.
  http-backend: Fix access beyond end of string.
  http-backend: Fix bad treatment of uintmax_t in Content-Length
  t5551-http-fetch: Work around broken Accept header in libcurl
  t5551-http-fetch: Work around some libcurl versions
  http-backend: Protect GIT_PROJECT_ROOT from /../ requests
  Git-aware CGI to provide dumb HTTP transport
  http-backend: Test configuration options
  http-backend: Use http.getanyfile to disable dumb HTTP serving
  test smart http fetch and push
  http tests: use /dumb/ URL prefix
  set httpd port before sourcing lib-httpd
  t5540-http-push: remove redundant fetches
  Smart HTTP fetch: gzip requests
  Smart fetch over HTTP: client side
  Smart push over HTTP: client side
  Discover refs via smart HTTP server when available
  http-backend: more explict LocationMatch
  http-backend: add example for gitweb on same URL
  http-backend: use mod_alias instead of mod_rewrite
  ...

Conflicts:
	.gitignore
	remote-curl.c
2009-11-20 23:51:23 -08:00
Junio C Hamano e36e6c00cd Merge branch 'np/maint-sideband-favor-status'
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-11-17 22:03:20 -08:00
Nicolas Pitre 6b59f51b31 give priority to progress messages
In theory it is possible for sideband channel #2 to be delayed if
pack data is quick to come up for sideband channel #1.  And because
data for channel #2 is read only 128 bytes at a time while pack data
is read 8192 bytes at a time, it is possible for many pack blocks to
be sent to the client before the progress message fifo is emptied,
making the situation even worse.  This would result in totally garbled
progress display on the client's console as local progress gets mixed
with partial remote progress lines.

Let's prevent such situations by giving transmission priority to
progress messages over pack data at all times.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-13 14:39:25 -08:00
Shawn O. Pearce 42526b478e Add stateless RPC options to upload-pack, receive-pack
When --stateless-rpc is passed as a command line parameter to
upload-pack or receive-pack the programs now assume they may
perform only a single read-write cycle with stdin and stdout.
This fits with the HTTP POST request processing model where a
program may read the request, write a response, and must exit.

When --advertise-refs is passed as a command line parameter only
the initial ref advertisement is output, and the program exits
immediately.  This fits with the HTTP GET request model, where
no request content is received but a response must be produced.

HTTP headers and/or environment are not processed here, but
instead are assumed to be handled by the program invoking
either service backend.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-04 17:58:14 -08:00
Shawn O. Pearce 78affc49de Add multi_ack_detailed capability to fetch-pack/upload-pack
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer.  This permits the client to better understand
the server's in-memory state.

The fetch-pack/upload-pack protocol now looks like:

NAK
---------------------------------
  Always sent in response to "done" if there was no common base
  selected from the "have" lines (or no have lines were sent).

  * no multi_ack or multi_ack_detailed:

    Sent when the client has sent a pkt-line flush ("0000") and
    the server has not yet found a common base object.

  * either multi_ack or multi_ack_detailed:

    Always sent in response to a pkt-line flush.

ACK %s
-----------------------------------
  * no multi_ack or multi_ack_detailed:

    Sent in response to "have" when the object exists on the remote
    side and is therefore an object in common between the peers.
    The argument is the SHA-1 of the common object.

  * either multi_ack or multi_ack_detailed:

    Sent in response to "done" if there are common objects.
    The argument is the last SHA-1 determined to be common.

ACK %s continue
-----------------------------------
  * multi_ack only:

    Sent in response to "have".

    The remote side wants the client to consider this object as
    common, and immediately stop transmitting additional "have"
    lines for objects that are reachable from it.  The reason
    the client should stop is not given, but is one of the two
    cases below available under multi_ack_detailed.

ACK %s common
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".  Both sides have this object.
    Like with "ACK %s continue" above the client should stop
    sending have lines reachable for objects from the argument.

ACK %s ready
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".

    The client should stop transmitting objects which are reachable
    from the argument, and send "done" soon to get the objects.

    If the remote side has the specified object, it should
    first send an "ACK %s common" message prior to sending
    "ACK %s ready".

    Clients may still submit additional "have" lines if there are
    more side branches for the client to explore that might be added
    to the common set and reduce the number of objects to transfer.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 19:20:54 -07:00
Jim Meyering 41698375ad don't dereference NULL upon fdopen failure
There were several unchecked use of fdopen(); replace them with xfdopen()
that checks and dies.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:32:20 -07:00
Jim Meyering 2b7ca830c6 use write_str_in_full helper to avoid literal string lengths
In 2d14d65 (Use a clearer style to issue commands to remote helpers,
2009-09-03) I happened to notice two changes like this:

-	write_in_full(helper->in, "list\n", 5);
+
+	strbuf_addstr(&buf, "list\n");
+	write_in_full(helper->in, buf.buf, buf.len);
+	strbuf_reset(&buf);

IMHO, it would be better to define a new function,

    static inline ssize_t write_str_in_full(int fd, const char *str)
    {
           return write_in_full(fd, str, strlen(str));
    }

and then use it like this:

-       strbuf_addstr(&buf, "list\n");
-       write_in_full(helper->in, buf.buf, buf.len);
-       strbuf_reset(&buf);
+       write_str_in_full(helper->in, "list\n");

Thus not requiring the added allocation, and still avoiding
the maintenance risk of literal string lengths.
These days, compilers are good enough that strlen("literal")
imposes no run-time cost.

Transformed via this:

    perl -pi -e \
        's/write_in_full\((.*?), (".*?"), \d+\)/write_str_in_full($1, $2)/'\
      $(git grep -l 'write_in_full.*"')

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:31:10 -07:00
Junio C Hamano e4d1afbcf2 Merge branch 'jc/upload-pack-hook'
* jc/upload-pack-hook:
  upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
  upload-pack: add a trigger for post-upload-pack hook
2009-09-07 15:24:47 -07:00
Junio C Hamano 8e4384fd44 Merge branch 'np/maint-1.6.3-deepen'
* np/maint-1.6.3-deepen:
  pack-objects: free preferred base memory after usage
  make shallow repository deepening more network efficient
2009-09-07 15:23:50 -07:00
Nicolas Pitre 6523078b96 make shallow repository deepening more network efficient
First of all, I can't find any reason why thin pack generation is
explicitly disabled when dealing with a shallow repository.  The
possible delta base objects are collected from the edge commits which
are always obtained through history walking with the same shallow refs
as the client, Therefore the client is always going to have those base
objects available. So let's remove that restriction.

Then we can make shallow repository deepening much more efficient by
using the remote's unshallowed commits as edge commits to get preferred
base objects for thin pack generation.  On git.git, this makes the data
transfer for the deepening of a shallow repository from depth 1 to depth 2
around 134 KB instead of 3.68 MB.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-05 22:25:26 -07:00
Brian Gianforcaro eeefa7c90e Style fixes, add a space after if/for/while.
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.

Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31 23:26:28 -07:00
Junio C Hamano 11cae066b2 upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
A request to clone the repository does not give any "have" but asks for
all the refs we offer with "want".  When a request does not ask to clone
the repository fully, but asks to fetch some refs into an empty
repository, it will not give any "have" but its "want" won't ask for all
the refs we offer.

If we suppose (and I would say this is a rather big if) that it makes
sense to distinguish these two cases, a hook cannot reliably do this
alone.  The hook can detect lack of "have" and bunch of "want", but there
is no direct way to tell if the other end asked for all refs we offered,
or merely most of them.

Between the time we talked with the other end and the time the hook got
called, we may have acquired more refs or lost some refs in the repository
by concurrent operations.  Given that we plan to introduce selective
advertisement of refs with a protocol extension, it would become even more
difficult for hooks to guess between these two cases.

This adds "kind [clone|fetch]" to hook's input, as a stable interface to
allow the hooks to tell these cases apart.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:24 -07:00
Junio C Hamano a8563ec851 upload-pack: add a trigger for post-upload-pack hook
After upload-pack successfully finishes its operation, post-upload-pack
hook can be called for logging purposes.

The hook is passed various pieces of information, one per line, from its
standard input.  Currently the following items can be fed to the hook, but
more types of information may be added in the future:

    want SHA-1::
        40-byte hexadecimal object name the client asked to include in the
        resulting pack.  Can occur one or more times in the input.

    have SHA-1::
        40-byte hexadecimal object name the client asked to exclude from
        the resulting pack, claiming to have them already.  Can occur zero
        or more times in the input.

    time float::
        Number of seconds spent for creating the packfile.

    size decimal::
        Size of the resulting packfile in bytes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:17 -07:00
Junio C Hamano f00ecbe42b Merge branch 'cc/replace'
* cc/replace:
  t6050: check pushing something based on a replaced commit
  Documentation: add documentation for "git replace"
  Add git-replace to .gitignore
  builtin-replace: use "usage_msg_opt" to give better error messages
  parse-options: add new function "usage_msg_opt"
  builtin-replace: teach "git replace" to actually replace
  Add new "git replace" command
  environment: add global variable to disable replacement
  mktag: call "check_sha1_signature" with the replacement sha1
  replace_object: add a test case
  object: call "check_sha1_signature" with the replacement sha1
  sha1_file: add a "read_sha1_file_repl" function
  replace_object: add mechanism to replace objects found in "refs/replace/"
  refs: add a "for_each_replace_ref" function
2009-08-21 18:47:53 -07:00
Junio C Hamano 7d1b509812 Merge branch 'ne/futz-upload-pack'
* ne/futz-upload-pack:
  Shift object enumeration out of upload-pack

Conflicts:
	upload-pack.c
2009-08-05 12:38:29 -07:00
Johannes Sixt 9462e3f59c upload-pack: squelch progress indicator if client cannot see it
upload-pack runs pack-objects, which generates progress indicator output
on its stderr. If the client requests a sideband, this indicator is sent
to the client; but if it did not, then the progress is written to
upload-pack's own stderr.

If upload-pack is itself run from git-daemon (and if the client did not
request a sideband) the progress indicator never reaches the client and it
need not be generated in the first place. With this patch the progress
indicator is suppressed in this situation.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-18 11:38:43 -07:00
Nick Edelen f0cea83f63 Shift object enumeration out of upload-pack
Offload object enumeration in upload-pack to pack-objects, but fall
back on internal revision walker for shallow interaction.   Aside from
architecturally making more sense, this also leaves the door open for
pack-objects to employ a revision cache mechanism.  Test t5530 updated
in order to explicitly check both enumeration methods.

Signed-off-by: Nick Edelen <sirnot@gmail.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-09 23:49:31 -07:00
Christian Couder dae556bdb1 environment: add global variable to disable replacement
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:02:59 -07:00
Junio C Hamano 7d71be242d Merge branch 'lt/pack-object-memuse' into maint
* lt/pack-object-memuse:
  show_object(): push path_name() call further down
  process_{tree,blob}: show objects without buffering
2009-05-03 15:02:40 -07:00
Junio C Hamano 9824a388e5 Merge branch 'lt/pack-object-memuse'
* lt/pack-object-memuse:
  show_object(): push path_name() call further down
  process_{tree,blob}: show objects without buffering

Conflicts:
	builtin-pack-objects.c
	builtin-rev-list.c
	list-objects.c
	list-objects.h
	upload-pack.c
2009-04-18 14:46:17 -07:00
Linus Torvalds cf2ab916af show_object(): push path_name() call further down
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow

 - more clarity into who "owns" the name (ie now when we free the name in
   the show_object callback, it's because we generated it ourselves by
   calling path_name())

 - not calling path_name() at all, either because we don't care about the
   name in the first place, or because we are actually happy walking the
   linked list of "struct name_path *" and the last component.

Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.

Why? We use that name for two things:
 - add_preferred_base_object(), which actually _wants_ to traverse the
   path, and now does it by looking for '/' characters!
 - for 'name_hash()', which only cares about the last 16 characters of a
   name, so again, generating the full name seems to be just unnecessary
   work.

Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.

This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-12 17:28:31 -07:00
Linus Torvalds 8d2dfc49b1 process_{tree,blob}: show objects without buffering
Here's a less trivial thing, and slightly more dubious one.

I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.

Now, there are possible downsides to this:

 - the "buffer using object_array" _can_ in theory result in at least
   better I-cache usage (two tight loops rather than one more spread out
   one). I don't think this is a real issue, but in theory..

 - this _does_ change the order of the objects printed. Instead of doing a
   "process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
   over the commits (which puts all the root trees _first_ in the object
   list, this patch just adds them to the list of pending objects, and
   then we'll traverse them in that order (and thus show each root tree
   object together with the objects we discover under it)

   I _think_ the new ordering actually makes more sense, but the object
   ordering is actually a subtle thing when it comes to packing
   efficiency, so any change in order is going to have implications for
   packing. Good or bad, I dunno.

 - There may be some reason why we did it that odd way with the object
   array, that I have simply forgotten.

Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.

This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...

Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.

Or maybe there _used_ to be a reason, and no longer is.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-12 17:28:31 -07:00
Christian Couder 11c211fa06 list-objects: add "void *data" parameter to show functions
The goal of this patch is to get rid of the "static struct rev_info
revs" static variable in "builtin-rev-list.c".

To do that, we need to pass the revs to the "show_commit" function
in "builtin-rev-list.c" and this in turn means that the
"traverse_commit_list" function in "list-objects.c" must be passed
functions pointers to functions with 2 parameters instead of one.

So we have to change all the callers and all the functions passed
to "traverse_commit_list".

Anyway this makes the code more clean and more generic, so it
should be a good thing in the long run.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-07 22:12:38 -07:00
Benjamin Kramer fd13b21f52 Move local variables to narrower scopes
These weren't used outside and can be safely moved

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:23 -08:00
Jeff King 05ac6b34e2 improve missing repository error message
Certain remote commands, when asked to do something in a
particular directory that was not actually a git repository,
would say "unable to chdir or not a git archive". The
"chdir" bit is an unnecessary detail, and the term "git
archive" is much less common these days than "git repository".

So let's switch them all to:

  fatal: '%s' does not appear to be a git repository

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-04 20:37:21 -08:00
Alexander Potashev 34263de026 Replace deprecated dashed git commands in usage
Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-04 15:08:49 -08:00
Steffen Prohaska 2fb3f6db96 Add calls to git_extract_argv0_path() in programs that call git_config_*
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().

This commit adds the necessary calls.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-26 00:26:05 -08:00
Junio C Hamano 7e44c93558 'git foo' program identifies itself without dash in die() messages
This is a mechanical conversion of all '*.c' files with:

	s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;

The result was manually inspected and no false positive was found.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31 09:39:19 -07:00
Johannes Sixt e1464ca7bb Record the command invocation path early
We will need the command invocation path in system_path(). This path was
passed to setup_path(), but  system_path() can be called earlier, for
example via:

    main
      commit_pager_choice
        setup_pager
          git_config
            git_etc_gitconfig
              system_path

Therefore, we introduce git_set_argv0_path() and call it as soon as
possible.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-25 17:41:13 -07:00
Johannes Sixt 618ebe9ff9 Windows: Implement asynchronous functions as threads.
In upload-pack we must explicitly close the output channel of rev-list.
(On Unix, the channel is closed automatically because process that runs
rev-list terminates.)

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
2008-06-26 08:45:08 +02:00
Shawn O. Pearce 348e390b17 Teach fetch-pack/upload-pack about --include-tag
The new protocol extension "include-tag" allows the client side
of the connection (fetch-pack) to request that the server side of the
native git protocol (upload-pack / pack-objects) use --include-tag
as it prepares the packfile, thus ensuring that an annotated tag object
will be included in the resulting packfile if the object it refers to
was also included into the packfile.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-04 23:28:14 -08:00
Shawn O. Pearce 49aaddd102 Teach upload-pack to log the received need lines to an fd
To facilitate testing and verification of the requests sent by
git-fetch to the remote side we permit logging the received packet
lines to the file descriptor specified in GIT_DEBUG_SEND_PACK has
been set.  Special start and end lines are included to indicate
the start and end of each connection.

  $ GIT_DEBUG_SEND_PACK=3 git fetch 3>UPLOAD_LOG
  $ cat UPLOAD_LOG
  #S
  want 8e10cf4e007ad7e003463c30c34b1050b039db78 multi_ack side-band-64k thin-pack ofs-delta
  want ddfa4a33562179aca1ace2bcc662244a17d0b503
  #E
  #S
  want 3253df4d1cf6fb138b52b1938473bcfec1483223 multi_ack side-band-64k thin-pack ofs-delta
  #E

>From the above trace the first connection opened by git-fetch was to
download two refs (with values 8e and dd) and the second connection
was opened to automatically follow an annotated tag (32).

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-03 00:05:45 -08:00
Junio C Hamano eadbcd498a Merge branch 'mk/maint-parse-careful'
* mk/maint-parse-careful:
  receive-pack: use strict mode for unpacking objects
  index-pack: introduce checking mode
  unpack-objects: prevent writing of inconsistent objects
  unpack-object: cache for non written objects
  add common fsck error printing function
  builtin-fsck: move common object checking code to fsck.c
  builtin-fsck: reports missing parent commits
  Remove unused object-ref code
  builtin-fsck: move away from object-refs to fsck_walk
  add generic, type aware object chain walker

Conflicts:

	Makefile
	builtin-fsck.c
2008-03-02 15:11:07 -08:00
Martin Koegler 7914053ba9 Remove unused object-ref code
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:35 -08:00
Junio C Hamano ee4f06c0a6 Merge branch 'mk/maint-parse-careful'
* mk/maint-parse-careful:
  peel_onion: handle NULL
  check return value from parse_commit() in various functions
  parse_commit: don't fail, if object is NULL
  revision.c: handle tag->tagged == NULL
  reachable.c::process_tree/blob: check for NULL
  process_tag: handle tag->tagged == NULL
  check results of parse_commit in merge_bases
  list-objects.c::process_tree/blob: check for NULL
  reachable.c::add_one_tree: handle NULL from lookup_tree
  mark_blob/tree_uninteresting: check for NULL
  get_sha1_oneline: check return value of parse_object
  read_object_with_reference: don't read beyond the buffer
2008-02-18 20:56:01 -08:00
Martin Koegler dec38c8165 check return value from parse_commit() in various functions
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-18 20:49:13 -08:00
Martin Koegler 3d51e1b5b8 check return code of prepare_revision_walk
A failure in prepare_revision_walk can be caused by
a not parseable object.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-17 23:51:12 -08:00
Martin Koegler affeef12fb deref_tag: handle return value NULL
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-17 23:46:55 -08:00
Johannes Sixt 04b330551e upload-pack: Initialize the exec-path.
Since git-upload-pack has to spawn git-pack-objects, it has to make sure
that the latter can be found in the PATH. Without this patch an attempt
to clone or pull via ssh from a server fails if the git tools are not in
the standard PATH on the server even though git clone or git pull were
invoked with --upload-pack=/path/to/git-upload-pack.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-13 12:04:01 -08:00
Johannes Sixt 4c324c0050 upload-pack: Use finish_{command,async}() instead of waitpid().
upload-pack spawns two processes, rev-list and pack-objects, and carefully
monitors their status so that it can report failure to the remote end.
This change removes the complicated procedures on the grounds of the
following observations:

- If everything is OK, rev-list closes its output pipe end, upon which
  pack-objects (which reads from the pipe) sees EOF and terminates itself,
  closing its output (and error) pipes. upload-pack reads from both until
  it sees EOF in both. It collects the exit codes of the child processes
  (which indicate success) and terminates successfully.

- If rev-list sees an error, it closes its output and terminates with
  failure. pack-objects sees EOF in its input and terminates successfully.
  Again upload-pack reads its inputs until EOF. When it now collects
  the exit codes of its child processes, it notices the failure of rev-list
  and signals failure to the remote end.

- If pack-objects sees an error, it terminates with failure. Since this
  breaks the pipe to rev-list, rev-list is killed with SIGPIPE.
  upload-pack reads its input until EOF, then collects the exit codes of
  the child processes, notices their failures, and signals failure to the
  remote end.

- If upload-pack itself dies unexpectedly, pack-objects is killed with
  SIGPIPE, and subsequently also rev-list.

The upshot of this is that precise monitoring of child processes is not
required because both terminate if either one of them dies unexpectedly.
This allows us to use finish_command() and finish_async() instead of
an explicit waitpid(2) call.

The change is smaller than it looks because most of it only reduces the
indentation of a large part of the inner loop.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-05 22:47:28 -08:00
Johannes Sixt 21edd3f197 upload-pack: Run rev-list in an asynchronous function.
This gets rid of an explicit fork().

Since upload-pack has to coordinate two processes (rev-list and
pack-objects), we cannot use the normal finish_async(), but have to monitor
the process explicitly. Hence, there are no changes at this front.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:30:42 -04:00
Johannes Sixt 80ccaa78a8 upload-pack: Move the revision walker into a separate function.
This allows us later to use start_async() with this function, and at
the same time is a nice cleanup that makes a long function
(create_pack_file()) shorter.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:30:41 -04:00
Johannes Sixt cc41fa8da9 upload-pack: Use start_command() to run pack-objects in create_pack_file().
This gets rid of an explicit fork/exec.

Since upload-pack has to coordinate two processes (rev-list and
pack-objects), we cannot use the normal finish_command(), but have to
monitor the processes explicitly. Hence, the waitpid() call remains.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:30:40 -04:00
Junio C Hamano 16befb8b7f Even more missing static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-08 02:54:57 -07:00
Junio C Hamano a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
H. Peter Anvin 465b3518a9 git-upload-pack: make sure we close unused pipe ends
Right now, we don't close the read end of the pipe when git-upload-pack
runs git-pack-object, so we hang forever (why don't we get SIGALRM?)
instead of dying with SIGPIPE if the latter dies, which seems to be the
norm if the client disconnects.

Thanks to Johannes Schindelin <Johannes.Schindelin@gmx.de> for
pointing out where this close() needed to go.

This patch has been tested on kernel.org for several weeks and appear
to resolve the problem of git-upload-pack processes hanging around
forever.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-27 17:05:12 -07:00
Junio C Hamano 3ddad98b74 Merge branch 'js/fetch-progress' (early part)
* 'js/fetch-progress' (early part):
  Fixup no-progress for fetch & clone
  fetch & clone: do not output progress when not on a tty

Conflicts:

	git-fetch.sh
2007-03-04 17:31:21 -08:00
Johannes Schindelin b0e908977e Fixup no-progress for fetch & clone
The intent of the commit 'fetch & clone: do not output progress when
not on a tty' was to make fetching and cloning less chatty when
output was not redirected (such as in a cron job).

However, there was a serious thinko in that commit. It assumed that
the client _and_ the server got this update at the same time. But
this is obviously not the case, and therefore upload-pack died on
seeing the option "--no-progress".

This patch fixes that issue by making it a protocol option. So, until
your server is updated, you still see the progress, but once the
server has this patch, it will be quiet.

A minor issue was also fixed: when cloning, the checkout did not
heed no_progress.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-24 00:26:18 -08:00
Junio C Hamano 599065a3bb prefixcmp(): fix-up mechanical conversion.
Previous step converted use of strncmp() with literal string
mechanically even when the result is only used as a boolean:

    if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo")))

This step manually cleans them up to read:

    if (!prefixcmp(arg, "foo"))

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Junio C Hamano cc44c7655f Mechanical conversion to use prefixcmp()
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily.  Leftover from this will be fixed in a separate step, including
idiotic conversions like

    if (!strncmp("foo", arg, 3))

  =>

    if (!(-prefixcmp(arg, "foo")))

This was done by using this script in px.perl

   #!/usr/bin/perl -i.bak -p
   if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
           s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
   }
   if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
           s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
   }

and running:

   $ git grep -l strncmp -- '*.c' | xargs perl px.perl

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Johannes Schindelin 83a5ad6126 fetch & clone: do not output progress when not on a tty
This adds the option "--no-progress" to fetch-pack and upload-pack,
and makes fetch and clone pass this option when stdout is not a tty.

While at documenting that option, also document --strict and --timeout
options for upload-pack.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-19 19:20:05 -08:00