This puts the determination of options to pass to each ssh variant
(see ssh.variant in git-config(1)) in one place.
A follow-up patch will use this in an initial dry run to detect which
variant to use when the ssh command is ambiguous.
No functional change intended yet.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git_connect function is growing long. Split the portion that
discovers an ssh command and options it accepts before the service
name and path to a separate function to make it easier to read.
No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git_connect function is growing long. Split the
PROTO_GIT-specific portion to a separate function to make it easier to
read.
No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_connect has the structure
struct child_process *conn = &no_fork;
...
switch (protocol) {
case PROTO_GIT:
if (git_use_proxy(hostandport))
conn = git_proxy_connect(fd, hostandport);
else
git_tcp_connect(fd, hostandport, flags);
...
break;
case PROTO_SSH:
conn = xmalloc(sizeof(*conn));
child_process_init(conn);
argv_array_push(&conn->args, ssh);
...
break;
...
return conn;
In all cases except the git_tcp_connect case, conn is explicitly
assigned a value. Make the code clearer by explicitly assigning
'conn = &no_fork' in the tcp case and eliminating the default so the
compiler can ensure conn is always correctly assigned.
Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the 'ssh' transport, the '-o' option is used to specify an
environment variable which should be set on the remote end. This allows
git to send additional information when contacting the server,
requesting the use of a different protocol version via the
'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL".
Unfortunately not all ssh variants support the sending of environment
variables to the remote end. To account for this, only use the '-o'
option for ssh variants which are OpenSSH compliant. This is done by
checking that the basename of the ssh command is 'ssh' or the ssh
variant is overridden to be 'ssh' (via the ssh.variant config).
Other options like '-p' and '-P', which are used to specify a specific
port to use, or '-4' and '-6', which are used to indicate that IPV4 or
IPV6 addresses should be used, may also not be supported by all ssh
variants.
Currently if an ssh command's basename wasn't 'plink' or
'tortoiseplink' git assumes that the command is an OpenSSH variant.
Since user configured ssh commands may not be OpenSSH compliant, tighten
this constraint and assume a variant of 'simple' if the basename of the
command doesn't match the variants known to git. The new ssh variant
'simple' will only have the host and command to execute ([username@]host
command) passed as parameters to the ssh command.
Update the Documentation to better reflect the command-line options sent
to ssh commands based on their variant.
Reported-by: Jeffrey Yasskin <jyasskin@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the connection logic to tell a serve that it understands protocol
v1. This is done in 2 different ways for the builtin transports, both
of which ultimately set 'GIT_PROTOCOL' to 'version=1' on the server.
1. git://
A normal request to git-daemon is structured as
"command path/to/repo\0host=..\0" and due to a bug introduced in
49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we
aren't able to place any extra arguments (separated by NULs) besides
the host otherwise the parsing of those arguments would enter an
infinite loop. This bug was fixed in 73bb33a94 (daemon: Strictly
parse the "extra arg" part of the command, 2009-06-04) but a check
was put in place to disallow extra arguments so that new clients
wouldn't trigger this bug in older servers.
In order to get around this limitation git-daemon was taught to
recognize additional request arguments hidden behind a second
NUL byte. Requests can then be structured like:
"command path/to/repo\0host=..\0\0version=1\0key=value\0".
git-daemon can then parse out the extra arguments and set
'GIT_PROTOCOL' accordingly.
By placing these extra arguments behind a second NUL byte we can
skirt around both the infinite loop bug in 49ba83fb6 (Add
virtualization support to git-daemon, 2006-09-19) as well as the
explicit disallowing of extra arguments introduced in 73bb33a94
(daemon: Strictly parse the "extra arg" part of the command,
2009-06-04) because both of these versions of git-daemon check for a
single NUL byte after the host argument before terminating the
argument parsing.
2. ssh://, file://
Set 'GIT_PROTOCOL' environment variable with the desired protocol
version. With the file:// transport, 'GIT_PROTOCOL' can be set
explicitly in the locally running git-upload-pack or git-receive-pack
processes. With the ssh:// transport and OpenSSH compliant ssh
programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o
SendEnv=GIT_PROTOCOL' and having the server whitelist this
environment variable.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response. This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.
This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.
As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, printing a warning message when
encountering capabilities on other lines.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce the scope of the variable cmd and release it before returning
early.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we get a repo path like "-repo.git", we may try to invoke
"git-upload-pack -repo.git". This is going to fail, since
upload-pack will interpret it as a set of bogus options. But
let's reject this before we even run the sub-program, since
we would not want to allow any mischief with repo names that
actually are real command-line options.
You can still ask for such a path via git-daemon, but there's no
security problem there, because git-daemon enters the repo itself
and then passes "." on the command line.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have a GIT_PROXY_COMMAND configured, we will run it
with the host/port on the command-line. If a URL contains a
mischievous host like "--foo", we don't know how the proxy
command may handle it. It's likely to break, but it may also
do something dangerous and unwanted (technically it could
even do something useful, but that seems unlikely).
We should err on the side of caution and reject this before
we even run the command.
The hostname check matches the one we do in a similar
circumstance for ssh. The port check is not present for ssh,
but there it's not necessary because the syntax is "-p
<port>", and there's no ambiguity on the parsing side.
It's not clear whether you can actually get a negative port
to the proxy here or not. Doing:
git fetch git://remote:-1234/repo.git
keeps the "-1234" as part of the hostname, with the default
port of 9418. But it's a good idea to keep this check close
to the point of running the command to make it clear that
there's no way to circumvent it (and at worst it serves as a
belt-and-suspenders check).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We reject hostnames that start with a dash because they may
be confused for command-line options. Let's factor out that
notion into a helper function, as we'll use it in more
places. And while it's simple now, it's not clear if some
systems might need more complex logic to handle all cases.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands like "git fetch" talk with ssh://$rest_of_URL/, the
code splits $rest_of_URL into components like host, port, etc., and
then spawns the underlying "ssh" program by formulating argv[] array
that has:
- the path to ssh command taken from GIT_SSH_COMMAND, etc.
- dashed options like '-batch' (for Tortoise), '-p <port>' as
needed.
- ssh_host, which is supposed to be the hostname parsed out of
$rest_of_URL.
- then the command to be run on the other side, e.g. git
upload-pack.
If the ssh_host ends up getting '-<anything>', the argv[] that is
used to spawn the command becomes something like:
{ "ssh", "-p", "22", "-<anything>", "command", "to", "run", NULL }
which obviously is bogus, but depending on the actual value of
"<anything>", will make "ssh" parse and use it as an option.
Prevent this by forbidding ssh_host that begins with a "-".
Noticed-by: Joern Schneeweisz of Recurity Labs
Reported-by: Brian at GitLab
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we successfully parse a symref value like
"HEAD:refs/heads/master", we add the result to a string
list. But because the string list is marked
STRING_LIST_INIT_DUP, the string list code will make a copy
of the string and add the copy.
This patch fixes it by adding the entry with
string_list_append_nodup(), which lets the string list take
ownership of our newly allocated string. There are two
alternatives that seem like they would work, but aren't the
right solution.
The first is to initialize the list with the "NODUP"
initializer. That would avoid the copy, but then the string
list would not realize that it owns the strings. When we
eventually call string_list_clear(), it would not free the
strings, causing a leak.
The second option would be to use the normal
string_list_append(), but free the local copy in our
function. We can't do this because the local copy actually
contains _two_ strings; the symref name and its target. We
point to the target pointer via the "util" field, and its
memory must last as long as the string list does.
You may also wonder whether it's safe to ever free the local
copy, since the target points into it. The answer is yes,
because we duplicate it in annotaate_refs_with_symref_info
before clearing the string list.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see an error from split_cmdline(), we exit the
function without freeing the copy of the command string we
made.
This was sort-of introduced by 22e5ae5c8 (connect.c: handle
errors from split_cmdline, 2017-04-10). The leak existed
before that, but before that commit fixed the bug, we could
never trigger this else clause in the first place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e9d9a8a4d (connect: handle putty/plink also in
GIT_SSH_COMMAND, 2017-01-02) added a call to
split_cmdline(), but checks only for a non-zero return to
see if we got any output. Since the function returns
negative values (and a NULL argv) on error, we end up
dereferencing NULL and segfaulting.
Arguably we could report on the parsing error here, but it's
probably not worth it. This is a best-effort attempt to see
if we are using plink. So we can simply return here with
"no, it wasn't plink" and let the shell actually complain
about the bogus quoting.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, &E2)
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dd33e07766 ("connect: Add the envvar GIT_SSH_VARIANT and ssh.variant
config", 2017-02-01) attempted to add support for configuration and
environment variable to override the different handling of
port_option and needs_batch settings suitable for variants of the
ssh implementation that was autodetected by looking at the ssh
command name. Because it piggybacked on the code that turns command
name to specific override (e.g. "plink.exe" and "plink" means
port_option needs to be set to 'P' instead of the default 'p'), yet
it defined a separate namespace for these overrides (e.g. "putty"
can be usable to signal that port_option needs to be 'P'), however,
it made the auto-detection based on the command name less robust
(e.g. the code now accepts "putty" as a SSH command name and applies
the same override).
Separate the code that interprets the override that was read from
the configuration & environment from the original code that handles
the command names, as they are in separate namespaces, to fix this
confusion.
This incidentally also makes it easier for future enhancement of the
override syntax (e.g. "port_option=p,needs_batch=1" may want to be
accepted as a more explicit syntax) without affecting the code for
auto-detection based on the command name.
While at it, update the return type of the handle_ssh_variant()
helper function to void; the caller does not use it, and the
function does not return any meaningful value.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This environment variable and configuration value allow to
override the autodetection of plink/tortoiseplink in case that
Git gets it wrong.
[jes: wrapped overly-long lines, factored out and changed
get_ssh_variant() to handle_ssh_variant() to accomodate the
change from the putty/tortoiseplink variables to
port_option/needs_batch, adjusted the documentation, free()d
value obtained from the config.]
Signed-off-by: Segev Finer <segev208@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We handle plink and tortoiseplink as OpenSSH replacements, by passing
the correct command-line options when detecting that they are used.
To let users override that auto-detection (in case Git gets it wrong),
we need to introduce new code to that end.
In preparation for this code, let's factor out the SSH variant handling
into its own function, handle_ssh_variant().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of these two may have originally been named after "what exact
SSH implementation do we have?" so that we can tweak the command
line options for that exact implementation. But "putty=1" no longer
means "We are using the plink SSH implementation that comes with
PuTTY" these days. It is set when we guess that either PuTTY plink
or Tortoiseplink is in use.
Rename them after what effect is desired. The current 'putty'
option is about using "-P <port>" when OpenSSH would use "-p <port>",
so rename it to 'port_option' whose value is either 'p' or 'P". The
other one is about passing an extra command line option "-batch",
so rename it to 'needs_batch'.
[jes: wrapped overly-long line]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows has special support for the popular SSH client PuTTY:
when using PuTTY's non-interactive version ("plink.exe"), we use the -P
option to specify the port rather than OpenSSH's -p option. TortoiseGit
ships with its own, forked version of plink.exe, that adds support for
the -batch option, and for good measure we special-case that, too.
However, this special-casing of PuTTY only covers the case where the
user overrides the SSH command via the environment variable GIT_SSH
(which allows specifying the name of the executable), not
GIT_SSH_COMMAND (which allows specifying a full command, including
additional command-line options).
When users want to pass any additional arguments to (Tortoise-)Plink,
such as setting a private key, they are required to either use a shell
script named plink or tortoiseplink or duplicate the logic that is
already in Git for passing the correct style of command line arguments,
which can be difficult, error prone and annoying to get right.
This patch simply reuses the existing logic and expands it to cover
GIT_SSH_COMMAND, too.
Note: it may look a little heavy-handed to duplicate the entire
command-line and then split it, only to extract the name of the
executable. However, this is not a performance-critical code path, and
the code is much more readable this way.
Signed-off-by: Segev Finer <segev208@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The smudge/clean filter API expect an external process is spawned
to filter the contents for each path that has a filter defined. A
new type of "process" filter API has been added to allow the first
request to run the filter for a path to spawn a single process, and
all filtering need is served by this single process for multiple
paths, reducing the process creation overhead.
* ls/filter-process:
contrib/long-running-filter: add long running filter example
convert: add filter.<driver>.process option
convert: prepare filter.<driver>.process option
convert: make apply_filter() adhere to standard Git error handling
pkt-line: add functions to read/write flush terminated packet streams
pkt-line: add packet_write_gently()
pkt-line: add packet_flush_gently()
pkt-line: add packet_write_fmt_gently()
pkt-line: extract set_packet_header()
pkt-line: rename packet_write() to packet_write_fmt()
run-command: add clean_on_exit_handler
run-command: move check_pipe() from write_or_die to run_command
convert: modernize tests
convert: quote filter names in error messages
packet_write() should be called packet_write_fmt() because it is a
printf-like function that takes a format string as first parameter.
packet_write_fmt() should be used for text strings only. Arbitrary
binary data should use a new packet_write() function that is introduced
in a subsequent patch.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even more i18n.
* va/i18n-more:
i18n: stash: mark messages for translation
i18n: notes-merge: mark die messages for translation
i18n: ident: mark hint for translation
i18n: i18n: diff: mark die messages for translation
i18n: connect: mark die messages for translation
i18n: commit: mark message for translation
JGit can show a fake ref "capabilities^{}" to "git fetch" when it
does not advertise any refs, but "git fetch" was not prepared to
see such an advertisement. When the other side disconnects without
giving any ref advertisement, we used to say "there may not be a
repository at that URL", but we may have seen other advertisement
like "shallow" and ".have" in which case we definitely know that a
repository is there. The code to detect this case has also been
updated.
* jt/accept-capability-advertisement-when-fetching-from-void:
connect: advertized capability is not a ref
connect: tighten check for unexpected early hang up
tests: move test_lazy_prereq JGIT to test-lib.sh
Mark messages passed to die() in die_initial_contact().
Update test to reflect changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning an empty repository served by standard git, "git clone" produces
the following reassuring message:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
Meanwhile when cloning an empty repository served by JGit, the output is more
haphazard:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.
This is a common command to run immediately after creating a remote repository
as preparation for adding content to populate it and pushing. The warning is
confusing and needlessly worrying.
The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
with no refs in upload service., 2013-08-08), JGit's ref advertisement includes
a ref named capabilities^{} to advertise its capabilities on, while git's ref
advertisement is empty in this case. This allows the client to learn about the
server's capabilities and is needed, for example, for fetch-by-sha1 to work
when no refs are advertised.
This also affects "ls-remote". For example, against an empty repository served
by JGit:
$ git ls-remote git://localhost/tmp/empty
0000000000000000000000000000000000000000 capabilities^{}
Git advertises the same capabilities^{} ref in its ref advertisement for push
but since it never did so for fetch, the client didn't need to handle this
case. Handle it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A server hanging up immediately to mark access being denied does not
send any .have refs, shallow lines, or anything else before hanging
up. If the server has sent anything, then the hangup is unexpected.
That is, if the server hangs up after a shallow line but before sending
any refs, then git should tell me so:
fatal: The remote end hung up upon initial contact
instead of suggesting an access control problem:
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Noticed while examining this code. This case isn't likely to come up
in practice but tightening the check makes the code easier to read and
manipulate.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to $GIT_ASKPASS or $GIT_PROXY_COMMAND, we also read from
config file first then fall back to $GIT_SSH_COMMAND.
This is useful for selecting different private keys targetting the
same host (e.g. github)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The two alternative ways to spell "ssh://" transport have been
deprecated for a long time. The last mention of them has finally
removed from the documentation.
* cn/deprecate-ssh-git-url:
Disown ssh+git and git+ssh
Some people argue that these were silly from the beginning (see
http://thread.gmane.org/gmane.comp.version-control.git/285590/focus=285601
for example), but we have to support them for compatibility.
That doesn't mean we have to show them in the documentation. These
were already left out of the main list, but a reference in the main
manpage was left, so remove that.
Also add a note to discourage their use if anybody goes looking for them
in the source code.
Signed-off-by: Carlos Martín Nieto <cmn@dwim.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace an unsigned char array with struct object_id and express several
hard-coded constants in terms of GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.
Macintosh-specific breakage was noticed and corrected in this
reroll.
* jk/war-on-sprintf: (70 commits)
name-rev: use strip_suffix to avoid magic numbers
use strbuf_complete to conditionally append slash
fsck: use for_each_loose_file_in_objdir
Makefile: drop D_INO_IN_DIRENT build knob
fsck: drop inode-sorting code
convert strncpy to memcpy
notes: document length of fanout path with a constant
color: add color_set helper for copying raw colors
prefer memcpy to strcpy
help: clean up kfmclient munging
receive-pack: simplify keep_arg computation
avoid sprintf and strcpy with flex arrays
use alloc_ref rather than hand-allocating "struct ref"
color: add overflow checks for parsing colors
drop strcpy in favor of raw sha1_to_hex
use sha1_to_hex_r() instead of strcpy
daemon: use cld->env_array when re-spawning
stat_tracking_info: convert to argv_array
http-push: use an argv_array for setup_revisions
fetch-pack: use argv_array for index-pack / unpack-objects
...
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.
However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When executing user-specified programs, we generally always
want to use a shell, for flexibility and consistency. One
big exception is executing $GIT_SSH, which for historical
reasons must not use a shell.
Once upon a time the logic in git_connect looked like:
if (protocol == PROTO_SSH) {
... setup ssh ...
} else {
... setup local connection ...
conn->use_shell = 1;
}
But over time the PROTO_SSH block has grown, and the "local"
block has shrunk so that it contains only conn->use_shell;
it's easy to miss at the end of the large block. Moreover,
PROTO_SSH now also sometimes sets use_shell, when the new
GIT_SSH_COMMAND is used.
Let's just set conn->use_shell when we're setting up the "conn"
struct, and unset it (with a comment) in the historical GIT_SSH
case. This will make the flow easier to follow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we "switch" to another local repository to run the server
side of a fetch or push, we must clear the variables in
local_repo_env so that our local $GIT_DIR, etc, do not
pollute the upload-pack or receive-pack that is executing in
the "remote" repository.
We have never done so for ssh connections. For the most
part, nobody has noticed because ssh will not pass unknown
environment variables by default. However, it is not out of
the question for a user to configure ssh to pass along GIT_*
variables using SendEnv/AcceptEnv.
We can demonstrate the problem by using "git -c" on a local
command and seeing its impact on a remote repository. This
config ends up in $GIT_CONFIG_PARAMETERS. In the local case,
the config has no impact, but in the ssh transport, it does
(our test script has a fake ssh that passes through all
environment variables; this isn't normal, but does simulate
one possible setup).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The connection initiation code for "ssh" transport tried to absorb
differences between the stock "ssh" and Putty-supplied "plink" and
its derivatives, but the logic to tell that we are using "plink"
variants were too loose and falsely triggered when "plink" appeared
anywhere in the path (e.g. "/home/me/bin/uplink/ssh").
* bc/connect-plink:
connect: improve check for plink to reduce false positives
t5601: fix quotation error leading to skipped tests
connect: simplify SSH connection code path
The connection initiation code for "ssh" transport tried to absorb
differences between the stock "ssh" and Putty-supplied "plink" and
its derivatives, but the logic to tell that we are using "plink"
variants were too loose and falsely triggered when "plink" appeared
anywhere in the path (e.g. "/home/me/bin/uplink/ssh").
* bc/connect-plink:
connect: improve check for plink to reduce false positives
t5601: fix quotation error leading to skipped tests
connect: simplify SSH connection code path
The git_connect function has code to handle plink and tortoiseplink
specially, as they require different command line arguments from
OpenSSH (-P instead of -p for ports; tortoiseplink additionally requires
-batch). However, the match was done by checking for "plink" anywhere
in the string, which led to a GIT_SSH value containing "uplink" being
treated as an invocation of putty's plink.
Improve the check by looking for "plink" or "tortoiseplink" (or those
names suffixed with ".exe") only in the final component of the path.
This has the downside that a program such as "plink-0.63" would no
longer be recognized, but the increased robustness is likely worth it.
Add tests to cover these cases to avoid regressions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code path used in git_connect pushed the majority of the SSH
connection code into an else block, even though the if block returns.
Simplify the code by eliminating the else block, as it is unneeded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier update to the parser that disects a URL broke an
address, followed by a colon, followed by an empty string (instead
of the port number), e.g. ssh://example.com:/path/to/repo.
* tb/connect-ipv6-parse-fix:
connect.c: ignore extra colon after hostname
An earlier update to the parser that disects an address broke an
address, followed by a colon, followed by an empty string (instead
of the port number).
* tb/connect-ipv6-parse-fix:
connect.c: ignore extra colon after hostname
Ignore an extra ':' at the end of the hostname in URL's like
"ssh://example.com:/path/to/repo"
The colon is meant to separate a port number from the hostname.
If the port is empty, the colon should be ignored, see RFC 3986.
It had been working for URLs with ssh:// scheme, but was unintentionally
broken in 86ceb3, "allow ssh://user@[2001:db8::1]/repo.git"
Reported-by: Reid Woodbury Jr. <reidw@rawsound.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We did not parse username followed by literal IPv6 address in SSH
transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git
correctly.
* tb/connect-ipv6-parse-fix:
t5500: show user name and host in diag-url
t5601: add more test cases for IPV6
connect.c: allow ssh://user@[2001:db8::1]/repo.git
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.
* jk/daemon-interpolate:
daemon: sanitize incoming virtual hostname
t5570: test git-daemon's --interpolated-path option
git_connect: let user override virtual-host we send to daemon
When git_connect() is called to see how the URL is parsed for
debugging purposes with CONNECT_DIAG_URL set, the variable conn is
leaked. At this point in the codeflow, it only has its memory and
no other resource is associated with it, so it is sufficient to
clean it up by just freeing it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanups.
* rs/simple-cleanups:
sha1_name: use strlcpy() to copy strings
pretty: use starts_with() to check for a prefix
for-each-ref: use skip_prefix() to avoid duplicate string comparison
connect: use strcmp() for string comparison
We did not parse username followed by literal IPv6 address in SSH
transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git
correctly.
* tb/connect-ipv6-parse-fix:
t5500: show user name and host in diag-url
t5601: add more test cases for IPV6
connect.c: allow ssh://user@[2001:db8::1]/repo.git
Code cleanups.
* rs/simple-cleanups:
sha1_name: use strlcpy() to copy strings
pretty: use starts_with() to check for a prefix
for-each-ref: use skip_prefix() to avoid duplicate string comparison
connect: use strcmp() for string comparison
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.
* jk/daemon-interpolate:
daemon: sanitize incoming virtual hostname
t5570: test git-daemon's --interpolated-path option
git_connect: let user override virtual-host we send to daemon
The URL for ssh may have include a username before the hostname,
like ssh://user@host/repo.
When literal IPV6 addresses are used together with a username,
the substring "user@[::1]" must be converted into "user@::1".
Make that conversion visible for the user, and write userandhost
in the diagnostics
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ssh:// syntax was added in 2386d658 (Add first cut at "git
protocol" connect logic., 2005-07-13), it accepted
ssh://user@2001:db8::1/repo.git, which is now legacy.
Over the years the parser was improved to support [] and port numbers,
but the combination of ssh://user@[2001:db8::1]:222/repo.git did
never work.
The only only way to use a user name, a literall IPV6 address and a port
number was ssh://[user@2001:db8::1]:222/repo.git
(Thanks to Christian Taube <lists@hcf.yourweb.de> for reporting this long
standing issue)
New users would use ssh://user@[2001:db8::1]:222/repo.git,
so change the parser to handle it correctly.
Support the old legacy URLs as well, to be backwards compatible,
and avoid regressions for users which upgrade an existing installation
to a later Git version.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of magic string length constants and simply compare the strings
using strcmp(). This makes the intent of the code a bit clearer.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we connect to a git-daemon at a given host and port, we
actually send the string "localhost:9418" to the other side,
which allows it to do virtual-hosting lookups. For testing
and debugging, we'd like to be able to send arbitrary
strings, rather than the hostname we actually connected to.
Using "insteadOf" config does not work for this purpose, as
the hostname determination happens at a very low level,
right before we feed the hostname to our lookup routines.
You could use /etc/hosts or similar to get around this, but
we cannot do that portably from our test suite.
Instead, this patch provides an environment variable that
can be used to send an arbitrary string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote update --prune" to drop many refs has been optimized.
* mh/simplify-repack-without-refs:
sort_string_list(): rename to string_list_sort()
prune_remote(): iterate using for_each_string_list_item()
prune_remote(): rename local variable
repack_without_refs(): make the refnames argument a string_list
prune_remote(): sort delete_refs_list references en masse
prune_remote(): initialize both delete_refs lists in a single loop
prune_remote(): exit early if there are no stale references
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It may be impractical to install a wrapper script for GIT_SSH
when additional parameters need to be passed. Provide an alternative
way of specifying a shell command to be run, including command line
arguments, by means of the GIT_SSH_COMMAND environment variable,
which behaves like GIT_SSH but is passed to the shell.
The special circuitry to modify parameters in the case of using
PuTTY's plink/tortoiseplink is activated only when using GIT_SSH;
in the case of using GIT_SSH_COMMAND, it is deliberately left up to
the user to make any required parameters adaptation before calling
the underlying ssh implementation.
Signed-off-by: Thomas Quinot <thomas@quinot.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* rs/more-uses-of-skip-prefix:
pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt()
connect: simplify check_ref() using skip_prefix() and starts_with()
Both callers of check_ref() pass in NUL-terminated strings for name.
Remove the len parameter and then use skip_prefix() and starts_with()
instead of memcmp() to check if it starts with certain strings. This
gets rid of several magic string length constants and a strlen() call.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function for initializing those struct child_process
variables for which the macro CHILD_PROCESS_INIT can't be used.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step. The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.
For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids magic numbers when we allocate fixed-size argv
arrays, and makes it more obvious that we are not
overflowing.
It is also the first step to fixing a memory leak. When
git_connect returns a child_process struct, the argv array
in the struct is dynamically allocated, but the individual
strings are not (they are either owned elsewhere, or are
freed). Later, in finish_connect, we free the array but
leave the strings alone.
This works for the child_process created by git_connect, but
if we use transport_take_over, we may also end up with a
child_process created by transport-helper's get_helper.
In that case, the strings are freshly allocated, and we
would want to free them. However, we have no idea in
finish_connect which type we have.
By consistently using run-command's internal argv-array, we
do not have to worry about this issue at all; finish_command
takes care of it for us, and we can drop our manual free
entirely.
Note that this actually makes the get_helper leak slightly
worse; now we are leaking both the strings and the array.
But when we adjust it in a future patch, that leak will go
away entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).
* nd/shallow-clone: (31 commits)
t5537: fix incorrect expectation in test case 10
shallow: remove unused code
send-pack.c: mark a file-local function static
git-clone.txt: remove shallow clone limitations
prune: clean .git/shallow after pruning objects
clone: use git protocol for cloning shallow repo locally
send-pack: support pushing from a shallow clone via http
receive-pack: support pushing to a shallow clone via http
smart-http: support shallow fetch/clone
remote-curl: pass ref SHA-1 to fetch-pack as well
send-pack: support pushing to a shallow clone
receive-pack: allow pushes that update .git/shallow
connected.c: add new variant that runs with --shallow-file
add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
receive/send-pack: support pushing from a shallow clone
receive-pack: reorder some code in unpack()
fetch: add --update-shallow to accept refs that update .git/shallow
upload-pack: make sure deepening preserves shallow roots
fetch: support fetching from a shallow repository
clone: support remote shallow repository
...
Be more careful when parsing remote repository URL given in the
scp-style host:path notation.
* tb/clone-ssh-with-colon-for-port:
git_connect(): use common return point
connect.c: refactor url parsing
git_connect(): refactor the port handling for ssh
git fetch: support host:/~repo
t5500: add test cases for diag-url
git fetch-pack: add --diag-url
git_connect: factor out discovery of the protocol and its parts
git_connect: remove artificial limit of a remote command
t5601: add tests for ssh
t5601: remove clear_ssh, refactor setup_ssh_wrapper
No callers pass a non-empty pointer as shallow_points at this
stage. As a result, all clients still refuse to talk to shallow
repository on the other end.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The latter can do everything the former can and is used in many more
places.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use only one return point from git_connect(), doing the
free();
return conn;
only at one place in the code.
There may be a little confusion what the variable "host" is for. At
some places it is only the host part, at other places it may include
the port number, so change host into hostandport here.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the function is_local() in transport.c public, rename it into
url_is_local_not_ssh() and use it in both transport.c and connect.c
Use a protocol "local" for URLs for the local file system.
One note about using file:// under Windows:
The (absolute) path on Unix like system typically starts with "/".
When the host is empty, it can be omitted, so that a shell scriptlet
url=file://$pwd
will give a URL like "file:///home/user/repo".
Windows does not have the same concept of a root directory located in "/".
When parsing the URL allow "file://C:/user/repo"
(even if RFC1738 indicates that "file:///C:/user/repo" should be used).
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use get_host_and_port() even for ssh.
Remove the variable port git_connect(), and simplify parse_connect_url()
Use only one return point in git_connect(), doing the free() and return conn.
t5601 had 2 corner test cases which now pass.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation (in urls.txt) says that
"ssh://host:/~repo",
"host:/~repo" or
"host:~repo"
specify the repository "repo" in the home directory at "host".
This has not been working for "host:/~repo".
Before commit 356bec "Support [address] in URLs", the comparison
"url != hostname" could be used to determine if the URL had a scheme
or not: "ssh://host/host" != "host".
However, after 356bec "[::1]" was converted into "::1", yielding
url != hostname as well. To fix this regression, don't use
"if (url != hostname)", but look at the separator instead.
Rename the variable "c" into "separator" to make it easier to read.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main purpose is to trace the URL parser called by git_connect() in
connect.c
The main features of the parser can be listed as this:
- parse out host and path for URLs with a scheme (git:// file:// ssh://)
- parse host names embedded by [] correctly
- extract the port number, if present
- separate URLs like "file" (which are local)
from URLs like "host:repo" which should use ssh
Add the new parameter "--diag-url" to "git fetch-pack", which prints
the value for protocol, host and path to stderr and exits.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_connect has grown large due to the many different protocols syntaxes
that are supported. Move the part of the function that parses the URL to
connect to into a separate function for readability.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since day one, function git_connect() had a limit on the command line of
the command that is invoked to make a connection. 7a33bcbe converted the
code that constructs the command to strbuf. This would have been the
right time to remove the limit, but it did not happen. Remove it now.
git_connect() uses start_command() to invoke the command; consequently,
the limits of the system still apply, but are diagnosed only at execve()
time. But these limits are more lenient than the 1K that git_connect()
imposed.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One long-standing flaw in the pack transfer protocol used by "git
clone" was that there was no way to tell the other end which branch
"HEAD" points at, and the receiving end needed to guess. A new
capability has been defined in the pack protocol to convey this
information so that cloning from a repository with more than one
branches pointing at the same commit where the HEAD is at now
reliably sets the initial branch in the resulting repository.
* jc/upload-pack-send-symref:
t5570: Update for clone-progress-to-stderr branch
t5570: Update for symref capability
clone: test the new HEAD detection logic
connect: annotate refs with their symref information in get_remote_head()
connect.c: make parse_feature_value() static
upload-pack: send non-HEAD symbolic refs
upload-pack: send symbolic ref information as capability
upload-pack.c: do not pass confusing cb_data to mark_our_ref()
t5505: fix "set-head --auto with ambiguous HEAD" test