We will be adding new mark types in the future, so separate
the suffix data from the logic.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the previous commit introduced the branch_get_upstream
helper, there was one call-site that could not be converted:
the one in sha1_name.c, which gives detailed error messages
for each possible failure.
Let's teach the helper to optionally report these specific
errors. This lets us convert another callsite, and means we
can use the helper in other locations that want to give the
same error messages.
The logic and error messages come straight from sha1_name.c,
with the exception that we start each error with a lowercase
letter, as is our usual style (note that a few tests need
updated as a result).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard function isxdigit() to make the intent clearer and
avoid using magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanups.
* rs/simple-cleanups:
sha1_name: use strlcpy() to copy strings
pretty: use starts_with() to check for a prefix
for-each-ref: use skip_prefix() to avoid duplicate string comparison
connect: use strcmp() for string comparison
Use strlcpy() instead of calling strncpy() and then setting the last
byte of the target buffer to NUL explicitly. This shortens and
simplifies the code a bit.
Signed-of-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_merge_bases*() API was easy to misuse by careless
copy&paste coders, leaving object flags tainted in the commits that
needed to be traversed.
* jc/merge-bases:
get_merge_bases(): always clean-up object flags
bisect: clean flags after checking merge bases
The code to abbreviate an object name to its short unique prefix
has been optimized when no abbreviation was requested.
* mh/find-uniq-abbrev:
sha1_name: avoid unnecessary sha1 lookup in find_unique_abbrev
An example where this happens is when doing an ls-tree on a tree that
contains a commit link. In that case, find_unique_abbrev is called
to get a non-abbreviated hex sha1, but still, a lookup is done as
to whether the sha1 is in the repository (which ends up looking for
a loose object in .git/objects), while the result of that lookup is
not used when returning a non-abbreviated hex sha1.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callers of get_merge_bases() can choose to leave object flags
used during the merge-base traversal by passing cleanup=0 as a
parameter, but in practice a very few callers can afford to do so
(namely, "git merge-base"), as they need to compute merge base in
preparation for other processing of their own and they need to see
the object without contaminate flags.
Change the function signature of get_merge_bases_many() and
get_merge_bases() to drop the cleanup parameter, so that the
majority of the callers do not have to say ", 1" at the end.
Give a new get_merge_bases_many_dirty() API to support only a few
callers that know they do not need to spend cycles cleaning up the
object flags.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a reflog is deleted, e.g. when "git stash" clears its stashes,
"git rev-parse --verify --quiet" dies:
fatal: Log for refs/stash is empty.
The reason is that the get_sha1() code path does not allow us
to suppress this message.
Pass the flags bitfield through get_sha1_with_context() so that
read_ref_at() can suppress the message.
Use get_sha1_with_context1() instead of get_sha1() in rev-parse
so that the --quiet flag is honored.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a couple of "accumulate into a sorted list" to "accumulate and
then sort the list".
* rs/list-optim:
walker: avoid quadratic list insertion in mark_complete
sha1_name: avoid quadratic list insertion in handle_one_ref
Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
mark_complete), sort only after all refs are collected instead of while
inserting. The result is the same, but it's more efficient that way.
The difference will only be measurable in repositories with a large
number of refs.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A call to "dwim_ref(name, len, flags, &ref)" will allocate a
new string in "ref" to return the exact ref we found. We do
not consistently free it in all code paths, leading to small
leaks. The worst is in get_sha1_basic, which may be called
many times (e.g., by "cat-file --batch"), though it is
relatively unlikely, as it only triggers on a bogus reflog
specification.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/xstrfmt:
setup_git_env(): introduce git_path_from_env() helper
unique_path: fix unlikely heap overflow
walker_fetch: fix minor memory leak
merge: use argv_array when spawning merge strategy
sequencer: use argv_array_pushf
setup_git_env: use git_pathdup instead of xmalloc + sprintf
use xstrfmt to replace xmalloc + strcpy/strcat
use xstrfmt to replace xmalloc + sprintf
use xstrdup instead of xmalloc + strcpy
use xstrfmt in favor of manual size calculations
strbuf: add xstrfmt helper
* jk/skip-prefix:
http-push: refactor parsing of remote object names
imap-send: use skip_prefix instead of using magic numbers
use skip_prefix to avoid repeated calculations
git: avoid magic number with skip_prefix
fetch-pack: refactor parsing in get_ack
fast-import: refactor parsing of spaces
stat_opt: check extra strlen call
daemon: use skip_prefix to avoid magic numbers
fast-import: use skip_prefix for parsing input
use skip_prefix to avoid repeating strings
use skip_prefix to avoid magic numbers
transport-helper: avoid reading past end-of-string
fast-import: fix read of uninitialized argv memory
apply: use skip_prefix instead of raw addition
refactor skip_prefix to return a boolean
avoid using skip_prefix as a boolean
daemon: mark some strings as const
parse_diff_color_slot: drop ofs parameter
It's a common idiom to match a prefix and then skip past it
with strlen, like:
if (starts_with(foo, "bar"))
foo += strlen("bar");
This avoids magic numbers, but means we have to repeat the
string (and there is no compiler check that we didn't make a
typo in one of the strings).
We can use skip_prefix to handle this case without repeating
ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to get manual allocation calculations wrong, and
the use of strcpy/strcat raise red flags for people looking
for buffer overflows (though in this case each site was
fine).
It's also shorter to use xstrfmt, and the printf-format
tends to be easier for a reader to see what the final string
will look like.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.
For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.
This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For both of these sites, we already do the "fallback to
read_sha1_file" trick. But we can shorten the code by just
using get_commit_buffer.
Note that the error cases are slightly different when
read_sha1_file fails. get_commit_buffer will die() if the
object cannot be loaded, or is a non-commit.
For get_sha1_oneline, this will almost certainly never
happen, as we will have just called parse_object (and if it
does, it's probably worth complaining about).
For record_author_date, the new behavior is probably better;
we notify the user of the error instead of silently ignoring
it. And because it's used only for sorting by author-date,
somebody examining a corrupt repo can fallback to the
regular traversal order.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Attempts to show where a single-strand-of-pearls break in "git log"
output.
* nd/log-show-linear-break:
log: add --show-linear-break to help see non-linear history
object.h: centralize object flag allocation
While the field "flags" is mainly used by the revision walker, it is
also used in many other places. Centralize the whole flag allocation to
one place for a better overview (and easier to move flags if we have
too).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a handful of bugs around interpreting $branch@{upstream}
notation and its lookalike, when $branch part has interesting
characters, e.g. "@", and ":".
* jk/interpret-branch-name-fix:
interpret_branch_name: find all possible @-marks
interpret_branch_name: avoid @{upstream} past colon
interpret_branch_name: always respect "namelen" parameter
interpret_branch_name: rename "cp" variable to "at"
interpret_branch_name: factor out upstream handling
When we parse a string like "foo@{upstream}", we look for
the first "@"-sign, and check to see if it is an upstream
mark. However, since branch names can contain an @, we may
also see "@foo@{upstream}". In this case, we check only the
first @, and ignore the second. As a result, we do not find
the upstream.
We can solve this by iterating through all @-marks in the
string, and seeing if any is a legitimate upstream or
empty-at mark.
Another strategy would be to parse from the right-hand side
of the string. However, that does not work for the
"empty_at" case, which allows "@@{upstream}". We need to
find the left-most one in this case (and we then recurse as
"HEAD@{upstream}").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_sha1() cannot currently parse a valid object name like
"HEAD:@{upstream}" (assuming that such an oddly named file
exists in the HEAD commit). It takes two passes to parse the
string:
1. It first considers the whole thing as a ref, which
results in looking for the upstream of "HEAD:".
2. It finds the colon, parses "HEAD" as a tree-ish, and then
finds the path "@{upstream}" in the tree.
For a path that looks like a normal reflog (e.g.,
"HEAD:@{yesterday}"), the first pass is a no-op. We try to
dwim_ref("HEAD:"), that returns zero refs, and we proceed
with colon-parsing.
For "HEAD:@{upstream}", though, the first pass ends up in
interpret_upstream_mark, which tries to find the branch
"HEAD:". When it sees that the branch does not exist, it
actually dies rather than returning an error to the caller.
As a result, we never make it to the second pass.
One obvious way of fixing this would be to teach
interpret_upstream_mark to simply report "no, this isn't an
upstream" in such a case. However, that would make the
error-reporting for legitimate upstream cases significantly
worse. Something like "bogus@{upstream}" would simply report
"unknown revision: bogus@{upstream}", while the current code
diagnoses a wide variety of possible misconfigurations (no
such branch, branch exists but does not have upstream, etc).
However, we can take advantage of the fact that a branch
name cannot contain a colon. Therefore even if we find an
upstream mark, any prefix with a colon must mean that
the upstream mark we found is actually a pathname, and
should be disregarded completely. This patch implements that
logic.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
interpret_branch_name gets passed a "name" buffer to parse,
along with a "namelen" parameter representing its length. If
"namelen" is zero, we fallback to the NUL-terminated
string-length of "name".
However, it does not necessarily follow that if we have
gotten a non-zero "namelen", it is the NUL-terminated
string-length of "name". E.g., when get_sha1() is parsing
"foo:bar", we will be asked to operate only on the first
three characters.
Yet in interpret_branch_name and its helpers, we use string
functions like strchr() to operate on "name", looking past
the length we were given. This can result in us mis-parsing
object names. We should instead be limiting our search to
"namelen" bytes.
There are three distinct types of object names this patch
addresses:
- The intrepret_empty_at helper uses strchr to find the
next @-expression after our potential empty-at. In an
expression like "@:foo@bar", it erroneously thinks that
the second "@" is relevant, even if we were asked only
to look at the first character. This case is easy to
trigger (and we test it in this patch).
- When finding the initial @-mark for @{upstream}, we use
strchr. This means we might treat "foo:@{upstream}" as
the upstream for "foo:", even though we were asked only
to look at "foo". We cannot test this one in practice,
because it is masked by another bug (which is fixed in
the next patch).
- The interpret_nth_prior_checkout helper did not receive
the name length at all. This turns out not to be a
problem in practice, though, because its parsing is so
limited: it always starts from the far-left of the
string, and will not tolerate a colon (which is
currently the only way to get a smaller-than-strlen
"namelen"). However, it's still worth fixing to make the
code more obviously correct, and to future-proof us
against callers with more exotic buffers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original version of this function, "cp" acted as a
pointer to many different things. Since the refactoring in
the last patch, it only marks the at-sign in the string.
Let's use a more descriptive variable name.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function checks a few different @{}-constructs. The
early part checks for and dispatches us to helpers for each
construct, but the code for handling @{upstream} is inline.
Let's factor this out into its own function. This makes
interpret_branch_name more readable, and will make it much
simpler to further refactor the function in future patches.
While we're at it, let's also break apart the refactored
code into a few helper functions. These will be useful if we
eventually implement similar @{upstream}-like constructs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a 40-hex string into the object name, the string is
checked to see if it can be interpreted as a ref so that a warning
can be given for ambiguity. The code kicked in even when the
core.warnambiguousrefs is set to false to squelch this warning, in
which case the cycles spent to look at the ref namespace were an
expensive no-op, as the result was discarded without being used.
* br/sha1-name-40-hex-no-disambiguation:
sha1_name: don't resolve refs when core.warnambiguousrefs is false
When seeing a full 40-hex object name, get_sha1_basic()
unconditionally checks if the string can also be interpreted as a
refname, but the result will not be used unless warn_ambiguous_refs
is in effect.
Omitting this unnecessary ref resolution provides a substantial
performance improvement, especially when passing many hashes to a
command (like "git rev-list --stdin") and core.warnambiguousrefs is
set to false. The check incurs 6 stat()s for every hash supplied,
which can be costly over NFS.
Signed-off-by: Brodie Rao <brodie@sf.io>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.
* cc/starts-n-ends-with:
replace {pre,suf}fixcmp() with {starts,ends}_with()
strbuf: introduce starts_with() and ends_with()
builtin/remote: remove postfixcmp() and use suffixcmp() instead
environment: normalize use of prefixcmp() by removing " != 0"
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/robustify-parse-commit:
checkout: do not die when leaving broken detached HEAD
use parse_commit_or_die instead of custom message
use parse_commit_or_die instead of segfaulting
assume parse_commit checks for NULL commit
assume parse_commit checks commit->object.parsed
log_tree_diff: die when we fail to parse a commit
The parse_commit function will check whether it was passed a
NULL commit pointer, and if so, return an error. There is no
need for callers to check this separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of typing four capital letters "HEAD", you can say "@" now,
e.g. "git log @".
* fc/at-head:
Add new @ shortcut for HEAD
sha1-name: pass len argument to interpret_branch_name()
Make "foo^{tag}" to peel a tag to itself, i.e. no-op., and fail if
"foo" is not a tag. "git rev-parse --verify v1.0^{tag}" would be a
more convenient way to say "test $(git cat-file -t v1.0) = tag".
* rh/peeling-tag-to-tag:
peel_onion: do not assume length of x_type globals
peel_onion(): add support for <rev>^{tag}
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.
The only remaining instances of 'committish' are:
* variable, function, and macro names
* "(also committish)" in the definition of commit-ish in
gitglossary[7]
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are parsing "rev^{foo}", we check "foo" against the
various global type strings, like "commit_type",
"tree_type", etc. This is nicely abstracted, but then we
destroy the abstraction completely by using magic numbers
that must match the length of the type strings.
We could avoid these magic numbers by using skip_prefix. But
taking a step back, we can realize that using the
"commit_type" global is not really buying us anything. It is
not ever going to change from being "commit" without causing
severe breakage to existing uses. And even if it did change
for some crazy reason, we would want to evaluate its effects
on the "rev^{}" syntax, anyway.
Let's just switch these to using a custom string literal, as
we do for "rev^{object}". The resulting code is more robust
to changes in the type strings, and is more readable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Complete the <rev>^{<type>} family of object descriptors by having
<rev>^{tag} dereference <rev> until a tag object is found (or fail if
unable).
At first glance this may not seem very useful, as commits, trees, and
blobs cannot be peeled to a tag, and a tag would just peel to itself.
However, this can be used to ensure that <rev> names a tag object:
$ git rev-parse --verify v1.8.4^{tag}
04f013dc38
$ git rev-parse --verify master^{tag}
error: master^{tag}: expected tag type, but the object dereferences to tree type
fatal: Needed a single revision
Users can already ensure that <rev> is a tag object by checking the
output of 'git cat-file -t <rev>', but:
* users may expect <rev>^{tag} to exist given that <rev>^{commit},
<rev>^{tree}, and <rev>^{blob} all exist
* this syntax is more convenient/natural in some circumstances
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is useful to make sure we don't step outside the boundaries of what
we are interpreting at the moment. For example while interpreting
foobar@{u}~1, the job of interpret_branch_name() ends right before ~1,
but there's no way to figure that out inside the function, unless the
len argument is passed.
So let's do that.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cdfd94837b, as it
does not just apply to "@" (and forms with modifiers like @{u}
applied to it), but also affects e.g. "refs/heads/@/foo", which it
shouldn't.
The basic idea of giving a short-hand might be good, and the topic
can be retried later, but let's revert to avoid affecting existing
use cases for now for the upcoming release.
We spell config variables in camelCase instead of with_underscores.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If somebody wants to only know on-disk footprint of an object
without having to know its type or payload size, we can bypass a
lot of code to cheaply learn it.
* jk/cat-file-batch-optim:
Fix some sparse warnings
sha1_object_info_extended: pass object_info to helpers
sha1_object_info_extended: make type calculation optional
packed_object_info: make type lookup optional
packed_object_info: hoist delta type resolution to helper
sha1_loose_object_info: make type lookup optional
sha1_object_info_extended: rename "status" to "type"
cat-file: disable object/refname ambiguity check for batch mode