The "remainder" of hn/refs-errno-cleanup topic.
* ab/refs-errno-cleanup: (21 commits)
refs API: post-migration API renaming [2/2]
refs API: post-migration API renaming [1/2]
refs API: don't expose "errno" in run_transaction_hook()
refs API: make expand_ref() & repo_dwim_log() not set errno
refs API: make resolve_ref_unsafe() not set errno
refs API: make refs_ref_exists() not set errno
refs API: make refs_resolve_refdup() not set errno
refs tests: ignore ignore errno in test-ref-store helper
refs API: ignore errno in worktree.c's find_shared_symref()
refs API: ignore errno in worktree.c's add_head_info()
refs API: make files_copy_or_rename_ref() et al not set errno
refs API: make loose_fill_ref_dir() not set errno
refs API: make resolve_gitlink_ref() not set errno
refs API: remove refs_read_ref_full() wrapper
refs/files: remove "name exist?" check in lock_ref_oid_basic()
reflog tests: add --updateref tests
refs API: make refs_rename_ref_available() static
refs API: make parse_loose_ref_contents() not set errno
refs API: make refs_read_raw_ref() not set errno
refs API: add a version of refs_resolve_ref_unsafe() with "errno"
...
There is only one caller, builtin/checkout.c, and it hardcodes
force_create=1.
This argument was introduced in abd0cd3a30 (refs: new public ref function:
safe_create_reflog, 2015-07-21), which promised to immediately use it in a
follow-on commit, but that never happened.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using gcc-11 (or 12) to compile refs.o with -O3 results in:
In file included from hashmap.h:4,
from cache.h:6,
from refs.c:5:
In function ‘oidcpy’,
inlined from ‘ref_transaction_add_update’ at refs.c:1065:3,
inlined from ‘ref_transaction_update’ at refs.c:1094:2,
inlined from ‘ref_transaction_verify’ at refs.c:1132:9:
hash.h:262:9: warning: argument 2 null where non-null expected [-Wnonnull]
262 | memcpy(dst->hash, src->hash, GIT_MAX_RAWSZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:177,
from cache.h:4,
from refs.c:5:
refs.c: In function ‘ref_transaction_verify’:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared ‘nonnull’
43 | extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
| ^~~~~~
That call to memcpy() is in a conditional block that requires
REF_HAVE_NEW to be set. But in ref_transaction_update(), we make sure it
isn't set coming in:
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
and then only set it if the variable isn't NULL:
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
So it should be impossible to reach that memcpy() with a NULL oid. But
for whatever reason, gcc doesn't accept that hitting the BUG() means we
won't go any further, even though it's marked with the noreturn
attribute. And the conditional is correct; ALLOWED_FLAGS doesn't contain
HAVE_NEW or HAVE_OLD, and you can even simplify it to check for those
flags explicitly and the compiler still complains.
We can work around this by just clearing the disallowed flags
explicitly. This should be a noop because of the BUG() check, but it
makes the compiler happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow through the work to use the repo interface to access
submodule objects in-process, instead of abusing the alternate
object database interface.
* jt/no-abuse-alternate-odb-for-submodules:
submodule: trace adding submodule ODB as alternate
submodule: pass repo to check_has_commit()
object-file: only register submodule ODB if needed
merge-{ort,recursive}: remove add_submodule_odb()
refs: peeling non-the_repository iterators is BUG
refs: teach arbitrary repo support to iterators
refs: plumb repo into ref stores
Rename the transitory refs_werrres_ref_unsafe() function to
refs_resolve_ref_unsafe(), now that all callers of the old function
have learned to pass in a "failure_errno" parameter.
The coccinelle semantic patch added in the preceding commit works, but
I couldn't figure out how to get spatch(1) to re-flow these argument
lists (and sometimes make lines way too long), so this rename was done
with:
perl -pi -e 's/refs_werrres_ref_unsafe/refs_resolve_ref_unsafe/g' \
$(git grep -l refs_werrres_ref_unsafe -- '*.c')
But after that "make contrib/coccinelle/refs.cocci.patch" comes up
empty, so the result would have been the same. Let's remove that
transitory semantic patch file, we won't need to retain it for any
other in-flight changes, refs_werrres_ref_unsafe() only existed within
this patch series.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preceding commits all callers of refs_resolve_ref_unsafe() were
migrated to the transitory refs_werrres_ref_unsafe() function.
As a first step in getting rid of it let's remove the old function
from the public API (it went unused in a preceding commit).
We then provide both a coccinelle rule to do the rename, and a macro
to avoid breaking the existing callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In run_transaction_hook() we've checked errno since 6754159767 (refs:
implement reference transaction hook, 2020-06-19), let's reset errno
afterwards to make sure nobody using refs.c directly or indirectly
relies on it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of these two is rather trivial, and it's easy to see none of
their callers care about errno. So let's move them from
refs_resolve_ref_unsafe() to refs_resolve_ref_unsafe_with_errno(),
these were the last two callers, so we can get rid of that wrapper
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the resolve_ref_unsafe() wrapper function to use the underlying
refs_werrres_ref_unsafe() directly.
From a reading of the callers I determined that the only one who cared
about errno was a sequencer.c caller added in e47c6cafcb (commit:
move print_commit_summary() to libgit, 2017-11-24), I'm migrating it
to using refs_werrres_ref_unsafe() directly.
This adds another "set errno" instance, but in this case it's OK and
idiomatic. We are setting it just before calling die_errno(). We could
have some hypothetical die_errno_var(&saved_errno, ...) here, but I
don't think it's worth it. The problem with errno is subtle action at
distance, not this sort of thing. We already use this pattern in a
couple of places in wrapper.c
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move refs_ref_exists from the legacy refs_resolve_ref_unsafe() to the
new refs_werrres_ref_unsafe(). I have read its callers and determined
that they don't care about errno being set, in particular:
git grep -W -w -e refs_ref_exists -e ref_exists
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move refs_resolve_refdup() from the legacy refs_resolve_ref_unsafe()
to the new refs_werrres_ref_unsafe(). I have read its callers and
determined that they don't care about errno being set.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I have carefully read the upstream callers of resolve_gitlink_ref()
and determined that they don't care about errno.
So let's move away from the errno-setting refs_resolve_ref_unsafe()
wrapper to refs_werrres_ref_unsafe(), and explicitly ignore the errno
it sets for us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the refs_read_ref_full() wrapper in favor of migrating various
refs.c API users to the underlying refs_werrres_ref_unsafe() function.
A careful reading of these callers shows that the callers of this
function did not care about "errno", by moving away from the
refs_resolve_ref_unsafe() wrapper we can be sure that nothing relies
on it anymore.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the refs_rename_ref_available() function into
"refs/files-backend.c". It is file-backend specific.
This function was added in 5fe7d825da (refs.c: pass a list of names
to skip to is_refname_available, 2014-05-01) as rename_ref_available()
and was only ever used in this one file-backend specific codepath. So
let's move it there.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_loose_ref_contents() function to stop setting "errno"
and failure, and to instead pass up a "failure_errno" via a
parameter. This requires changing its callers to do the same.
The EINVAL error from parse_loose_ref_contents is used in files-backend
to create a custom error message.
In untangling this we discovered a tricky edge case. The
refs_read_special_head() function was relying on
parse_loose_ref_contents() setting EINVAL.
By converting it to use "saved_errno" we can migrate away from "errno"
in this part of the code entirely, and do away with an existing
"save_errno" pattern, its only purpose was to not clobber the "errno"
we previously needed at the end of files_read_raw_ref().
Let's assert that we can do that by not having files_read_raw_ref()
itself operate on *failure_errno in addition to passing it on. Instead
we'll assert that if we return non-zero we actually do set errno, thus
assuring ourselves and callers that they can trust the resulting
"failure_errno".
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a "failure_errno" to refs_read_raw_ref(), his allows
refs_werrres_ref_unsafe() to pass along its "failure_errno", as a
first step before its own callers are migrated to pass it further up
the chain.
We are leaving out out the refs_read_special_head() in
refs_read_raw_ref() for now, as noted in a subsequent commit moving it
to "failure_errno" will require some special consideration.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new refs_werrres_ref_unsafe() function, which is like
refs_resolve_ref_unsafe() except that it explicitly saves away the
"errno" to a passed-in parameter, the refs_resolve_ref_unsafe() then
becomes a wrapper for it.
In subsequent commits we'll migrate code over to it, before finally
making "refs_resolve_ref_unsafe()" with an "errno" parameter the
canonical version, so this this function exists only so that we can
incrementally migrate callers, it will be going away in a subsequent
commit.
As the added comment notes has a rather tortured name to be the same
length as "refs_resolve_ref_unsafe", to avoid churn as we won't need
to re-indent the argument lists, similarly the documentation and
structure of it in refs.h is designed to minimize a diff in a
subsequent commit, where that documentation will be added to the new
refs_resolve_ref_unsafe().
At the end of this migration the "meaningful errno" TODO item left in
76d70dc0c6 (refs.c: make resolve_ref_unsafe set errno to something
meaningful on error, 2014-06-20) will be resolved.
As can be seen from the use of refs_read_raw_ref() we'll also need to
convert some functions that the new refs_werrres_ref_unsafe() itself
calls to take this "failure_errno". That will be done in subsequent
commits.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mostly preliminary clean-up in the hook API.
* ab/config-based-hooks-1:
hook-list.h: add a generated list of hooks, like config-list.h
hook.c users: use "hook_exists()" instead of "find_hook()"
hook.c: add a hook_exists() wrapper and use it in bugreport.c
hook.[ch]: move find_hook() from run-command.c to hook.c
Makefile: remove an out-of-date comment
Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H)
Makefile: stop hardcoding {command,config}-list.h
Makefile: mark "check" target as .PHONY
The ref iteration code used to optionally allow dangling refs to be
shown, which has been tightened up.
* jk/ref-paranoia:
refs: drop "broken" flag from for_each_fullref_in()
ref-filter: drop broken-ref code entirely
ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
repack, prune: drop GIT_REF_PARANOIA settings
refs: turn on GIT_REF_PARANOIA by default
refs: omit dangling symrefs when using GIT_REF_PARANOIA
refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
refs-internal.h: move DO_FOR_EACH_* flags next to each other
t5312: be more assertive about command failure
t5312: test non-destructive repack
t5312: create bogus ref as necessary
t5312: drop "verbose" helper
t5600: provide detached HEAD for corruption failures
t5516: don't use HEAD ref for invalid ref-deletion tests
t7900: clean up some more broken refs
Note that should_pack_ref() is called when writing refs, which is only
supported for the_repository, hence the_repository is hardcoded there.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for the next 2 patches that adds (partial) support for
arbitrary repositories to ref iterators, plumb a repository into all ref
stores. There are no changes to program logic.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was added in 3dce444f17 (refs: add a backend method
structure, 2016-09-04), but has never been used by anything. The only
caller that might care uses find_ref_storage_backend() directly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No callers pass in anything but "0" here. Likewise to our sibling
functions. Note that some of them ferry along the flag, but none of
their callers pass anything but "0" either.
Nor is anybody likely to change that. Callers which really want to see
all of the raw refs use for_each_rawref(). And anybody interested in
iterating a subset of the refs will likely be happy to use the
now-default behavior of showing broken refs, but omitting dangling
symlinks.
So we can get rid of this whole feature.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that GIT_REF_PARANOIA is the default, we don't need to selectively
enable it for destructive operations. In fact, it's harmful to do so,
because it overrides any GIT_REF_PARANOIA=0 setting that the user may
have provided (because they're trying to work around some corruption).
With these uses gone, we can further clean up the ref_paranoia global,
and make it a static variable inside the refs code.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original point of the GIT_REF_PARANOIA flag was to include broken
refs in iterations, so that possibly-destructive operations would not
silently ignore them (and would generally instead try to operate on the
oids and fail when the objects could not be accessed).
We already turned this on by default for some dangerous operations, like
"repack -ad" (where missing a reachability tip would mean dropping the
associated history). But it was not on for general use, even though it
could easily result in the spreading of corruption (e.g., imagine
cloning a repository which simply omits some of its refs because
their objects are missing; the result quietly succeeds even though you
did not clone everything!).
This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned
above would actually fail (upload-pack tells us about the broken ref,
and when we ask for the objects, pack-objects fails to deliver them).
This may be inconvenient when working with a corrupted repository, but:
- we are better off to err on the side of complaining about
corruption, and then provide mechanisms for explicitly loosening
safety.
- this is only one type of corruption anyway. If we are missing any
other objects in the history that _aren't_ ref tips, then we'd
behave similarly (happily show the ref, but then barf when we
started traversing).
We retain the GIT_REF_PARANOIA variable, but simply default it to "1"
instead of "0". That gives the user an escape hatch for loosening this
when working with a corrupt repository. It won't work across a remote
connection to upload-pack (because we can't necessarily set environment
variables on the remote), but there the client has other options (e.g.,
choosing which refs to fetch).
As a bonus, this also makes ref iteration faster in general (because we
don't have to call has_object_file() for each ref), though probably not
noticeably so in the general case. In a repo with a million refs, it
shaved a few hundred milliseconds off of upload-pack's advertisement;
that's noticeable, but most repos are not nearly that large.
The possible downside here is that any operation which iterates refs but
doesn't ever open their objects may now quietly claim to have X when the
object is corrupted (e.g., "git rev-list new-branch --not --all" will
treat a broken ref as uninteresting). But again, that's not really any
different than corruption below the ref level. We might have
refs/heads/old-branch as non-corrupt, but we are not actively checking
that we have the entire reachable history. Or the pointed-to object
could even be corrupted on-disk (but our "do we have it" check would
still succeed). In that sense, this is merely bringing ref-corruption in
line with general object corruption.
One alternative implementation would be to actually check for broken
refs, and then _immediately die_ if we see any. That would cause the
"rev-list --not --all" case above to abort immediately. But in many ways
that's the worst of all worlds:
- it still spends time looking up the objects an extra time
- it still doesn't catch corruption below the ref level
- it's even more inconvenient; with the current implementation of
GIT_REF_PARANOIA for something like upload-pack, we can make
the advertisement and let the client choose a non-broken piece of
history. If we bail as soon as we see a broken ref, they cannot even
see the advertisement.
The test changes here show some of the fallout. A non-destructive "git
repack -adk" now fails by default (but we can override it). Deleting a
broken ref now actually tells the hooks the correct "before" state,
rather than a confusing null oid.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dangling symrefs aren't actually a corruption problem. It's perfectly
fine for refs/remotes/origin/HEAD to point to an unborn branch. And in
particular, if you are trying to establish reachability, a symref that
points nowhere doesn't matter either way. Any ref it could point to will
be examined during the rest of the traversal.
It's possible that a symref pointing nowhere _could_ be a sign that the
ref it was meant to point to was deleted accidentally (e.g., via
corruption). But there is no particular reason to think that is true for
any given case, and in the meantime, GIT_REF_PARANOIA kicking in
automatically for some operations means they'll fail unnecessarily.
So let's loosen it just a bit. The new test in t5312 shows off an
example that is safe, but currently fails (and no longer does after this
patch).
Note that we don't do anything if the caller explicitly asked for
DO_FOR_EACH_INCLUDE_BROKEN. In that case they may be looking for
dangling symrefs themselves, and setting GIT_REF_PARANOIA should not
_loosen_ things from what the caller asked for.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the DO_FOR_EACH_* flags is sprinkled over the
refs-internal.h file. We define the two flags in one spot, and then
describe them in more detail far away from there, in the definitions of
refs_ref_iterator_begin() and ref_iterator_advance_fn().
Let's try to organize this a bit better:
- convert the #defines to an enum. This makes it clear that they are
related, and that the enum shows the complete set of flags.
- combine all descriptions for each flag in a single spot, next to the
flag's definition
- use the enum rather than a bare int for functions which take the
flags. This helps readers realize which flags can be used.
- clarify the mention of flags for ref_iterator_advance_fn(). It does
not take flags itself, but is meant to depend on ones set up
earlier.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.
Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it explicit how alternative ref backends should report errors in
read_raw_ref_fn.
read_raw_ref_fn needs to supply a credible errno for a number of cases. These
are primarily:
1) The files backend calls read_raw_ref from lock_raw_ref, and uses the
resulting error codes to create/remove directories as needed.
2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set,
returning the last successfully resolved symref. This is necessary so
read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch
du-jour), and we know on which branch to create the first commit.
Make this information flow explicit by adding a failure_errno to the signature
of read_raw_ref. All errnos from the files backend are still propagated
unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are
relevant.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the the preceding commit the "oid" parameter to reflog_expire()
is always NULL, but it was not cleaned up to reduce the size of the
diff. Let's do that subsequent API and documentation cleanup now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the repo_dwim_log() function initially added as dwim_log() in
eb3a48221f (log --reflog: use dwim_log, 2007-02-09) to accept a NULL
oid parameter. The refs_resolve_ref_unsafe() function it invokes
already deals with it, but it didn't.
This allows for a bit more clarity in a reflog-walk.c codepath added
in f2eba66d4d (Enable HEAD@{...} and make it independent from the
current branch, 2007-02-03). We'll shortly use this in
builtin/reflog.c as well.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use -1 as error return value throughout.
This removes spurious differences in the GIT_TRACE_REFS output, depending on the
ref storage backend active.
Before, the cached ref_iterator (but only that iterator!) would return
peel_object() output directly. No callers relied on the peel_status values
beyond success/failure. All calls to these functions go through
peel_iterated_oid(), which returns peel_object() as a fallback, but also
squashing the error values.
The iteration interface already passes REF_ISSYMREF and REF_ISBROKEN through the
flags argument, so the additional error values in enum peel_status provide no
value.
The ref iteration interface provides a separate peel() function because certain
formats (eg. packed-refs and reftable) can store the peeled object next to the
tag SHA1. Passing the peeled SHA1 as an optional argument to each_ref_fn maps
more naturally to the implementation of ref databases. Changing the code in this
way is left for a future refactoring.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SHA-256 transition.
* bc/hash-transition-interop-part-1:
hex: print objects using the hash algorithm member
hex: default to the_hash_algo on zero algorithm value
builtin/pack-objects: avoid using struct object_id for pack hash
commit-graph: don't store file hashes as struct object_id
builtin/show-index: set the algorithm for object IDs
hash: provide per-algorithm null OIDs
hash: set, copy, and use algo field in struct object_id
builtin/pack-redundant: avoid casting buffers to struct object_id
Use the final_oid_fn to finalize hashing of object IDs
hash: add a function to finalize object IDs
http-push: set algorithm when reading object ID
Always use oidread to read into struct object_id
hash: add an algo member to struct object_id
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.
We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead. It shortens the code and infers the
element size automatically.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ls-refs protocol operation has been optimized to narrow the
sub-hierarchy of refs/ it walks to produce response.
* tb/ls-refs-optim:
ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets
ls-refs.c: initialize 'prefixes' before using it
refs: expose 'for_each_fullref_in_prefixes'
This function was used in the ref-filter.c code to find the longest
common prefix of among a set of refspecs, and then to iterate all of the
references that descend from that prefix.
A future patch will want to use that same code from ls-refs.c, so
prepare by exposing and moving it to refs.c. Since there is nothing
specific to the ref-filter code here (other than that it was previously
the only caller of this function), this really belongs in the more
generic refs.h header.
The code moved in this patch is identical before and after, with the one
exception of renaming some arguments to be consistent with other
functions exposed in refs.h.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This sequence works
$ git checkout -b newbranch
$ git commit --allow-empty -m one
$ git show -s newbranch@{1}
and shows the state that was immediately after the newbranch was
created.
But then if you do
$ git reflog expire --expire=now refs/heads/newbranch
$ git commit --allow=empty -m two
$ git show -s newbranch@{1}
you'd be scolded with
fatal: log for 'newbranch' only has 1 entries
While it is true that it has only 1 entry, we have enough
information in that single entry that records the transition between
the state in which the tip of the branch was pointing at commit
'one' to the new commit 'two' built on it, so we should be able to
answer "what object newbranch was pointing at?". But we refuse to
do so.
Make @{0} the special case where we use the new side to look up that
entry. Otherwise, look up @{n} using the old side of the (n-1)th entry
of the reflog.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This block of code is duplicated twice. In a future commit, it will be
duplicated for a third time. Factor out the common functionality into
set_read_ref_cutoffs().
In the case of read_ref_at_ent(), we are incrementing `cb->reccnt` at the
beginning of the function. Move these to right before the return so that
the `cb->reccnt - 1` is changed to `cb->reccnt` and it can be cleanly
factored out into set_read_ref_cutoffs(). The duplication of the
increment statements will be removed in a future patch.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To give ample warning for users wishing to override Git's the fall-back
for an unconfigured `init.defaultBranch` (in case we decide to change it
in a future Git version), let's introduce some advice that is shown upon
`git init` when that value is not set.
Note: two test cases in Git's test suite want to verify that the
`stderr` output of `git init` is empty. It is now necessary to suppress
the advice, we now do that via the `init.defaultBranch` setting. While
not strictly necessary, we also set this to `false` in
`test_create_repo()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are about to introduce a message giving users running `git init` some
advice about `init.defaultBranch`. This will necessarily be done in
`repo_default_branch_name()`.
Not all code paths want to show that advice, though. In particular, the
`git clone` codepath _specifically_ asks for `init_db()` to be quiet,
via the `INIT_DB_QUIET` flag.
In preparation for showing users above-mentioned advice, let's change
the function signature of `get_default_branch_name()` to accept the
parameter `quiet`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To allow for an incremental conversion to a new default main branch
name, let's introduce `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME`. This
environment variable can be set at the top of each converted test
script, overriding the default main branch name to use when initializing
new repositories (or cloning empty repositories).
Note: the `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME` is _not_ intended to be
used manually; many tests require a specific main branch name and cannot
simply work with another one. This `GIT_TEST_*` variable is meant purely
for the transitional period while the entire test suite is converted to
use `main` as the initial branch name by default.
We also introduce the `PREPARE_FOR_MAIN_BRANCH` prereq that determines
whether the default main branch name is `main`, and adjust a couple of
test functions to use it. This prereq will be used to temporarily
disable a couple test cases to allow for adjusting the test script
incrementally. Once an entire test is adjusted, we will adjust the test
so that it is run with `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME=main`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>