"maybe" pointer in 'struct commit' is tricky because it can be lazily
initialized to take advantage of commit-graph if available. This makes
it not safe to access directly.
This leads to a rule in commit.cocci to rewrite 'x->maybe_tree' to
'get_commit_tree(x)'. But that rule alone could lead to incorrectly
rewrite assignments, e.g. from
x->maybe_tree = yes
to
get_commit_tree(x) = yes
Because of this we have a second rule to revert this effect. Szeder
found out that we could do better by performing the assignment rewrite
rule first, then the remaining is read-only access and handled by the
current first rule.
For this to work, we need to transform "x->maybe_tree = y" to something
that does NOT contain "x->maybe_tree" to avoid the original first
rule. This is where set_commit_tree() comes in.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line completion (in contrib/) has been taught to
complete more subcommand parameters.
* nd/completion-more-parameters:
completion: add more parameter value completion
This adds value completion for a couple more paramters. To make it
easier to maintain these hard coded lists, add a comment at the original
list/code to remind people to update git-completion.bash too.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few functions related to directory renames that have unused
parameters. After consulting with the author in [1], these seem to be
leftover cruft from the development process, and not signs of any bug.
Let's drop them.
[1] https://public-inbox.org/git/CABPp-BHobf8wbBsXF97scNQCzkxQukziODRXq6JOOWq61cAd9g@mail.gmail.com/
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The assumption to work on the single "in-core index" instance has
been reduced from the library-ish part of the codebase.
* nd/the-index-final:
cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch
read-cache.c: remove the_* from index_has_changes()
merge-recursive.c: remove implicit dependency on the_repository
merge-recursive.c: remove implicit dependency on the_index
sha1-name.c: remove implicit dependency on the_index
read-cache.c: replace update_index_if_able with repo_&
read-cache.c: kill read_index()
checkout: avoid the_index when possible
repository.c: replace hold_locked_index() with repo_hold_locked_index()
notes-utils.c: remove the_repository references
grep: use grep_opt->repo instead of explict repo argument
The traversal over tree objects has learned to honor
":(attr:label)" pathspec match, which has been implemented only for
enumerating paths on the filesystem.
* nd/attr-pathspec-in-tree-walk:
tree-walk: support :(attr) matching
dir.c: move, rename and export match_attrs()
pathspec.h: clean up "extern" in function declarations
tree-walk.c: make tree_entry_interesting() take an index
tree.c: make read_tree*() take 'struct repository *'
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hold_locked_index() assumes the index path at $GIT_DIR/index. This is
not good for places that take an arbitrary index_state instead of
the_index, which is basically everywhere except builtin/.
Replace it with repo_hold_locked_index(). hold_locked_index() remains
as a wrapper around repo_hold_locked_index() to reduce changes in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions call tree_entry_interesting() which will soon require
a 'struct index_state *' to be passed in. Instead of just changing the
function signature to take an index, update to take a repo instead
because these functions do need object database access.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In handle_rename_rename_1to2(), we have duplicated error handling
around colliding paths. Specifically, when we want to write out
the file and there is a directory or untracked file in the way,
we need to create a temporary file to hold the contents. This has
some special output to alert the user, and this output is
duplicated for each side of the conflict.
Simplify the call by generating this new path in a helper
function.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have a rename/rename(1to2) conflict, each of the renames can
collide with a file addition. Each of these rename/add conflicts suffered
from the same kinds of problems that normal rename/add suffered from.
Make the code use handle_file_conflicts() as well so that we get all the
same fixes and consistent behavior between the different conflict types.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This results in no-net change of behavior, it simply ensures that all
file-collision conflict handling types are being handled the same by
calling the same function.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the rename/rename(2to1) conflicts use the new
handle_file_collision() function. Since that function was based
originally on the rename/rename(2to1) handling code, the main
differences here are in what was added. In particular:
* Instead of storing files at collide_path~HEAD and collide_path~MERGE,
the files are two-way merged and recorded at collide_path.
* Instead of recording the version of the renamed file that existed
on the renamed side in the index (thus ignoring any changes that
were made to the file on the side of history without the rename),
we do a three-way content merge on the renamed path, then store
that at either stage 2 or stage 3.
* Note that since the content merge for each rename may have conflicts,
and then we have to merge the two renamed files, we can end up with
nested conflict markers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the rename/add conflict handling make use of the new
handle_file_collision() function, which fixes several bugs and improves
things for the rename/add case significantly. Previously, rename/add
would:
* Not leave any higher order stage entries in the index, making it
appear as if there were no conflict.
* Would place the rename file at the colliding path, and move the
added file elsewhere, which combined with the lack of higher order
stage entries felt really odd. It's not clear to me why the
rename should take precedence over the add; if one should be moved
out of the way, they both probably should.
* In the recursive case, it would do a two way merge of the added
file and the version of the renamed file on the renamed side,
completely excluding modifications to the renamed file on the
unrenamed side of history.
Use the new handle_file_collision() to fix all of these issues.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three conflict types that represent two (possibly entirely
unrelated) files colliding at the same location:
* add/add
* rename/add
* rename/rename(2to1)
These three conflict types already share more similarity than might be
immediately apparent from their description: (1) the handling of the
rename variants already involves removing any entries from the index
corresponding to the original file names[*], thus only leaving entries
in the index for the colliding path; (2) likewise, any trace of the
original file name in the working tree is also removed. So, in all
three cases we're left with how to represent two colliding files in both
the index and the working copy.
[*] Technically, this isn't quite true because rename/rename(2to1)
conflicts in the recursive (o->call_depth > 0) case do an "unrename"
since about seven years ago. But even in that case, Junio felt
compelled to explain that my decision to "unrename" wasn't necessarily
the only or right answer -- search for "Comment from Junio" in t6036 for
details.
My initial motivation for looking at these three conflict types was that
if the handling of these three conflict types is the same, at least in
the limited set of cases where a renamed file is unmodified on the side
of history where the file is not renamed, then a significant performance
improvement for rename detection during merges is possible. However,
while that served as motivation to look at these three types of
conflicts, the actual goal of this new function is to try to improve the
handling for all three cases, not to merely make them the same as each
other in that special circumstance.
=== Handling the working tree ===
The previous behavior for these conflict types in regards to the
working tree (assuming the file collision occurs at 'foo') was:
* add/add does a two-way merge of the two files and records it as 'foo'.
* rename/rename(2to1) records the two different files into two new
uniquely named files (foo~HEAD and foo~$MERGE), while removing 'foo'
from the working tree.
* rename/add records the two different files into two different
locations, recording the add at foo~$SIDE and, oddly, recording
the rename at foo (why is the rename more important than the add?)
So, the question for what to write to the working tree boils down to
whether the two colliding files should be two-way merged and recorded in
place, or recorded into separate files. As per discussion on the git
mailing lit, two-way merging was deemed to always be preferred, as that
makes these cases all more like content conflicts that users can handle
from within their favorite editor, IDE, or merge tool. Note that since
renames already involve a content merge, rename/add and
rename/rename(2to1) conflicts could result in nested conflict markers.
=== Handling of the index ===
For a typical rename, unpack_trees() would set up the index in the
following fashion:
old_path new_path
stage1: 5ca1ab1e 00000000
stage2: f005ba11 00000000
stage3: 00000000 b0a710ad
And merge-recursive would rewrite this to
new_path
stage1: 5ca1ab1e
stage2: f005ba11
stage3: b0a710ad
Removing old_path from the index means the user won't have to `git rm
old_path` manually every time a renamed path has a content conflict.
It also means they can use `git checkout [--ours|--theirs|--conflict|-m]
new_path`, `git diff [--ours|--theirs]` and various other commands that
would be difficult otherwise.
This strategy becomes a problem when we have a rename/add or
rename/rename(2to1) conflict, however, because then we have only three
slots to store blob sha1s and we need either four or six. Previously,
this was handled by continuing to delete old_path from the index, and
just outright ignoring any blob shas from old_path. That had the
downside of deleting any trace of changes made to old_path on the other
side of history. This function instead does a three-way content merge of
the renamed file, and stores the blob sha1 for that at either stage2 or
stage3 for new_path (depending on which side the rename came from). That
has the advantage of bringing information about changes on both sides and
still allows for easy resolution (no need to git rm old_path, etc.), but
does have the downside that if the content merge had conflict markers,
then what we store in the index is the sha1 of a blob with conflict
markers. While that is a downside, it seems less problematic than the
downsides of any obvious alternatives, and certainly makes more sense
than the previous handling. Further, it has a precedent in that when we
do recursive merges, we may accept a file with conflict markers as the
resolution for the merge of the merge-bases, which will then show up in
the index of the outer merge at stage 1 if a conflict exists at the outer
level.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Later patches in this series will modify file collision conflict
handling (e.g. from rename/add and rename/rename(2to1) conflicts) so
that multiply nested conflict markers can arise even before considering
conflicts in the virtual merge base. Including the virtual merge base
will provide a way to get triply (or higher) nested conflict markers.
This new way to get nested conflict markers will force the need for a
more general mechanism to extend the length of conflict markers in order
to differentiate between different nestings.
Along with this change to conflict marker length handling, we want to
make sure that we don't regress handling for other types of conflicts
with nested conflict markers. Add a more involved testcase using
merge.conflictstyle=diff3, where not only does the virtual merge base
contain conflicts, but its virtual merge base does as well (i.e. a case
with triply nested conflict markers). While there are multiple
reasonable ways to handle nested conflict markers in the virtual merge
base for this type of situation, the easiest approach that dovetails
well with the new needs for the file collision conflict handling is to
require that the length of the conflict markers increase with each
subsequent nesting.
Subsequent patches which change the rename/add and rename/rename(2to1)
conflict handling will modify the extra_marker_size flag appropriately
for their new needs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further clean-up of merge-recursive machinery.
* en/merge-cleanup-more:
merge-recursive: avoid showing conflicts with merge branch before HEAD
merge-recursive: improve auto-merging messages with path collisions
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
We want to load unmerged entries from HEAD into the index at stage 2 and
from MERGE_HEAD into stage 3. Similarly, folks expect merge conflicts
to look like
<<<<<<<< HEAD
content from our side
========
content from their side
>>>>>>>> MERGE_HEAD
not
<<<<<<<< MERGE_HEAD
content from their side
========
content from our side
>>>>>>>> HEAD
The correct order usually comes naturally and for free, but with renames
we often have data in the form {rename_branch, other_branch}, and
working relative to the rename first (e.g. for rename/add) is more
convenient elsewhere in the code. Address the slight impedance
mismatch by having some functions re-call themselves with flipped
arguments when the branch order is reversed.
Note that setup_rename_conflict_info() has one asymmetry in it, in
setting dst_entry1->processed=0 but not doing similarly for
dst_entry2->processed. When dealing with rename/rename and similar
conflicts, we do not want the processing to happen twice, so the
desire to only set one of the entries to unprocessed is intentional.
So, while this change modifies which branch's entry will be marked as
unprocessed, that dovetails nicely with putting HEAD first so that we
get the index stage entries and conflict markers in the right order.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each individual file involved in a rename could have also been modified
on both sides of history, meaning it may need to have content merges.
If two such files are renamed into the same location, then on top of the
two natural auto-merging messages we also have to two-way merge the
result, giving us messages that look like
Auto-merging somefile.c (was somecase.c)
Auto-merging somefile.c (was somefolder.c)
Auto-merging somefile.c
However, despite the fact that I was the one who put the "(was %s)"
portions into the messages (and just a few months ago), I was still
initially confused when running into a rename/rename(2to1) case and
wondered if somefile.c had been merged three times. Update this to
instead be:
Auto-merging version of somefile.c from somecase.c
Auto-merging version of somefile.c from someportfolio.c
Auto-merging somefile.c
This is an admittedly long set of messages for a single path, but you
only get all three messages when dealing with the rare case of a
rename/rename(2to1) conflict where both sides of both original files
were also modified, in conflicting ways.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* en/merge-cleanup:
merge-recursive: rename merge_file_1() and merge_content()
merge-recursive: remove final remaining caller of merge_file_one()
merge-recursive: avoid wrapper function when unnecessary and wasteful
merge-recursive: set paths correctly when three-way merging content
A new variant repo_diff_setup() is added that takes 'struct repository *'
and diff_setup() becomes a thin macro around it that is protected by
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_....
The plan is these macros will always be defined for all library files
and the macros are only accessible in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Summary:
merge_file_1() -> merge_mode_and_contents()
merge_content() -> handle_content_merge()
merge_file_1() is a very unhelpful name. Rename it to
merge_mode_and_contents() to reflect what it does.
merge_content() calls merge_mode_and_contents() to do the main part of
its work, but most of this function was about higher level stuff, e.g.
printing out conflict messages, updating skip_worktree bits, checking
for ability to avoid updating the working tree or for D/F conflicts
being in the way, etc. Since there are several handle_*() functions for
similar levels of checking and handling in merge-recursive.c (e.g.
handle_change_delete(), handle_rename_rename_2to1()), let's rename this
function to handle_content_merge().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function names merge_file_one() and merge_file_1() aren't
particularly intuitive function names, especially since there is no
associated merge_file() function that these are related to. The
previous commit showed that merge_file_one() was prone to be called when
merge_file_1() should be, and since it is just a thin wrapper around
merge_file_1() anyway and only has one caller left, let's just remove
merge_file_one() entirely.
(It also turns out that the one remaining caller of merge_file_one()
has very broken code that needs to be completely rewritten, but that's
the subject of a future patch series; for now, we're just translating
it into a merge_file_1() call.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_file_one() is a convenience function taking a bunch of oids and
modes, combining each pair into a diff_filespec, and then calling
merge_file_1(). When we already start with diff_filespec's, we can
just call merge_file_1() directly instead of splitting out the oids
and modes for the wrapper to recombine into what we already had.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_3way() has code to mark different sides of the conflict with info
about where the content comes from. If the names of the files involved
match, it simply uses the branch name. If the names of the files do not
match, it uses branchname:filename. Unfortunately, merge_content()
previously always called it with one.path = a.path = b.path. Granted,
it didn't have other path information available to it for years, but
that was corrected by passing rename_conflict_info in commit
3c217c077a ("merge-recursive: Provide more info in conflict markers
with file renames", 2011-08-11). In that commit, instead of just fixing
the bug with the pathnames, it created fake branch names incorporating
both the branch name and file name.
This "fake branch" workaround was extended further when I pulled that
logic out into a special function in commit dac4741554
("merge-recursive: Create function for merging with branchname:file
markers", 2011-08-11), and a number of other sites outside of
merge_content() have been added which call into that. However, this
Rube-Goldberg-esque setup is not merely duplicate code and unnecessary
work, it also risked having other callsites invoke it in a way that
would result in markers of the form branchname:filename:filename (i.e.
with the filename repeated).
Fix this whole mess by:
- setting one.path, a.path, and b.path appropriately
- calling merge_file_1() directly
- deleting the merge_file_special_markers() workaround wrapper
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.
* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.
* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c
Recent addition of "directory rename" heuristics to the
merge-recursive backend makes the command susceptible to false
positives and false negatives. In the context of "git am -3",
which does not know about surrounding unmodified paths and thus
cannot inform the merge machinery about the full trees involved,
this risk is particularly severe. As such, the heuristic is
disabled for "git am -3" to keep the machinery "more stupid but
predictable".
* en/directory-renames-nothanks:
am: avoid directory rename detection when calling recursive merge machinery
merge-recursive: add ability to turn off directory rename detection
t3401: add another directory rename testcase for rebase and am
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.
* nd/no-the-index: (24 commits)
blame.c: remove implicit dependency on the_index
apply.c: remove implicit dependency on the_index
apply.c: make init_apply_state() take a struct repository
apply.c: pass struct apply_state to more functions
resolve-undo.c: use the right index instead of the_index
archive-*.c: use the right repository
archive.c: avoid access to the_index
grep: use the right index instead of the_index
attr: remove index from git_attr_set_direction()
entry.c: use the right index instead of the_index
submodule.c: use the right index instead of the_index
pathspec.c: use the right index instead of the_index
unpack-trees: avoid the_index in verify_absent()
unpack-trees: convert clear_ce_flags* to avoid the_index
unpack-trees: don't shadow global var the_index
unpack-trees: add a note about path invalidation
unpack-trees: remove 'extern' on function declaration
ls-files: correct index argument to get_convert_attr_ascii()
preload-index.c: use the right index instead of the_index
dir.c: remove an implicit dependency on the_index in pathspec code
...
When the sparse checkout feature is in use, "git cherry-pick" and
other mergy operations lost the skip_worktree bit when a path that
is excluded from checkout requires content level merge, which is
resolved as the same as the HEAD version, without materializing the
merge result in the working tree, which made the path appear as
deleted. This has been corrected by preserving the skip_worktree
bit (and not materializing the file in the working tree).
* en/merge-recursive-skip-fix:
merge-recursive: preserve skip_worktree bit when necessary
t3507: add a testcase showing failure with sparse checkout
Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recursive merge strategy did not properly ensure there was no
change between HEAD and the index before performing its operation,
which has been corrected.
* en/dirty-merge-fixes:
merge: fix misleading pre-merge check documentation
merge-recursive: enforce rule that index matches head before merging
t6044: add more testcases with staged changes before a merge is invoked
merge-recursive: fix assumption that head tree being merged is HEAD
merge-recursive: make sure when we say we abort that we actually abort
t6044: add a testcase for index matching head, when head doesn't match HEAD
t6044: verify that merges expected to abort actually abort
index_has_changes(): avoid assuming operating on the_index
read-cache.c: move index_has_changes() from merge.c
For a large tree, the index needs to hold many cache entries
allocated on heap. These cache entries are now allocated out of a
dedicated memory pool to amortize malloc(3) overhead.
* jm/cache-entry-from-mem-pool:
block alloc: add validations around cache_entry lifecyle
block alloc: allocate cache entries from mem_pool
mem-pool: fill out functionality
mem-pool: add life cycle management functions
mem-pool: only search head block for available space
block alloc: add lifecycle APIs for cache_entry structs
read-cache: teach make_cache_entry to take object_id
read-cache: teach refresh_cache_entry to take istate
merge-recursive takes any files marked as unmerged by unpack_trees,
tries to figure out whether they can be resolved (e.g. using renames
or a file-level merge), and then if they can be it will delete the old
cache entries and writes new ones. This means that any ce_flags for
those cache entries are essentially cleared when merging.
Unfortunately, if a file was marked as skip_worktree and it needs a
file-level merge but the merge results in the same version of the file
that was found in HEAD, we skip updating the worktree (because the
file was unchanged) but clear the skip_worktree bit (because of the
delete-cache-entry-and-write-new-one). This makes git treat the file
as having a local change in the working copy, namely a delete, when it
should appear as unchanged despite not being present. Avoid this
problem by copying the skip_worktree flag in this case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These methods are now declared in commit-reach.h. Remove them from
commit.h and add new include statements in all files that require these
declarations.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
builtin/merge.c says that when we are about to perform a merge:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
merge-recursive has always relied on unpack_trees() to enforce this
requirement, except in the case of an "Already up to date!" merge.
unpack-trees.c does not actually enforce this requirement, though. It
allows for a pair of exceptions, in cases which it refers to as #14(ALT)
and #2ALT. Documentation/technical/trivial-merge.txt can be consulted for
the precise meanings of the various case numbers and their meanings for
unpack-trees.c, but we have a high-level description of the intent behind
these two exceptions in a combined and summarized form in
Documentation/git-merge.txt:
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed index
entries are in the state that would result from the merge already.)
While this high-level description does describe conditions under which it
would be safe to allow the index to diverge from HEAD, it does not match
what is actually implemented. In particular, unpack-trees.c has no
knowledge of renames, and these two exceptions were written assuming that
no renames take place. Once renames get into the mix, it is no longer
safe to allow the index to not match for #2ALT. We could modify
unpack-trees to only allow #14(ALT) as an exception, but that would be
more strict than required for the resolve strategy (since the resolve
strategy doesn't handle renames at all). Therefore, unpack_trees.c seems
like the wrong place to fix this.
Further, if someone fixes the combination of break and rename detection
and modifies merge-recursive to take advantage of the combination, then it
will also no longer be safe to allow the index to not match for #14(ALT)
when the recursive strategy is in use. Therefore, leaving one of the
exceptions in place with the recursive merge strategy feels like we are
just leaving a latent bug in the code for folks in the future to stumble
across.
It may be possible to fix both unpack-trees and merge-recursive in a way
that implements the exception as stated in Documentation/git-merge.txt,
but it would be somewhat complex, possibly also buggy at first, and
ultimately, not all that valuable. Instead, just enforce the requirement
stated in builtin/merge.c; error out if the index does not match the HEAD
commit, just like the 'ours' and 'octopus' strategies do.
Some testcase fixups were in order:
t7611: had many tests designed to show that `git merge --abort` could
not always restore the index and working tree to the state they
were in before the merge started. The tests that were associated
with having changes in the index before the merge started are no
longer applicable, so they have been removed.
t7504: had a few tests that had stray staged changes that were not
actually part of the test under consideration
t6044: We no longer expect stray staged changes to sometimes result
in the merge continuing. Also, fix a case where a merge
didn't abort but should have.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>