2006-02-26 03:19:46 +03:00
|
|
|
#ifndef REVISION_H
|
|
|
|
#define REVISION_H
|
|
|
|
|
2008-07-10 01:38:34 +04:00
|
|
|
#include "parse-options.h"
|
2008-08-25 10:15:05 +04:00
|
|
|
#include "grep.h"
|
2010-03-12 20:04:26 +03:00
|
|
|
#include "notes.h"
|
toposort: rename "lifo" field
The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are. When
traversing a forked history like this with "git log C E":
A----B----C
\
D----E
we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.
In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.
Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output). The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour. We start the traversal by knowing two
commits, C and E. While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected. By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.
When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.
The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.
Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code. The mechanical replacement rule is:
"lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
"lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-07 03:07:14 +04:00
|
|
|
#include "commit.h"
|
2008-07-10 01:38:34 +04:00
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
#define SEEN (1u<<0)
|
|
|
|
#define UNINTERESTING (1u<<1)
|
2007-11-13 10:16:08 +03:00
|
|
|
#define TREESAME (1u<<2)
|
2006-03-01 02:07:20 +03:00
|
|
|
#define SHOWN (1u<<3)
|
2006-03-01 11:58:56 +03:00
|
|
|
#define TMP_MARK (1u<<4) /* for isolated cases; clean after use */
|
2006-03-28 11:58:34 +04:00
|
|
|
#define BOUNDARY (1u<<5)
|
2007-03-06 03:10:28 +03:00
|
|
|
#define CHILD_SHOWN (1u<<6)
|
2006-04-17 05:12:49 +04:00
|
|
|
#define ADDED (1u<<7) /* Parents already parsed and added? */
|
2006-10-23 04:32:47 +04:00
|
|
|
#define SYMMETRIC_LEFT (1u<<8)
|
2011-03-07 15:31:40 +03:00
|
|
|
#define PATCHSAME (1u<<9)
|
2013-05-16 19:32:38 +04:00
|
|
|
#define BOTTOM (1u<<10)
|
|
|
|
#define ALL_REV_FLAGS ((1u<<11)-1)
|
2006-02-26 03:19:46 +03:00
|
|
|
|
2009-08-15 18:23:12 +04:00
|
|
|
#define DECORATE_SHORT_REFS 1
|
|
|
|
#define DECORATE_FULL_REFS 2
|
|
|
|
|
2006-03-10 12:21:39 +03:00
|
|
|
struct rev_info;
|
Log message printout cleanups
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-17 22:59:32 +04:00
|
|
|
struct log_info;
|
2010-03-12 20:04:26 +03:00
|
|
|
struct string_list;
|
log: use true parents for diff even when rewriting
When using pathspec filtering in combination with diff-based log
output, parent simplification happens before the diff is computed.
The diff is therefore against the *simplified* parents.
This works okay, arguably by accident, in the normal case:
simplification reduces to one parent as long as the commit is TREESAME
to it. So the simplified parent of any given commit must have the
same tree contents on the filtered paths as its true (unfiltered)
parent.
However, --full-diff breaks this guarantee, and indeed gives pretty
spectacular results when comparing the output of
git log --graph --stat ...
git log --graph --full-diff --stat ...
(--graph internally kicks in parent simplification, much like
--parents).
To fix it, store a copy of the parent list before simplification (in a
slab) whenever --full-diff is in effect. Then use the stored parents
instead of the simplified ones in the commit display code paths. The
latter do not actually check for --full-diff to avoid duplicated code;
they just grab the original parents if save_parents() has not been
called for this revision walk.
For ordinary commits it should be obvious that this is the right thing
to do.
Merge commits are a bit subtle. Observe that with default
simplification, merge simplification is an all-or-nothing decision:
either the merge is TREESAME to one parent and disappears, or it is
different from all parents and the parent list remains intact.
Redundant parents are not pruned, so the existing code also shows them
as a merge.
So if we do show a merge commit, the parent list just consists of the
rewrite result on each parent. Running, e.g., --cc on this in
--full-diff mode is not very useful: if any commits were skipped, some
hunks will disagree with all sides of the merge (with one side,
because commits were skipped; with the others, because they didn't
have those changes in the first place). This triggers --cc showing
these hunks spuriously.
Therefore I believe that even for merge commits it is better to show
the diffs wrt. the original parents.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01 00:13:20 +04:00
|
|
|
struct saved_parents;
|
2006-03-10 12:21:39 +03:00
|
|
|
|
2011-08-26 04:35:39 +04:00
|
|
|
struct rev_cmdline_info {
|
|
|
|
unsigned int nr;
|
|
|
|
unsigned int alloc;
|
|
|
|
struct rev_cmdline_entry {
|
|
|
|
struct object *item;
|
|
|
|
const char *name;
|
|
|
|
enum {
|
|
|
|
REV_CMD_REF,
|
|
|
|
REV_CMD_PARENTS_ONLY,
|
|
|
|
REV_CMD_LEFT,
|
|
|
|
REV_CMD_RIGHT,
|
2013-05-13 19:00:47 +04:00
|
|
|
REV_CMD_MERGE_BASE,
|
2011-08-26 04:35:39 +04:00
|
|
|
REV_CMD_REV
|
|
|
|
} whence;
|
|
|
|
unsigned flags;
|
|
|
|
} *rev;
|
|
|
|
};
|
|
|
|
|
teach log --no-walk=unsorted, which avoids sorting
When 'git log' is passed the --no-walk option, no revision walk takes
place, naturally. Perhaps somewhat surprisingly, however, the provided
revisions still get sorted by commit date. So e.g 'git log --no-walk
HEAD HEAD~1' and 'git log --no-walk HEAD~1 HEAD' give the same result
(unless the two revisions share the commit date, in which case they
will retain the order given on the command line). As the commit that
introduced --no-walk (8e64006 (Teach revision machinery about
--no-walk, 2007-07-24)) points out, the sorting is intentional, to
allow things like
git log --abbrev-commit --pretty=oneline --decorate --all --no-walk
to show all refs in order by commit date.
But there are also other cases where the sorting is not wanted, such
as
<command producing revisions in order> |
git log --oneline --no-walk --stdin
To accomodate both cases, leave the decision of whether or not to sort
up to the caller, by allowing --no-walk={sorted,unsorted}, defaulting
to 'sorted' for backward-compatibility reasons.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-29 10:15:54 +04:00
|
|
|
#define REVISION_WALK_WALK 0
|
|
|
|
#define REVISION_WALK_NO_WALK_SORTED 1
|
|
|
|
#define REVISION_WALK_NO_WALK_UNSORTED 2
|
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
struct rev_info {
|
|
|
|
/* Starting list */
|
|
|
|
struct commit_list *commits;
|
Add "named object array" concept
We've had this notion of a "object_list" for a long time, which eventually
grew a "name" member because some users (notably git-rev-list) wanted to
name each object as it is generated.
That object_list is great for some things, but it isn't all that wonderful
for others, and the "name" member is generally not used by everybody.
This patch splits the users of the object_list array up into two: the
traditional list users, who want the list-like format, and who don't
actually use or want the name. And another class of users that really used
the list as an extensible array, and generally wanted to name the objects.
The patch is fairly straightforward, but it's also biggish. Most of it
really just cleans things up: switching the revision parsing and listing
over to the array makes things like the builtin-diff usage much simpler
(we now see exactly how many members the array has, and we don't get the
objects reversed from the order they were on the command line).
One of the main reasons for doing this at all is that the malloc overhead
of the simple object list was actually pretty high, and the array is just
a lot denser. So this patch brings down memory usage by git-rev-list by
just under 3% (on top of all the other memory use optimizations) on the
mozilla archive.
It does add more lines than it removes, and more importantly, it adds a
whole new infrastructure for maintaining lists of objects, but on the
other hand, the new dynamic array code is pretty obvious. The change to
builtin-diff-tree.c shows a fairly good example of why an array interface
is sometimes more natural, and just much simpler for everybody.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-20 04:42:35 +04:00
|
|
|
struct object_array pending;
|
2006-02-26 03:19:46 +03:00
|
|
|
|
revision walker: Fix --boundary when limited
This cleans up the boundary processing in the commit walker. It
- rips out the boundary logic from the commit walker. Placing
"negative" commits in the revs->commits list was Ok if all we
cared about "boundary" was the UNINTERESTING limiting case,
but conceptually it was wrong.
- makes get_revision_1() function to walk the commits and return
the results as if there is no funny postprocessing flags such
as --reverse, --skip nor --max-count.
- makes get_revision() function the postprocessing phase:
If reverse is given, wait for get_revision_1() to give
everything that it would normally give, and then reverse it
before consuming.
If skip is given, skip that many before going further.
If max is given, stop when we gave out that many.
Now that we are about to return one positive commit, mark
the parents of that commit to be potential boundaries
before returning, iff we are doing the boundary processing.
Return the commit.
- After get_revision() finishes giving out all the positive
commits, if we are doing the boundary processing, we look at
the parents that we marked as potential boundaries earlier,
see if they are really boundaries, and give them out.
It loses more code than it adds, even when the new gc_boundary()
function, which is purely for early optimization, is counted.
Note that this patch is purely for eyeballing and discussion
only. It breaks git-bundle's verify logic because the logic
does not use BOUNDARY_SHOW flag for its internal computation
anymore. After we correct it not to attempt to affect the
boundary processing by setting the BOUNDARY_SHOW flag, we can
remove BOUNDARY_SHOW from revision.h and use that bit assignment
for the new CHILD_SHOWN flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-06 00:10:06 +03:00
|
|
|
/* Parents of shown commits */
|
|
|
|
struct object_array boundary_commits;
|
|
|
|
|
2011-08-26 04:35:39 +04:00
|
|
|
/* The end-points specified by the end user */
|
|
|
|
struct rev_cmdline_info cmdline;
|
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
/* Basic information */
|
|
|
|
const char *prefix;
|
2008-07-08 17:19:33 +04:00
|
|
|
const char *def;
|
2010-12-17 15:43:06 +03:00
|
|
|
struct pathspec prune_data;
|
toposort: rename "lifo" field
The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are. When
traversing a forked history like this with "git log C E":
A----B----C
\
D----E
we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.
In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.
Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output). The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour. We start the traversal by knowing two
commits, C and E. While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected. By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.
When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.
The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.
Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code. The mechanical replacement rule is:
"lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
"lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-07 03:07:14 +04:00
|
|
|
|
|
|
|
/* topo-sort */
|
|
|
|
enum rev_sort_order sort_order;
|
|
|
|
|
2011-05-19 05:08:09 +04:00
|
|
|
unsigned int early_output:1,
|
|
|
|
ignore_missing:1;
|
2007-11-03 21:11:10 +03:00
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
/* Traversal flags */
|
|
|
|
unsigned int dense:1,
|
2007-11-06 00:22:34 +03:00
|
|
|
prune:1,
|
teach log --no-walk=unsorted, which avoids sorting
When 'git log' is passed the --no-walk option, no revision walk takes
place, naturally. Perhaps somewhat surprisingly, however, the provided
revisions still get sorted by commit date. So e.g 'git log --no-walk
HEAD HEAD~1' and 'git log --no-walk HEAD~1 HEAD' give the same result
(unless the two revisions share the commit date, in which case they
will retain the order given on the command line). As the commit that
introduced --no-walk (8e64006 (Teach revision machinery about
--no-walk, 2007-07-24)) points out, the sorting is intentional, to
allow things like
git log --abbrev-commit --pretty=oneline --decorate --all --no-walk
to show all refs in order by commit date.
But there are also other cases where the sorting is not wanted, such
as
<command producing revisions in order> |
git log --oneline --no-walk --stdin
To accomodate both cases, leave the decision of whether or not to sort
up to the caller, by allowing --no-walk={sorted,unsorted}, defaulting
to 'sorted' for backward-compatibility reasons.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-29 10:15:54 +04:00
|
|
|
no_walk:2,
|
Add "--show-all" revision walker flag for debugging
It's really not very easy to visualize the commit walker, because - on
purpose - it obvously doesn't show the uninteresting commits!
This adds a "--show-all" flag to the revision walker, which will make
it show uninteresting commits too, and they'll have a '^' in front of
them (it also fixes a logic error for !verbose_header for boundary
commits - we should show the '-' even if left_right isn't shown).
A separate patch to gitk to teach it the new '^' was sent
to paulus. With the change in place, it actually is interesting
even for the cases that git doesn't have any problems with, ie
for the kernel you can do:
gitk -d --show-all v2.6.24..
and you see just how far down it has to parse things to see it all. The
use of "-d" is a good idea, since the date-ordered toposort is much better
at showing why it goes deep down (ie the date of some of those commits
after 2.6.24 is much older, because they were merged from trees that
weren't rebased).
So I think this is a useful feature even for non-debugging - just to
visualize what git does internally more.
When it actually breaks out due to the "everybody_uninteresting()"
case, it adds the uninteresting commits (both the one it's looking at
now, and the list of pending ones) to the list
This way, we really list *all* the commits we've looked at.
Because we now end up listing commits we may not even have been parsed
at all "show_log" and "show_commit" need to protect against commits
that don't have a commit buffer entry.
That second part is debatable just how it should work. Maybe we shouldn't
show such entries at all (with this patch those entries do get shown, they
just don't get any message shown with them). But I think this is a useful
case.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-10 01:02:07 +03:00
|
|
|
show_all:1,
|
2006-02-26 03:19:46 +03:00
|
|
|
remove_empty_trees:1,
|
2006-06-11 21:57:35 +04:00
|
|
|
simplify_history:1,
|
2006-02-26 03:19:46 +03:00
|
|
|
topo_order:1,
|
revision traversal: show full history with merge simplification
The --full-history traversal keeps all merges in addition to non-merge
commits that touch paths in the given pathspec. This is useful to view
both sides of a merge in a topology like this:
A---M---o
/ /
---O---B
even when A and B makes identical change to the given paths. The revision
traversal without --full-history aims to come up with the simplest history
to explain the final state of the tree, and one of the side branches can
be pruned away.
The behaviour to keep all merges however is inconvenient if neither A nor
B touches the paths we are interested in. --full-history reduces the
topology to:
---O---M---o
in such a case, without removing M.
This adds a post processing phase on top of --full-history traversal to
remove needless merges from the resulting history.
The idea is to compute, for each commit in the "full history" result set,
the commit that should replace it in the simplified history. The commit
to replace it in the final history is determined as follows:
* In any case, we first figure out the replacement commits of parents of
the commit we are looking at. The commit we are looking at is
rewritten as if the replacement commits of its original parents are its
parents. While doing so, we reduce the redundant parents from the
rewritten parent list by not just removing the identical ones, but also
removing a parent that is an ancestor of another parent.
* After the above parent simplification, if the commit is a root commit,
an UNINTERESTING commit, a merge commit, or modifies the paths we are
interested in, then the replacement commit of the commit is itself. In
other words, such a commit is not dropped from the final result.
The first point above essentially means that the history is rewritten in
the bottom up direction. We can rewrite the parent list of a commit only
after we know how all of its parents are rewritten. This means that the
processing needs to happen on the full history (i.e. after limit_list()).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-31 12:17:41 +04:00
|
|
|
simplify_merges:1,
|
2008-11-03 22:25:46 +03:00
|
|
|
simplify_by_decoration:1,
|
2006-02-26 03:19:46 +03:00
|
|
|
tag_objects:1,
|
|
|
|
tree_objects:1,
|
|
|
|
blob_objects:1,
|
2011-09-02 02:43:34 +04:00
|
|
|
verify_objects:1,
|
2006-02-27 19:54:36 +03:00
|
|
|
edge_hint:1,
|
|
|
|
limited:1,
|
2009-02-28 11:00:21 +03:00
|
|
|
unpacked:1,
|
revision walker: Fix --boundary when limited
This cleans up the boundary processing in the commit walker. It
- rips out the boundary logic from the commit walker. Placing
"negative" commits in the revs->commits list was Ok if all we
cared about "boundary" was the UNINTERESTING limiting case,
but conceptually it was wrong.
- makes get_revision_1() function to walk the commits and return
the results as if there is no funny postprocessing flags such
as --reverse, --skip nor --max-count.
- makes get_revision() function the postprocessing phase:
If reverse is given, wait for get_revision_1() to give
everything that it would normally give, and then reverse it
before consuming.
If skip is given, skip that many before going further.
If max is given, stop when we gave out that many.
Now that we are about to return one positive commit, mark
the parents of that commit to be potential boundaries
before returning, iff we are doing the boundary processing.
Return the commit.
- After get_revision() finishes giving out all the positive
commits, if we are doing the boundary processing, we look at
the parents that we marked as potential boundaries earlier,
see if they are really boundaries, and give them out.
It loses more code than it adds, even when the new gc_boundary()
function, which is purely for early optimization, is counted.
Note that this patch is purely for eyeballing and discussion
only. It breaks git-bundle's verify logic because the logic
does not use BOUNDARY_SHOW flag for its internal computation
anymore. After we correct it not to attempt to affect the
boundary processing by setting the BOUNDARY_SHOW flag, we can
remove BOUNDARY_SHOW from revision.h and use that bit assignment
for the new CHILD_SHOWN flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-06 00:10:06 +03:00
|
|
|
boundary:2,
|
2010-06-10 15:47:23 +04:00
|
|
|
count:1,
|
2006-12-17 02:31:25 +03:00
|
|
|
left_right:1,
|
2011-02-21 19:09:11 +03:00
|
|
|
left_only:1,
|
|
|
|
right_only:1,
|
2008-05-04 14:36:52 +04:00
|
|
|
rewrite_parents:1,
|
|
|
|
print_parents:1,
|
2008-10-27 22:51:59 +03:00
|
|
|
show_source:1,
|
2008-11-03 22:23:57 +03:00
|
|
|
show_decorations:1,
|
2007-03-13 11:57:22 +03:00
|
|
|
reverse:1,
|
2008-08-29 23:18:38 +04:00
|
|
|
reverse_output_stage:1,
|
2007-04-09 14:40:38 +04:00
|
|
|
cherry_pick:1,
|
2011-03-07 15:31:40 +03:00
|
|
|
cherry_mark:1,
|
2009-10-27 21:28:07 +03:00
|
|
|
bisect:1,
|
revision: --ancestry-path
"rev-list A..H" computes the set of commits that are ancestors of H, but
excludes the ones that are ancestors of A. This is useful to see what
happened to the history leading to H since A, in the sense that "what does
H have that did not exist in A" (e.g. when you have a choice to update to
H from A).
x---x---A---B---C <-- topic
/ \
x---x---x---o---o---o---o---M---D---E---F---G <-- dev
/ \
x---o---o---o---o---o---o---o---o---o---o---o---N---H <-- master
The result in the above example would be the commits marked with caps
letters (except for A itself, of course), and the ones marked with 'o'.
When you want to find out what commits in H are contaminated with the bug
introduced by A and need fixing, however, you might want to view only the
subset of "A..B" that are actually descendants of A, i.e. excluding the
ones marked with 'o'. Introduce a new option --ancestry-path to compute
this set with "rev-list --ancestry-path A..B".
Note that in practice, you would build a fix immediately on top of A and
"git branch --contains A" will give the names of branches that you would
need to merge the fix into (i.e. topic, dev and master), so this may not
be worth paying the extra cost of postprocessing.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-21 00:48:39 +04:00
|
|
|
ancestry_path:1,
|
Implement line-history search (git log -L)
This is a rewrite of much of Bo's work, mainly in an effort to split
it into smaller, easier to understand routines.
The algorithm is built around the struct range_set, which encodes a
series of line ranges as intervals [a,b). This is used in two
contexts:
* A set of lines we are tracking (which will change as we dig through
history).
* To encode diffs, as pairs of ranges.
The main routine is range_set_map_across_diff(). It processes the
diff between a commit C and some parent P. It determines which diff
hunks are relevant to the ranges tracked in C, and computes the new
ranges for P.
The algorithm is then simply to process history in topological order
from newest to oldest, computing ranges and (partial) diffs. At
branch points, we need to merge the ranges we are watching. We will
find that many commits do not affect the chosen ranges, and mark them
TREESAME (in addition to those already filtered by pathspec limiting).
Another pass of history simplification then gets rid of such commits.
This is wired as an extra filtering pass in the log machinery. This
currently only reduces code duplication, but should allow for other
simplifications and options to be used.
Finally, we hook a diff printer into the output chain. Ideally we
would wire directly into the diff logic, to optionally use features
like word diff. However, that will require some major reworking of
the diff chain, so we completely replace the output with our own diff
for now.
As this was a GSoC project, and has quite some history by now, many
people have helped. In no particular order, thanks go to
Jakub Narebski <jnareb@gmail.com>
Jens Lehmann <Jens.Lehmann@web.de>
Jonathan Nieder <jrnieder@gmail.com>
Junio C Hamano <gitster@pobox.com>
Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Will Palmer <wmpalmer@gmail.com>
Apologies to everyone I forgot.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 20:47:32 +04:00
|
|
|
first_parent_only:1,
|
|
|
|
line_level_traverse:1;
|
2006-02-26 03:19:46 +03:00
|
|
|
|
Common option parsing for "git log --diff" and friends
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 03:52:13 +04:00
|
|
|
/* Diff flags */
|
|
|
|
unsigned int diff:1,
|
|
|
|
full_diff:1,
|
|
|
|
show_root_diff:1,
|
|
|
|
no_commit_id:1,
|
|
|
|
verbose_header:1,
|
|
|
|
ignore_merges:1,
|
|
|
|
combine_merges:1,
|
|
|
|
dense_combined_merges:1,
|
|
|
|
always_show_header:1;
|
|
|
|
|
|
|
|
/* Format info */
|
Log message printout cleanups
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-17 22:59:32 +04:00
|
|
|
unsigned int shown_one:1,
|
2012-10-18 08:27:22 +04:00
|
|
|
shown_dashes:1,
|
2008-07-08 17:19:33 +04:00
|
|
|
show_merge:1,
|
2010-01-21 00:59:36 +03:00
|
|
|
show_notes:1,
|
|
|
|
show_notes_given:1,
|
2011-10-19 02:53:23 +04:00
|
|
|
show_signature:1,
|
2010-01-21 00:59:36 +03:00
|
|
|
pretty_given:1,
|
2008-04-08 04:11:34 +04:00
|
|
|
abbrev_commit:1,
|
2011-05-18 21:56:04 +04:00
|
|
|
abbrev_commit_given:1,
|
2008-05-04 14:36:54 +04:00
|
|
|
use_terminator:1,
|
2009-09-24 12:28:15 +04:00
|
|
|
missing_newline:1,
|
2011-05-27 02:28:17 +04:00
|
|
|
date_mode_explicit:1,
|
|
|
|
preserve_subject:1;
|
2009-11-03 17:59:18 +03:00
|
|
|
unsigned int disable_stdin:1;
|
2011-10-01 19:56:08 +04:00
|
|
|
unsigned int leak_pending:1;
|
2009-11-03 17:59:18 +03:00
|
|
|
|
2007-04-25 10:36:22 +04:00
|
|
|
enum date_mode date_mode;
|
2006-09-06 13:12:09 +04:00
|
|
|
|
Common option parsing for "git log --diff" and friends
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 03:52:13 +04:00
|
|
|
unsigned int abbrev;
|
|
|
|
enum cmit_fmt commit_format;
|
Log message printout cleanups
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-17 22:59:32 +04:00
|
|
|
struct log_info *loginfo;
|
2006-05-05 06:30:52 +04:00
|
|
|
int nr, total;
|
2006-05-20 17:40:29 +04:00
|
|
|
const char *mime_boundary;
|
2009-03-23 05:14:05 +03:00
|
|
|
const char *patch_suffix;
|
|
|
|
int numbered_files;
|
2012-12-22 12:21:23 +04:00
|
|
|
int reroll_count;
|
2008-02-19 06:56:06 +03:00
|
|
|
char *message_id;
|
teach format-patch to place other authors into in-body "From"
Format-patch generates emails with the "From" address set to the
author of each patch. If you are going to send the emails, however,
you would want to replace the author identity with yours (if they
are not the same), and bump the author identity to an in-body
header.
Normally this is handled by git-send-email, which does the
transformation before sending out the emails. However, some
workflows may not use send-email (e.g., imap-send, or a custom
script which feeds the mbox to a non-git MUA). They could each
implement this feature themselves, but getting it right is
non-trivial (one must canonicalize the identities by reversing any
RFC2047 encoding or RFC822 quoting of the headers, which has caused
many bugs in send-email over the years).
This patch takes a different approach: it teaches format-patch a
"--from" option which handles the ident check and in-body header
while it is writing out the email. It's much simpler to do at this
level (because we haven't done any quoting yet), and any workflow
based on format-patch can easily turn it on.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-03 11:08:22 +04:00
|
|
|
struct ident_split from_ident;
|
2009-02-20 00:26:31 +03:00
|
|
|
struct string_list *ref_message_ids;
|
2013-02-12 14:17:38 +04:00
|
|
|
int add_signoff;
|
2006-06-02 17:21:17 +04:00
|
|
|
const char *extra_headers;
|
2006-12-25 22:48:35 +03:00
|
|
|
const char *log_reencode;
|
2007-04-12 03:58:07 +04:00
|
|
|
const char *subject_prefix;
|
2007-03-04 02:12:06 +03:00
|
|
|
int no_inline;
|
2007-07-20 22:15:13 +04:00
|
|
|
int show_log_size;
|
2013-01-06 01:26:41 +04:00
|
|
|
struct string_list *mailmap;
|
Common option parsing for "git log --diff" and friends
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 03:52:13 +04:00
|
|
|
|
2006-09-18 02:43:40 +04:00
|
|
|
/* Filter by commit log message */
|
2008-08-25 10:15:05 +04:00
|
|
|
struct grep_opt grep_filter;
|
2006-09-18 02:43:40 +04:00
|
|
|
|
2008-05-04 14:36:54 +04:00
|
|
|
/* Display history graph */
|
|
|
|
struct git_graph *graph;
|
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
/* special limits */
|
2006-12-20 05:25:32 +03:00
|
|
|
int skip_count;
|
2006-02-26 03:19:46 +03:00
|
|
|
int max_count;
|
|
|
|
unsigned long max_age;
|
|
|
|
unsigned long min_age;
|
2011-03-21 13:14:06 +03:00
|
|
|
int min_parents;
|
|
|
|
int max_parents;
|
2006-03-10 12:21:39 +03:00
|
|
|
|
Common option parsing for "git log --diff" and friends
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 03:52:13 +04:00
|
|
|
/* diff info for patches and for paths limiting */
|
2006-04-11 05:14:54 +04:00
|
|
|
struct diff_options diffopt;
|
Common option parsing for "git log --diff" and friends
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 03:52:13 +04:00
|
|
|
struct diff_options pruning;
|
2006-04-11 05:14:54 +04:00
|
|
|
|
2007-01-11 13:47:48 +03:00
|
|
|
struct reflog_walk_info *reflog_info;
|
2008-04-03 13:12:06 +04:00
|
|
|
struct decoration children;
|
2008-08-14 21:59:44 +04:00
|
|
|
struct decoration merge_simplification;
|
revision.c: Make --full-history consider more merges
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.
While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by "-s
ours", then:
git log -m -p --full-history -Schange file
would successfully locate "change"'s addition but would not locate the
merge that resolved against it.
Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.
Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.
Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.
This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 19:32:34 +04:00
|
|
|
struct decoration treesame;
|
2010-03-12 20:04:26 +03:00
|
|
|
|
|
|
|
/* notes-specific options: which refs to show */
|
|
|
|
struct display_notes_opt notes_opt;
|
2010-06-10 15:47:23 +04:00
|
|
|
|
|
|
|
/* commit counts */
|
|
|
|
int count_left;
|
|
|
|
int count_right;
|
2011-04-26 12:24:29 +04:00
|
|
|
int count_same;
|
Implement line-history search (git log -L)
This is a rewrite of much of Bo's work, mainly in an effort to split
it into smaller, easier to understand routines.
The algorithm is built around the struct range_set, which encodes a
series of line ranges as intervals [a,b). This is used in two
contexts:
* A set of lines we are tracking (which will change as we dig through
history).
* To encode diffs, as pairs of ranges.
The main routine is range_set_map_across_diff(). It processes the
diff between a commit C and some parent P. It determines which diff
hunks are relevant to the ranges tracked in C, and computes the new
ranges for P.
The algorithm is then simply to process history in topological order
from newest to oldest, computing ranges and (partial) diffs. At
branch points, we need to merge the ranges we are watching. We will
find that many commits do not affect the chosen ranges, and mark them
TREESAME (in addition to those already filtered by pathspec limiting).
Another pass of history simplification then gets rid of such commits.
This is wired as an extra filtering pass in the log machinery. This
currently only reduces code duplication, but should allow for other
simplifications and options to be used.
Finally, we hook a diff printer into the output chain. Ideally we
would wire directly into the diff logic, to optionally use features
like word diff. However, that will require some major reworking of
the diff chain, so we completely replace the output with our own diff
for now.
As this was a GSoC project, and has quite some history by now, many
people have helped. In no particular order, thanks go to
Jakub Narebski <jnareb@gmail.com>
Jens Lehmann <Jens.Lehmann@web.de>
Jonathan Nieder <jrnieder@gmail.com>
Junio C Hamano <gitster@pobox.com>
Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Will Palmer <wmpalmer@gmail.com>
Apologies to everyone I forgot.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 20:47:32 +04:00
|
|
|
|
|
|
|
/* line level range that we are chasing */
|
|
|
|
struct decoration line_log_data;
|
log: use true parents for diff even when rewriting
When using pathspec filtering in combination with diff-based log
output, parent simplification happens before the diff is computed.
The diff is therefore against the *simplified* parents.
This works okay, arguably by accident, in the normal case:
simplification reduces to one parent as long as the commit is TREESAME
to it. So the simplified parent of any given commit must have the
same tree contents on the filtered paths as its true (unfiltered)
parent.
However, --full-diff breaks this guarantee, and indeed gives pretty
spectacular results when comparing the output of
git log --graph --stat ...
git log --graph --full-diff --stat ...
(--graph internally kicks in parent simplification, much like
--parents).
To fix it, store a copy of the parent list before simplification (in a
slab) whenever --full-diff is in effect. Then use the stored parents
instead of the simplified ones in the commit display code paths. The
latter do not actually check for --full-diff to avoid duplicated code;
they just grab the original parents if save_parents() has not been
called for this revision walk.
For ordinary commits it should be obvious that this is the right thing
to do.
Merge commits are a bit subtle. Observe that with default
simplification, merge simplification is an all-or-nothing decision:
either the merge is TREESAME to one parent and disappears, or it is
different from all parents and the parent list remains intact.
Redundant parents are not pruned, so the existing code also shows them
as a merge.
So if we do show a merge commit, the parent list just consists of the
rewrite result on each parent. Running, e.g., --cc on this in
--full-diff mode is not very useful: if any commits were skipped, some
hunks will disagree with all sides of the merge (with one side,
because commits were skipped; with the others, because they didn't
have those changes in the first place). This triggers --cc showing
these hunks spuriously.
Therefore I believe that even for merge commits it is better to show
the diffs wrt. the original parents.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01 00:13:20 +04:00
|
|
|
|
|
|
|
/* copies of the parent lists, for --full-diff display */
|
|
|
|
struct saved_parents *saved_parents_slab;
|
2006-02-26 03:19:46 +03:00
|
|
|
};
|
|
|
|
|
2006-03-10 12:21:39 +03:00
|
|
|
#define REV_TREE_SAME 0
|
2009-06-03 05:34:01 +04:00
|
|
|
#define REV_TREE_NEW 1 /* Only new files */
|
|
|
|
#define REV_TREE_OLD 2 /* Only files removed */
|
|
|
|
#define REV_TREE_DIFFERENT 3 /* Mixed changes */
|
2006-03-10 12:21:39 +03:00
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
/* revision.c */
|
2007-11-03 21:11:10 +03:00
|
|
|
typedef void (*show_early_output_fn_t)(struct rev_info *, struct commit_list *);
|
2008-08-21 04:34:30 +04:00
|
|
|
extern volatile show_early_output_fn_t show_early_output;
|
2006-03-10 12:21:39 +03:00
|
|
|
|
2010-03-09 09:58:09 +03:00
|
|
|
struct setup_revision_opt {
|
|
|
|
const char *def;
|
2010-03-09 10:27:25 +03:00
|
|
|
void (*tweak)(struct rev_info *, struct setup_revision_opt *);
|
2010-07-07 17:39:12 +04:00
|
|
|
const char *submodule;
|
2012-04-14 23:04:48 +04:00
|
|
|
int assume_dashdash;
|
2012-07-02 23:43:05 +04:00
|
|
|
unsigned revarg_opt;
|
2010-03-09 09:58:09 +03:00
|
|
|
};
|
|
|
|
|
2006-07-29 08:21:48 +04:00
|
|
|
extern void init_revisions(struct rev_info *revs, const char *prefix);
|
2013-05-25 13:08:07 +04:00
|
|
|
extern int setup_revisions(int argc, const char **argv, struct rev_info *revs,
|
|
|
|
struct setup_revision_opt *);
|
2008-07-10 01:38:34 +04:00
|
|
|
extern void parse_revision_opt(struct rev_info *revs, struct parse_opt_ctx_t *ctx,
|
2013-05-25 13:08:07 +04:00
|
|
|
const struct option *options,
|
|
|
|
const char * const usagestr[]);
|
2012-07-02 23:33:52 +04:00
|
|
|
#define REVARG_CANNOT_BE_FILENAME 01
|
2012-07-02 23:43:05 +04:00
|
|
|
#define REVARG_COMMITTISH 02
|
2013-05-25 13:08:07 +04:00
|
|
|
extern int handle_revision_arg(const char *arg, struct rev_info *revs,
|
|
|
|
int flags, unsigned revarg_opt);
|
2006-09-06 08:28:36 +04:00
|
|
|
|
2012-03-29 11:21:21 +04:00
|
|
|
extern void reset_revision_walk(void);
|
2007-05-05 01:54:57 +04:00
|
|
|
extern int prepare_revision_walk(struct rev_info *revs);
|
2006-02-28 22:24:00 +03:00
|
|
|
extern struct commit *get_revision(struct rev_info *revs);
|
2013-05-25 13:08:07 +04:00
|
|
|
extern char *get_revision_mark(const struct rev_info *revs,
|
|
|
|
const struct commit *commit);
|
|
|
|
extern void put_revision_mark(const struct rev_info *revs,
|
|
|
|
const struct commit *commit);
|
2006-02-28 22:24:00 +03:00
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
extern void mark_parents_uninteresting(struct commit *commit);
|
|
|
|
extern void mark_tree_uninteresting(struct tree *tree);
|
|
|
|
|
|
|
|
struct name_path {
|
|
|
|
struct name_path *up;
|
|
|
|
int elem_len;
|
|
|
|
const char *elem;
|
|
|
|
};
|
|
|
|
|
show_object(): push path_name() call further down
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow
- more clarity into who "owns" the name (ie now when we free the name in
the show_object callback, it's because we generated it ourselves by
calling path_name())
- not calling path_name() at all, either because we don't care about the
name in the first place, or because we are actually happy walking the
linked list of "struct name_path *" and the last component.
Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.
Why? We use that name for two things:
- add_preferred_base_object(), which actually _wants_ to traverse the
path, and now does it by looking for '/' characters!
- for 'name_hash()', which only cares about the last 16 characters of a
name, so again, generating the full name seems to be just unnecessary
work.
Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.
This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-11 05:15:26 +04:00
|
|
|
char *path_name(const struct name_path *path, const char *name);
|
process_{tree,blob}: show objects without buffering
Here's a less trivial thing, and slightly more dubious one.
I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.
Now, there are possible downsides to this:
- the "buffer using object_array" _can_ in theory result in at least
better I-cache usage (two tight loops rather than one more spread out
one). I don't think this is a real issue, but in theory..
- this _does_ change the order of the objects printed. Instead of doing a
"process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
over the commits (which puts all the root trees _first_ in the object
list, this patch just adds them to the list of pending objects, and
then we'll traverse them in that order (and thus show each root tree
object together with the objects we discover under it)
I _think_ the new ordering actually makes more sense, but the object
ordering is actually a subtle thing when it comes to packing
efficiency, so any change in order is going to have implications for
packing. Good or bad, I dunno.
- There may be some reason why we did it that odd way with the object
array, that I have simply forgotten.
Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.
This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...
Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.
Or maybe there _used_ to be a reason, and no longer is.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-11 04:27:58 +04:00
|
|
|
|
2013-05-25 13:08:07 +04:00
|
|
|
extern void show_object_with_name(FILE *, struct object *,
|
|
|
|
const struct name_path *, const char *);
|
2011-08-18 01:30:34 +04:00
|
|
|
|
Add "named object array" concept
We've had this notion of a "object_list" for a long time, which eventually
grew a "name" member because some users (notably git-rev-list) wanted to
name each object as it is generated.
That object_list is great for some things, but it isn't all that wonderful
for others, and the "name" member is generally not used by everybody.
This patch splits the users of the object_list array up into two: the
traditional list users, who want the list-like format, and who don't
actually use or want the name. And another class of users that really used
the list as an extensible array, and generally wanted to name the objects.
The patch is fairly straightforward, but it's also biggish. Most of it
really just cleans things up: switching the revision parsing and listing
over to the array makes things like the builtin-diff usage much simpler
(we now see exactly how many members the array has, and we don't get the
objects reversed from the order they were on the command line).
One of the main reasons for doing this at all is that the malloc overhead
of the simple object list was actually pretty high, and the array is just
a lot denser. So this patch brings down memory usage by git-rev-list by
just under 3% (on top of all the other memory use optimizations) on the
mozilla archive.
It does add more lines than it removes, and more importantly, it adds a
whole new infrastructure for maintaining lists of objects, but on the
other hand, the new dynamic array code is pretty obvious. The change to
builtin-diff-tree.c shows a fairly good example of why an array interface
is sometimes more natural, and just much simpler for everybody.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-20 04:42:35 +04:00
|
|
|
extern void add_object(struct object *obj,
|
|
|
|
struct object_array *p,
|
|
|
|
struct name_path *path,
|
|
|
|
const char *name);
|
|
|
|
|
2013-05-25 13:08:07 +04:00
|
|
|
extern void add_pending_object(struct rev_info *revs,
|
|
|
|
struct object *obj, const char *name);
|
|
|
|
extern void add_pending_sha1(struct rev_info *revs,
|
|
|
|
const char *name, const unsigned char *sha1,
|
|
|
|
unsigned int flags);
|
2006-02-26 03:19:46 +03:00
|
|
|
|
2007-12-11 21:09:04 +03:00
|
|
|
extern void add_head_to_pending(struct rev_info *);
|
|
|
|
|
2007-11-04 23:12:05 +03:00
|
|
|
enum commit_action {
|
|
|
|
commit_ignore,
|
|
|
|
commit_show,
|
|
|
|
commit_error
|
|
|
|
};
|
|
|
|
|
2013-05-25 13:08:07 +04:00
|
|
|
extern enum commit_action get_commit_action(struct rev_info *revs,
|
|
|
|
struct commit *commit);
|
|
|
|
extern enum commit_action simplify_commit(struct rev_info *revs,
|
|
|
|
struct commit *commit);
|
2007-11-04 23:12:05 +03:00
|
|
|
|
2013-03-28 20:47:31 +04:00
|
|
|
enum rewrite_result {
|
|
|
|
rewrite_one_ok,
|
|
|
|
rewrite_one_noparents,
|
|
|
|
rewrite_one_error
|
|
|
|
};
|
|
|
|
|
|
|
|
typedef enum rewrite_result (*rewrite_parent_fn_t)(struct rev_info *revs, struct commit **pp);
|
|
|
|
|
|
|
|
extern int rewrite_parents(struct rev_info *revs, struct commit *commit,
|
|
|
|
rewrite_parent_fn_t rewrite_parent);
|
log: use true parents for diff even when rewriting
When using pathspec filtering in combination with diff-based log
output, parent simplification happens before the diff is computed.
The diff is therefore against the *simplified* parents.
This works okay, arguably by accident, in the normal case:
simplification reduces to one parent as long as the commit is TREESAME
to it. So the simplified parent of any given commit must have the
same tree contents on the filtered paths as its true (unfiltered)
parent.
However, --full-diff breaks this guarantee, and indeed gives pretty
spectacular results when comparing the output of
git log --graph --stat ...
git log --graph --full-diff --stat ...
(--graph internally kicks in parent simplification, much like
--parents).
To fix it, store a copy of the parent list before simplification (in a
slab) whenever --full-diff is in effect. Then use the stored parents
instead of the simplified ones in the commit display code paths. The
latter do not actually check for --full-diff to avoid duplicated code;
they just grab the original parents if save_parents() has not been
called for this revision walk.
For ordinary commits it should be obvious that this is the right thing
to do.
Merge commits are a bit subtle. Observe that with default
simplification, merge simplification is an all-or-nothing decision:
either the merge is TREESAME to one parent and disappears, or it is
different from all parents and the parent list remains intact.
Redundant parents are not pruned, so the existing code also shows them
as a merge.
So if we do show a merge commit, the parent list just consists of the
rewrite result on each parent. Running, e.g., --cc on this in
--full-diff mode is not very useful: if any commits were skipped, some
hunks will disagree with all sides of the merge (with one side,
because commits were skipped; with the others, because they didn't
have those changes in the first place). This triggers --cc showing
these hunks spuriously.
Therefore I believe that even for merge commits it is better to show
the diffs wrt. the original parents.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01 00:13:20 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Save a copy of the parent list, and return the saved copy. This is
|
|
|
|
* used by the log machinery to retrieve the original parents when
|
|
|
|
* commit->parents has been modified by history simpification.
|
|
|
|
*
|
|
|
|
* You may only call save_parents() once per commit (this is checked
|
|
|
|
* for non-root commits).
|
|
|
|
*
|
|
|
|
* get_saved_parents() will transparently return commit->parents if
|
|
|
|
* history simplification is off.
|
|
|
|
*/
|
|
|
|
extern void save_parents(struct rev_info *revs, struct commit *commit);
|
|
|
|
extern struct commit_list *get_saved_parents(struct rev_info *revs, const struct commit *commit);
|
|
|
|
extern void free_saved_parents(struct rev_info *revs);
|
|
|
|
|
2006-02-26 03:19:46 +03:00
|
|
|
#endif
|