Граф коммитов

49 Коммитов

Автор SHA1 Сообщение Дата
Junio C Hamano 6a67695268 Merge branch 'js/regexec-buf'
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region.  This was fixed by introducing
a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND
extension.

* js/regexec-buf:
  regex: use regexec_buf()
  regex: add regexec_buf() that can work on a non NUL-terminated string
  regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
2016-09-26 16:09:19 -07:00
Johannes Schindelin b7d36ffca0 regex: use regexec_buf()
The new regexec_buf() function operates on buffers with an explicitly
specified length, rather than NUL-terminated strings.

We need to use this function whenever the buffer we want to pass to
regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated).

Note: the original motivation for this patch was to fix a bug where
`git diff -G <regex>` would crash. This patch converts more callers,
though, some of which allocated to construct NUL-terminated strings,
or worse, modified buffers to temporarily insert NULs while calling
regexec(3).  By converting them to use regexec_buf(), the code has
become much cleaner.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 13:56:15 -07:00
brian m. carlson d449347d08 Convert read_mmblob to take struct object_id.
Since all of its callers have been updated, convert read_mmblob to take
a pointer to struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
René Scharfe e0876bca4d xdiff: don't trim common tail with -W
The function trim_common_tail() exits early if context lines are
requested.  If -U0 and -W are specified together then it can still trim
context lines that might belong to a changed function.  As a result
that function is shown incompletely.

Fix that by calling trim_common_tail() only if no function context or
fixed context is requested.  The parameter ctx is no longer needed now;
remove it.

While at it fix an outdated comment as well.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31 13:08:56 -07:00
Jeff King b32fa95fd8 convert trivial cases to ALLOC_ARRAY
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King dcd1742e56 xdiff: reject files larger than ~1GB
The xdiff code is not prepared to handle extremely large
files. It uses "int" in many places, which can overflow if
we have a very large number of lines or even bytes in our
input files. This can cause us to produce incorrect diffs,
with no indication that the output is wrong. Or worse, we
may even underallocate a buffer whose size is the result of
an overflowing addition.

We're much better off to tell the user that we cannot diff
or merge such a large file. This patch covers both cases,
but in slightly different ways:

  1. For merging, we notice the large file and cleanly fall
     back to a binary merge (which is effectively "we cannot
     merge this").

  2. For diffing, we make the binary/text distinction much
     earlier, and in many different places. For this case,
     we'll use the xdi_diff as our choke point, and reject
     any diff there before it hits the xdiff code.

     This means in most cases we'll die() immediately after.
     That's not ideal, but in practice we shouldn't
     generally hit this code path unless the user is trying
     to do something tricky. We already consider files
     larger than core.bigfilethreshold to be binary, so this
     code would only kick in when that is circumvented
     (either by bumping that value, or by using a
     .gitattribute to mark a file as diffable).

     In other words, we can avoid being "nice" here, because
     there is already nice code that tries to do the right
     thing. We are adding the suspenders to the nice code's
     belt, so notice when it has been worked around (both to
     protect the user from malicious inputs, and because it
     is better to die() than generate bogus output).

The maximum size was chosen after experimenting with feeding
large files to the xdiff code. It's just under a gigabyte,
which leaves room for two obvious cases:

  - a diff3 merge conflict result on files of maximum size X
    could be 3*X plus the size of the markers, which would
    still be only about 3G, which fits in a 32-bit int.

  - some of the diff code allocates arrays of one int per
    record. Even if each file consists only of blank lines,
    then a file smaller than 1G will have fewer than 1G
    records, and therefore the int array will fit in 4G.

Since the limit is arbitrary anyway, I chose to go under a
gigabyte, to leave a safety margin (e.g., we would not want
to overflow by allocating "(records + 1) * sizeof(int)" or
similar.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-28 14:57:23 -07:00
René Scharfe 3319e60633 xdiff: remove emit_func() and xdi_diff_hunks()
The functions are unused now, remove them.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09 14:08:42 -07:00
Jonathan Nieder 8c2be75fe1 add, merge, diff: do not use strcasecmp to compare config variable names
The config machinery already makes section and variable names
lowercase when parsing them, so using strcasecmp for comparison just
feels wasteful.  No noticeable change intended.

Noticed-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-14 18:53:39 -07:00
Junio C Hamano 26517dea24 Merge branch 'rs/maint-diff-fd-leak' into maint
* rs/maint-diff-fd-leak:
  close file on error in read_mmfile()
2010-12-26 11:18:39 -08:00
René Scharfe 5fd898141c close file on error in read_mmfile()
Reported in http://qa.debian.org/daca/cppcheck/sid/git_1.7.2.3-2.2.html
and in http://thread.gmane.org/gmane.comp.version-control.git/123042.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-26 11:17:18 -08:00
Brandon Casey 1b6ecbad35 xdiff-interface.c: always trim trailing space from xfuncname matches
Generally, trailing space is removed from the string matched by the
xfuncname patterns.  The exception is when the matched string exceeds the
length of the fixed-size buffer that it will be copied in to.  But, a
string that exceeds the buffer can still contain trailing space in the
portion of the string that will be copied into the buffer.  So, simplify
this code slightly, and just perform the trailing space removal always.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09 17:18:29 -07:00
Junio C Hamano 6b6f5d4664 Merge branch 'maint-1.7.0' into maint
* maint-1.7.0:
  remove ecb parameter from xdi_diff_outf()
2010-05-04 15:20:47 -07:00
René Scharfe dfea79004c remove ecb parameter from xdi_diff_outf()
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in.  It's no surprise then that all
callers pass a pointer to an uninitialized structure.  They also don't
read it after the call, so the parameter is neither used for input nor
for output.   Turn it into a local variable of xdi_diff_outf().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04 15:19:14 -07:00
Michael Lukashov 06b65939b0 refactor duplicated fill_mm() in checkout and merge-recursive
The following function is duplicated:

  fill_mm

Move it to xdiff-interface.c and rename it 'read_mmblob', as suggested
by Junio C Hamano.

Also, change parameters order for consistency with read_mmfile().

Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-17 15:11:33 -08:00
René Scharfe 8cfe5f1cd5 userdiff: add xdiff_clear_find_func()
xdiff_set_find_func() is used to set user defined regular expressions
for finding function signatures.  Add xdiff_clear_find_func(), which
frees the memory allocated by the former, making the API complete.

Also, use the new function in diff.c (the only call site of
xdiff_set_find_func()) to clean up after ourselves.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:37 -07:00
Benjamin Kramer eb3a9dd327 Remove unused function scope local variables
These variables were unused and can be removed safely:

  builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
  builtin-fetch-pack.c::find_common(): len
  builtin-remote.c::mv(): symref
  diff.c::show_stats():show_stats(): total
  diffcore-break.c::should_break(): base_size
  fast-import.c::validate_raw_date(): date, sign
  fsck.c::fsck_tree(): o_sha1, sha1
  xdiff-interface.c::parse_num(): read_some

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:17 -08:00
Jim Meyering c095a1db30 xdiff-interface.c: remove 10 duplicated lines
Remove an accidentally duplicated sequence of 10 lines.
This happens to plug a leak, too.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-26 10:47:41 -08:00
Junio C Hamano 1e2bba92d2 Merge branch 'rs/blame'
* rs/blame:
  blame: use xdi_diff_hunks(), get rid of struct patch
  add xdi_diff_hunks() for callers that only need hunk lengths
  Allow alternate "low-level" emit function from xdl_diff
  Always initialize xpparam_t to 0
  blame: inline get_patch()
2008-11-08 16:05:39 -08:00
René Scharfe 86295bb6ba add xdi_diff_hunks() for callers that only need hunk lengths
Based on a patch by Brian Downing, this uses the xdiff emit_func feature
to implement xdi_diff_hunks().  It's a function that calls a callback for
each hunk of a diff, passing its lengths.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-25 12:09:31 -07:00
Junio C Hamano 46dc1b0e33 Merge branch 'maint'
* maint:
  t1301-shared-repo.sh: don't let a default ACL interfere with the test
  git-check-attr(1): add output and example sections
  xdiff-interface.c: strip newline (and cr) from line before pattern matching
  t4018-diff-funcname: demonstrate end of line funcname matching flaw
  t4018-diff-funcname: rework negated last expression test
  Typo "does not exists" when git remote update remote.
  remote.c: correct the check for a leading '/' in a remote name
  Add testcase to ensure merging an early part of a branch is done properly

Conflicts:
	t/t7600-merge.sh
2008-10-17 01:52:32 -07:00
Brandon Casey 563d5a2c84 xdiff-interface.c: strip newline (and cr) from line before pattern matching
POSIX doth sayeth:

   "In the regular expression processing described in IEEE Std 1003.1-2001,
    the <newline> is regarded as an ordinary character and both a period and
    a non-matching list can match one. ... Those utilities (like grep) that
    do not allow <newline>s to match are responsible for eliminating any
    <newline> from strings before matching against the RE."

Thus far git has not been removing the trailing newline from strings matched
against regular expression patterns. This has the effect that (quoting
Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
'^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-16 08:31:56 -07:00
Brandon Casey a5a5a04863 xdiff-interface.c: strip newline (and cr) from line before pattern matching
POSIX doth sayeth:

   "In the regular expression processing described in IEEE Std 1003.1-2001,
    the <newline> is regarded as an ordinary character and both a period and
    a non-matching list can match one. ... Those utilities (like grep) that
    do not allow <newline>s to match are responsible for eliminating any
    <newline> from strings before matching against the RE."

Thus far git has not been removing the trailing newline from strings matched
against regular expression patterns. This has the effect that (quoting
Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
'^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 18:45:53 -07:00
Shawn O. Pearce 9800c0df41 Merge branch 'bc/master-diff-hunk-header-fix'
* bc/master-diff-hunk-header-fix:
  Clarify commit error message for unmerged files
  Use strchrnul() instead of strchr() plus manual workaround
  Use remove_path from dir.c instead of own implementation
  Add remove_path: a function to remove as much as possible of a path
  git-submodule: Fix "Unable to checkout" for the initial 'update'
  Clarify how the user can satisfy stash's 'dirty state' check.
  t4018-diff-funcname: test syntax of builtin xfuncname patterns
  t4018-diff-funcname: test syntax of builtin xfuncname patterns
  make "git remote" report multiple URLs
  diff hunk pattern: fix misconverted "\{" tex macro introducers
  diff: fix "multiple regexp" semantics to find hunk header comment
  diff: use extended regexp to find hunk headers
  diff: use extended regexp to find hunk headers
  diff.*.xfuncname which uses "extended" regex's for hunk header selection
  diff.c: associate a flag with each pattern and use it for compiling regex
  diff.c: return pattern entry pointer rather than just the hunk header pattern

Conflicts:
	builtin-merge-recursive.c
	t/t7201-co.sh
	xdiff-interface.h
2008-09-29 11:04:20 -07:00
Shawn O. Pearce 9ba929ed65 Merge branch 'jc/better-conflict-resolution'
* jc/better-conflict-resolution:
  Fix AsciiDoc errors in merge documentation
  git-merge documentation: describe how conflict is presented
  checkout --conflict=<style>: recreate merge in a non-default style
  checkout -m: recreate merge when checking out of unmerged index
  git-merge-recursive: learn to honor merge.conflictstyle
  merge.conflictstyle: choose between "merge" and "diff3 -m" styles
  rerere: understand "diff3 -m" style conflicts with the original
  rerere.c: use symbolic constants to keep track of parsing states
  xmerge.c: "diff3 -m" style clips merge reduction level to EAGER or less
  xmerge.c: minimum readability fixups
  xdiff-merge: optionally show conflicts in "diff3 -m" style
  xdl_fill_merge_buffer(): separate out a too deeply nested function
  checkout --ours/--theirs: allow checking out one side of a conflicting merge
  checkout -f: allow ignoring unmerged paths when checking out of the index

Conflicts:
	Documentation/git-checkout.txt
	builtin-checkout.c
	builtin-merge-recursive.c
	t/t7201-co.sh
2008-09-29 10:15:07 -07:00
Junio C Hamano 3d8dccd74a diff: fix "multiple regexp" semantics to find hunk header comment
When multiple regular expressions are concatenated with "\n", they were
traditionally AND'ed together, and only a line that matches _all_ of them
is taken as a match.  This however is unwieldy when multiple regexp
feature is used to specify alternatives.

This fixes the semantics to take the first match.  A nagative pattern, if
matches, makes the line to fail as before.  A match with a positive
pattern will be the final match, and what it captures in $1 is used as the
hunk header comment.

We could write alternatives using "|" in ERE, but the machinery can only
use captured $1 as the hunk header comment (or $0 if there is no match in
$1), so you cannot write:

    "junk ( A | B ) | garbage ( C | D )"

and expect both "junk" and "garbage" to get stripped with the existing
code.  With this fix, you can write it as:

    "junk ( A | B ) \n garbage ( C | D )"

and the way capture works would match the user expectation more
naturally.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-20 00:52:11 -07:00
Junio C Hamano dde4af4313 Merge branch 'bc/maint-diff-hunk-header-fix' into bc/master-diff-hunk-header-fix
* bc/maint-diff-hunk-header-fix:
  diff.*.xfuncname which uses "extended" regex's for hunk header selection
  diff.c: associate a flag with each pattern and use it for compiling regex
  diff.c: return pattern entry pointer rather than just the hunk header pattern
  Cosmetical command name fix
  Start conforming code to "git subcmd" style part 3
  t9700/test.pl: remove File::Temp requirement
  t9700/test.pl: avoid bareword 'STDERR' in 3-argument open()
  GIT 1.6.0.2
  Fix some manual typos.
  Use compatibility regex library also on FreeBSD
  Use compatibility regex library also on AIX
  Update draft release notes for 1.6.0.2
  Use compatibility regex library for OSX/Darwin
  git-svn: Fixes my() parameter list syntax error in pre-5.8 Perl
  Git.pm: Use File::Temp->tempfile instead of ->new
  t7501: always use test_cmp instead of diff
  Start conforming code to "git subcmd" style part 2
  diff: Help "less" hide ^M from the output
  checkout: do not check out unmerged higher stages randomly

Conflicts:
	Documentation/git.txt
	Documentation/gitattributes.txt
	Makefile
	diff.c
	t/t7201-co.sh
2008-09-18 20:32:50 -07:00
Brandon Casey a013585b20 diff.c: associate a flag with each pattern and use it for compiling regex
This is in preparation for allowing extended regular expression patterns.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-18 20:06:23 -07:00
Junio C Hamano b541248467 merge.conflictstyle: choose between "merge" and "diff3 -m" styles
This teaches "git merge-file" to honor merge.conflictstyle configuration
variable, whose value can be "merge" (default) or "diff3".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30 19:41:44 -07:00
Junio C Hamano 8a3f524bf2 xdiff-interface: hide the whole "xdiff_emit_state" business from the caller
This further enhances xdi_diff_outf() interface so that it takes two
common parameters: the callback function that processes one line at a
time, and a pointer to its application specific callback data structure.
xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes
these two away inside it, which is used by the lowest level output
function in the xdiff_outf() callchain, consume_one(), to call back to the
application layer.  With this restructuring, we lift the requirement that
the caller supplied callback data structure embeds xdiff_emit_state
structure as its first member.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-14 00:30:26 -07:00
Brian Downing b463776086 Use strbuf for struct xdiff_emit_state's remainder
Continually xreallocing and freeing the remainder member of struct
xdiff_emit_state was a noticeable performance hit.  Use a strbuf
instead.

This yields a decent performance improvement on "git blame" on certain
repositories.  For example, before this commit:

$ time git blame -M -C -C -p --incremental server.c >/dev/null
101.52user 0.17system 1:41.73elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+39561minor)pagefaults 0swaps

With this commit:

$ time git blame -M -C -C -p --incremental server.c >/dev/null
80.38user 0.30system 1:20.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+50979minor)pagefaults 0swaps

Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13 23:10:23 -07:00
Brian Downing c99db9d292 Make xdi_diff_outf interface for running xdiff_outf diffs
To prepare for the need to initialize and release resources for an
xdi_diff with the xdiff_outf output function, make a new function to
wrap this usage.

Old:

	ecb.outf = xdiff_outf;
	ecb.priv = &state;
	...
	xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb);

New:

	xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb);

Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13 23:10:23 -07:00
Junio C Hamano 16007f3916 Merge branch 'maint'
* maint:
  merge-file: handle empty files gracefully
  merge-recursive: handle file mode changes
  Minor wording changes in the keyboard descriptions in git-add --interactive.
  git fetch: Take '-n' to mean '--no-tags'
  quiltimport: fix misquoting of parsed -p<num> parameter
  git-quiltimport: better parser to grok "enhanced" series files.
2008-03-14 00:16:42 -07:00
Johannes Schindelin 381b851c9b merge-file: handle empty files gracefully
Earlier, it would error out while trying to read and/or writing them.
Now, calling merge-file with empty files is neither interesting nor
useful, but it is a bug that needed fixing.

Noticed by Clemens Buchacher.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2008-03-13 23:43:56 -07:00
Jim Meyering 8e0f70033b Avoid unnecessary "if-before-free" tests.
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:

        if (some_expression)
                free (some_expression);

with the now-equivalent:

        free (some_expression);

It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:

    http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html

FYI, the change below was prepared by running the following:

  git ls-files -z | xargs -0 \
  perl -0x3b -pi -e \
    's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'

Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }".  But that's ok, since there were none like
that in git sources.

Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code.  That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this

  if (x)
    free (x);
  else
    foo ();

into this:

  free (x);
  else
    foo ();

There were none of those here, either.

If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
 option to make it detect free-like functions with different names]

  http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free

Addendum:
  Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 14:14:40 -08:00
Linus Torvalds d2f82950a9 Re(-re)*fix trim_common_tail()
The tar-ball and the git archive itself is fine, but yes, the diff from
2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization
that has caused way too much pain.

Very interesting breakage. The patch was actually "correct" in a (rather
limited) technical sense, but the context at the end was missing because
while the trim_common_tail() code made sure to keep enough common context
to allow a valid diff to be generated, the diff machinery itself could
decide that it could generate the diff differently than the "obvious"
solution.

Thee sad fact is that the git optimization (which is very important for
"git blame", which needs no context), is only really valid for that one
case where we really don't need any context.

[jc: since this is shared with "git diff -U0" codepath, context recovery
to the end of line needs to be done even for zero context case.]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-20 20:54:23 -08:00
Junio C Hamano 079fe1dae8 Re-re-re-fix common tail optimization
We need to be extra careful recovering the removed common section, so
that we do not break context nor the changed incomplete line (i.e. the
last line that does not end with LF).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-16 14:00:30 -08:00
Jeff King 5249997735 trim_common_tail: brown paper bag fix.
The recovered context lines were not LF terminated due to off-by-one
error, which also caused the outer loop to count the number of recovered
lines to terminate after running only once.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-16 11:53:03 -08:00
Junio C Hamano 29ab27f4b5 xdiff tail trimming: use correct type.
Inside xdiff library, the number of context lines is represented in
long, not int.

Noticed by Peter Baumann.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14 12:00:42 -08:00
Junio C Hamano 913b45f51b xdi_diff: trim common trailing lines
This implements earlier Linus's optimization to trim common lines at the
end before passing them down to low level xdiff interface for all of our
xdiff users.

We could later enhance this to also trim common leading lines, but that
would need tweaking the output function to add the number of lines
trimmed at the beginning to line numbers that appear in the hunk
headers.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13 23:04:26 -08:00
Junio C Hamano c279d7e986 xdl_diff: identify call sites.
This inserts a new function xdi_diff() that currently does not
do anything other than calling the underlying xdl_diff() to the
callchain of current callers of xdl_diff() function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13 23:04:26 -08:00
Junio C Hamano f258475a6e Per-path attribute based hunk header selection.
This makes"diff -p" hunk headers customizable via gitattributes mechanism.
It is based on Johannes's earlier patch that allowed to define a single
regexp to be used for everything.

The mechanism to arrive at the regexp that is used to define hunk header
is the same as other use of gitattributes.  You assign an attribute, funcname
(because "diff -p" typically uses the name of the function the patch is about
as the hunk header), a simple string value.  This can be one of the names of
built-in pattern (currently, "java" is defined) or a custom pattern name, to
be looked up from the configuration file.

  (in .gitattributes)
  *.java   funcname=java
  *.perl   funcname=perl

  (in .git/config)
  [funcname]
    java = ... # ugly and complicated regexp to override the built-in one.
    perl = ... # another ugly and complicated regexp to define a new one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-06 01:20:47 -07:00
Junio C Hamano a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Johannes Schindelin 6bfce93e04 Move buffer_is_binary() to xdiff-interface.h
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-04 23:07:00 -07:00
Shawn O. Pearce dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Johannes Schindelin 7cab5883ff move read_mmfile() into xdiff-interface
read_file() was a useful function if you want to work with the xdiff stuff,
so it was renamed and put into a more central place.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-21 23:10:14 -08:00
Jonas Fonseca 83572c1a91 Use xrealloc instead of realloc
Change places that use realloc, without a proper error path, to instead use
xrealloc. Drop an erroneous error path in the daemon code that used errno
in the die message in favour of the simpler xrealloc.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 17:54:06 -07:00
Junio C Hamano a0fd31463b Match ofs/cnt types in diff interface.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-06 22:29:55 -07:00
Junio C Hamano c1e335a43f combine-diff: move the code to parse hunk-header into common library.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-05 12:22:35 -07:00
Junio C Hamano d9ea73e056 combine-diff: refactor built-in xdiff interface.
This refactors the line-by-line callback mechanism used in
combine-diff so that other programs can reuse it more easily.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-05 02:09:58 -07:00