"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored" did not list ignored and untracked files
without "-uall". These have been corrected.
* sl/clean-d-ignored-fix:
clean: teach clean -d to preserve ignored paths
dir: expose cmp_name() and check_contains()
dir: hide untracked contents of untracked dirs
dir: recurse into untracked dirs for ignored files
t7061: status --ignored should search untracked dirs
t7300: clean -d should skip dirs with ignored files
There is an implicit assumption that a directory containing only
untracked and ignored paths should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored paths could be
deleted, even though doing so would also remove the ignored paths.
To get around this, we teach clean -d to collect ignored paths and skip
an untracked directory if it contained an ignored path, instead just
removing the untracked contents thereof. To achieve this, cmd_clean()
has to collect all untracked contents of untracked directories, in
addition to all ignored paths, to determine which untracked dirs must be
skipped (because they contain ignored paths) and which ones should *not*
be skipped.
For this purpose, correct_untracked_entries() is introduced to prune a
given dir_struct of untracked entries containing ignored paths and those
untracked entries encompassed by the untracked entries which are not
pruned away.
A memory leak is also fixed in cmd_clean().
This also fixes the known breakage in t7300, since clean -d now skips
untracked directories containing ignored paths.
Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some warning() messages from "git clean" were updated to show the
errno from failed system calls.
* nd/clean-preserve-errno-in-warning:
clean: use warning_errno() when appropriate
All these warning() calls are preceded by a system call. Report the
actual error to help the user understand why we fail to remove
something.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change strings for help to match the ones in git-add--interactive.perl.
The strings now represent one entry to translate each rather then two
entries each different only by an ending newline character.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The underlying machinery used by "ls-files -o" and other commands
have been taught not to create empty submodule ref cache for a
directory that is not a submodule. This removes a ton of wasted
CPU cycles.
* jk/ref-cache-non-repository-optim:
resolve_gitlink_ref: ignore non-repository paths
clean: make is_git_repository a public function
We have always had is_git_directory(), for looking at a
specific directory to see if it contains a git repo. In
0179ca7 (clean: improve performance when removing lots of
directories, 2015-06-15), we added is_git_repository() which
checks for a non-bare repository by looking at its ".git"
entry.
However, the fix in 0179ca7 needs to be applied other
places, too. Let's make this new helper globally available.
We need to give it a better name, though, to avoid confusion
with is_git_directory(). This patch does that, documents
both functions with a comment to reduce confusion, and
removes the clean-specific references in the comments.
Based-on-a-patch-by: Andreas Krey <a.krey@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.
Most of these cases are trivially identical conversions, but
there are two things to note:
- in a few cases we did not check that the strbuf is
non-empty (which would lead to an out-of-bounds memory
access). These were generally not triggerable in
practice, either from earlier assertions, or typically
because we would have just fed the strbuf to opendir(),
which would choke on an empty path.
- in a few cases we indexed the buffer with "original_len"
or similar, rather than the current sb->len, and it is
not immediately obvious from the diff that they are the
same. In all of these cases, I manually verified that
the strbuf does not change between the assignment and
the strbuf_complete call.
This does not convert cases which look like:
if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
strbuf_addch(sb, '/');
as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clean" uses resolve_gitlink_ref() to check for the presence of
nested git repositories, but it has the drawback of creating a
ref_cache entry for every directory that should potentially be
cleaned. The linear search through the ref_cache list causes a massive
performance hit for large number of directories.
Modify clean.c:remove_dirs to use setup.c:is_git_directory and
setup.c:read_gitfile_gently instead.
Both these functions will open files and parse contents when they find
something that looks like a git repository. This is ok from a
performance standpoint since finding repository candidates should be
comparatively rare.
Using is_git_directory and read_gitfile_gently should give a more
standardized check for what is and what isn't a git repository but
also gives three behavioral changes.
The first change is that we will now detect and avoid cleaning empty
nested git repositories (only init run). This is desirable.
Second, we will no longer die when cleaning a file named ".git" with
garbage content (it will be cleaned instead). This is also desirable.
The last change is that we will detect and avoid cleaning empty bare
repositories that have been placed in a directory named ".git". This
is not desirable but should have no real user impact since we already
fail to clean non-empty bare repositories in the same scenario. This
is thus deemed acceptable.
On top of this we add some extra precautions. If read_gitfile_gently
fails to open the git file, read the git file or verify the path in
the git file we assume that the path with the git file is a valid
repository and avoid cleaning.
Update t7300 to reflect these changes in behavior.
The time to clean an untracked directory containing 100000 sub
directories went from 61s to 1.7s after this change.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Erik Elfström <erik.elfstrom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* rs/janitorial:
dir: remove unused variable sb
clean: remove unused variable buf
use file_exists() to check if a file exists in the worktree
"git clean pathspec..." tried to lstat(2) and complain even for
paths outside the given pathspec.
* dt/clean-pathspec-filter-then-lstat:
clean: only lstat files in pathspec
Even though "git clean" takes pathspec to limit the part of the
working tree to be cleaned, it checked the paths it encounters
during its directory traversal with lstat(2), before checking if
the path is within the pathspec.
Ignore paths outside pathspec and proceed without checking with
lstat(2). Even if such a path is unreadable due to e.g. EPERM,
"git clean" should not care.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prompt string "remove?" used when "git clean -i" asks the user
if a path should be removed was localizable, but the code always
expects a substring of "yes" to tell it to go ahead. Always show
[y/N] as part of this prompt to hint that the answer is not (yet)
localized.
* ja/clean-confirm-i18n:
Add hint interactive cleaning
For translators, specify that a [y/N] reply is needed.
Also capitalize the first word in the prompt, as all the other
interactive prompts from this command are capitalized.
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally the color-parsing function was used only for
config variables. It made sense to pass the variable name so
that the die() message could be something like:
$ git -c color.branch.plain=bogus branch
fatal: bad color value 'bogus' for variable 'color.branch.plain'
These days we call it in other contexts, and the resulting
error messages are a little confusing:
$ git log --pretty='%C(bogus)'
fatal: bad color value 'bogus' for variable '--pretty format'
$ git config --get-color foo.bar bogus
fatal: bad color value 'bogus' for variable 'command line'
This patch teaches color_parse to complain only about the
value, and then return an error code. Config callers can
then propagate that up to the config parser, which mentions
the variable name. Other callers can provide a custom
message. After this patch these three cases now look like:
$ git -c color.branch.plain=bogus branch
error: invalid color value: bogus
fatal: unable to parse 'color.branch.plain' from command-line config
$ git log --pretty='%C(bogus)'
error: invalid color value: bogus
fatal: unable to parse --pretty format
$ git config --get-color foo.bar bogus
error: invalid color value: bogus
fatal: unable to parse default color value
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue where ae021d87 (use skip_prefix to avoid magic numbers) left off
and use skip_prefix() in more places for determining the lengths of prefix
strings to avoid using dependent constants and other indirect methods.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly state that menu_item functions like clean_cmd don't take
any arguments by using void instead of an empty parameter list.
Found using gcc -Wstrict-prototypes.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xcalloc() instead of xmalloc() followed by memset() to allocate
and zero out memory because it's shorter and avoids duplicating the
function parameters.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eradicate mistaken use of "nor" (that is, essentially "nor" used
not in "neither A nor B" ;-)) from in-code comments, command output
strings, and documentations.
* jl/nor-or-nand-and:
code and test: fix misuses of "nor"
comments: fix misuses of "nor"
contrib: fix misuses of "nor"
Documentation: fix misuses of "nor"
"git clean -d pathspec" did not use the given pathspec correctly
and ended up cleaning too much.
* jk/clean-d-pathspec:
clean: simplify dir/not-dir logic
clean: respect pathspecs with "-d"
"git clean -d pathspec" did not use the given pathspec correctly
and ended up cleaning too much.
* jk/clean-d-pathspec:
clean: simplify dir/not-dir logic
clean: respect pathspecs with "-d"
When we get a list of paths from read_directory, we further
prune it to create the final list of items to remove. The
code paths for directories and non-directories repeat the
same "add to list" code.
This patch restructures the code so that we don't repeat
ourselves. Also, by following a "if (condition) continue"
pattern like the pathspec check above, it makes it more
obvious that the conditional is about excluding directories
under certain circumstances.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-clean uses read_directory to fill in a `struct dir` with
potential hits. However, read_directory does not actually
check against our pathspec. It uses a simplified version
that may turn up false positives. As a result, we need to
check that any hits match our pathspec. We do so reliably
for non-directories. For directories, if "-d" is not given
we check that the pathspec matched exactly (i.e., we are
even stricter, and require an explicit "git clean foo" to
clean "foo/"). But if "-d" is given, rather than relaxing
the exact match to allow a recursive match, we do not check
the pathspec at all.
This regression was introduced in 113f10f (Make git-clean a
builtin, 2007-11-11).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shrink lifetime of variables by moving their definitions to an
inner scope where appropriate.
* ep/varscope:
builtin/gc.c: reduce scope of variables
builtin/fetch.c: reduce scope of variable
builtin/commit.c: reduce scope of variables
builtin/clean.c: reduce scope of variable
builtin/blame.c: reduce scope of variables
builtin/apply.c: reduce scope of variables
bisect.c: reduce scope of variable
cmd_clean() has the exact same code of index_name_is_other(). Reduce
code duplication.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This instance was left out when many match_pathspec() call sites that
take input from dir_entry were converted to dir_path_match() because
it passed a path with the trailing slash stripped out to match_pathspec()
while the others did not. Stripping for all call sites back then would
be a regression because match_pathspec() did not know how to match
pathspec foo/ against _directory_ foo (the stripped version of path
"foo/").
match_pathspec() knows how to do it now. And dir_path_match() strips
the trailing slash also. Use the new function, because the stripping
code is removed in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which
makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce
the same output. Previously only the version without trailing slash
returns the difference (if any).
That's the effect of new ce_path_match(). dir_path_match() is not
executed by the new tests. And it should not introduce regressions.
Previously if path "dir/" is passed in with pathspec "dir/", they
obviously match. With new dir_path_match(), the path becomes
_directory_ "dir" vs pathspec "dir/", which is not executed by the old
code path in m_p_i(). The new code path is executed and produces the
same result.
The other case is pathspec "dir" and path "dir/" is now turned to
"dir" (with DO_MATCH_DIRECTORY). Still the same result before or after
the patch.
So why change? Because of the next patch about clean.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long time ago, for some reason I was not happy with
match_pathspec(). I created a better version, match_pathspec_depth()
that was suppose to replace match_pathspec()
eventually. match_pathspec() has finally been gone since 6 months
ago. Use the shorter name for match_pathspec_depth().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.
* jl/submodule-mv: (53 commits)
rm: delete .gitmodules entry of submodules removed from the work tree
mv: update the path entry in .gitmodules for moved submodules
submodule.c: add .gitmodules staging helper functions
mv: move submodules using a gitfile
mv: move submodules together with their work trees
rm: do not set a variable twice without intermediate reading.
t6131 - skip tests if on case-insensitive file system
parse_pathspec: accept :(icase)path syntax
pathspec: support :(glob) syntax
pathspec: make --literal-pathspecs disable pathspec magic
pathspec: support :(literal) syntax for noglob pathspec
kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
parse_pathspec: make sure the prefix part is wildcard-free
rename field "raw" to "_raw" in struct pathspec
tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
remove match_pathspec() in favor of match_pathspec_depth()
remove init_pathspec() in favor of parse_pathspec()
remove diff_tree_{setup,release}_paths
convert common_prefix() to use struct pathspec
...
Convert most uses of OPT_BOOLEAN/OPTION_BOOLEAN that can use
OPT_BOOL/OPTION_BOOLEAN which have much saner semantics, and turn
remaining ones into OPT_SET_INT, OPT_COUNTUP, etc. as necessary.
* sb/parseopt-boolean-removal:
revert: use the OPT_CMDMODE for parsing, reducing code
checkout-index: fix negations of even numbers of -n
config parsing options: allow one flag multiple times
hash-object: replace stdin parsing OPT_BOOLEAN by OPT_COUNTUP
branch, commit, name-rev: ease up boolean conditions
checkout: remove superfluous local variable
log, format-patch: parsing uses OPT__QUIET
Replace deprecated OPT_BOOLEAN by OPT_BOOL
Remove deprecated OPTION_BOOLEAN for parsing arguments
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.
This patch does not change semantics of any command intentionally.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document for interactive git-clean says: "You also could say `c` or
`clean` above as long as the choice is unique". But it's not true,
because only hotkey `c` and full match (`clean`) could work.
Implement partial matching via find_unique function to make the
document right.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add "interactive" mode to "git clean".
The early part to refactor relative path related helper functions
looked sensible.
* jx/clean-interactive:
test: run testcases with POSIX absolute paths on Windows
test: add t7301 for git-clean--interactive
git-clean: add documentation for interactive git-clean
git-clean: add ask each interactive action
git-clean: add select by numbers interactive action
git-clean: add filter by pattern interactive action
git-clean: use a git-add-interactive compatible UI
git-clean: add colors to interactive git-clean
git-clean: show items of del_list in columns
git-clean: add support for -i/--interactive
git-clean: refactor git-clean into two phases
write_name{_quoted_relative,}(): remove redundant parameters
quote_path_relative(): remove redundant parameter
quote.c: substitute path_relative with relative_path
path.c: refactor relative_path(), not only strip prefix
test: add test cases for relative_path