Граф коммитов

52397 Коммитов

Автор SHA1 Сообщение Дата
Eric Sunshine 5238710eb4 t/chainlint: add chainlint "basic" test cases
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.

In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:15:14 -07:00
Eric Sunshine 803394459d t/Makefile: add machinery to check correctness of chainlint.sed
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection.
Although the heuristics work well, they are still best-guesses and
future changes could accidentally break assumptions upon which they are
based. To protect against this possibility, tests checking correctness
of the linter itself will be added. As preparation, add a new makefile
"check-chainlint" target and associated machinery.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:15:14 -07:00
Eric Sunshine 878f988350 t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
The --chain-lint option detects broken &&-chains by forcing the test to
exit early (as the very first step) with a sentinel value. If that
sentinel is the test's overall exit code, then the &&-chain is intact;
if not, then the chain is broken. Unfortunately, this detection does not
extend to &&-chains within subshells even when the subshell itself is
properly linked into the outer &&-chain.

Address this shortcoming by feeding the body of the test to a
lightweight "linter" which can peer inside subshells and identify broken
&&-chains by pure textual inspection. Although the linter does not
actually parse shell scripts, it has enough knowledge of shell syntax to
reliably deal with formatting style variations (as evolved over the
years) and to avoid being fooled by non-shell content (such as inside
here-docs and multi-line strings). It recognizes modern subshell
formatting:

    statement1 &&
    (
        statement2 &&
        statement3
    ) &&
    statement4

as well as old-style:

    statement1 &&
    (statement2 &&
     statement3) &&
    statement4

Heuristics are employed to properly identify the extent of a subshell
formatted in the old-style since a number of legitimate constructs may
superficially appear to close the subshell even though they don't. For
example, it understands that neither "x=$(command)" nor "case $x in *)"
end a subshell, despite the ")" at the end of line.

Due to limitations of the tool used ('sed') and its inherent
line-by-line processing, only subshells one level deep are handled, as
well as one-liner subshells one level below that. Subshells deeper than
that or multi-line subshells at level two are passed through as-is, thus
&&-chains in their bodies are not checked.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:15:14 -07:00
SZEDER Gábor 9500526284 t5608: fix broken &&-chain
This was missed by the previous clean-ups.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:12:59 -07:00
Jules Maselbas 636f3d7ac5 send-email: fix tls AUTH when sending batch
The variable smtp_encryption must keep it's value between two batches.
Otherwise the authentication is skipped after the first batch.

Signed-off-by: Jules Maselbas <jules.maselbas@grenoble-inp.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 15:02:36 -07:00
Eric Sunshine a0a630192d t/check-non-portable-shell: detect "FOO=bar shell_func"
One-shot environment variable assignments, such as 'FOO' in
"FOO=bar cmd", exist only during the invocation of 'cmd'. However, if
'cmd' happens to be a shell function, then 'FOO' is assigned in the
executing shell itself, and that assignment remains until the process
exits (unless explicitly unset). Since this side-effect of
"FOO=bar shell_func" is unlikely to be intentional, detect and report
such usage.

To distinguish shell functions from other commands, perform a pre-scan
of shell scripts named as input, gleaning a list of function names by
recognizing lines of the form (loosely matching whitespace):

    shell_func () {

and later report suspect lines of the form (loosely matching quoted
values):

    FOO=bar [BAR=foo ...] shell_func

Also take care to stitch together incomplete lines (those ending with
"\") since suspect invocations may be split over multiple lines:

    FOO=bar BAR=foo \
    shell_func

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine c433600593 t/check-non-portable-shell: make error messages more compact
Error messages emitted by this linting script are long and noisy,
consisting of several sections:

    <test-script>:<line#>: error: <explanation>: <failed-shell-text>

The line of failed shell text, usually coming from within a test body,
is often indented by one or two TABs, with the result that the actual
(important) text is separated from <explanation> by a good deal of empty
space. This can make for a difficult read, especially on typical
80-column terminals.

Make the messages more compact and perhaps a bit easier to digest by
folding out the leading whitespace from <failed-shell-text>.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine ef2d2accef t/check-non-portable-shell: stop being so polite
Error messages emitted by this linting script are long and noisy,
consisting of several sections:

    <test-script>:<line#>: error: <explanation>: <failed-shell-text>

Many problem explanations ask the reader to "please" use a suggested
alternative, however, such politeness is unnecessary and just adds to
the noise and length of the line, so drop "please" to make the message a
bit more concise.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine 079b087c8e t6046/t9833: fix use of "VAR=VAL cmd" with a shell function
Unlike "FOO=bar cmd" one-shot environment variable assignments
which exist only for the invocation of 'cmd', those assigned by
"FOO=bar shell_func" exist within the running shell and continue to
do so until the process exits (or are explicitly unset). It is
unlikely that this behavior was intended by the test author.

In these particular tests, the "FOO=bar shell_func" invocations are
already in subshells, so the assignments don't last too long, don't
appear to harm subsequent commands in the same subshells, and don't
affect other tests in the same scripts, however, the usage is
nevertheless misleading and poor practice, so fix the tests to assign
and export the environment variables in the usual fashion.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Junio C Hamano f44a7442f6 Merge branch 'jc/t3404-one-shot-export-fix' into es/test-lint-one-shot-export
* jc/t3404-one-shot-export-fix:
  t3404: fix use of "VAR=VAL cmd" with a shell function
2018-07-16 14:54:55 -07:00
Jonathan Tan 42cc7485a2 negotiator/skipping: skip commits during fetch
Introduce a new negotiation algorithm used during fetch that skips
commits in an effort to find common ancestors faster. The skips grow
similarly to the Fibonacci sequence as the commit walk proceeds further
away from the tips. The skips may cause unnecessary commits to be
included in the packfile, but the negotiation step typically ends more
quickly.

Usage of this algorithm is guarded behind the configuration flag
fetch.negotiationAlgorithm.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:51:12 -07:00
Eric Sunshine f9f7c116a3 t9119: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine cff4243db9 t9000-t9999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine e974e06de0 t7000-t7999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine c8ce3763ff t6000-t6999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine 51b85471af t5000-t5999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine f957f03b60 t4000-t4999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine b6c32f63f3 t3030: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine 3ea6737993 t3000-t3999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine 2c2d0f9f47 t2000-t2999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine f2deabfcb6 t1000-t1999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine 75651fd783 t0000-t0999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine 794165cb17 t9814: simplify convoluted check that command correctly errors out
This test uses a convoluted method to verify that "p4 help" errors
out when asked for help about an unknown command. In doing so, it
intentionally breaks the &&-chain. Simplify by employing the typical
"! command" idiom and a normal &&-chain instead.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine be8c48d4c4 t9001: fix broken "invoke hook" test
This test has been dysfunctional since it was added by 6489660b4b
(send-email: support validate hook, 2017-05-12), however, the problem
went unnoticed due to a broken &&-chain late in the test.

The test wants to verify that a non-zero exit code from the
'sendemail-validate' hook causes git-send-email to abort with a
particular error message. A command which is expected to fail should be
run with 'test_must_fail', however, the test neglects to do so.

Fix this problem, as well as the broken &&-chain behind which the
problem hid.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine d964def526 t7810: use test_expect_code() instead of hand-rolled comparison
This test manually checks the exit code of git-grep for a particular
value. In doing so, it intentionally breaks the &&-chain. Modernize the
test by taking advantage of test_expect_code() and a normal &&-chain.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine adc73318fe t7400: fix broken "submodule add/reconfigure --force" test
This test has been dysfunctional since it was added by 619acfc78c
(submodule add: extend force flag to add existing repos, 2016-10-06),
however, two problems early in the test went unnoticed due to a broken
&&-chain later in the test.

First, it tries configuring the submodule with repository "bogus-url",
however, "git submodule add" insists that the repository be either an
absolute URL or a relative pathname requiring prefix "./" or "../" (this
is true even with --force), but "bogus-url" does not meet those
criteria, thus the command fails.

Second, it then tries configuring a submodule with a path which is
.gitignore'd, which is disallowed. This restriction can be overridden
with --force, but the test neglects to use that option.

Fix both problems, as well as the broken &&-chain behind which they hid.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Johannes Schindelin ede8d89bb1 vcbuild/README: update to accommodate for missing common-cmds.h
In 60f487ac0e (Remove common-cmds.h, 2018-05-10), we forgot to adjust
this README when removing the common-cmds.h file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:30:34 -07:00
brian m. carlson 580f0980e1 pretty: switch hard-coded constants to the_hash_algo
Switch several hard-coded constants into expressions based either on
GIT_MAX_HEXSZ or the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:40 -07:00
brian m. carlson 94b5e093f9 sha1-file: convert constants to uses of the_hash_algo
Convert one use of 20 and several uses of GIT_SHA1_HEXSZ into references
to the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:40 -07:00
brian m. carlson 2ed1960a77 log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 02afca1ee4 diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson b7f20f7204 builtin/merge-recursive: make hash independent
Use GIT_MAX_HEXSZ instead of GIT_SHA1_HEXSZ for an allocation so that it
is sufficiently large.  Switch a comparison to use the_hash_algo to
determine the length of a hex object ID.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson ab47df2d9a builtin/merge: switch to use the_hash_algo
Switch uses of GIT_SHA1_HEXSZ to use the_hash_algo instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 5188eb5d8e builtin/fmt-merge-msg: make hash independent
Convert several uses of GIT_SHA1_HEXSZ into references to the_hash_algo.
Switch other uses into a use of parse_oid_hex and uses of its computed
pointer.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson fe04ccf7ca builtin/update-index: simplify parsing of cacheinfo
Switch from using get_oid_hex to parse_oid_hex to simplify pointer
operations and avoid the need for a hash-related constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 1928c9449e builtin/update-index: convert to using the_hash_algo
Switch from using GIT_SHA1_HEXSZ to the_hash_algo to make the parsing of
the index information hash independent.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 2ae2e2a1ca refs/files-backend: use the_hash_algo for writing refs
In order to ensure we write the correct amount, use the_hash_algo to
find the correct number of bytes for the current hash.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 7b38efad5e sha1-name: use the_hash_algo when parsing object names
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 4b048c917f strbuf: allocate space with GIT_MAX_HEXSZ
In order to be sure we have enough space to use with any hash algorithm,
use GIT_MAX_HEXSZ to allocate space.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 2770ccbdb2 commit: express tree entry constants in terms of the_hash_algo
Specify these constants in terms of the size of the hash algorithm
currently in use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson d9cd734990 hex: switch to using the_hash_algo
Instead of using the GIT_SHA1_* constants, switch to using the_hash_algo
to convert object IDs to and from hex format.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 83e4b7571c tree-walk: replace hard-coded constants with the_hash_algo
Remove the hard-coded 20-based values and replace them with uses of
the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson 509f6f62a4 cache: update object ID functions for the_hash_algo
Most of our code has been converted to use struct object_id for object
IDs.  However, there are some places that still have not, and there are
a variety of places that compare equivalently sized hashes that are not
object IDs.  All of these hashes are artifacts of the internal hash
algorithm in use, and when we switch to NewHash for object storage, all
of these uses will also switch.

Update the hashcpy, hashclr, and hashcmp functions to use the_hash_algo,
since they are used in a variety of places to copy and manipulate
buffers that need to move data into or out of struct object_id.  This
has the effect of making the corresponding oid* functions use
the_hash_algo as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:38 -07:00
Jeff King 022d2ac1f3 blame: prefer xsnprintf to strcpy for colors
Our color buffers are all COLOR_MAXLEN, which fits the
largest possible color. So we can never overflow the buffer
by copying an existing color. However, using strcpy() makes
it harder to audit the code-base for calls that _are_
problems. We should use something like xsnprintf(), which
shows the reader that we expect this never to fail (and
provides a run-time assertion if it does, just in case).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 13:59:39 -07:00
Jonathan Tan 8c4cc32689 tag: don't warn if target is missing but promised
deref_tag() prints a warning if the object that a tag refers to does not
exist. However, when a partial clone has an annotated tag from its
promisor remote, but not the object that it refers to, printing a
warning on such a tag is incorrect.

This occurs, for example, when the checkout that happens after a partial
clone causes some objects to be fetched - and as part of the fetch, all
local refs are read. The test included in this patch demonstrates this
situation.

Therefore, do not print a warning in this case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 12:56:14 -07:00
Jonathan Tan dc0a13f681 revision: tolerate promised targets of tags
In handle_commit(), it is fatal for an annotated tag to point to a
non-existent object. --exclude-promisor-objects should relax this rule
and allow non-existent objects that are promisor objects, but this is
not the case. Update handle_commit() to tolerate this situation.

This was observed when cloning from a repository with an annotated tag
pointing to a blob. The test included in this patch demonstrates this
case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 12:56:14 -07:00
brian m. carlson ab5e67d751 sequencer: pass absolute GIT_WORK_TREE to exec commands
The sequencer currently passes GIT_DIR, but not GIT_WORK_TREE, to exec
commands.  In that configuration, we assume that whatever directory
we're in is the top level of the work tree, and git rev-parse
--show-toplevel responds accordingly.  However, when we're in a
subdirectory, that isn't correct: we respond with the subdirectory as
the top level, resulting in unexpected behavior.

Ensure that we pass GIT_WORK_TREE as well as GIT_DIR so that git
operations within subdirectories work correctly.

Note that we are guaranteed to have a work tree in this case: the
relevant sequencer functions are called only from revert, cherry-pick,
and rebase--helper; all of these commands require a working tree.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 11:16:45 -07:00
Aaron Schrab 02cfd14920 sequencer: use configured comment character
Use the configured comment character when generating comments about
branches in a todo list.  Failure to honor this configuration causes a
failure to parse the resulting todo list.

Setting core.commentChar to "auto" will not be honored here, and the
previously configured or default value will be used instead. But, since
the todo list will consist of only generated content, there should not
be any non-comment lines beginning with that character.

Signed-off-by: Aaron Schrab <aaron@schrab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 11:04:56 -07:00
Jeff King 64eb14d310 fsck: downgrade gitmodulesParse default to "info"
We added an fsck check in ed8b10f631 (fsck: check
.gitmodules content, 2018-05-02) as a defense against the
vulnerability from 0383bbb901 (submodule-config: verify
submodule names as paths, 2018-04-30). With the idea that
up-to-date hosting sites could protect downstream unpatched
clients that fetch from them.

As part of that defense, we reject any ".gitmodules" entry
that is not syntactically valid. The theory is that if we
cannot even parse the file, we cannot accurately check it
for vulnerabilities. And anybody with a broken .gitmodules
file would eventually want to know anyway.

But there are a few reasons this is a bad tradeoff in
practice:

 - for this particular vulnerability, the client has to be
   able to parse the file. So you cannot sneak an attack
   through using a broken file, assuming the config parsers
   for the process running fsck and the eventual victim are
   functionally equivalent.

 - a broken .gitmodules file is not necessarily a problem.
   Our fsck check detects .gitmodules in _any_ tree, not
   just at the root. And the presence of a .gitmodules file
   does not necessarily mean it will be used; you'd have to
   also have gitlinks in the tree. The cgit repository, for
   example, has a file named .gitmodules from a
   pre-submodule attempt at sharing code, but does not
   actually have any gitlinks.

 - when the fsck check is used to reject a push, it's often
   hard to work around. The pusher may not have full control
   over the destination repository (e.g., if it's on a
   hosting server, they may need to contact the hosting
   site's support). And the broken .gitmodules may be too
   far back in history for rewriting to be feasible (again,
   this is an issue for cgit).

So we're being unnecessarily restrictive without actually
improving the security in a meaningful way. It would be more
convenient to downgrade this check to "info", which means
we'd still comment on it, but not reject a push. Site admins
can already do this via config, but we should ship sensible
defaults.

There are a few counterpoints to consider in favor of
keeping the check as an error:

 - the first point above assumes that the config parsers for
   the victim and the fsck process are equivalent. This is
   pretty true now, but as time goes on will become less so.
   Hosting sites are likely to upgrade their version of Git,
   whereas vulnerable clients will be stagnant (if they did
   upgrade, they'd cease to be vulnerable!). So in theory we
   may see drift over time between what two config parsers
   will accept.

   In practice, this is probably OK. The config format is
   pretty established at this point and shouldn't change a
   lot. And the farther we get from the announcement of the
   vulnerability, the less interesting this extra layer of
   protection becomes. I.e., it was _most_ valuable on day
   0, when everybody's client was still vulnerable and
   hosting sites could protect people. But as time goes on
   and people upgrade, the population of vulnerable clients
   becomes smaller and smaller.

 - In theory this could protect us from other
   vulnerabilities in the future. E.g., .gitmodules are the
   only way for a malicious repository to feed data to the
   config parser, so this check could similarly protect
   clients from a future (to-be-found) bug there.

   But that's trading a hypothetical case for real-world
   pain today. If we do find such a bug, the hosting site
   would need to be updated to fix it, too. At which point
   we could figure out whether it's possible to detect
   _just_ the malicious case without hurting existing
   broken-but-not-evil cases.

 - Until recently, we hadn't made any restrictions on
   .gitmodules content. So now in tightening that we're
   hitting cases where certain things used to work, but
   don't anymore. There's some moderate pain now. But as
   time goes on, we'll see more (and more varied) cases that
   will make tightening harder in the future. So there's
   some argument for putting rules in place _now_, before
   users grow more cases that violate them.

   Again, this is trading pain now for hypothetical benefit
   in the future. And if we try hard in the future to keep
   our tightening to a minimum (i.e., rejecting true
   maliciousness without hurting broken-but-not-evil repos),
   then that reduces even the hypothetical benefit.

Considering both sets of arguments, it makes sense to loosen
this check for now.

Note that we have to tweak the test in t7415 since fsck will
no longer consider this a fatal error. But we still check
that it reports the warning, and that we don't get the
spurious error from the config code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 10:57:23 -07:00
Jeff King 0d68764d94 fsck: split ".gitmodules too large" error from parse failure
Since ed8b10f631 (fsck: check .gitmodules content,
2018-05-02), we'll report a gitmodulesParse error for two
conditions:

  - a .gitmodules entry is not syntactically valid

  - a .gitmodules entry is larger than core.bigFileThreshold

with the intent that we can detect malicious files and
protect downstream clients. E.g., from the issue in
0383bbb901 (submodule-config: verify submodule names as
paths, 2018-04-30).

But these conditions are actually quite different with
respect to that bug:

 - a syntactically invalid file cannot trigger the problem,
   as the victim would barf before hitting the problematic
   code

 - a too-big .gitmodules _can_ trigger the problem. Even
   though it is obviously silly to have a 500MB .gitmodules
   file, the submodule code will happily parse it if you
   have enough memory.

So it may be reasonable to configure their severity
separately. Let's add a new class for the "too large" case
to allow that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 10:57:22 -07:00