зеркало из https://github.com/microsoft/git.git
cat-file: add option '-Z' that delimits input and output with NUL
Indb9d67f2e9
(builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), we have introduced a new mode to read the input via NUL-delimited records instead of newline-delimited records. This allows the user to query for revisions that have newlines in their path component. While unusual, such queries are perfectly valid and thus it is clear that we should be able to support them properly. Unfortunately, the commit only changed the input to be NUL-delimited, but didn't change the output at the same time. While this is fine for queries that are processed successfully, it is less so for queries that aren't. In the case of missing commits for example the result can become entirely unparsable: ``` $ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" | git cat-file --batch -z7ce4f05bae
blob 10 1234567890 commit missing ``` This is of course a crafted query that is intentionally gaming the deficiency, but more benign queries that contain newlines would have similar problems. Ideally, we should have also changed the output to be NUL-delimited when `-z` is specified to avoid this problem. As the input is NUL-delimited, it is clear that the output in this case cannot ever contain NUL characters by itself. Furthermore, Git does not allow NUL characters in revisions anyway, further stressing the point that using NUL-delimited output is safe. The only exception is of course the object data itself, but as git-cat-file(1) prints the size of the object data clients should read until that specified size has been consumed. But even though `-z` has only been introduced a few releases ago in Git v2.38.0, changing the output format retroactively to also NUL-delimit output would be a backwards incompatible change. And while one could make the argument that the output is inherently broken already, we need to assume that there are existing users out there that use it just fine given that revisions containing newlines are quite exotic. Instead, introduce a new option `-Z` that switches to NUL-delimited input and output. While this new option could arguably only switch the output format to be NUL-delimited, the consequence would be that users have to always specify both `-z` and `-Z` when the input may contain newlines. On the other hand, if the user knows that there never will be newlines in the input, they don't have to use either of those options. There is thus no usecase that would warrant treating input and output format separately, which is why we instead opt to "do the right thing" and have `-Z` mean to NUL-terminate both formats. The old `-z` option is marked as deprecated with a hint that its output may become unparsable. It is thus hidden both from the synopsis as well as the command's help output. Co-authored-by: Toon Claes <toon@iotcl.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Родитель
3217f52a49
Коммит
f79e18849b
|
@ -14,7 +14,7 @@ SYNOPSIS
|
|||
'git cat-file' (-t | -s) [--allow-unknown-type] <object>
|
||||
'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
|
||||
[--buffer] [--follow-symlinks] [--unordered]
|
||||
[--textconv | --filters] [-z]
|
||||
[--textconv | --filters] [-Z]
|
||||
'git cat-file' (--textconv | --filters)
|
||||
[<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
|
||||
|
||||
|
@ -243,10 +243,16 @@ respectively print:
|
|||
/etc/passwd
|
||||
--
|
||||
|
||||
-Z::
|
||||
Only meaningful with `--batch`, `--batch-check`, or
|
||||
`--batch-command`; input and output is NUL-delimited instead of
|
||||
newline-delimited.
|
||||
|
||||
-z::
|
||||
Only meaningful with `--batch`, `--batch-check`, or
|
||||
`--batch-command`; input is NUL-delimited instead of
|
||||
newline-delimited.
|
||||
newline-delimited. This option is deprecated in favor of
|
||||
`-Z` as the output can otherwise be ambiguous.
|
||||
|
||||
|
||||
OUTPUT
|
||||
|
@ -384,6 +390,11 @@ notdir SP <size> LF
|
|||
is printed when, during symlink resolution, a file is used as a
|
||||
directory name.
|
||||
|
||||
Alternatively, when `-Z` is passed, the line feeds in any of the above examples
|
||||
are replaced with NUL terminators. This ensures that output will be parsable if
|
||||
the output itself would contain a linefeed and is thus recommended for
|
||||
scripting purposes.
|
||||
|
||||
CAVEATS
|
||||
-------
|
||||
|
||||
|
|
|
@ -43,6 +43,7 @@ struct batch_options {
|
|||
int unordered;
|
||||
int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
|
||||
char input_delim;
|
||||
char output_delim;
|
||||
const char *format;
|
||||
};
|
||||
|
||||
|
@ -437,11 +438,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
|
|||
}
|
||||
}
|
||||
|
||||
static void print_default_format(struct strbuf *scratch, struct expand_data *data)
|
||||
static void print_default_format(struct strbuf *scratch, struct expand_data *data,
|
||||
struct batch_options *opt)
|
||||
{
|
||||
strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
|
||||
strbuf_addf(scratch, "%s %s %"PRIuMAX"%c", oid_to_hex(&data->oid),
|
||||
type_name(data->type),
|
||||
(uintmax_t)data->size);
|
||||
(uintmax_t)data->size, opt->output_delim);
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -470,8 +472,8 @@ static void batch_object_write(const char *obj_name,
|
|||
&data->oid, &data->info,
|
||||
OBJECT_INFO_LOOKUP_REPLACE);
|
||||
if (ret < 0) {
|
||||
printf("%s missing\n",
|
||||
obj_name ? obj_name : oid_to_hex(&data->oid));
|
||||
printf("%s missing%c",
|
||||
obj_name ? obj_name : oid_to_hex(&data->oid), opt->output_delim);
|
||||
fflush(stdout);
|
||||
return;
|
||||
}
|
||||
|
@ -492,17 +494,17 @@ static void batch_object_write(const char *obj_name,
|
|||
strbuf_reset(scratch);
|
||||
|
||||
if (!opt->format) {
|
||||
print_default_format(scratch, data);
|
||||
print_default_format(scratch, data, opt);
|
||||
} else {
|
||||
strbuf_expand(scratch, opt->format, expand_format, data);
|
||||
strbuf_addch(scratch, '\n');
|
||||
strbuf_addch(scratch, opt->output_delim);
|
||||
}
|
||||
|
||||
batch_write(opt, scratch->buf, scratch->len);
|
||||
|
||||
if (opt->batch_mode == BATCH_MODE_CONTENTS) {
|
||||
print_object_or_die(opt, data);
|
||||
batch_write(opt, "\n", 1);
|
||||
batch_write(opt, &opt->output_delim, 1);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -520,22 +522,25 @@ static void batch_one_object(const char *obj_name,
|
|||
if (result != FOUND) {
|
||||
switch (result) {
|
||||
case MISSING_OBJECT:
|
||||
printf("%s missing\n", obj_name);
|
||||
printf("%s missing%c", obj_name, opt->output_delim);
|
||||
break;
|
||||
case SHORT_NAME_AMBIGUOUS:
|
||||
printf("%s ambiguous\n", obj_name);
|
||||
printf("%s ambiguous%c", obj_name, opt->output_delim);
|
||||
break;
|
||||
case DANGLING_SYMLINK:
|
||||
printf("dangling %"PRIuMAX"\n%s\n",
|
||||
(uintmax_t)strlen(obj_name), obj_name);
|
||||
printf("dangling %"PRIuMAX"%c%s%c",
|
||||
(uintmax_t)strlen(obj_name),
|
||||
opt->output_delim, obj_name, opt->output_delim);
|
||||
break;
|
||||
case SYMLINK_LOOP:
|
||||
printf("loop %"PRIuMAX"\n%s\n",
|
||||
(uintmax_t)strlen(obj_name), obj_name);
|
||||
printf("loop %"PRIuMAX"%c%s%c",
|
||||
(uintmax_t)strlen(obj_name),
|
||||
opt->output_delim, obj_name, opt->output_delim);
|
||||
break;
|
||||
case NOT_DIR:
|
||||
printf("notdir %"PRIuMAX"\n%s\n",
|
||||
(uintmax_t)strlen(obj_name), obj_name);
|
||||
printf("notdir %"PRIuMAX"%c%s%c",
|
||||
(uintmax_t)strlen(obj_name),
|
||||
opt->output_delim, obj_name, opt->output_delim);
|
||||
break;
|
||||
default:
|
||||
BUG("unknown get_sha1_with_context result %d\n",
|
||||
|
@ -547,9 +552,9 @@ static void batch_one_object(const char *obj_name,
|
|||
}
|
||||
|
||||
if (ctx.mode == 0) {
|
||||
printf("symlink %"PRIuMAX"\n%s\n",
|
||||
printf("symlink %"PRIuMAX"%c%s%c",
|
||||
(uintmax_t)ctx.symlink_path.len,
|
||||
ctx.symlink_path.buf);
|
||||
opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
|
||||
fflush(stdout);
|
||||
return;
|
||||
}
|
||||
|
@ -913,6 +918,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
|
|||
struct batch_options batch = {0};
|
||||
int unknown_type = 0;
|
||||
int input_nul_terminated = 0;
|
||||
int nul_terminated = 0;
|
||||
|
||||
const char * const usage[] = {
|
||||
N_("git cat-file <type> <object>"),
|
||||
|
@ -920,7 +926,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
|
|||
N_("git cat-file (-t | -s) [--allow-unknown-type] <object>"),
|
||||
N_("git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]\n"
|
||||
" [--buffer] [--follow-symlinks] [--unordered]\n"
|
||||
" [--textconv | --filters] [-z]"),
|
||||
" [--textconv | --filters] [-Z]"),
|
||||
N_("git cat-file (--textconv | --filters)\n"
|
||||
" [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]"),
|
||||
NULL
|
||||
|
@ -949,7 +955,9 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
|
|||
N_("like --batch, but don't emit <contents>"),
|
||||
PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
|
||||
batch_option_callback),
|
||||
OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")),
|
||||
OPT_BOOL_F('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated"),
|
||||
PARSE_OPT_HIDDEN),
|
||||
OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")),
|
||||
OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"),
|
||||
N_("read commands from stdin"),
|
||||
PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
|
||||
|
@ -1011,8 +1019,15 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
|
|||
else if (input_nul_terminated)
|
||||
usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
|
||||
"-z");
|
||||
else if (nul_terminated)
|
||||
usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
|
||||
"-Z");
|
||||
|
||||
batch.input_delim = input_nul_terminated ? '\0' : '\n';
|
||||
batch.input_delim = batch.output_delim = '\n';
|
||||
if (input_nul_terminated)
|
||||
batch.input_delim = '\0';
|
||||
if (nul_terminated)
|
||||
batch.input_delim = batch.output_delim = '\0';
|
||||
|
||||
/* Batch defaults */
|
||||
if (batch.buffer_output < 0)
|
||||
|
|
|
@ -89,7 +89,8 @@ done
|
|||
for opt in --buffer \
|
||||
--follow-symlinks \
|
||||
--batch-all-objects \
|
||||
-z
|
||||
-z \
|
||||
-Z
|
||||
do
|
||||
test_expect_success "usage: bad option combination: $opt without batch mode" '
|
||||
test_incompatible_usage git cat-file $opt &&
|
||||
|
@ -392,17 +393,18 @@ deadbeef
|
|||
|
||||
"
|
||||
|
||||
batch_output="$hello_sha1 blob $hello_size
|
||||
$hello_content
|
||||
$commit_sha1 commit $commit_size
|
||||
$commit_content
|
||||
$tag_sha1 tag $tag_size
|
||||
$tag_content
|
||||
deadbeef missing
|
||||
missing"
|
||||
printf "%s\0" \
|
||||
"$hello_sha1 blob $hello_size" \
|
||||
"$hello_content" \
|
||||
"$commit_sha1 commit $commit_size" \
|
||||
"$commit_content" \
|
||||
"$tag_sha1 tag $tag_size" \
|
||||
"$tag_content" \
|
||||
"deadbeef missing" \
|
||||
" missing" >batch_output
|
||||
|
||||
test_expect_success '--batch with multiple sha1s gives correct format' '
|
||||
echo "$batch_output" >expect &&
|
||||
tr "\0" "\n" <batch_output >expect &&
|
||||
echo_without_newline "$batch_input" >in &&
|
||||
git cat-file --batch <in >actual &&
|
||||
test_cmp expect actual
|
||||
|
@ -410,11 +412,17 @@ test_expect_success '--batch with multiple sha1s gives correct format' '
|
|||
|
||||
test_expect_success '--batch, -z with multiple sha1s gives correct format' '
|
||||
echo_without_newline_nul "$batch_input" >in &&
|
||||
echo "$batch_output" >expect &&
|
||||
tr "\0" "\n" <batch_output >expect &&
|
||||
git cat-file --batch -z <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success '--batch, -Z with multiple sha1s gives correct format' '
|
||||
echo_without_newline_nul "$batch_input" >in &&
|
||||
git cat-file --batch -Z <in >actual &&
|
||||
test_cmp batch_output actual
|
||||
'
|
||||
|
||||
batch_check_input="$hello_sha1
|
||||
$tree_sha1
|
||||
$commit_sha1
|
||||
|
@ -423,40 +431,55 @@ deadbeef
|
|||
|
||||
"
|
||||
|
||||
batch_check_output="$hello_sha1 blob $hello_size
|
||||
$tree_sha1 tree $tree_size
|
||||
$commit_sha1 commit $commit_size
|
||||
$tag_sha1 tag $tag_size
|
||||
deadbeef missing
|
||||
missing"
|
||||
printf "%s\0" \
|
||||
"$hello_sha1 blob $hello_size" \
|
||||
"$tree_sha1 tree $tree_size" \
|
||||
"$commit_sha1 commit $commit_size" \
|
||||
"$tag_sha1 tag $tag_size" \
|
||||
"deadbeef missing" \
|
||||
" missing" >batch_check_output
|
||||
|
||||
test_expect_success "--batch-check with multiple sha1s gives correct format" '
|
||||
echo "$batch_check_output" >expect &&
|
||||
tr "\0" "\n" <batch_check_output >expect &&
|
||||
echo_without_newline "$batch_check_input" >in &&
|
||||
git cat-file --batch-check <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
|
||||
echo "$batch_check_output" >expect &&
|
||||
tr "\0" "\n" <batch_check_output >expect &&
|
||||
echo_without_newline_nul "$batch_check_input" >in &&
|
||||
git cat-file --batch-check -z <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
|
||||
test_expect_success "--batch-check, -Z with multiple sha1s gives correct format" '
|
||||
echo_without_newline_nul "$batch_check_input" >in &&
|
||||
git cat-file --batch-check -Z <in >actual &&
|
||||
test_cmp batch_check_output actual
|
||||
'
|
||||
|
||||
test_expect_success FUNNYNAMES 'setup with newline in input' '
|
||||
touch -- "newline${LF}embedded" &&
|
||||
git add -- "newline${LF}embedded" &&
|
||||
git commit -m "file with newline embedded" &&
|
||||
test_tick &&
|
||||
|
||||
printf "HEAD:newline${LF}embedded" >in &&
|
||||
git cat-file --batch-check -z <in >actual &&
|
||||
printf "HEAD:newline${LF}embedded" >in
|
||||
'
|
||||
|
||||
test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
|
||||
git cat-file --batch-check -z <in >actual &&
|
||||
echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' '
|
||||
git cat-file --batch-check -Z <in >actual &&
|
||||
printf "%s\0" "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
batch_command_multiple_info="info $hello_sha1
|
||||
info $tree_sha1
|
||||
info $commit_sha1
|
||||
|
@ -480,7 +503,13 @@ test_expect_success '--batch-command with multiple info calls gives correct form
|
|||
echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&
|
||||
git cat-file --batch-command --buffer -z <in >actual &&
|
||||
|
||||
test_cmp expect actual
|
||||
test_cmp expect actual &&
|
||||
|
||||
echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&
|
||||
tr "\n" "\0" <expect >expect_nul &&
|
||||
git cat-file --batch-command --buffer -Z <in >actual &&
|
||||
|
||||
test_cmp expect_nul actual
|
||||
'
|
||||
|
||||
batch_command_multiple_contents="contents $hello_sha1
|
||||
|
@ -490,15 +519,15 @@ contents deadbeef
|
|||
flush"
|
||||
|
||||
test_expect_success '--batch-command with multiple command calls gives correct format' '
|
||||
cat >expect <<-EOF &&
|
||||
$hello_sha1 blob $hello_size
|
||||
$hello_content
|
||||
$commit_sha1 commit $commit_size
|
||||
$commit_content
|
||||
$tag_sha1 tag $tag_size
|
||||
$tag_content
|
||||
deadbeef missing
|
||||
EOF
|
||||
printf "%s\0" \
|
||||
"$hello_sha1 blob $hello_size" \
|
||||
"$hello_content" \
|
||||
"$commit_sha1 commit $commit_size" \
|
||||
"$commit_content" \
|
||||
"$tag_sha1 tag $tag_size" \
|
||||
"$tag_content" \
|
||||
"deadbeef missing" >expect_nul &&
|
||||
tr "\0" "\n" <expect_nul >expect &&
|
||||
|
||||
echo "$batch_command_multiple_contents" >in &&
|
||||
git cat-file --batch-command --buffer <in >actual &&
|
||||
|
@ -508,7 +537,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f
|
|||
echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
|
||||
git cat-file --batch-command --buffer -z <in >actual &&
|
||||
|
||||
test_cmp expect actual
|
||||
test_cmp expect actual &&
|
||||
|
||||
echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
|
||||
git cat-file --batch-command --buffer -Z <in >actual &&
|
||||
|
||||
test_cmp expect_nul actual
|
||||
'
|
||||
|
||||
test_expect_success 'setup blobs which are likely to delta' '
|
||||
|
@ -848,6 +882,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for brok
|
|||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for broken in-repo, same-dir links' '
|
||||
printf "HEAD:broken-same-dir-link\0" >in &&
|
||||
printf "dangling 25\0HEAD:broken-same-dir-link\0" >expect &&
|
||||
git cat-file --batch-check --follow-symlinks -Z <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch-check --follow-symlinks works for same-dir links-to-links' '
|
||||
echo HEAD:link-to-link | git cat-file --batch-check --follow-symlinks >actual &&
|
||||
test_cmp found actual
|
||||
|
@ -862,6 +903,15 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for pare
|
|||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for parent-dir links' '
|
||||
echo HEAD:dir/parent-dir-link | git cat-file --batch-check --follow-symlinks >actual &&
|
||||
test_cmp found actual &&
|
||||
printf "notdir 29\0HEAD:dir/parent-dir-link/nope\0" >expect &&
|
||||
printf "HEAD:dir/parent-dir-link/nope\0" >in &&
|
||||
git cat-file --batch-check --follow-symlinks -Z <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch-check --follow-symlinks works for .. links' '
|
||||
echo dangling 22 >expect &&
|
||||
echo HEAD:dir/link-dir/nope >>expect &&
|
||||
|
@ -976,6 +1026,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlink breaks loops' '
|
|||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch-check --follow-symlink -Z breaks loops' '
|
||||
printf "loop 10\0HEAD:loop1\0" >expect &&
|
||||
printf "HEAD:loop1\0" >in &&
|
||||
git cat-file --batch-check --follow-symlinks -Z <in >actual &&
|
||||
test_cmp expect actual
|
||||
'
|
||||
|
||||
test_expect_success 'git cat-file --batch --follow-symlink returns correct sha and mode' '
|
||||
echo HEAD:morx | git cat-file --batch >expect &&
|
||||
echo HEAD:morx | git cat-file --batch --follow-symlinks >actual &&
|
||||
|
|
Загрузка…
Ссылка в новой задаче