Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` support `--perl`.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.
The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).
E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.
Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.
Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase. We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.
Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test clean-up.
* ab/test-lib-updates:
test-lib: split up and deprecate test_create_repo()
test-lib: do not show advice about init.defaultBranch under --verbose
test-lib: reformat argument list in test_create_repo()
submodule tests: use symbolic-ref --short to discover branch name
test-lib functions: add --printf option to test_commit
describe tests: convert setup to use test_commit
test-lib functions: add an --annotated option to "test_commit"
test-lib-functions: document test_commit --no-tag
test-lib-functions: reword "test_commit --append" docs
test-lib tests: remove dead GIT_TEST_FRAMEWORK_SELFTEST variable
test-lib: bring $remove_trash out of retirement
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.
* so/log-m-implies-p:
diff-merges: let "-m" imply "-p"
diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
stash list: stop passing "-m" to "git log"
git-svn: stop passing "-m" to "git rev-list"
diff-merges: move specific diff-index "-m" handling to diff-index
t4013: test "git diff-index -m"
t4013: test "git diff-tree -m"
t4013: test "git log -m --stat"
t4013: test "git log -m --raw"
t4013: test that "-m" alone has no effect in "git log"
Optimize out repeated rename detection in a sequence of mergy
operations.
* en/ort-perf-batch-11:
merge-ort, diffcore-rename: employ cached renames when possible
merge-ort: handle interactions of caching and rename/rename(1to1) cases
merge-ort: add helper functions for using cached renames
merge-ort: preserve cached renames for the appropriate side
merge-ort: avoid accidental API mis-use
merge-ort: add code to check for whether cached renames can be reused
merge-ort: populate caches of rename detection results
merge-ort: add data structures for in-memory caching of rename detection
t6429: testcases for remembering renames
fast-rebase: write conflict state to working tree, index, and HEAD
fast-rebase: change assert() to BUG()
Documentation/technical: describe remembering renames optimization
t6423: rename file within directory that other side renamed
"git fetch" over protocol v2 left its side of the socket open after
it finished speaking, which unnecessarily wasted the resource on
the other side.
* jk/fetch-pack-v2-half-close-early:
fetch-pack: signal v2 server that we are done making requests
Use the hashfile API in the codepath that writes the index file to
reduce code duplication.
* ds/write-index-with-hashfile-api:
read-cache: delete unused hashing methods
read-cache: use hashfile instead of git_hash_ctx
csum-file.h: increase hashfile buffer size
hashfile: use write_in_full()
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.
* jk/clone-clean-upon-transport-error:
clone: clean up directory after transport_fetch_refs() failure
"git send-email" learned the "--sendmail-cmd" command line option
and the "sendemail.sendmailCmd" configuration variable, which is a
more sensible approach than the current way of repurposing the
"smtp-server" that is meant to name the server to instead name the
command to talk to the server.
* ga/send-email-sendmail-cmd:
git-send-email: add option to specify sendmail command
The code to handle the "--format" option in "for-each-ref" and
friends made too many string comparisons on %(atom)s used in the
format string, which has been corrected by converting them into
enum when the format string is parsed.
* zh/ref-filter-atom-type:
ref-filter: introduce enum atom_type
ref-filter: add objectsize to used_atom
The description of "fast-forward" in the glossary has been updated.
* ry/clarify-fast-forward-in-glossary:
docs: improve fast-forward in glossary content
The parallel checkout codepath did not initialize object ID field
used to talk to the worker processes in a futureproof way.
* mt/parallel-checkout-with-padded-oidcpy:
parallel-checkout: send the new object_id algo field to the workers
We historically rejected a very short string as an author name
while accepting a patch e-mail, which has been loosened.
* ef/mailinfo-short-name:
mailinfo: don't discard names under 3 characters
Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion of a
command substitution during declaration of a local variable. It causes
the parallel-checkout tests to fail e.g. when running them with
/bin/dash on MacOS 11.4, where they error out like this:
./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
That's because the output of wc -l contains leading spaces and the
returned number of lines is treated as another variable to declare, i.e.
as in "local workers= 0".
Work around it by enclosing the command substitution in quotes.
Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some platforms, like NonStop do not automatically restart fsync()
when interrupted by a signal, even when that signal is setup with
SA_RESTART.
This can lead to test breakage, e.g., where "--progress" is used,
thus SIGALRM is sent often, and can interrupt an fsync() syscall.
Make sure we deal with such a case by retrying the syscall
ourselves. Luckily, we call fsync() fron a single wrapper,
fsync_or_die(), so the fix is fairly isolated.
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
[jc: the above two did most of the work---I just tied the loose end]
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two "if (opt->all_objects)" blocks next
to each other, merge them into one to provide better
readability.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --batch code to print an object assumes we found out the type of
the object from calling oid_object_info_extended(). This is true for
the default format, but even in a custom format, we manually modify
the object_info struct to ask for the type.
This assumption was broken by 845de33a5b (cat-file: avoid noop calls
to sha1_object_info_extended, 2016-05-18). That commit skips the call
to oid_object_info_extended() entirely when --batch-all-objects is in
use, and the custom format does not include any placeholders that
require calling it.
Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.
This results in an error when we try to confirm that the type didn't
change:
$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 000023961a changed type!?
and also has other subtle effects (e.g., we'd fail to stream a blob,
since we don't realize it's a blob in the first place).
We can fix this by flipping the order of the setup. The check for "do
we need to get the object info" must come _after_ we've decided
whether we need to look up the type.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent change to make git-completion.bash use $__git_cmd_idx
in more places broke a number of completions on zsh because it
modified __git_main but did not update __git_zsh_main.
Notably, completions for "add", "branch", "mv" and "push" were
broken as a result of this change.
In addition to the undefined variable usage, "git mv <tab>" also
prints the following error:
__git_count_arguments:7: bad math expression:
operand expected at `"1"'
_git_mv:[:7: unknown condition: -gt
Remove the quotes around $__git_cmd_idx in __git_count_arguments
and set __git_cmd_idx=1 early in __git_zsh_main to fix the
regressions from 59d85a2a05.
This was tested on zsh 5.7.1 (x86_64-apple-darwin19.0).
Suggested-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Acked-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>