fast-export and fast-import have nice --import-marks flags which allow
for incremental migrations. However, if there is a mark in
fast-export's file of marks without a corresponding mark in the one for
fast-import, then we run the risk that fast-export tries to send new
objects relative to the mark it knows which fast-import does not,
causing fast-import to fail.
This arises in practice when there is a filter of some sort running
between the fast-export and fast-import processes which prunes some
commits programmatically. Provide such a filter with the ability to
alias pruned commits to their most recent non-pruned ancestor.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark identifiers are used in fast-export and fast-import to provide a
label to refer to earlier content. Blobs are given labels because they
need to be referenced in the commits where they first appear with a
given filename, and commits are given labels because they can be the
parents of other commits. Tags were never given labels, probably
because they were viewed as unnecessary, but that presents two problems:
1. It leaves us without a way of referring to previous tags if we
want to create a tag of a tag (or higher nestings).
2. It leaves us with no way of recording that a tag has already been
imported when using --export-marks and --import-marks.
Fix these problems by allowing an optional mark label for tags.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grammar for commits used a '?' rather than a '*' on the `merge`
directive line, despite the fact that the code allows multiple `merge`
directives in order to support n-way merges. In fact, elsewhere in
git-fast-import.txt there is an explicit declaration that "an unlimited
number of `merge` commands per commit are permitted by fast-import".
Fix the grammar to match the intent and implementation.
Reported-by: Joachim Klein <joachim.klein@automata.tools>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inspired by 21416f0a07 ("restore: fix typo in docs", 2019-08-03), I ran
"git grep -E '(\b[a-zA-Z]+) \1\b' -- Documentation/" to find other cases
where words were duplicated, e.g. "the the", and in most cases removed
one of the repeated words.
There were many false positives by this grep command, including
deliberate repeated words like "really really" or valid uses of "that
that" which I left alone, of course.
I also did not correct any of the legitimate, accidentally repeated
words in old RelNotes.
Signed-off-by: Mark Rushakoff <mark.rushakoff@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since git supports commit messages with an encoding other than UTF-8,
allow fast-import to import such commits. This may be useful for folks
who do not want to reencode commit messages from an external system, and
may also be useful to achieve reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions or subtree work) with git repositories that
have used specialized encodings in their commit history.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the "PACKFILE OPTIMIZATION" section in "fast-import" to explain
that simply running "git gc --aggressive" after a "fast-import" should
properly optimize the repository. This is simpler and more effective
than the existing "repack" advice (which I'm keeping as it helps
explain things) because it e.g. also packs the newly imported refs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When get-mark was introduced in commit 28c7b1f7b7 ("fast-import: add a
get-mark command", 2015-07-01), it followed the precedent of the
cat-blob command to be allowed on any line other than in the middle of a
data directive; see commit 777f80d742 ("fast-import: Allow cat-blob
requests at arbitrary points in stream", 2010-11-28). It was useful to
allow cat-blob directives in the middle of a commit to get more data
that would be used in writing the current commit object. get-mark is
not similarly useful since fast-import can already use either object id
or mark. Further, trying to allow this command anywhere caused parsing
bugs. Fix the parsing problems by only allowing get-mark commands to
appear when other commands have completed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 777f80d742 ("fast-import: Allow cat-blob requests at
arbitrary points in stream", 2010-11-28), fast-import started allowing
cat-blob commands to appear on the start of any line except in the
middle of a "data" command. It could be in the middle of various
directives that were part of a tag command, or in the middle of
checkpoints or progresses (each of which allow an optional second empty
newline), or even immediately after the mark command of a blob before
the data directive appeared (raising the question of what if it used the
mark for the blob that just barely appeared in the stream that we do not
yet have the data for). None of these locations make any sense as
places to put cat-blob requests.
The purpose of this change as stated in that commit message was to
[save] frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
However, that can be achieved by simply allowing cat-blob requests to
appear whenever a filemodify directive is allowed. Further, it avoids
setting a bad precedent for other commands to follow (e.g. get-mark); a
precedent which caused parsing problems in corner cases.
Technically, inline filemodify directives add a slight wrinkle in that
frontends might want to have cat-blob directives appear after the start
of the filemodify and before the data directive contained within it. I
think it would have been better to disallow such a case (it would be
trivial to use cat-blob before the filemodify instead), but since there
is evidence this was used, for backwards compatibility let's support
that case too.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The docs claimed `ls` commands could appear almost anywhere, but the
code told a different story. Modify the docs to match the code.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Knowing the original names (hashes) of commits can sometimes enable
post-filtering that would otherwise be difficult or impossible. In
particular, the desire to rewrite commit messages which refer to other
prior commits (on top of whatever other filtering is being done) is
very difficult without knowing the original names of each commit.
In addition, knowing the original names (hashes) of blobs can allow
filtering by blob-id without requiring re-hashing the content of the
blob, and is thus useful as a small optimization.
Once we add original ids for both commits and blobs, we may as well
add them for tags too for completeness. Perhaps someone will have a
use for them.
This commit teaches a new --show-original-ids option to fast-export
which will make it add a 'original-oid <hash>' line to blob, commits,
and tags. It also teaches fast-import to parse (and ignore) such
lines.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard for command documentation synopses appears to be:
[...] means optional
<...> means replaceable
[<...>] means both optional and replaceable
So fix a number of doc pages that use incorrect variations of the
above.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatted as a man page, 1st section header is always in upper
case even if we write it otherwise. Make all 1st section headers
uppercase to keep it close to the final output.
This does affect html since case is kept there, but I still think it's
a good idea to maintain a consistent style for 1st section headers.
Some sections perhaps should become second sections instead, where
case is kept, and for better organization. I will update if anyone has
suggestions about this.
While at there I also make some header more consistent (e.g. examples
vs example) and fix a couple minor things here and there.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 618e613a70, 10 years ago, the default for pack depth used for
git-pack-objects and git-repack was changed from 10 to 50, while
leaving fast-import's default to 10.
There doesn't seem to be a reason besides oversight for the change not
having happened in fast-import as well.
Interestingly, fast-import uses pack.depth when it's set, and the
git-config manual says the default for pack.depth is 50. While the
git-fast-import manual does say the default depth is 10, the
inconsistency is also confusing.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More mark-up updates to typeset strings that are expected to
literally typed by the end user in fixed-width font.
* mm/doc-tt:
doc: typeset HEAD and variants as literal
CodingGuidelines: formatting HEAD in documentation
doc: typeset long options with argument as literal
doc: typeset '--' as literal
doc: typeset long command-line options as literal
doc: typeset short command-line options as literal
Documentation/git-mv.txt: fix whitespace indentation
This was obtained with:
perl -pi -e "s/'--'/\`--\`/g" *.txt
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With many incremental imports, small packs become highly
inefficient due to the need to readdir scan and load many
indices to locate even a single object. Frequent repacking and
consolidation may be prohibitively expensive in terms of disk
I/O, especially in large repositories where the initial packs
were aggressively optimized and marked with .keep files.
In those cases, users may be better served with loose objects
and relying on "git gc --auto".
This changes the default behavior of fast-import for small
imports found in test cases, so adjustments to t9300 were
necessary.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-import" learned to respond to the get-mark command via
its cat-blob-fd interface.
* mh/fast-import-get-mark:
fast-import: add a get-mark command
It is sometimes useful for importers to be able to read the SHA-1
corresponding to a mark that they have created via fast-import. For
example, they might want to embed the SHA-1 into the commit message of
a later commit. Or it might be useful for internal bookkeeping uses,
or for logging.
Add a "get-mark" command to "git fast-import" that allows the importer
to ask for the value of a mark that has been created earlier.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Older versions of AsciiDoc would convert the "--" in
"--option" into an emdash. According to 565e135
(Documentation: quote double-dash for AsciiDoc, 2011-06-29),
this is fixed in AsciiDoc 8.3.0. According to bf17126, we
don't support anything older than 8.4.1 anyway, so we no
longer need to worry about quoting.
Even though this does not change the output at all, there
are a few good reasons to drop the quoting:
1. It makes the source prettier to read.
2. We don't quote consistently, which may be confusing when
reading the source.
3. Asciidoctor does not like the quoting, and renders a
literal backslash.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In AsciiDoc, it is OK to say:
this is my title
-------------------------
but AsciiDoctor is more strict. Let's match the underline to
the title (which also makes the source prettier to read).
The output from AsciiDoc is the same either way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merges with an absurd number of parents are still a bad idea because
they do not render well in tools like gitk, but if they are present
in the repository being imported into git then there's no need to
avoid reproducing them faithfully.
In olden times, before v1.6.0-rc0~194 (2008-06-27), git commit-tree
and higher-level tools built on top of it were limited to writing 16
parents for a commit. Nowadays normal git operations are happy to
write more parents when asked, so the motivation for this note in the
fast-import documentation is gone and we can remove it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In particular, git-fast-import and -export link to each
other, and gitremote-helpers links to existing remote
helpers, and vice versa. Also link to fast-import from the
remote helper spec, as this is relevant for remote helpers
using the fast-import format.
Signed-off-by: Max Horn <max@quendi.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow remote-helper/fast-import based transport to rename the refs
while transferring the history.
* fc/remote-helper-refmap:
transport-helper: remove unnecessary strbuf resets
transport-helper: add support to delete branches
fast-export: add support to delete refs
fast-import: add support to delete refs
transport-helper: add support to push symbolic refs
transport-helper: add support for old:new refspec
fast-export: add new --refspec option
fast-export: improve argument parsing
"timezone" is two words, not one (i.e. "time zone" is correct).
Correct this in these files:
-- date-formats.txt
-- git-blame.txt
-- git-cvsimport.txt
-- git-fast-import.txt
-- git-svn.txt
-- gitweb.conf.txt
-- rev-list-options.txt
Signed-off-by: Jason St. John <jstjohn@purdue.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We liberally use "committish" and "commit-ish" (and "treeish" and
"tree-ish"); as these are non-words, let's unify these terms to
their dashed form. More importantly, clarify the documentation on
object peeling using these terms.
* rh/ishes-doc:
glossary: fix and clarify the definition of 'ref'
revisions.txt: fix and clarify <rev>^{<type>}
glossary: more precise definition of tree-ish (a.k.a. treeish)
use 'commit-ish' instead of 'committish'
use 'tree-ish' instead of 'treeish'
glossary: define commit-ish (a.k.a. committish)
glossary: mention 'treeish' as an alternative to 'tree-ish'
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.
The only remaining instances of 'committish' are:
* variable, function, and macro names
* "(also committish)" in the definition of commit-ish in
gitglossary[7]
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most cases, "feature <foo>" does not just require that the feature
exists, but also changes the behavior by enabling it.
Cases where the feature is only requested like cat-blob, notes or ls are
clearly documented below.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options in git-fast-import(1) are not currently arranged in a
logical order, which has caused the '--done' options to be documented
twice (commit 3266de10).
Rearrange them into logical groups under subheadings.
Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The descriptions of '--relative-marks' and '--no-relative-marks' make
more sense when read together instead of as two independent options.
Combine them into a single description block.
Signed-off-by: John Keeping <john@keeping.me.uk>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--done' option to git-fast-import is documented twice in its manual
page. Combine the best bits of each description, keeping the location
of the instance that was added first.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Email addresses in documentation are converted into mailto: hyperlinks
in the HTML output and footnotes in man pages. This isn't desirable for
cases where the address is used as an example and is not valid.
Particularly annoying is the example "jane@laptop.(none)" which appears
in git-shortlog(1) as "jane@laptop[1].(none)", with note 1 saying:
1. jane@laptop
mailto:jane@laptop
Fix this by escaping these email addresses with a leading backslash, to
prevent Asciidoc expanding them as inline macros.
In the case of mailmap.txt, render the address monospaced so that it
matches the block examples surrounding that paragraph.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation mentioned only newlines and double quotes as
characters needing escaping, but the backslash also needs it. Also, the
documentation was not clearly saying that double quotes around the file
name were required (double quotes in the examples could be interpreted as
part of the sentence, not part of the actual string).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our documentation was written for an ancient version of AsciiDoc,
making the source not very readable.
By Jeff King
* jk/doc-asciidoc-inline-literal:
docs: stop using asciidoc no-inline-literal
In asciidoc 7, backticks like `foo` produced a typographic
effect, but did not otherwise affect the syntax. In asciidoc
8, backticks introduce an "inline literal" inside which markup
is not interpreted. To keep compatibility with existing
documents, asciidoc 8 has a "no-inline-literal" attribute to
keep the old behavior. We enabled this so that the
documentation could be built on either version.
It has been several years now, and asciidoc 7 is no longer
in wide use. We can now decide whether or not we want
inline literals on their own merits, which are:
1. The source is much easier to read when the literal
contains punctuation. You can use `master~1` instead
of `master{tilde}1`.
2. They are less error-prone. Because of point (1), we
tend to make mistakes and forget the extra layer of
quoting.
This patch removes the no-inline-literal attribute from the
Makefile and converts every use of backticks in the
documentation to an inline literal (they must be cleaned up,
or the example above would literally show "{tilde}" in the
output).
Problematic sites were found by grepping for '`.*[{\\]' and
examined and fixed manually. The results were then verified
by comparing the output of "html2text" on the set of
generated html pages. Doing so revealed that in addition to
making the source more readable, this patch fixes several
formatting bugs:
- HTML rendering used the ellipsis character instead of
literal "..." in code examples (like "git log A...B")
- some code examples used the right-arrow character
instead of '->' because they failed to quote
- api-config.txt did not quote tilde, and the resulting
HTML contained a bogus snippet like:
<tt><sub></tt> foo <tt></sub>bar</tt>
which caused some parsers to choke and omit whole
sections of the page.
- git-commit.txt confused ``foo`` (backticks inside a
literal) with ``foo'' (matched double-quotes)
- mentions of `A U Thor <author@example.com>` used to
erroneously auto-generate a mailto footnote for
author@example.com
- the description of --word-diff=plain incorrectly showed
the output as "[-removed-] and {added}", not "{+added+}".
- using "prime" notation like:
commit `C` and its replacement `C'`
confused asciidoc into thinking that everything between
the first backtick and the final apostrophe were meant
to be inside matched quotes
- asciidoc got confused by the escaping of some of our
asterisks. In particular,
`credential.\*` and `credential.<url>.\*`
properly escaped the asterisk in the first case, but
literally passed through the backslash in the second
case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fast-import's command pipe and the frontend's cat-blob/ls response
pipe are both filled, there can be a deadlock. Luckily all existing
frontends consume any pending cat-blob/ls responses completely before
writing the next command.
Document the requirements so future frontend authors and users can be
spared from the problem, too. It is not always easy to catch that
kind of bug by testing.
To set the scene, add some words of explanation to help the novice
understand that "cat-blob" and "ls" output are meant for consumption
by the frontend.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* di/fast-import-ident:
fsck: improve committer/author check
fsck: add a few committer name tests
fast-import: check committer name more strictly
fast-import: don't fail on omitted committer name
fast-import: add input format tests