Граф коммитов

106490 Коммитов

Автор SHA1 Сообщение Дата
Jeff Hostetler 2f67571a79 gvfs:trace2:data: add vfs stats
Report virtual filesystem summary data.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:22 -05:00
Jeff Hostetler 13a32d2ec1 gvfs:trace2:data: status serialization
Add trace information around status serialization.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:22 -05:00
Jeff Hostetler 0961a9b46d gvfs:trace2:data: status deserialization information
Add trace2 region and data events describing attempts to deserialize
status data using a status cache.

A category:status, label:deserialize region is pushed around the
deserialize code.

Deserialization results when reading from a file are:
    category:status, path   = <path>
    category:status, polled = <number_of_attempts>
    category:status, result = "ok" | "reject"

When reading from STDIN are:
    category:status, path   = "STDIN"
    category:status, result = "ok" | "reject"

Status will fallback and run a normal status scan when a "reject"
is reported (unless "--deserialize-wait=fail").

If "ok" is reported, status was able to use the status cache and
avoid scanning the workdir.

Additionally, a cmd_mode is emitted for each step: collection,
deserialization, and serialization.  For example, if deserialization
is attempted and fails and status falls back to actually computing
the status, a cmd_mode message containing "deserialize" is issued
and then a cmd_mode for "collect" is issued.

Also, if deserialization fails, a data message containing the
rejection reason is emitted.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:22 -05:00
Jeff Hostetler 28e93dfc6c gvfs:trace2:data: add trace2 tracing around read_object_process
Add trace2 region around read_object_process to collect
time spent waiting for missing objects to be dynamically
fetched.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:22 -05:00
Kevin Willford 5482d57c6f Merge pull request #98 Add explanation of branches and using forks
BRANCHES.md: Add explanation of branches and using forks
2020-01-14 08:51:22 -05:00
Ben Peart 0c339e0608 Merge pull request #91 from benpeart/block-commands
gvfs: block unsupported commands when running in a GVFS repo
2020-01-14 08:51:22 -05:00
Kevin Willford 2009015df8 BRANCHES.md: Add explanation of branches and using forks 2020-01-14 08:51:22 -05:00
Kevin Willford 2b9d086bb8 Merge pull request #68 send-pack do not check for sha1 file when GVFS_MISSING_OK set
send-pack: do not check for sha1 file when GVFS_MISSING_OK set
2020-01-14 08:51:22 -05:00
Ben Peart 5b05fa082a gvfs: block unsupported commands when running in a GVFS repo
The following commands and options are not currently supported when working
in a GVFS repo.  Add code to detect and block these commands from executing.

1) fsck
2) gc
4) prune
5) repack
6) submodule
8) update-index --split-index
9) update-index --index-version (other than 4)
10) update-index --[no-]skip-worktree
11) worktree

Signed-off-by: Ben Peart <benpeart@microsoft.com>
2020-01-14 08:51:22 -05:00
Johannes Schindelin ab39364cc0 Merge pull request #24 Match multi-pack-index feature from upstream
This includes commits that fixup!-revert all the midx-related commits from our GVFS branch and replaces them with the exact commits that are being merged upstream. This should automatically remove the commits during our next version rebase-and-merge action.

Changes upstream:
- The builtin is called 'git multi-pack-index'.
- The command-line takes a 'write' verb and an '--object-dir' parameter.
- We no longer have a 'midx-head' or '*.midx' files.
- Instead, we have a 'multi-pack-index' file in the pack-dir.
- It no longer makes sense to specify '--update-head'
2020-01-14 08:51:22 -05:00
Kevin Willford c17eb7cc21 send-pack: do not check for sha1 file when GVFS_MISSING_OK set 2020-01-14 08:51:22 -05:00
Derrick Stolee f6c830e1db Merge pull request #36 Avoid `sane_execvp` in `git rebase` and `git stash`
William Baker reported that the non-built-in rebase and stash fail to
run the post-command hook (which is important for VFS for Git, though).

The reason is that an `exec()` will replace the current process by the
newly-exec'ed one (our Windows-specific emulation cannot do that, and
does not even try, so this is only an issue on Linux/macOS). As a
consequence, not even the atexit() handlers are run, including the
one running the post-command hook.

To work around that, let's spawn the legacy rebase/stash and exit with
the reported exit code.
2020-01-14 08:51:22 -05:00
Derrick Stolee 7c9dca420b fsck: use ERROR_MULTI_PACK_INDEX
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.

Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
2020-01-14 08:51:22 -05:00
Jeff Hostetler b87f0faed9 Merge pull request #7 from jeffhostetler/gvfs-status-serialize-wait
status: deserialization wait
2020-01-14 08:51:22 -05:00
Johannes Schindelin bcba7d3adf rebase/stash: make post-command hook work again
William Baker reported that the non-built-in rebase and stash fail to
run the post-command hook (which is important for VFS for Git, though).

The reason is that an `exec()` will replace the current process by the
newly-exec'ed one (our Windows-specific emulation cannot do that, and
does not even try, so this is only an issue on Linux/macOS). As a
consequence, not even the atexit() handlers are run, including the
one running the post-command hook.

To work around that, let's spawn the legacy rebase/stash and exit with
the reported exit code.
2020-01-14 08:51:22 -05:00
Jeff Hostetler 7dcd715dbc Merge pull request #1 from jeffhostetler/gvfs-serialize-exclude
serialize-status: serialize global and repo-local exclude file metadata
2020-01-14 08:51:21 -05:00
Jeff Hostetler c9f635c8dd status: deserialization wait
Teach `git status --deserialize` to either wait indefintely
or immediately fail if the status serialization cache file
is stale.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Jeff Hostetler d11ff4d601 Merge pull request #6 from jeffhostetler/gvfs-serialize-status-rename
GVFS serialize status updates
2020-01-14 08:51:21 -05:00
Jeff Hostetler 607ed43623 serialize-status: serialize global and repo-local exclude file metadata
Changes to the global or repo-local excludes files can change the
results returned by "git status" for untracked files.  Therefore,
it is important that the exclude-file values used during serialization
are still current at the time of deserialization.

Teach "git status --serialize" to report metadata on the user's global
exclude file (which defaults to "$XDG_HOME/git/ignore") and for the
repo-local excludes file (which is in ".git/info/excludes").  Serialize
will record the pathnames and mtimes for these files in the serialization
header (next to the mtime data for the .git/index file).

Teach "git status --deserialize" to validate this new metadata.  If either
exclude file has changed since the serialization-cache-file was written,
then deserialize will reject the cache file and force a full/normal status
run.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin ce030e57ee Merge 'gvfs/ds/generation-numbers-update'
Revert the previous commits involving generation numbers and apply the
commits that are in upstream `next`.
2020-01-14 08:51:21 -05:00
Jeff Hostetler fbf83a36e3 status: add comments for ahead_behind_flags in serialization
The "ahead_behind_flags" field of "struct wt_status" does not
need to be stored in the serialization cache file, since it is
a display property.  Update the code comments in both serialize
and deserialize to reflect that.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin 92c7dc47b7 Merge 'virtual-file-system-support'
Add virtual file system settings and hook proc.  On index load,
clear/set the skip worktree bits based on the virtual file system data.
Use virtual file system data to update skip-worktree bit in
unpack-trees. Use virtual file system data to exclude files and folders
not explicitly requested.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
2020-01-14 08:51:21 -05:00
Derrick Stolee 6f32da2dbe commit: add generation to pop_most_recent_commit()
The method pop_most_recent_commit() is confusingly named, in that it
pops the most-recent commit, but also adds that commit's parents to
the list. This is used by a few commit walks, especially the one in
ref_newer(). 'git push' uses ref_newer() to check if a force-push is
necessary, and in the case of a force-push being needed, the current
algorithm walks every reachable commit. This is especially severe in
the case of an amended commit: they have the same parent, but we still
walk to the very end of the graph!

Add a 'min_generation' parameter to pop_most_recent_commit() to limit
the commits that are walked to those with generation number at least
'min_generation'. This greatly reduces the number of commits walked by
a force-push.

There may be more work to improve this algorithm in the future, but for
now this is enough for most cases. This direction has the benefit that
it does not affect the non-force-push case at all. Future directions
should consider improving that case as well.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
2020-01-14 08:51:21 -05:00
Jeff Hostetler 2acc657e20 status: fix rename reporting when using serialization cache
Fix "git status --deserialize" to correctly report both pathnames
for renames.  Add a test case to verify.

A change was made upstream that added an additional "rename_status"
field to the "struct wt_status_change_data" structure.  It is used
during the various print routines to decide if 2 pathnames need to
be printed.

    5134ccde64
    wt-status.c: rename rename-related fields in wt_status_change_data

The fix here is to add that field to the status cache data.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin 501f87ad15 Merge branch 'ahead-behind-and-serialized-status' 2020-01-14 08:51:21 -05:00
Jameson Miller 53136aa648 vfs: fix case where directories not handled correctly
The vfs does not correctly handle the case when there is a file
that begins with the same prefix as a directory. For example, the
following setup would encounter this issue:

    A directory contains a file named `dir1.sln` and a directory
    named `dir1/`.

    The directory `dir1` contains other files.

    The directory `dir1` is in the virtual file system list

The contents of `dir1` should be in the virtual file system, but
it is not. The contents of this directory do not have the skip
worktree bit cleared as expected. The problem is in the
`apply_virtualfilesystem(...)` function where it does not include
the trailing slash of the directory name when looking up the
position in the index to start clearing the skip worktree bit.

This fix is it include the trailing slash when finding the first
index entry from `index_name_pos(...)`.
2020-01-14 08:51:21 -05:00
Johannes Schindelin b92639cf44 Merge branch 'serialize_status_gvfs' 2020-01-14 08:51:21 -05:00
Jeff Hostetler c11afa2df8 status: reject deserialize in V2 and conflicts
Teach status deserialize code to reject status cache
when printing in porcelain V2 and there are unresolved
conflicts in the cache file.  A follow-on task might
extend the cache format to include this additiona data.

See code for longer explanation.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Kevin Willford 06187e1845 virtualfilesystem: check if directory is included
Add check to see if a directory is included in the virtualfilesystem
before checking the directory hashmap.  This allows a directory entry
like foo/ to find all untracked files in subdirectories.
2020-01-14 08:51:21 -05:00
Kevin Willford ba96ed82bf cache-tree: remove use of strbuf_addf in update_one
String formatting can be a performance issue when there are
hundreds of thousands of trees.

Change to stop using the strbuf_addf and just add the strings
or characters individually.

There are a limited number of modes so added a switch for the
known ones and a default case if something comes through that
are not a known one for git.

In one scenario regarding a huge worktree, this reduces the
time required for a `git checkout <branch>` from 44 seconds
to 38 seconds, i.e. it is a non-negligible performance
improvement.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
2020-01-14 08:51:21 -05:00
Jeff Hostetler db3c3fc3cd status: add status serialization mechanism
Teach STATUS to optionally serialize the results of a
status computation to a file.

Teach STATUS to optionally read an existing serialization
file and simply print the results, rather than actually
scanning.

This is intended for immediate status results on extremely
large repos and assumes the use of a service/daemon to
maintain a fresh current status snapshot.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Jeff Hostetler b7a47a3992 status: serialize to path
Teach status serialization to take an optional pathname on
the command line to direct that cache data be written there
rather than to stdout.  When used this way, normal status
results will still be written to stdout.

When no path is given, only binary serialization data is
written to stdout.

Usage:
    git status --serialize[=<path>]

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2020-01-14 08:51:21 -05:00
Ben Peart 949b596eae virtualfilesystem: fix bug with symlinks being ignored
The virtual file system code incorrectly treated symlinks as directories
instead of regular files.  This meant symlinks were not included even if
they are listed in the list of files returned by the core.virtualFilesystem
hook proc.  Fixes #25

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
2020-01-14 08:51:21 -05:00
Kevin Willford fa5d4abca0 gvfs: refactor loading the core.gvfs config value
This code change makes sure that the config value for core_gvfs
is always loaded before checking it.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
2020-01-14 08:51:21 -05:00
Jameson Miller e18a0aa695 Teach ahead-behind and serialized status to play nicely together 2020-01-14 08:51:21 -05:00
Ben Peart 96ba4eadec virtualfilesystem: don't run the virtual file system hook if the index has been redirected
Fixes #13

Some git commands spawn helpers and redirect the index to a different
location.  These include "difftool -d" and the sequencer
(i.e. `git rebase -i`, `git cherry-pick` and `git revert`) and others.
In those instances we don't want to update their temporary index with
our virtualization data.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin 76a05bd87f Merge 'sparse-checkout-fixes' into HEAD 2020-01-14 08:51:21 -05:00
Ben Peart 527e28e21c Update the virtualfilesystem support
We now specify that it needs to be run from the root of the git work
tree. This enables the hook to be found even if the current working
directory is not the root of the repo (like when running 'git diff' with
Beyond Compare configured as the diff tool.

Also simpify how argv[] parameter is created.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin 26d9337494 Merge 'pre-post-command-hooks' into HEAD 2020-01-14 08:51:21 -05:00
Kevin Willford 6f1c19016c Do not remove files outside the sparse-checkout
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
2020-01-14 08:51:21 -05:00
Ben Peart c827fd2eb7 Add virtual file system settings and hook proc
On index load, clear/set the skip worktree bits based on the virtual
file system data. Use virtual file system data to update skip-worktree
bit in unpack-trees. Use virtual file system data to exclude files and
folders not explicitly requested.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin 9d3a1c38b8 Merge 'read-object-hook' into HEAD 2020-01-14 08:51:21 -05:00
Johannes Schindelin 27b8f6d795 pre-command: always respect core.hooksPath
We need to respect that config setting even if we already know that we
have a repository, but have not yet read the config.

The regression test was written by Alejandro Pauly.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
2020-01-14 08:51:21 -05:00
Kevin Willford eb9bcc7626 Fix reset when using the sparse-checkout feature.
When using the sparse checkout feature the git reset command will add
entries to the index that will have the skip-worktree bit off but will
leave the working directory empty.  File data is lost because the index
version of the files has been changed but there is nothing that is in
the working directory.  This will cause the next status call to show
either deleted for files modified or deleting or nothing for files
added.  The added files should be shown as untracked and modified files
should be shown as modified.

To fix this when the reset is running if there is not a file in the
working directory and if it will be missing with the new index entry or
was not missing in the previous version, we create the previous index
version of the file in the working directory so that status will report
correctly and the files will be availble for the user to deal with.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
2020-01-14 08:51:21 -05:00
Ben Peart edcfc9df8c gvfs: ensure all filters and EOL conversions are blocked
Ensure all filters and EOL conversions are blocked when running under
GVFS so that our projected file sizes will match the actual file size
when it is hydrated on the local machine.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin b90204f9df sha1_file: when writing objects, skip the read_object_hook
If we are going to write an object there is no use in calling
the read object hook to get an object from a potentially remote
source.  We would rather just write out the object and avoid the
potential round trip for an object that doesn't exist.

This change adds a flag to the check_and_freshen() and
freshen_loose_object() functions' signatures so that the hook
is bypassed when the functions are called before writing loose
objects. The check for a local object is still performed so we
don't overwrite something that has already been written to one
of the objects directories.

Based on a patch by Kevin Willford.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
2020-01-14 08:51:21 -05:00
Alejandro Pauly 6f4062d600 Pass PID of git process to hooks.
Signed-off-by: Alejandro Pauly <alpauly@microsoft.com>
2020-01-14 08:51:21 -05:00
Kevin Willford 71383272cb sparse-checkout: avoid writing entries with the skip-worktree bit
When using the sparse-checkout feature git should not write to the working
directory for files with the skip-worktree bit on.  With the skip-worktree
bit on the file may or may not be in the working directory and if it is
not we don't want or need to create it by calling checkout_entry.

There are two callers of checkout_target.  Both of which check that the
file does not exist before calling checkout_target.  load_current which
make a call to lstat right before calling checkout_target and
check_preimage which will only run checkout_taret it stat_ret is less than
zero.  It sets stat_ret to zero and only if !stat->cached will it lstat
the file and set stat_ret to something other than zero.

This patch checks if skip-worktree bit is on in checkout_target and just
returns so that the entry doesn't not end up in the working directory.
This is so that apply will not create a file in the working directory,
then update the index but not keep the working directory up to date with
the changes that happened in the index.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
2020-01-14 08:51:21 -05:00
Ben Peart 425c1571a5 Add support for read-object as a background process to retrieve missing objects
This commit converts the existing read_object hook proc model for
downloading missing blobs to use a background process that is started
the first time git encounters a missing blob and stays running until git
exits.  Git and the read-object process communicate via stdin/stdout and
a versioned, capability negotiated interface as documented in
Documentation/technical/read-object-protocol.txt.  The advantage of this
over the previous hook proc is that it saves the overhead of spawning a
new hook process for every missing blob.

The model for the background process was refactored from the recent git
LFS work.  I refactored that code into a shared module (sub-process.c/h)
and then updated convert.c to consume the new library.  I then used the
same sub-process module when implementing the read-object background
process.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
2020-01-14 08:51:21 -05:00
Johannes Schindelin da43a06898 t0400: verify that the hook is called correctly from a subdirectory
Suggested by Ben Peart.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
2020-01-14 08:51:21 -05:00