Migrate applicable secrets to a new 'release' workflow environment. This
is a security measure to help ensure secrets cannot be accessed by those
without proper permissions.
An example of a passing `build-git-installers` workflow with these
changes can be found
[here](https://github.com/ldennington/git/actions/runs/5182147378) (I
set up my fork with the same migrated secret values as this repo).
Note that the old actions secrets will be left in this repo until the
next successful release, at which point they can be safely removed.
Migrate applicable secrets to a new 'release' workflow environment. This
is a security measure to help ensure secrets cannot be accessed by those
without proper permissions.
Teach `gvfs-helper` to ignore the optional `.idx` files that may be
included in a `prefetch` response and always use `git index-pack` to
create them from the `.pack` files received in the data stream.
This is a little wasteful in terms of client-side compute and of the
network bandwidth, but allows us to use the full packfile verification
code contained within `git index-pack` to ensure that the received
packfiles are valid.
The GVFS cache server can return multiple pairs of (.pack, .idx)
files. If both are provided, `gvfs-helper` assumes that they are
valid without any validation. This might cause problems if the
.pack file is corrupt inside the data stream. (This might happen
if the cache server sends extra unexpected STDERR data or if the
.pack file is corrupt on the cache server's disk.)
All of the .pack file verification logic is already contained
within `git index-pack`, so let's ignore the .idx from the data
stream and force compute it.
This defeats the purpose of some of the data cacheing on the cache
server, but safety is more important.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Add "mayhem" keys to generate corrupt packfiles and/or corrupt idx
files in prefetch by trashing the trailing checksum SHA.
Add unit tests to t5799 to verify that `gvfs-helper` detects these
corrupt pack/idx files.
Currently, only the (bad-pack, no-idx) case is correctly detected,
Because `gvfs-helper` needs to locally compute the idx file itself.
A test for the (bad-pack, any-idx) case was also added (as a known
breakage) because `gvfs-helper` assumes that when the cache server
provides both, it doesn't need to verify them. We will fix that
assumption in the next commit.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
This adds a new builtin, `git update-microsoft-git`, that executes the platform-specific upgrade steps to get the latest version of `microsoft-git`.
On Windows, this means running `git update-git-for-windows` which was updated to use the `microsoft/git` releases page, when appropriate. See #321 for details.
On macOS, this means running a sequence of `brew` commands. These are adapted from the `UpgradeVerb` in `microsoft/scalar`, with an important simplification: we don't need to differentiate between the `scalar` and `scalar-azrepos` cask.
When the virtualfilesystem is enabled the previous implementation of
clear_ce_flags would iterate all of the cache entries and query whether
each one is in the virtual filesystem to determine whether to clear one
of the SKIP_WORKTREE bits. For each cache entry, we would do a hash
lookup for each parent directory in the is_included_in_virtualfilesystem
function.
The former approach is slow for a typical Windows OS enlistment with
3 million files where only a small percentage is in the virtual
filesystem. The cost is
O(n_index_entries * n_chars_per_path * n_parent_directories_per_path).
In this change, we use the same approach as apply_virtualfilesystem,
which iterates the set of entries in the virtualfilesystem and searches
in the cache for the corresponding entries in order to clear their
flags. This approach has a cost of
O(n_virtual_filesystem_entries * n_chars_per_path * log(n_index_entries)).
The apply_virtualfilesystem code was refactored a bit and modified to
clear flags for all names that 'alias' a given virtual filesystem name
when ignore_case is set.
n_virtual_filesystem_entries is typically much less than
n_index_entries, in which case the new approach is much faster. We wind
up building the name hash for the index, but this occurs quickly thanks
to the multi-threading.
This PR updates our `vfs-2.29.0` branch's version of `git maintenance` to match the latest in upstream. Unfortunately, not all of these commits made it to the `2.30` release candidate, but there are more commits from the series making it in. They will cause a conflict in the `vfs-2.30.0` rebase, so merge them in here. This also includes the `fixup!` reverts of the earlier versions.
Finally, I also noticed that we started depending on `git maintenance start` in Scalar for macOS, but we never checked that this worked with the shared object cache. It doesn't! 😨 The very tip commit of this PR includes logic to make `git maintenance run` care about `gvfs.sharedCache`. Functional test updates in Scalar will follow.
Add virtual file system settings and hook proc. On index load,
clear/set the skip worktree bits based on the virtual file system data.
Use virtual file system data to update skip-worktree bit in
unpack-trees. Use virtual file system data to exclude files and folders
not explicitly requested.
The hook was first contributed in private, but was extended via the
following pull requests:
#15#27#33#70
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
This replaces #493 (can't reopen a PR after a force-push...).
I updated this commit with a more firm version of the fix. This hopefully answers Victoria's excellent concerns with the previous approach.
I did not manage to get an automated test for this, but I did carefully verify this manually with a few commits in a VFS for Git enlistment (with different files every time). I updated the commit message with more details about why this works.
---
This fork contains changes specific to monorepo scenarios. If you are an
external contributor, then please detail your reason for submitting to
this fork:
* [X] This change only applies to the virtualization hook and VFS for Git.
Resolves#490.
Most of these were done in private before microsoft/git. However,
the following pull requests modified the core feature:
#85#89#91#98#243#263
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The maintenance.lock file exists to prevent concurrent maintenance
processes from writing to a repository at the same time. However, it has
the downside of causing maintenance to start failing without recovery if
there is any reason why the maintenance command failed without cleaning
up the lock-file.
This change makes it such that maintenance will delete a lock file that
was modified over 6 hours ago. This will auto-heal repositories that are
stuck with failed maintenance (and maybe it will fail again, but we will
get a message other than the lock file exists).
The intention here is to get users out of a bad state without weakening the maintenance lock _too_ much. I'm open to other thoughts, including expanding the timeframe from 6 to 24 hours.
---
The `fixup!` commit fixes the use of the `gvfs.sharedcache` config value around the loose-objects task. Specifically, the loose-objects step might fail because it is calling `git pack-objects {garbage}/pack/loose` which fails.
During the 2.35.0 rebase, we ejected 570f64b (Fix reset when using the
sparse-checkout feature., 2017-03-15) because of a similar change
upstream that actually works with the expected behavior of
sparse-checkout.
That commit only ever existed in microsoft/git, but when it was
considered for upstream we realized that it behaved strangely for a
sparse-checkout scenario.
The root problem is that during a mixed reset, 'git reset <commit>'
updates the index to aggree with <commit> but leaves the worktree the
same as it was before. The issue with sparse-checkout is that some files
might not be in the worktree and thus the information from those files
would be "lost".
The upstream decision was to leave these files as ignored, because
that's what the SKIP_WORKTREE bit means: don't put these files in the
worktree and ignore their contents. If there already were files in the
worktree, then Git does not change them. The case for "losing" data is
if a committed change outside of the sparse-checkout was in the previous
HEAD position. However, this information could be recovered from the
reflog.
The case where this is different is in a virtualized filesystem. The
virtualization is projecting the index contents onto the filesystem, so
we need to do something different here. In a virtual environment, every
file is considered "important" and we abuse the SKIP_WORKTREE bit to
indicate that Git does not need to process a projected file. When a file
is populated, the virtual filesystem hook provides the information for
removing the SKIP_WORKTREE bit.
In the case of these mixed resets, we have the issue where we change the
projection of the worktree for these cache entries that change. If a
file is populated in the worktree, then the populated file will persist
and appear in a follow-up 'git status'. However, if the file is not
populated and only projected, we change the projection from the current
value to the new value, leaving a clean 'git status'.
The previous version of this commit includes a call to checkout_entry(),
which populates the file. This causes the file to be actually in the
working tree and no longer projected.
To make this work with the upstream changes, stop setting the
skip-worktree bit for the new cache entry. This seemed to work fine
without this change, but it's likely due to some indirection with the
virtual filesystem. Better to do the best-possible thing here so we
don't hide a corner-case bug by accident.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
* t1092: remove the 'git update-index' test that currently fails
because the command ignores the bad path, but doesn't return a
failure.
* dir.c: prevent matching against sparse-checkout patterns when the
virtual filesystem is enabled. Should prevent some corner case
issues.
* t1092: add quiet mode for some rebase tests because the stderr
output can change in some of the modes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The maintenance.lock file exists to prevent concurrent maintenance
processes from writing to a repository at the same time. However, it has
the downside of causing maintenance to start failing without recovery if
there is any reason why the maintenance command failed without cleaning
up the lock-file.
This change makes it such that maintenance will delete a lock file that
was modified over 6 hours ago. This will auto-heal repositories that are
stuck with failed maintenance (and maybe it will fail again, but we will
get a message other than the lock file exists).
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
During a run of the Scalar functional tests, we hit a case where the
inexact rename detection of a 'git cherry-pick' command slowed to the
point of writing its delayed progress, failing the test because stderr
differed from the control case. Showing progress like this when stderr
is not a terminal is non-standard for Git, so inject an isatty(2) when
initializing the progress option in sequencer.c.
Unfortunately, there is no '--quiet' option in 'git cherry-pick'
currently wired up. This could be considered in the future, and the
isatty(2) could be moved to that position. This would also be needed for
commands like 'git rebase', so we leave that for another time.
During a run of the Scalar functional tests, we hit a case where the
inexact rename detection of a 'git cherry-pick' command slowed to the
point of writing its delayed progress, failing the test because stderr
differed from the control case. Showing progress like this when stderr
is not a terminal is non-standard for Git, so inject an isatty(2) when
initializing the progress option in sequencer.c.
Unfortunately, there is no '--quiet' option in 'git cherry-pick'
currently wired up. This could be considered in the future, and the
isatty(2) could be moved to that position. This would also be needed for
commands like 'git rebase', so we leave that for another time.
Reported-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Test cases specific to handling untracked files in `git stash` a) ensure
that files outside the sparse checkout definition are handled as-expected
and b) document the index expansion inside of `git stash -u`. Note that, in b),
it is not the full repository index that is expanded - it is the temporary,
standalone index containing the stashed untracked files only.
Signed-off-by: Victoria Dye <vdye@github.com>
This branch is exactly #410, but with one more commit: enabling the sparse index by default in d59110a81b42fe80cae0b788869c28c3b1eb8785.
Having this in the `vfs-2.33.0` branch helps build confidence that the sparse index is doing what it should be doing by running in the Scalar functional tests and in our test branches.
If we want to cut a new `microsoft/git` release without enabling the sparse index, we can simply revert this commit.
This verifies that `diff` and `diff --staged` behave the same in sparse
index repositories in the following partially-staged scenarios (i.e. the
index, HEAD, and working directory differ at a given path):
1. Path is within sparse-checkout cone.
2. Path is outside sparse-checkout cone.
3. A merge conflict exists for paths outside sparse-checkout cone.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
```
6e74958f59 p2000: add 'git checkout -' test and decrease depth
3e1d03c41b p2000: compress repo names
cd94f82005 commit: integrate with sparse-index
65e79b8037 sparse-index: recompute cache-tree
e9a9981477 checkout: stop expanding sparse indexes
4b801c854f t1092: document bad 'git checkout' behavior
71e301501c unpack-trees: resolve sparse-directory/file conflicts
5e96df4df5 t1092: test merge conflicts outside cone
defab1b86d add: allow operating on a sparse-only index
9fc4313c88 pathspec: stop calling ensure_full_index
0ec03ab021 add: ignore outside the sparse-checkout in refresh()
adf5b15ac3 add: remove ensure_full_index() with --renormalize
```
These commits are equivalent to those already in `next` via gitgitgadget/git#999.
```
80b8d6c56b Merge branch 'sparse-index/add' into stolee/sparse-index/add
```
This merge resolves conflicts with some work that happened in parallel, but is already in upstream `master`.
```
c407b2cb34 t7519: rewrite sparse index test
9dad0d2a3d sparse-index: silently return when not using cone-mode patterns
29749209c6 sparse-index: silently return when cache tree fails
e7cdaa0141 unpack-trees: fix nested sparse-dir search
347410c5d5 sparse-checkout: create helper methods
4537233f04 attr: be careful about sparse directories
5282a8614f sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
3a2f3164fc sparse-checkout: clear tracked sparse dirs
fb47b56719 sparse-checkout: add config to disable deleting dirs
```
These commits are the ones under review as of gitgitgadget/git#1009. Recent review made this less stable. It's a slightly different and more robust version of #396.
> Note: I'm still not done with the feedback for upstream, but the remaining feedback is "can we add tests that cover these tricky technical bits?" and in `microsoft/git` these are already covered by the Scalar functional tests (since that's how they were found).
```
080b02c7b6 diff: ignore sparse paths in diffstat
d91a647494 merge: make sparse-aware with ORT
df49b5f030 merge-ort: expand only for out-of-cone conflicts
cdecb85d77 t1092: add cherry-pick, rebase tests
0c1ecfbdc6 sequencer: ensure full index if not ORT strategy
406dfbede9 sparse-index: integrate with cherry-pick and rebase
```
These commits integrate with `git merge`, `git cherry-pick`, `git revert`, and `git rebase` as of gitgitgadget/git#1019. This got some feedback that changed how the tests were working so they are more robust. This led to a new commit (0c1ecfbdc6).
```
cbb0ab3b06 Merge branch 'sparse-index/merge' into vfs-2.33.0
acb8623f22 t7524: test no longer fails
```
Finally, the commits are merged into `vfs-2.33.0` and also we include a fix to a `microsoft/git` test that is no longer broken.
There is some strangeness when expanding a sparse-index that exists
within a submodule. We will need to resolve that later, but for now,
let's do a better job of explicitly disabling the sparse-index when
requested, and do so in t7817.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Upstream, a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
04-08-2021) modified how 'git add <pathspec>' works with cache entries
marked with the SKIP_WORKTREE bit. The intention is to prevent a user
from accidentally adding a path that is outside their sparse-checkout
definition but somehow matches an existing index entry.
This breaks when using the virtual filesystem in VFS for Git. It is
rare, but we could be in a scenario where the user has staged a change
and then the file is projected away. If the user re-adds the file, then
this warning causes the command to fail with the advise message.
Disable this logic when core_virtualfilesystem is enabled.
This should allow the VFS for Git functional tests to pass (at least
the ones in the default run). I'll create a `-pr` installer build to
check before merging this.
The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete, especially when a conflict appears. In
order to avoid expanding a sparse index, the reuse_worktree_file() needs
to be adapted to ignore files that are outside of the sparse-checkout
cone. The file names and OIDs used for this check come from the merged
tree in the case of the ORT strategy, not the index, hence the ability
to look into these paths without having already expanded the index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
These are two highly-requested items from an internal team considering a
move to Scalar using Azure Repos.
1. Remove the requirement that we create a `src` directory at clone time.
2. Allow `git worktree` even when using the GVFS protocol.
These are not difficult to implement. The `--no-src` option could even
be submitted upstream (though the commit will need to drop one bit about
an interaction with the local cache path).
Upstream, a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
2021-04-08) modified how 'git add <pathspec>' works with cache entries
marked with the SKIP_WORKTREE bit. The intention is to prevent a user
from accidentally adding a path that is outside their sparse-checkout
definition but somehow matches an existing index entry.
A similar change for 'git rm' happened in d5f4b82 (rm: honor sparse
checkout patterns, 2021-04-08).
This breaks when using the virtual filesystem in VFS for Git. It is
rare, but we could be in a scenario where the user has staged a change
and then the file is projected away. If the user re-adds the file, then
this warning causes the command to fail with the advise message.
Disable this logic when core_virtualfilesystem is enabled.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The clean_tracked_sparse_directories() method deletes the tracked
directories that go out of scope when the sparse-checkout cone changes,
at least in cone mode. This is new behavior, but is recommended based on
our understanding of how users are interacting with the feature in most
cases.
It is possible that some users will object to the new behavior, so
create a new configuration option 'index.deleteSparseDirectories' that
can be set to 'false' to make clean_tracked_sparse_directories() do
nothing. This will keep all untracked files in the working tree and
cause performance problems with the sparse index, but those trade-offs
are for the user to decide.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
When running 'scalar reconfigure -a', such as at install time, Scalar
has warning messages about the repository missing (or not containing a
.git directory). Failures can also happen while trying to modify the
repository-local config for that repository.
These warnings may seem confusing to users who don't understand what
they mean or how to stop them.
Add a warning that instructs the user how to remove the warning in
future installations.
Some users have strong aversions to Scalar's opinion that the repository
should be in a 'src' directory, even though it creates a clean slate for
placing build outputs in adjacent directories.
The --no-src option allows users to opt-out of the default behavior. The
parse-opt logic automatically figures out that '--src' is a possible
flag that negates '--no-src'.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
This allows fixing settings after a Scalar upgrade, or after botching
the enlistments configuration.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When running 'scalar reconfigure -a', such as at install time, Scalar
has warning messages about the repository missing (or not containing a
.git directory). Failures can also happen while trying to modify the
repository-local config for that repository.
These warnings may seem confusing to users who don't understand what
they mean or how to stop them.
Add a warning that instructs the user how to remove the warning in
future installations.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
We should not be putting the .scalarCache inside the enlistment as a
sibling to the 'src' directory. This only happens in "unattended" mode,
but it also negates any benefit of a shared object cache because each
enlistment absolutely does not share any objects with others.
Move the shared object cache in this case to a level above the
enlistment, so at least there is some hope that it can be reused. This
is also critical to the upcoming --no-src option, since the shared
object cache cannot be located within the Git repository.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>