Test cases specific to handling untracked files in `git stash` a) ensure
that files outside the sparse checkout definition are handled as-expected
and b) document the index expansion inside of `git stash -u`. Note that, in b),
it is not the full repository index that is expanded - it is the temporary,
standalone index containing the stashed untracked files only.
Signed-off-by: Victoria Dye <vdye@github.com>
This branch is exactly #410, but with one more commit: enabling the sparse index by default in d59110a81b42fe80cae0b788869c28c3b1eb8785.
Having this in the `vfs-2.33.0` branch helps build confidence that the sparse index is doing what it should be doing by running in the Scalar functional tests and in our test branches.
If we want to cut a new `microsoft/git` release without enabling the sparse index, we can simply revert this commit.
This verifies that `diff` and `diff --staged` behave the same in sparse
index repositories in the following partially-staged scenarios (i.e. the
index, HEAD, and working directory differ at a given path):
1. Path is within sparse-checkout cone.
2. Path is outside sparse-checkout cone.
3. A merge conflict exists for paths outside sparse-checkout cone.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
```
6e74958f59 p2000: add 'git checkout -' test and decrease depth
3e1d03c41b p2000: compress repo names
cd94f82005 commit: integrate with sparse-index
65e79b8037 sparse-index: recompute cache-tree
e9a9981477 checkout: stop expanding sparse indexes
4b801c854f t1092: document bad 'git checkout' behavior
71e301501c unpack-trees: resolve sparse-directory/file conflicts
5e96df4df5 t1092: test merge conflicts outside cone
defab1b86d add: allow operating on a sparse-only index
9fc4313c88 pathspec: stop calling ensure_full_index
0ec03ab021 add: ignore outside the sparse-checkout in refresh()
adf5b15ac3 add: remove ensure_full_index() with --renormalize
```
These commits are equivalent to those already in `next` via gitgitgadget/git#999.
```
80b8d6c56b Merge branch 'sparse-index/add' into stolee/sparse-index/add
```
This merge resolves conflicts with some work that happened in parallel, but is already in upstream `master`.
```
c407b2cb34 t7519: rewrite sparse index test
9dad0d2a3d sparse-index: silently return when not using cone-mode patterns
29749209c6 sparse-index: silently return when cache tree fails
e7cdaa0141 unpack-trees: fix nested sparse-dir search
347410c5d5 sparse-checkout: create helper methods
4537233f04 attr: be careful about sparse directories
5282a8614f sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
3a2f3164fc sparse-checkout: clear tracked sparse dirs
fb47b56719 sparse-checkout: add config to disable deleting dirs
```
These commits are the ones under review as of gitgitgadget/git#1009. Recent review made this less stable. It's a slightly different and more robust version of #396.
> Note: I'm still not done with the feedback for upstream, but the remaining feedback is "can we add tests that cover these tricky technical bits?" and in `microsoft/git` these are already covered by the Scalar functional tests (since that's how they were found).
```
080b02c7b6 diff: ignore sparse paths in diffstat
d91a647494 merge: make sparse-aware with ORT
df49b5f030 merge-ort: expand only for out-of-cone conflicts
cdecb85d77 t1092: add cherry-pick, rebase tests
0c1ecfbdc6 sequencer: ensure full index if not ORT strategy
406dfbede9 sparse-index: integrate with cherry-pick and rebase
```
These commits integrate with `git merge`, `git cherry-pick`, `git revert`, and `git rebase` as of gitgitgadget/git#1019. This got some feedback that changed how the tests were working so they are more robust. This led to a new commit (0c1ecfbdc6).
```
cbb0ab3b06 Merge branch 'sparse-index/merge' into vfs-2.33.0
acb8623f22 t7524: test no longer fails
```
Finally, the commits are merged into `vfs-2.33.0` and also we include a fix to a `microsoft/git` test that is no longer broken.
There is some strangeness when expanding a sparse-index that exists
within a submodule. We will need to resolve that later, but for now,
let's do a better job of explicitly disabling the sparse-index when
requested, and do so in t7817.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Upstream, a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
04-08-2021) modified how 'git add <pathspec>' works with cache entries
marked with the SKIP_WORKTREE bit. The intention is to prevent a user
from accidentally adding a path that is outside their sparse-checkout
definition but somehow matches an existing index entry.
This breaks when using the virtual filesystem in VFS for Git. It is
rare, but we could be in a scenario where the user has staged a change
and then the file is projected away. If the user re-adds the file, then
this warning causes the command to fail with the advise message.
Disable this logic when core_virtualfilesystem is enabled.
This should allow the VFS for Git functional tests to pass (at least
the ones in the default run). I'll create a `-pr` installer build to
check before merging this.
The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete, especially when a conflict appears. In
order to avoid expanding a sparse index, the reuse_worktree_file() needs
to be adapted to ignore files that are outside of the sparse-checkout
cone. The file names and OIDs used for this check come from the merged
tree in the case of the ORT strategy, not the index, hence the ability
to look into these paths without having already expanded the index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The clean_tracked_sparse_directories() method deletes the tracked
directories that go out of scope when the sparse-checkout cone changes,
at least in cone mode. This is new behavior, but is recommended based on
our understanding of how users are interacting with the feature in most
cases.
It is possible that some users will object to the new behavior, so
create a new configuration option 'index.deleteSparseDirectories' that
can be set to 'false' to make clean_tracked_sparse_directories() do
nothing. This will keep all untracked files in the working tree and
cause performance problems with the sparse index, but those trade-offs
are for the user to decide.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Upstream, a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
2021-04-08) modified how 'git add <pathspec>' works with cache entries
marked with the SKIP_WORKTREE bit. The intention is to prevent a user
from accidentally adding a path that is outside their sparse-checkout
definition but somehow matches an existing index entry.
A similar change for 'git rm' happened in d5f4b82 (rm: honor sparse
checkout patterns, 2021-04-08).
This breaks when using the virtual filesystem in VFS for Git. It is
rare, but we could be in a scenario where the user has staged a change
and then the file is projected away. If the user re-adds the file, then
this warning causes the command to fail with the advise message.
Disable this logic when core_virtualfilesystem is enabled.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Azure Repos does not support partial clones at the moment, but it does
support the GVFS protocol. To that end, the Microsoft fork of Git has a
`gvfs-helper` command that is optionally used to perform essentially the
same functionality as partial clone.
Let's verify that `scalar clone` detects that situation and enables the
GVFS helper.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This finalizes the port of the `QueryVstsInfo()` function: we already
taught `gvfs-helper` to access the `vsts/info` endpoint on demand, we
implemented proper JSON parsing, and now it is time to hook it all up.
To that end, we also provide a default local cache root directory. It
works the same way as the .NET version of Scalar: it uses
C:\scalarCache on Windows,
~/.scalarCache/ on macOS and
~/.cache/scalar on Linux
Modified to include call to is_unattended() that was removed from a
previous commit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, both the forward slash and the backslash are directory
separators. Which means that `a\b\c` really is inside `a/b`. Therefore,
we need to special-case the directory separators in the helper function
`cmp_icase()` that is used in the loop in `dir_inside_of()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We already have the `config` command that accesses the `gvfs/config`
endpoint.
To implement `scalar`, we also need to be able to access the `vsts/info`
endpoint. Let's add a command to do precisely that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This comes in handy, as we want to verify that `scalar clone` also works
against a GVFS-enabled remote repository.
Note that we have to set `MSYS2_ENV_CONV_EXCL` to prevent MSYS2 from
mangling `PATH_TRANSLATED`: The value _does_ look like a Unix-style
path, but no, MSYS2 must not be allowed to convert that into a Windows
path: `http-backend` needs it in the unmodified form. (The MSYS2 runtime
comes in when `git` is run via `bin-wrappers/git`, which is a shell
script.)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
With this change, we come a big step closer to feature parity with
Scalar: this allows cloning from Azure Repos (which do not support
partial clones at time of writing).
We use the just-implemented JSON parser to parse the response we got
from the `gvfs/config` endpoint; Please note that this response might,
or might not, contain information about a cache server. The presence or
absence of said cache server, however, has nothing to do with the
ability to speak the GVFS protocol (but the presence of the
`gvfs/config` endpoint does that).
An alternative considered during the development of this patch was to
perform simple string matching instead of parsing the JSON-formatted
data; However, this would have been fragile, as the response contains
free-form text (e.g. the repository's description) which might contain
parts that would confuse a simple string matcher (but not a proper JSON
parser).
Note: we need to limit the re-try logic in `git clone` to handle only
the non-GVFS case: the call to `set_config()` to un-set the partial
clone settings would otherwise fail because those settings would not
exist in the GVFS protocol case. This will at least give us a clearer
reason why such a fetch fails.
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
This merges the upstreamable part of the Scalar patches.
Minor merge conflicts (caused by the gvfs-helper) were resolved
trivially.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
No grown-up C project comes without their own JSON parser.
Just kidding!
We need to parse a JSON result when determining which cache server to
use. It would appear that searching for needles `"CacheServers":[`,
`"Url":"` and `"GlobalDefault":true` _happens_ to work right now, it is
fragile as it depends on no whitespace padding and on the order of the
fields remaining as-is.
Let's implement a super simple JSON parser (at the cost of being
slightly inefficient) for that purpose. To avoid allocating a ton of
memory, we implement a callback-based one. And to save on complexity,
let's not even bother validating the input properly (we will just go
ahead and instead rely on Azure Repos to produce correct JSON).
Note: An alternative would have been to use existing solutions such as
JSON-C, CentiJSON or JSMN. However, they are all a lot larger than the
current solution; The smallest, JSMN, which does not even provide parsed
string values (something we actually need) weighs in with 471 lines,
while we get away with 182 + 29 lines for the C and the header file,
respectively.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Microsoft's fork of Git is not quite Git for Windows, therefore we want
to tell the keen reader all about it. :-)
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
We have long inherited the pull request template from
git-for-windows/git, but we should probably do a better job of
specifying the need for why a PR in microsoft/git exists instead of an
upstream contribution.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The steps to update the microsoft-git cask are:
1. brew update
2. brew upgrade --cask microsoft-git
This is adapted from the UpgradeVerb within microsoft/scalar. There is
one important simplification: Scalar needed to check 'brew list --cask'
to find out if the 'scalar' cask or the 'scalar-azrepos' cask was
installed (which determined if the 'microsoft-git' cask was a necessary
dependency). We do not need that here, since we are already in the
microsoft-git cask.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Add basic installer validation to release pipeline for Windows, macOS, and
Linux (Debian package only). Validation runs the installers/any necessary
setup and checks that the installed version matches the expected version.
In 'git-update-git-for-windows', there is a recently_seen variable that
is loaded from Git config. This is intended to allow users to say "No, I
don't want that version of Git for Windows." If users say no, then they
are not reminded. Ever.
We want users of microsoft/git to be notified repeately until they
upgrade. The first notification might be dismissed because they don't
want to interrupt their work. They should get the picture within a few
reminders and upgrade in a timely fashion.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
We have been using the default issue template from git-for-windows/git,
but we should ask different questions than Git for Windows. Update the
issue template to ask these helpful questions.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
On Windows, we have the 'git update-git-for-windows' command. It is
poorly named within the microsoft/git fork, because the script has been
updated to look at the GitHub releases of microsoft/git, not
git-for-windows/git.
Still, it handles all the complicated details about downloading,
verifying, and running the installer.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Set the `GIT_BUILT_FROM_COMMIT` based on the version specified in the `make
dist` output archive header. This ensures the commit hash is shown in
`git version --build-options`.
Signed-off-by: Victoria Dye <vdye@github.com>
Update build-git-installers workflow to publish `microsoft/git`'s GPG public
key as part of each release. Add explanation for how to use this key to verify
the Debian package's signature to the README.
Just do the boilerplate stuff of making a new builtin, including
documentation and integration with git.c.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Update `git archive` tree-ish argument from `HEAD^{tree}` to `HEAD`. By
using a commit (rather than tree) reference, the commit hash will be stored
as an extended pax header, extractable git `git get-tar-commit-id`.
The intended use-case for this change is building `git` from the output of
`make dist` - in combination with the ability to specify a fallback
`GIT_BUILT_FROM_COMMIT`, a user can extract the commit ID used to build the
archive and set it as `GIT_BUILT_FROM_COMMIT`. The result is fully-populated
information for the commit hash in `git version --build-options`.
Signed-off-by: Victoria Dye <vdye@github.com>
Allow specification of a custom `GIT_BUILT_FROM_COMMIT` string to replace
the output of `git rev-parse HEAD`. This allows a build of `git` from
somewhere other than an active clone of `git` (e.g. from the archive created
with `make dist`) to include commit information in
`git version --build-options`.
Signed-off-by: Victoria Dye <vdye@github.com>
- sign using Azure-stored certificates & client
- sign on Windows agent via python script
- job skipped if credentials for accessing certificate aren't present
Co-authored-by: Lessley Dennington <ldennington@github.com>
This was disabled by a0da6deeec (ci: only run win+VS build & tests in
Git for Windows' fork, 2022-12-19) to avoid other forks doing too many
builds. But we want to keep these builds for the microsoft/git fork.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
When the virtualfilesystem is enabled the previous implementation of
clear_ce_flags would iterate all of the cache entries and query whether
each one is in the virtual filesystem to determine whether to clear one
of the SKIP_WORKTREE bits. For each cache entry, we would do a hash
lookup for each parent directory in the is_included_in_virtualfilesystem
function.
The former approach is slow for a typical Windows OS enlistment with
3 million files where only a small percentage is in the virtual
filesystem. The cost is
O(n_index_entries * n_chars_per_path * n_parent_directories_per_path).
In this change, we use the same approach as apply_virtualfilesystem,
which iterates the set of entries in the virtualfilesystem and searches
in the cache for the corresponding entries in order to clear their
flags. This approach has a cost of
O(n_virtual_filesystem_entries * n_chars_per_path * log(n_index_entries)).
The apply_virtualfilesystem code was refactored a bit and modified to
clear flags for all names that 'alias' a given virtual filesystem name
when ignore_case is set.
n_virtual_filesystem_entries is typically much less than
n_index_entries, in which case the new approach is much faster. We wind
up building the name hash for the index, but this occurs quickly thanks
to the multi-threading.
Signed-off-by: Neeraj Singh <neerajsi@ntdev.microsoft.com>
For Scalar and VFS for Git, we use an alternate as a shared object
cache. We need to enable the maintenance builtin to work on that
shared object cache, especially in the background.
'scalar run <task>' would set GIT_OBJECT_DIRECTORY to handle this.
We set GIT_OBJECT_DIRECTORY based on the gvfs.sharedCache config,
but we also need the checks in pack_loose() to look at that object
directory instead of the current ODB's.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The GVFS cache server can return multiple pairs of (.pack, .idx)
files. If both are provided, `gvfs-helper` assumes that they are
valid without any validation. This might cause problems if the
.pack file is corrupt inside the data stream. (This might happen
if the cache server sends extra unexpected STDERR data or if the
.pack file is corrupt on the cache server's disk.)
All of the .pack file verification logic is already contained
within `git index-pack`, so let's ignore the .idx from the data
stream and force compute it.
This defeats the purpose of some of the data cacheing on the cache
server, but safety is more important.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
- include `scalar`
- build signed .dmg & .pkg for target OS version 10.6
- upload artifacts to workflow
Co-authored-by: Lessley Dennington <ldennington@github.com>
It really does not make sense to run that workflow in any fork of
git-for-windows/git. Typically, it is enough to simply disable it (since
it is a scheduled workflow, it is disabled by default in any new fork).
However, in microsoft/git, we switch the default branch whenever we
rebase to a new upstream version, and every time we do so, this
scheduled workflow gets re-enabled.
Let's just delete it in microsoft/git and never be bothered by it again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a GitHub workflow that is triggered on the `release` event to
automatically update the `microsoft-git` Homebrew Cask on the
`microsoft/git` Tap.
A secret `HOMEBREW_TOKEN` with push permissions to the
`microsoft/homebrew-git` repository must exist. A pull request will be
created at the moment to allow for last minute manual verification.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
When building Git as a universal binary on macOS, the binary supports more than
one target architecture. This is a bit of a problem for the `HOST_CPU`
setting that is woefully unprepared for such a situation, as it wants to
show architecture hard-coded at build time.
In preparation for releasing universal builds, work around this by
special-casing `universal` and replacing it at run-time with the known
values `x86_64` or `arm64`.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
- trigger on tag matching basic "vfs" version pattern
- validate tag is annotated & matches stricter checks
- include `scalar`
- build x86_64 & portable git installers, upload artifacts to workflow
Update Apr 18, 2022: these steps are built explicitly on 'windows-2019'
agents (rather than 'windows-latest') to ensure the correct version of
Visual Studio is used (verified in the pipeline via 'type -p mspdb140.dll').
Additionally, due to a known (but not-yet-fixed) issue downloading the
'build-installers' flavor of the Git for Windows SDK with the
'git-for-windows/setup-git-for-windows-sdk' Action, the SDK used is the
'full' flavor.
`sparse` complains with an error message like this:
gvfs-helper.c:2912:17: error: expression using sizeof on a
function
The culprit is this line:
curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
Similar lines exist in `http-push.c` and other files that are in
upstream Git, and to avoid these bogus warnings, they are already
exempted from `sparse`'s tender, loving care. We simply add
`gvfs-helper.c` to that list.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Teach helper/test-gvfs-protocol to be able to send corrupted
loose blobs.
Add unit test for gvfs-helper to detect receipt of a corrupted loose blob.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add "mayhem" keys to generate corrupt packfiles and/or corrupt idx
files in prefetch by trashing the trailing checksum SHA.
Add unit tests to t5799 to verify that `gvfs-helper` detects these
corrupt pack/idx files.
Currently, only the (bad-pack, no-idx) case is correctly detected,
Because `gvfs-helper` needs to locally compute the idx file itself.
A test for the (bad-pack, any-idx) case was also added (as a known
breakage) because `gvfs-helper` assumes that when the cache server
provides both, it doesn't need to verify them. We will fix that
assumption in the next commit.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>