The version is currently used is not the version that is referenced by the version of StackExchange.Redis.
And [this conversion](https://github.com/StackExchange/StackExchange.Redis/issues/1120) suggests that this maybe the issue that causes connectivity issues.
Related work items: #1719347
Starting with Visual Studio 16.6 the .editorconfig can contain a copyright text that is enforced by the IDE.
This PR adds the new option and also I generated the full solution and fixed all the known violations of our copyright header.
The feature is available only in 16.6 but you can stay on the older version without any issues, just this feature won't be available to you.
Many parts in our codebase assume that the deserialized path table is non-null. The code that try to deserialize path table is always guarded by try/catch that handles deserization failure.
Currently, if the serialized path table is corrupted, the deserialization of it can result in null path table. Then the code that tries to deserialize the path table is happy, thinking that the deserialization is successful. But the downstream code that consumes the path table suffers from contract exception (many parts in our code assert that path table is non null).
```
Exception:System.Diagnostics.ContractsLight.ContractException: Precondition failed.
at \.\Public\Src\Engine\Scheduler\IncrementalScheduling\GraphAgnosticIncrementalSchedulingState.cs:1773
at void System.Diagnostics.ContractsLight.ContractRuntimeHelper.TriggerFailure(ContractFailureKind kind, string msg, string userMessage, string conditionTxt)
at void System.Diagnostics.ContractsLight.ContractRuntimeHelper.ReportFailure(ContractFailureKind kind, string msg, string conditionTxt, Provenance provenance)
at void BuildXL.Scheduler.IncrementalScheduling.GraphAgnosticIncrementalSchedulingState.ProcessGraphChange(LoggingContext loggingContext, PipGraph pipGraph, PathTable internalPathTable, DirtyNodeTracker dirtyNodeTracker, PipProducers pipProducers, ConcurrentBigMap<PipStableId, PipGraphSequenceNumber> cleanPips, ConcurrentBigSet<PipStableId> materializedPips, ConcurrentBigMap<AbsolutePath, PipGraphSequenceNumber> cleanSourceFiles, PipOrigins pipOrigins, IncrementalSchedulingPathMapping<PipStableId> dynamicallyObservedFiles, IncrementalSchedulingPathMapping<PipStableId> dynamicallyProbedFiles, IncrementalSchedulingPathMapping<PipStableId> dynamicallyObservedEnumerations, DirtiedPips dirtiedPips, List<ValueTuple<Guid, DateTime>> graphLogs, PipGraphSequenceNumber pipGraphSequenceNumber, out int indexToGraphLogs) in \.\Public\Src\Engine\Scheduler\IncrementalScheduling\GraphAgnosticIncrementalSchedulingState.cs:line 1774
at GraphAgnosticIncrementalSchedulingState BuildXL.Scheduler.IncrementalScheduling.GraphAgnosticIncrementalSchedulingState.Load(LoggingContext loggingContext, FileEnvelopeId atomicSaveToken, PipGraph pipGraph, IConfiguration configuration, PreserveOutputsInfo preserveOutputSalt, string incrementalSchedulingStatePath, bool analysisModeOnly, ITempCleaner tempDirectoryCleaner) in \.\Public\Src\Engine\Scheduler\IncrementalScheduling\GraphAgnosticIncrementalSchedulingState.cs:line 1587
```
`FileSystemContentStore` is used in many places and by default all the operations like placing the files and others should be traced.
The flag was introduced to reduce the amount of traces in casaas, and this PR does not change that behavior, because the behavior is controlled by `DistributedContentSettings` configuration.
This is a workaround for GitHub issue 878. If you have Windows language packs installed on your machine, various pools like cl.exe will access files during the build.
When up-to-date, shrikwrap-deps.json is a legit witness of a project transitive dependencies. Add an option to use this file instead of tracking all dependencies. This option is safe when running in a lab environment, where rush install/update is always run before bxl.
This brings a couple of benefits:
* A lot less tracking/hashing, so perf may be better
* Make cache hit/miss work with compatible dependency changes. In this scenario, Rush allows projects to consume dependencies which don't exactly match the rush lock file, but are considered semantically equivalent.
Related work items: #1697633
This PR makes the master be at the end of all P2P copies. This implies we:
- Track which machine is the master by looking at who produced the last checkpoint, and make this part of the ClusterState
- Make the identity of the master accessible to the DistributedContentCopier via the IDistributedContentHost
- Change prioritization on all copies, when the feature is enabled
This is done only on copies that would perform IO against the LLS drive where the content database is located.
Related work items: #1721129
Add NetCore-only HexToBytes overload that allows (re)use of a preallocated buffer. Currently in use in AnyBuild for cases where a series of hashes is built up, likely useful as the codebase moves toward NetCore as an optimization to reduce allocations for similar cases.
Upgrade QTest to 20.5.20: Avoid diag logging in Heisenbug
Office tests were timing out due to diag logging being enabled after internal failures. We disabled diagnostic logging for heisenbug queues due to the amount of overhead involved and little benefit.
Sets a limit of 500 warnings and 500 errors to be emitted as issues by the DevOps listener. This is to prevent slow rendering of the build results page.
A message of the same type is added to let end users know the number of errors/warnings rendered has been truncated.
Related work items: #1717244
This has been validated to work by deploying to a machine in CBTest and checking that we can operate and RocksDb logs show the correct version. It has also been validated that we can downgrade from the new version to the current version and still work without issue, both locally and on CBTest
Related work items: #1704285
This PR fixes two types of issues:
1. When the service fails to start successfully in some cases the following shutdown fails because of double shutdown issue.
This PR adds an Id to all "shutdownable" resource to simplify investigating such kind of issues and also fixes one of the most common one by not shutting down stores in `OneLevelCache` when the stores were not fully initialized.
2. The shutdown logic in GRPC layer was not fully implemented: when `GrpcContentServer` is shut down, all the pending operations were still not cancelled, because the shutdown cancellation token was never propagated.
So this PR addresses this issue as well.
An example of the issue 1:
```
Critical error occurred: OneLevelCache.ShutdownAsync stop 44.3285ms. result=[Error=[ContractException: Assertion failed: Cannot shut down 'RocksDbMemoizationDatabase' because ShutdownAsync method was already called on this instance..
at \.\Public\Src\Cache\ContentStore\Library\Utils\StartupShutdownSlimBase.cs:111] Diagnostics=[System.Diagnostics.ContractsLight.ContractException: Assertion failed: Cannot shut down 'RocksDbMemoizationDatabase' because ShutdownAsync method was already called on this instance..
at \.\Public\Src\Cache\ContentStore\Library\Utils\StartupShutdownSlimBase.cs:111
at void System.Diagnostics.ContractsLight.ContractRuntimeHelper.TriggerFailure(ContractFailureKind kind, string msg, string userMessage, string conditionTxt)
at void System.Diagnostics.ContractsLight.ContractRuntimeHelper.ReportFailure(ContractFailureKind kind, string msg, string conditionTxt, Provenance provenance)
at void System.Diagnostics.ContractsLight.ContractFluentExtensions.Assert(in AssertionFailure result, string message)
at async Task<BoolResult> BuildXL.Cache.ContentStore.Utils.StartupShutdownSlimBase.ShutdownAsync(Context context)
at async Task<BoolResult> BuildXL.Cache.MemoizationStore.Sessions.OneLevelCacheBase.ShutdownCoreAsync(OperationContext context)
at async Task<T> BuildXL.Cache.ContentStore.Tracing.Internal.PerformOperationBuilderBase<TResult, TBuilder>.RunOperationAndConvertExceptionToErrorAsync<T>(Func<Task<T>> operation)]]..
```
An example of the issue 2:
```
Critical error occurred: GrpcContentServer.CopyFileAsync stop 14621.8642ms. result=[Error=[InvalidOperationException: Shutdown has already been called] Diagnostics=[System.InvalidOperationException: Shutdown has already been called
at void Grpc.Core.Internal.CompletionQueueSafeHandle.BeginOp()
at void Grpc.Core.Internal.CallSafeHandle.StartSendMessage(ISendCompletionCallback callback, SliceBufferSafeHandle payload, WriteFlags writeFlags, bool sendEmptyInitialMetadata)
at Task Grpc.Core.Internal.AsyncCallBase<TWrite, TRead>.SendMessageInternalAsync(TWrite msg, WriteFlags writeFlags)
at Task Grpc.Core.Internal.ServerResponseStream<TRequest, TResponse>.WriteAsync(TResponse message)
at async Task<(ValueTuple<long, long> Chunks)> BuildXL.Cache.ContentStore.Service.Grpc.GrpcContentServer.StreamContentAsync(Stream input, byte[] buffer, IServerStreamWriter<CopyFileResponse> responseStream, CancellationToken ct)
at async Task<T> BuildXL.Cache.ContentStore.Tracing.Internal.PerformOperationBuilderBase<TResult, TBuilder>.RunOperationAndConvertExceptionToErrorAsync<T>(Func<Task<T>> operation)]]. Hash=VSO0:76AB506E7182CF9B48FD, GZip=off..
```
Currently, a bunch of tests was generating critical errors.
This PR fixes all the known occurrences and also fixes the flaky behavior of the following test`LocalLocationStoreDistributedContentTests.ProactiveReplicationTest`.
The original implementation was not resilient if the content on disk was modified.
In this case, the cache can "replace" just part of the file ending up with corrupted content.
Update QTest package
Includes the following changes:
1. Fix for Code Coverage DFA
2. Remove redundant field from QTestcontextInfo
Updated with a fix for an earlier regression which caused validation with Office WAC queue failed.
I could not find the Bxl_OfficeTenant_Wac_Release queue today so I tested with Office Motif build queue here: https://cloudbuild.microsoft.com/build/cc56cda5-af0b-44df-9aec-625c2a5a7a1c?bq=Bxl_OfficeTenant_Motif