* update to latest Azure Storage Tables client
* Apply suggestions from code review
Co-authored-by: Jacob Viau <javia@microsoft.com>
* use fail-fast when parsing records, add unit tests, and fix a number of bugs
* Add some .ConfigureAwait(false) to AzureBlobLoadPublisher
* address PR feedback.
Co-authored-by: Jacob Viau <javia@microsoft.com>
* temp commit
* fix stuff
* update retry logic
* update tests
* rename providers to layers
* add display names for pipeline stages
* minor updates to comments and remove unnecessary usings
* temporarily disable the size checking
* edit and reorder CI pipeline for troubleshooting
* do not run replay checker on the default host fixture
* more tinkering with order of tasks in CI pipeline
* update test projects to latest runtime and packages
* remove replay checker from in-memory tests since it does not actually run anyway
* temporarily remove ConcurrentTestsFaster.EachScenarioOnce since it has demonstrated hanging
* update pipeline durations
* reenable EachScenarioOnce with fixes
* remove excessive tests
* Update tooling for Hello samples
* update tooling in load generator and performance tests
* remove unneded dependency on Microsoft.Azure.Functions.Extensions
* harden unit test logger
* draft commit
* revise so we can distinguish types of recovery and avoid sending too much stuff
* temp commit
* finish implementation of EH recovery
* update tracing for perf, clients, and load monitor
* fix eh checkpointing, and fix replay behavior
* update timeout handling for queries and requests
* fix timeouts for queries
* fix restart on lost events
* fix accidentally committed stuff
* make blob logger resilient against MemoryStream size overflow
* use storage account instead of storage emulator for CI
* update to FASTER v2 and implement MemoryTracker
* update history size calculation in tests
* fix renamed variable
* work around incorrect status on read completion callback
* minor updates to tests
* adjust timeout for latest bits
* Recover fewer pages.
* handle one more transient timeout on storage calls.
* fix compaction
(cherry picked from commit acec2f0ac06c25e39bda3edcb4ab6a7a8b71d5bb)
* fix client issues causing hangs
(cherry picked from commit 6434c56033ee9fac995659764ab2aedd75fb0d94)
* update to latest prerelease
(cherry picked from commit 0ad283aaa9956022d1e5bf14b616961a59f071d6)
* add EH overlap to work around bug
* add tolerance to MemoryReduction test
* overhaul exception handling in query events
* remove deadlock risks in cancellation token handlers
* compaction must be categorized like scan
* load fewer pages on recovery
* trace event id string when receiving packet
* add condition to size check to account for concurrent modifications
* keep termination continuation direct for tracing and for debugging
* retry slow reads to work around FASTER hang issues
* update tracing
* instrument hang detection
* update timeout limits
* refactor cancellation
* make the faultinjector back off of startup after 2 failures
* overhaul shutdown sequence
* add hang detection to azure storage device
* add CompactThenFail test
* simplify checkpoint state machine as compaction now already moves BeginAddress
* fix bug in PartitionErrorHandler
* Arm assertions in release build also, and run EH CI tests on release
* extend hang detection limit to 90s and remove the slowest ScaleSmallScenarios test
* update compaction progress reporting
* recognize yet another type of transient storage error
* use release build and remove concurrent test from EH tests on CI
* implement thread count watcher
* fix bug in AzureStorageDevice.RemoveSegmentsAsync and add to tracking
* remove .ConfigureAwait(false)
* update PerformanceTests http bindings to support arbitrary prefixes with start/purge/query/count/await
* fix CI
* fix nullref in dispose
* implement FASTER log compaction state machine.
* fix test
* fix bug in ActivitiesState, must not overwrite ReportedLoad field
* improve tracing of outbox events
* improve event detail tracing
* improve tracing of storage operations
* improve checkpoint removal tracing
* fix state machine logic for removing obsolete checkpoints
* backport fix to EmitCurrentState
* fix batch.SendingEventId assignment: should not be gated by replay
* use explicit interface for IOrchestrationService, and add some eventhubs tracing
* remove configureawait and make tracing for received events more visible
* update offset provider
* two more tracing changes
* undo semantics change on WaitForOrchestration
* improve ping so it is useful for detecting hangs
* catch timeouts
* more tracing in EventHubsProcessor
* fix details in storage format error
* support error injection and replay checking in DF deployments
* reduce tracing detail of exceptions for warnings
* add replay condition to tracing of discarded work items
* fix replay checker to handle serialized tracked objects
* avoid propagation of hangs by adding termination wrapper on all awaits in FasterStorage
* remove instanceprotected assertions since they may not be local to a partition
* add check for termination condition to EH checkpoint saving, and add tracing to event processor host registration
* fix race condition in MemoryTransport
* implement fault injection tests
* fix tests
* update fault injection tests
* revise MemoryTransport failure handling
* fix serialization of RecoveryCompleted
* refactor concurrent tests into separate file
* revise tracing and timeouts in tests
* fix fault injection tests
* change timeout behavior of wait for orchestration to bring it in line with other backends
* fix bug that causes test to hang when recovering
* revise the recovery mechanism in MemoryTransport
* connect Partition.Assert to TestHooks so it can fail unit tests
* fix missing check for replaying when processing RecoveryCompleted event
* revise fault injector: add timeout to startup and do not inject lease renewals
* annotate tests with whether they support any transport
* adjust timeouts and update pipeline
* add labels to all the assertions and hook them up so they trip unit tests
* remove non-fixture query test from "AnyTransport" test category
* fix small errors and add some tracing in performance tests
* edit perftests
* update concurrency default
* change throughput scale for word hash
* update series, and use public access blob for gutenberg
* update bank and fanoutfanin series
* limit number of words counted in each book to allow better load balancing
* fix error and select EP2 for alt
* update wordcount definitions for EP2 and EP3
* update wordcount definition for EP1
* add load generator app
* launch query and prefetch events earlier to improve latency
* add nicer query endpoint to perf tests
(cherry picked from commit 9851a1de91)
* add tracing for when a sender is starting to send and event.
* support use of multiple client channels; fix range of partitions; support pipelined return of query results
* use custom serialization to compress query results
* fix bug in test
* Add random scheduler
* add loadmonitor component
* add scheduling options & filehash tweaks
* keep partition prefix consistent with all the other tracing
* fix bug (failed to update estimated load when remote activities complete)
* Add aggresive scheduler mode
* fixed a typo
* basic load monitor
* checkpoint
* fix bug in ETW tracing
* change conditions for load monitor reporting
* use full tracing
* trace more information for OffloadCommandReceived
* change tracing of task messages to provide more consistent results
* several package updates, including DurableTask.Core 2.5.6
* update to account for tracing changes
* remove load monitor interval and add idle message
* add latency field to ActivityCompleted and RemoteActivityResultReceived
* added offload algorithms based on waiting time
* fix overflow in filehash
* fixes to loadmonitor logic, autoscaler, and temporary info sender
* fix missing file and update scale tracing
* change loadmonitor hosting so it moves less often
* implement overlay of pending commands on offload estimation
* improve precision of load estimation and distinguish between stationary and mobile.
* remove unnecessary left over line of code.
* replace push based algo with pull based
* fix so we don't pull more than mobile
* fix dumb mistake
* fix formula
* need more conservative default estimate for completion time, otherwise offload is too aggressive
* remove unnecessary constants.
* add tracing for RTT
* tweak parameters for RTT and smoothing
* batchworker tracing for senders
* use array instead of dictionary, fix concurrent modification exception
* revise sender for load monitor events to send only latest state, and to do rate limiting
* fix tracing
* turn off all loadmonitor activity when the parameter ActivityScheduler is not set LoadMonitor
* implemented "Static" setting for ActivityScheduler
* reduce severity of tracing in loadmonitor
* make filehash scale configurable
* update names, remove old algorithms
* improve terminology and implement solicitation
* fixes and simplifications
* minor updates
* simplify data structure for local activities
* use smarter concurrency defaults, same as for default backend
* fix tracing in LoadMonitor
* update filehash a bit to make it more uniform
* add test series for comparing local vs locavore
Co-authored-by: Sebastian Burckhardt <sburckha@microsoft.com>
* Go back to System.Threading.Channels version 4.7.1
* throw exception on 32bit process
* update to latest Microsoft.NET.Sdk.Functions
* use trace verbosity for unit tests
* fix implementation of local file storage for FASTER
* fix test configuration in DurableTask.Netherite.AzureFunctions
* generalize the CI pipeline so it runs all tests again.