This PR reuses the same `IRandomValueGenerator` that systematic testing strategies use in the runtime (instead of calling `Random` directly), so things are consistent (and in the future we easily plugin different random -- not that we currently need to do that, but you never know).
Also changes the name of the public static `RandomValueGenerator` API (used by `ControlledTask` objects) to `ControlledRandomValueGenerator`, first to match the naming of `ControlledTask` APIs and second to make it clear that this is a random generator that is controlled during testing.
Also adding an internal helper for creating a new instance of the production runtime (inside the existing runtime factory class), so that we can set the random value generator in as few places as possible.
**Note:** I am not trying to reinvent or improve the `IRandomValueGenerator` interface in this PR, rather just make sure we reuse what we currently have for consistency. Any improvements to the interface are outside the scope of this PR.
This PR removes the `IgnoreEvent` and `DeverEvent` APIs from the `Actor` type. I was playing with them when originally created the `Actor` type but forgot to remove them when we merged.
Basically, these APIs should not exist on an actor (similar to how actors in other frameworks/languages do not have them), as ignoring and deferring events is a state machine concept (in the sense that an event type can be ignored or deferred on some state, and then restored on another, but base actors do not have states). If someone really wanted (say to ignore an event because the actor doesn't care about it) they could just add that in the control flow of an event handler (e.g. if type of dequeued event is foo, then return).
Another bug that this PR fixes, is that because these APIs where on the base `Actor` type, where also exposed on the `StateMachine` type which is confusing and wrong (as they cannot alter the statically typed ignoring/deferring of state machines), so they were no-ops on the subclass.
This PR reduces some array allocations in the testing runtime. Its also nukes some experimental strategies that we never really used (not even for paper evaluations) to cleanup the codebase. These strategies are in the P# codebase, so if we ever really need to bring them back and polish them, we can easily do them. I would normally clean them up in a separate PR, but doing it here as I had to change an API so they could not compile anymore due to a type difference.
This change reduced the peak working set quite a lot.
This PR removes some string allocations in debug calls (some debug calls where already doing this, this makes stuff more consistent).
E.g. on the `CoffeeMachine` sample I was able to reduce the total allocations from `708.117` to `642.253` MB for 100 iterations.
This PR makes some changes towards a PR focused on perf improvements. The way actor id is output is simplified from e.g. `'FooActor(0)'` to `FooActor(0)` (basically removing the brackets `''`).
replace ILogger with TextWriter and move the implementation to ActorRuntimeLogTextFormatter to unify all logging done inside LogWriter and prepare the way for xml and json versions of this thing.
Related work items: #2125
This PR simplifies* the logic behind the `ControlledTask` implementation to make it easier to add upcoming features and improve the performance.
The PR removes some unnecessary intermediate classes as well as logic to streamline the design (e.g. removed the `RuntimeProvider`, and also various classes related to the interception of controlled tasks during testing -- this is now done much simpler via the `ITaskController`). This simplification also results in some performance improvements as well as much cleaner code inside the `_Runtime` classes. Global accesses to the runtime object from the task classes are also now reduced, which is nicer.
The PR also fixes a bug where the returned controlled task ID was not the expected in some scenarios (now it matches the `Task.Run` behavior), and adds some tests to avoid regressions.
The PR also removes `ControlledLock` which was experimental and not a real .NET API. We can bring back some more robust APIs for exposing the capability to declare controlled locks in a future PR.
*There are more that can be done, but I leave that for after v1 for the benefit of time.
Related work items: #1943
Graph is consolidating events from one state to another so it shows up as one link, but if those are actually different event types it was forgetting that. This fix changes the Graph data structure so it can remember multiple event id's on a single link. The coverage report then uses that to produce the right result. The handling of Monitors was also slightly wrong where the [hot], [cold] state labels were not matching with the coverage report MachineStates info, so this has been changed to a graph label only, rather than being part of the node id.
Also changed things so that you can now do this on the coyote test command line:
```
--explore --coverage activity
```
and you get the combined coverage of all explorations, regardless of whether any bugs were found.
Related work items: #2037
Fix some documentation bugs, typos, and move "state machine" docs into their own top level page so they are more discoverable, also complete the state machine docs by also talking about push and pop.
Add Transition.None and hide the Transition.Type since there is no need to make that stuff public, in fact it just leads to confusion because one starts to hunt for ways to change that field but there is no way. Also added more documentation to Transition class and how to use it properly.
Related work items: #1946
This PR is removes the `pstrace` test output (which is not supported in P# for a long time now, and was dropped from Coyote) and the `BugTrace` data structure that was gathering data for `pstrace`. @<Chris Lovett>, to give you some context that was an early attempt for some trace visualization but we dropped it. Removing these (besides cleaning up unnecessary code) should reduce runtime/memory overhead during testing.
If we bring trace visualization back, we can use Chris' new logging infrastructure for this purpose (e.g. a logger that gathers these data and spits out some file used for visualization).
Related work items: #1149
This PR adds some missing APIs from `ControlledTask`, namely the `Wait` methods.
The PR also addresses a super sneaky bug in the controlled task implementation (and also adds a few new tests to guard against regressions).
A simple repro:
```c#
private class SharedEntry
{
public int Value = 0;
}
private static async ControlledTask WriteAsync(SharedEntry entry, int value)
{
await ControlledTask.Delay(10);
entry.Value = value;
}
private static async ControlledTask InvokeWriteAsync(SharedEntry entry, int value)
{
await WriteAsync(entry, value);
}
[Test]
public static async ControlledTask RunExample()
{
SharedEntry entry = new SharedEntry();
ControlledTask task1 = InvokeWriteAsync(entry, 3);
ControlledTask task2 = InvokeWriteAsync(entry, 5);
await ControlledTask.WhenAll(task1, task2);
Specification.Assert(entry.Value == 5, "Value is {0} instead of 5.", entry.Value);
}
```
The bug causes the assertion to never fail. Same behavior happens with some other combinations e.g. `Yield` instead of `Delay`.
The issue is that in the intermediate (`InvokeWriteAsync`) method (could be more than 1 of them) the "rewriting" that coyote does during systematic testing does not "package" the “after-await-logic” into a continuation and then return to the caller (`RunExample` here). Instead it "blocks" the current task executing until the await completes, which is wrong semantics, and causes the missing interleaving. Although here the assertion is missed, it could potentially lead to a worse scenario, e.g. a deadlock not getting resolved (imagine if `WriteAsync(3)` was waiting on a lock that only `WriteAsync(5)` releases 😊).
The pre-fix logic only did the packaging (which generates a special task under the hood, which is controlled and explored systematically) when a `ControlledTask` was created explicitly (`Run`, `Yield`, `Delay`) but not implicitly as in the above intermediate example, thus causing the issue! The actual logic (and production .NET semantics) is a bit more complicated than my description.
The fix is quite simple conceptually, imagine an async callstack like this:
```
WriteAsync
InvokeWriteAsync
RunExample
```
Coyote should do the packaging when an await happens in any method (which can be anonymous lambda), but not on the “root” method (`RunExample`, in this case). If it happens in the root, then it can hang the scheduler (as there is no caller method to return too). My current fix is to do a limited traversal of the `StackTrace` and differentiate between the cases.
The PR also does some minor related cleanup.
This PR fixes some of the APIs of the `TestingEngineFactory`. Basically after supporting `ControlledTask`, we switched the expected test function pointers to `ControlledTask` instead of `Task` and forgot to update it here. This PR solves this issue.
This PR fixes a reported bug where the seed was not reported when the testing engine run programmatically. This was fixed by simplifying setting the seed (there is no need for it to be nullable, it always has a value now, which is set upon construction of the configuration object).
Fix bug in handling of raised events so the links show up correctly. The problem was OnHandleRaisedEvent was trying to create a link immediately, but we don't yet know the impact of this raised event. For example, it might result in a OnEventGotoState transition, in which case we want the link on the diagram to reflect that. Fortunately the fix was very simple since I had a concept of dequeued events already.
Also added animating raft demo under http://127.0.0.1:4000/coyote/learn/core/demo.
Also added the beginnings of some tutorial content - more of a template really...
This PR changes the `StateMachine` API to make all transitions (`RaiseEvent`, `GotoState`, `PushState`, `PopState` and `Halt`) typed (via the new `StateMachine.Transition` type) towards improving the programmability of this programming model. The same applies to the `Monitor` API. Quite a bit of runtime complexity was also reduced based on the fact that the C# type checker does some of the work for us now.
This basically allows a `StateMachine` to be declared as follows:
```c#
private class M : StateMachine
{
[Start]
[OnEntry(nameof(InitOnEntry))]
private class Init : State
{
}
private Transition InitOnEntry() => this.GotoState<Done>();
private class Done : State
{
}
}
```
Related work items: #1439
This PR changes the API of `Actor` and `Monitor` to allow an event to be passed as in-parameter in an action. The previous `ReceivedEvent` property has been removed.
Related work items: #871
This PR removes the `RaiseEvent` API from `Actor` as it is specific to state-machine logic. It introduces a `Halt` API that is basically syntactic sugar for `RaiseEvent(HaltEvent)` (and is required by `Actor` to halt without having to go through the queue). The PR also optimizes `HaltEvent` by making the constructor private, now can use it via a singleton instance `HaltEvent.Instance` which minimizes allocations.
This PR adds lots of systematic testing tests for the actor type. Also minor update in the `Coyote.nuspec` to point to the website and repository properly.
This PR introduces the `Actor` type, which is the base class for all actor types. `StateMachine` has now become a subclass of `Actor`, and most of the actor-related functionality has moved inside `Actor`, leaving only state machine related functionality in the subclass.
The three `SharedObjects` testing implementations (`SharedCounter`, `SharedDictionary` and `SharedRegister`) converted from `StateMachine` to `Actor`.
There is only a limited amount of new `Actor` tests added for two reasons:
- `StateMachine` is built on top of `Actor`, and most of the non-state-transition related functionality is inside `Actor`, which means that all `StateMachine` tests also test common `Actor` functionality.
- For the interest of time to get this PR merged, more tests will be added in upcoming PRs.
I have not updated the docs, we can do that in another PR.
Also I plan to do some more refactoring and polishing in both `Actor` and `StateMachine` (e.g. `Event` passed as in-param, and `OnInitializeAsync` meaningfully used for `StateMachine` custom initialization), but I leave that for subsequent PRs.
Also did some refactoring in tests, to increase code reuse (`E` and `Unit`, which were the most commonly used events, and do not take a payload, are now moved to a common reusable across tests event `UnitEvent`). More maximizing code between tests in future PRs.
Related work items: #1169, #1174
This fixes a hang in parallel coyote testing on Linux. Seems a socket Listener on Linux doesn't terminate when you call Close, it just hangs. Switching to async Accept solved the problem.
Related work items: #1296
The Graph writer was getting confused about "sender state" and monitor state so the monitor states were showing up in the sender. There was also missing information during halt on what the current state was before the machine was halted, this missing information is gathered when we pop unhandled events, since Halt is always an unhandled event.
This PR renames the following APIs:
- `IActorRuntime.CreateMachine*` becomes `IActorRuntime.CreateStateMachine*`. When we introduce actors we can think if we want to create a common creation for both state machines and actors (as Akash suggested in his email). But for now I want to have it explicit that its creating a state machine.
- `StateMachine.CreateMachine` becomes `StateMachine.CreateStateMachine`. Similar to above.
- `StateMachine.Raise` becomes `StateMachine.RaiseEvent`. Becoming more descriptive.
- `StateMachine.Receive` becomes `StateMachine.ReceiveEventAsync`. Becoming more descriptive and also adds the `Async` suffix to follow best practises for C# `async`/`await` programming..
- `StateMachine.Send` becomes `StateMachine.SendEvent`. Becoming more descriptive, and also matches the corresponding runtime API.
- `Monitor.Raise` becomes `Monitor.RaiseEvent`. Becoming more descriptive.
Related work items: #1168
Cleanup time! I checked all known customer repositories, can't find anyone who is using this. Will eventually be replaced by a much better `Actor` type.
Related work items: #1163
This PR renames `IMachineRuntime` to `IActorRuntime`. It also changes comments to mention actors or state machines (depending on the context). Also renamed `IMachineRuntimeLog` to `IActorRuntimeLog` and `RuntimeLogWriter` to `ActorRuntimeLogWriter`.
Related work items: #1156
This PR renames MachineId (in the code, as well as some comments and in the docs) to ActorId **only**. It should not touch any of the stuff in Chris Machine -> StateMachine PR, so the two should be able to be merged in parallel.
More refactoring in separate PRs.
Related work items: #1157
This is phase 1 of the rename, it is the just the very mechanical change of Machine to StateMachine ONLY. We can do the continuation of this direction in subsequent PR's.
Related work items: #1151, #1155