Граф коммитов

422 Коммитов

Автор SHA1 Сообщение Дата
Jonathan Tims 308795a40f
Merge branch 'main' into jotims/pr/remove-binaryformatter 2022-06-23 19:44:10 +01:00
dependabot[bot] 6809c84080
Bump Newtonsoft.Json from 11.0.2 to 13.0.1 in /test/Tests (#407)
Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 11.0.2 to 13.0.1.
- [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases)
- [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/11.0.2...13.0.1)

---
updated-dependencies:
- dependency-name: Newtonsoft.Json
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-23 18:20:24 +01:00
Jonathan Tims d28ebd20a0 wip 2022-06-22 08:21:51 +01:00
Jonathan Tims 2f58f64afb w 2022-06-22 07:54:27 +01:00
Tom Minka 5156776f4c
Fixed DependencyAnalysisTransform and SchedulingTransform (#405) 2022-05-29 18:40:37 +01:00
Tom Minka 55d26a7138
Added Variable.ParallelCopy (#404)
* VectorGaussian.SetToRatio correctly enforces forceProper
* BeliefPropagationGateEnterPartialOp.ValueAverageConditional double-checks for conflicting values
* Added ReplicateOp_Divide_NoInit
* PlusProductHierarchyTest is OpenBug
* MatrixVectorProduct has optional check
* Deterministic gates track the number of cases
2022-05-03 22:27:20 +01:00
Tom Minka ca25068ead
WetGlassSprinklerRainModel has flexible state counts (#402)
* Expanded CausalityExample
* Variable.Dirichlet and Discrete check the array size
* IterativeProcessTransform ensures Marginal methods have unique names
* DependencyGraph.initializedEdges overrides required edges
* DependencyAnalysisTransform requires gated definitions to fully precede their uses
2022-04-08 20:14:57 +01:00
Tom Minka f008e6c719
macOS-latest (#399) 2022-03-02 22:04:11 +00:00
Tom Minka 68a10ac833
Rprop stepsize is lower bounded (#398)
Reduced the number of false "excess memory" warnings from GateTransform.
2022-03-02 21:10:46 +00:00
Tom Minka 45d2e593ba
Improved accuracy of MMath.WeightedAverage and GaussianProductOp (#395)
* RpropBufferData.EnsureConvergence = false
* GetJaggedItemsOp and GetItemsFromJaggedOp use IReadOnlyList
2022-02-16 14:58:03 +00:00
Ivan Korostelev 5b23c2c517
Micro-optimizations for sequence and collection distributions (#394)
1. Implementation of `SequenceDistribution` (of which `StringDistribution` specialization is most important)
   uses the curiously recurring template pattern (https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern)
   by specifying the `TThis` template parameter. In places where `SequenceDistribution` creates new instances
   it uses the `new TThis()` which goes through the `Activator.CreateInstance<>` method which is very slow.
   To avoid this a special helper - `Util.New<T>` is introduced which uses a code generation at runtime trick:
   https://stackoverflow.com/a/1280832 which is at least an order of magnitude faster that what compiler does.
2. `CollectionElementMappingInfo.ElementMapping` is switched from List of lists to List of read only arrays.
   This brings 2 performance benefits: a) there is less indirection when reading the element mapping and
   b) the common mappings can be cached and reused now.
2022-02-11 14:52:26 +00:00
Andrii Kurdiumov 1b4c479339
.NET 5 instead of .NET Core 3.1 (#273)
* Rename netcoreapp3.1 to net5.0 in all files
* Remove references to Microsoft.NET.Sdk.WindowsDesktop
* WinForms/WPF examples use net5.0-windows
2022-01-26 13:22:26 +00:00
Tom Minka 9b28c62294
Improved support for computing derivatives (#390)
* Inferred constant variables are not inlined
* Added Factor.ProbLessThan, ProbBetween, Quantile, Integral, Apply.
* ModelCompiler handles Delegate-valued variables.
* RatioGaussianOp handles ratio instead of ProductOp.
* ExpOp_Slow and LogOp_EP can compute derivatives
* Refactored ConstantFoldingTransform out of ModelAnalysisTransform
* CodeBuilder.MethodRefExpr takes arguments.
* Added VariableInformation.NeedsMarginalDividedByPrior and CodeRecognizer.NeedsMarginalDividedByPrior
* MessageTransform does not use a distribution for the forward message of a constant.
2022-01-25 21:41:52 +00:00
Tom Minka fee27bdf93
makeApiDocs uses newer docfx version (#389)
* DefaultFactorManager allows deterministic factors to have Full support
* DefaultFactorManager puts each factor on a separate line
* docs README mentions ShowFactorManager
2022-01-11 09:42:38 +00:00
Tom Minka b6dd9630f5
DiscreteEnumAreEqualOp supports VMP (#387) 2022-01-08 18:47:02 +00:00
Tom Minka 079250ef67
Observed Beta variables support a Gamma-distributed pseudocount if the other pseudocount is always 1 (#386)
* ShowFactorManager shows more patterns
* Tidy up code
2022-01-07 18:07:48 +00:00
Jonathan Tims fc93e12851
Switch release build to pool MicroBuild2022-1ES
Switch release build to pool MicroBuild2022-1ES
2022-01-07 14:22:35 +00:00
Tom Minka 56fccb62d8
FileArray.GetTempFolder returns a folder in the current directory (#384) 2022-01-06 22:49:41 +00:00
Jonathan Tims 4b27c953d1
Switch release build to ubuntu-latest (#383)
Switch release build to ubuntu-latest (#383)
2022-01-06 22:40:41 +00:00
Sam Webster 7e49df44ca
Update compiler options pipeline (#381)
Remove clean up tasks as they are failing and not required due to switch to hosted pool build VMs
2022-01-06 12:25:12 +00:00
Tom Minka 791d11a8a1
MMath.WeightedAverage fix (#379) 2021-12-26 20:03:11 +00:00
Tom Minka 5f394f9b8a
Added IClassifierMapping.LabelToString and ParseLabel. (#378)
BinaryNativeClassifierMapping and MulticlassNativeClassifierMapping serialize labels as strings. Incremented CustomSerializationVersion for each.
BinaryNativeClassifierMapping does not serialize labels when TLabel is bool.
IReader.ReadObject is generic.
2021-12-14 11:33:00 +00:00
Tom Minka 3c825f2d29
BayesPointMachineClassifier.CreateGaussianPriorBinaryClassifier is public. (#376)
* Added Variable.Max(int,int).
* Compiler warns about excess memory consumption in more cases when it should, and fewer cases when it shouldn't.
* TransformBrowser shows attributes by default.
* Updated FactorDocs
2021-12-10 23:37:00 +00:00
Jonathan Tims cc132f77f8
Switch to explicit pools in YAML for PR build (#374) 2021-11-30 14:39:43 +00:00
Tom Minka 6c6fb65dd2
BayesPointMachineClassifier can be serialized as text (#373)
* BayesPointMachineClassifier.LoadBackwardCompatibleBinaryClassifier and SaveForwardCompatible use text or binary depending on the file extension
* Added IWriter and IReader, WrappedBinaryWriter, WrappedBinaryReader, WrappedTextWriter, WrappedTextReader
* Changed uses of BinaryWriter to IWriter
* Changed uses of BinaryReader to IReader
* LearnersTests uses same xunit version as other tests
2021-11-29 23:08:58 +00:00
Ivan Korostelev 85be0de052
Avoid accidental quadratic behavior in caching for Automata calculations (#370)
Introduce "generational" strategy for clearing intermediate data in reused preallocated array for
`Automaton.Condensation.FindStronglyConnectedComponents` state.

To avoid allocations `FindStronglyConnectedComponents` employs a preallocated array with data about states.
It may use this cache multiple times when processing a single automaton. It typically will use only a handful
of entries in this array, but each time it grabs it, it has to clear it in full.

With previous array clearance strategy in the worst case `ComputeEpsilonClosure` was a quadratic algorithm,
since it will call `FindStronglyConnectedComponents` as many times as there are states in automaton - (N).
And `FindStronglyConnectedComponents` will clear array which is also O(N), turning the whole thing into O(N^2)
2021-10-06 14:21:28 +01:00
Ivan Korostelev 2376606055
Remove Pair<> and introduce IntPair (#369)
C# now has a built-in type to represent pairs - `ValueTupe<>`. `Pair<>` was eliminated in favour of it.

At the same time a new type - `IntPair` was introduced. It is faster than `ValueTuple<int, int>` when `.Equals()` is called often.
It makes some common string operations which do many lookups by `IntPair` keys up to 10% faster.
2021-09-30 17:11:04 +01:00
Ivan Korostelev 376918bfd6
Reduce GC pressure induced by string operations (#362)
A lot of automata operations create large short-lived data structures.
Those are now cached in thread local static fields. This saves a lot of allocations and consequently - causes less GC pressure.

To do so also a new container is introduced - `GenerationalDictionary` which can be cleared in constant time and reuses memory after the Clear.

These changes reduce amount of memory allocations done by `StringInferencePerformanceTests` from 4Gb to 2Gb. Another 1 Gb is allocated by `FindStronlyConnectedComponents` which will be fixed in separate PR, it is harder to introduce caching in there. Another 1Gb is set up of the tests (creation of test data).
2021-09-29 18:28:15 +01:00
Tom Minka 591117241e
All-zero Discrete is not a point mass (#368) 2021-09-29 11:19:15 +01:00
Tom Minka 76ec66dcca
Changed the BCC confusion matrix prior (#367)
Changed the BCC confusion matrix prior so that TrueLabels can be inferred when LabelCount==2.
Fixed serialization of BCC posteriors.  BCC posteriors now save to the Results folder.
Fixed serialization example code.
2021-09-29 07:09:38 +01:00
Tom Minka c90ad0bf1e
Discrete distribution can be all zero (#364)
Added Discrete.Zero and IsZero.
IntegerPlusOp allows result to be the same object as an input.
2021-09-23 17:16:56 +01:00
Ivan Korostelev 8370c9e8e7
Simplify discrete char implementation (#361)
There are two changes:
1. (major) `LogProbabilityOverride` was removed from `DiscreteChar` and `ImmutableDiscreteChar`
    This was an unsound functionality that is not used and was complicating the implementation of char distributions
2. (minor) `ImmutableDiscreteChar.Multiply` may reuse immutable discrete char in more cases out of the box. 
   This reduces the GC pressure a little bit.
2021-09-14 21:53:05 +01:00
Tom Minka abae518039
OuterQuantiles handles extreme values (#363)
* Subarray checks that indices are distinct when debugging
* Renamed ConcatOpTests to StringConcatOpTests
* Crowdsourcing explains why accuracy and precision are NaN.
* Added build instructions for Visual Studio Code.
2021-09-13 18:20:37 +01:00
msdmkats 321be1463d
Extend SequenceDistribution API and bugfix (#352)
- Fixed automaton deserialization ignoring LogValueOverride
- Fixed SequenceDistribution.EnumerateSupport and TryEnumerateSupport having different side effects
- Added TryDeterminize, SetLogValueOverride, and ProjectOnTransducer methods to SequenceDistribution
- Added a parameterless overload for Automaton.TryDeterminize that returns the output of TryDeterminize(out TThis) and discards information about deterministicity of said output
2021-07-26 16:54:41 +03:00
Tom Minka 14d4164ebc
Improved handling of numerical overflow (#347)
Improved handling of numerical overflow in GaussianProductVmpOp.ProductAverageLogarithm, DoublePlusOp, Gaussian.FromDerivatives, Gaussian.GetDerivatives, GammaFromShapeAndRateOp_Slow.SampleAverageConditional
2021-07-13 00:27:14 +01:00
msdmkats d27df8ce03
Automata: immutability and optimized GetLogValue (#344)
* Immutable distribution interfaces

* DiscreteChar made immutable

* Automata made constant

* Automaton.GetLogValue optimized for cases of deterministic and epsilon-free automata

* Fixed Automaton.[Try]EnumerateSupport so that it won't produce duplicates for non-determinizable automata
2021-07-08 17:03:53 +03:00
msdmkats bd48c63627
Multi-representable weight function for sequence distribution (#339)
* Introduced IWeightFunction - interface for abstract weight functions used by SequenceDistribution

* Multi-representable weight function for sequence distribution, that automatically switches between point mass, dictionary, and autoaton representations as appropriate

* Early stops for automaton support enumeration

* Improved automata graphviz format

* Language writer correctly processes nested generics
2021-06-08 14:03:44 +03:00
Martin 1d9536b2b2
Fix SonarQube's "IEnumerable LINQs should be simplified" / Code Smell (#341) 2021-05-13 21:47:27 +01:00
Tom Minka 95a6d39f7a
Fixed an issue where IsBetweenGaussianOp would throw "ylInvSqrtVxlPlusAlphaX is NaN" (#338)
* Renamed IsBetweenOp -> IsBetweenGaussianOp
2021-04-19 20:08:07 +01:00
Jonathan Tims bd821dac12
Migrate to VSEngSS-MicroBuild2019
Migrate to VSEngSS-MicroBuild2019
2021-04-13 18:04:19 +01:00
Jonathan Tims d4966ec295
Update status badges (for default branch rename)
Update status badges (for default branch rename)
2021-04-07 09:41:57 +01:00
Jonathan Tims 33bc436481
Update status badges (for default branch rename)
A different link is required now that the default branch has been renamed.
2021-04-07 09:28:00 +01:00
Jonathan Tims 50afbcd442
Disabled "all compiler options" build for PRs (#331) 2021-03-18 12:33:15 +00:00
Tom Minka 9b9c7a0571
Variable.Subarray, GetItems, Observed, and Constant are overloaded for IReadOnlyList arguments instead of IList (#330)
* Incremented version to 0.4
* Subarray and GetItems factors and operators take IReadOnlyList instead of IList.
* IMatchboxRecommenderMapping uses IReadOnlyList instead of IList.
* Moved Subarray and GetItems factors from Factor class to Collection class.
* Moved variable factors from Factor class to Clone class.
* Conversion.IsAssignableFrom handles covariance.
* Util.GetElementType and IsIList include IReadOnlyList.
* Code cleanup
* Refactored MessageTransform.ConvertMethodInvoke
* Removed Collection.Sort
2021-03-18 12:15:16 +00:00
Jonathan Tims dcb5ac5d5d
Move compiler options test to separate net461 process (#310)
# problem
The Azure DevOps VSTest runner is not handling the Compiler Options test well, because it takes so long. Also the test takes a very long time to run.

# solution
Run different options in parallel, and in a separate process to finish reliably and in a reasonable time.
2021-03-18 09:21:27 +00:00
Jonathan Tims b61f522d9c
Change references to "main" branch
Change references to "main" branch
2021-03-17 21:28:39 +00:00
Tom Minka 9cd729fd29
CopyPropagationTransform can substitute a value of different type if the context allows it (#328)
InferNet.Infer is generic.
ModelBuilder uses the correct overload of InferNet.Infer.
CodeBuilder.Method checks for correct number of arguments.
FileArray implements IReadOnlyList.
2021-03-16 15:06:59 +00:00
Tom Minka 4394af9e21
Added GetProbBetween to CanGetProbLessThan (#325)
CanGetQuantile and CanGetProbLessThan are generic.
2021-03-12 07:16:01 +00:00
Tom Minka b16f8074d4
Added MeanAccumulator and MMath.WeightedAverage (#324) 2021-03-11 18:49:31 +00:00
Tom Minka eeae7afcb5
Added TruncatedPoisson (#323)
* Added WordStrings and StringFormatTests.
* FactorManager does not allow point mass conversion of the return value argument of EP evidence methods (previously handled by MessageTransform).
* Code cleanup.  Renamed IdentityComparer to ReferenceEqualityComparer.
2021-03-05 18:29:54 +00:00