infer

Граф коммитов

Автор	SHA1	Сообщение	Дата
Tom Minka	791d11a8a1	MMath.WeightedAverage fix (#379 )	2021-12-26 20:03:11 +00:00
Tom Minka	5f394f9b8a	Added IClassifierMapping.LabelToString and ParseLabel. (#378 ) BinaryNativeClassifierMapping and MulticlassNativeClassifierMapping serialize labels as strings. Incremented CustomSerializationVersion for each. BinaryNativeClassifierMapping does not serialize labels when TLabel is bool. IReader.ReadObject is generic.	2021-12-14 11:33:00 +00:00
Tom Minka	3c825f2d29	BayesPointMachineClassifier.CreateGaussianPriorBinaryClassifier is public. (#376 ) * Added Variable.Max(int,int). * Compiler warns about excess memory consumption in more cases when it should, and fewer cases when it shouldn't. * TransformBrowser shows attributes by default. * Updated FactorDocs	2021-12-10 23:37:00 +00:00
Jonathan Tims	cc132f77f8	Switch to explicit pools in YAML for PR build (#374 )	2021-11-30 14:39:43 +00:00
Tom Minka	6c6fb65dd2	BayesPointMachineClassifier can be serialized as text (#373 ) * BayesPointMachineClassifier.LoadBackwardCompatibleBinaryClassifier and SaveForwardCompatible use text or binary depending on the file extension * Added IWriter and IReader, WrappedBinaryWriter, WrappedBinaryReader, WrappedTextWriter, WrappedTextReader * Changed uses of BinaryWriter to IWriter * Changed uses of BinaryReader to IReader * LearnersTests uses same xunit version as other tests	2021-11-29 23:08:58 +00:00
Ivan Korostelev	85be0de052	Avoid accidental quadratic behavior in caching for Automata calculations (#370 ) Introduce "generational" strategy for clearing intermediate data in reused preallocated array for `Automaton.Condensation.FindStronglyConnectedComponents` state. To avoid allocations `FindStronglyConnectedComponents` employs a preallocated array with data about states. It may use this cache multiple times when processing a single automaton. It typically will use only a handful of entries in this array, but each time it grabs it, it has to clear it in full. With previous array clearance strategy in the worst case `ComputeEpsilonClosure` was a quadratic algorithm, since it will call `FindStronglyConnectedComponents` as many times as there are states in automaton - (N). And `FindStronglyConnectedComponents` will clear array which is also O(N), turning the whole thing into O(N^2)	2021-10-06 14:21:28 +01:00
Ivan Korostelev	2376606055	Remove Pair<> and introduce IntPair (#369 ) C# now has a built-in type to represent pairs - `ValueTupe<>`. `Pair<>` was eliminated in favour of it. At the same time a new type - `IntPair` was introduced. It is faster than `ValueTuple<int, int>` when `.Equals()` is called often. It makes some common string operations which do many lookups by `IntPair` keys up to 10% faster.	2021-09-30 17:11:04 +01:00
Ivan Korostelev	376918bfd6	Reduce GC pressure induced by string operations (#362 ) A lot of automata operations create large short-lived data structures. Those are now cached in thread local static fields. This saves a lot of allocations and consequently - causes less GC pressure. To do so also a new container is introduced - `GenerationalDictionary` which can be cleared in constant time and reuses memory after the Clear. These changes reduce amount of memory allocations done by `StringInferencePerformanceTests` from 4Gb to 2Gb. Another 1 Gb is allocated by `FindStronlyConnectedComponents` which will be fixed in separate PR, it is harder to introduce caching in there. Another 1Gb is set up of the tests (creation of test data).	2021-09-29 18:28:15 +01:00
Tom Minka	591117241e	All-zero Discrete is not a point mass (#368 )	2021-09-29 11:19:15 +01:00
Tom Minka	76ec66dcca	Changed the BCC confusion matrix prior (#367 ) Changed the BCC confusion matrix prior so that TrueLabels can be inferred when LabelCount==2. Fixed serialization of BCC posteriors. BCC posteriors now save to the Results folder. Fixed serialization example code.	2021-09-29 07:09:38 +01:00
Tom Minka	c90ad0bf1e	Discrete distribution can be all zero (#364 ) Added Discrete.Zero and IsZero. IntegerPlusOp allows result to be the same object as an input.	2021-09-23 17:16:56 +01:00
Ivan Korostelev	8370c9e8e7	Simplify discrete char implementation (#361 ) There are two changes: 1. (major) `LogProbabilityOverride` was removed from `DiscreteChar` and `ImmutableDiscreteChar` This was an unsound functionality that is not used and was complicating the implementation of char distributions 2. (minor) `ImmutableDiscreteChar.Multiply` may reuse immutable discrete char in more cases out of the box. This reduces the GC pressure a little bit.	2021-09-14 21:53:05 +01:00
Tom Minka	abae518039	OuterQuantiles handles extreme values (#363 ) * Subarray checks that indices are distinct when debugging * Renamed ConcatOpTests to StringConcatOpTests * Crowdsourcing explains why accuracy and precision are NaN. * Added build instructions for Visual Studio Code.	2021-09-13 18:20:37 +01:00
msdmkats	321be1463d	Extend SequenceDistribution API and bugfix (#352 ) - Fixed automaton deserialization ignoring LogValueOverride - Fixed SequenceDistribution.EnumerateSupport and TryEnumerateSupport having different side effects - Added TryDeterminize, SetLogValueOverride, and ProjectOnTransducer methods to SequenceDistribution - Added a parameterless overload for Automaton.TryDeterminize that returns the output of TryDeterminize(out TThis) and discards information about deterministicity of said output	2021-07-26 16:54:41 +03:00
Tom Minka	14d4164ebc	Improved handling of numerical overflow (#347 ) Improved handling of numerical overflow in GaussianProductVmpOp.ProductAverageLogarithm, DoublePlusOp, Gaussian.FromDerivatives, Gaussian.GetDerivatives, GammaFromShapeAndRateOp_Slow.SampleAverageConditional	2021-07-13 00:27:14 +01:00
msdmkats	d27df8ce03	Automata: immutability and optimized GetLogValue (#344 ) * Immutable distribution interfaces * DiscreteChar made immutable * Automata made constant * Automaton.GetLogValue optimized for cases of deterministic and epsilon-free automata * Fixed Automaton.[Try]EnumerateSupport so that it won't produce duplicates for non-determinizable automata	2021-07-08 17:03:53 +03:00
msdmkats	bd48c63627	Multi-representable weight function for sequence distribution (#339 ) * Introduced IWeightFunction - interface for abstract weight functions used by SequenceDistribution * Multi-representable weight function for sequence distribution, that automatically switches between point mass, dictionary, and autoaton representations as appropriate * Early stops for automaton support enumeration * Improved automata graphviz format * Language writer correctly processes nested generics	2021-06-08 14:03:44 +03:00
Martin	1d9536b2b2	Fix SonarQube's "IEnumerable LINQs should be simplified" / Code Smell (#341 )	2021-05-13 21:47:27 +01:00
Tom Minka	95a6d39f7a	Fixed an issue where IsBetweenGaussianOp would throw "ylInvSqrtVxlPlusAlphaX is NaN" (#338 ) * Renamed IsBetweenOp -> IsBetweenGaussianOp	2021-04-19 20:08:07 +01:00
Jonathan Tims	bd821dac12	Migrate to VSEngSS-MicroBuild2019 Migrate to VSEngSS-MicroBuild2019	2021-04-13 18:04:19 +01:00
Jonathan Tims	d4966ec295	Update status badges (for default branch rename) Update status badges (for default branch rename)	2021-04-07 09:41:57 +01:00
Jonathan Tims	33bc436481	Update status badges (for default branch rename) A different link is required now that the default branch has been renamed.	2021-04-07 09:28:00 +01:00
Jonathan Tims	50afbcd442	Disabled "all compiler options" build for PRs (#331 )	2021-03-18 12:33:15 +00:00
Tom Minka	9b9c7a0571	Variable.Subarray, GetItems, Observed, and Constant are overloaded for IReadOnlyList arguments instead of IList (#330 ) * Incremented version to 0.4 * Subarray and GetItems factors and operators take IReadOnlyList instead of IList. * IMatchboxRecommenderMapping uses IReadOnlyList instead of IList. * Moved Subarray and GetItems factors from Factor class to Collection class. * Moved variable factors from Factor class to Clone class. * Conversion.IsAssignableFrom handles covariance. * Util.GetElementType and IsIList include IReadOnlyList. * Code cleanup * Refactored MessageTransform.ConvertMethodInvoke * Removed Collection.Sort	2021-03-18 12:15:16 +00:00
Jonathan Tims	dcb5ac5d5d	Move compiler options test to separate net461 process (#310 ) # problem The Azure DevOps VSTest runner is not handling the Compiler Options test well, because it takes so long. Also the test takes a very long time to run. # solution Run different options in parallel, and in a separate process to finish reliably and in a reasonable time.	2021-03-18 09:21:27 +00:00
Jonathan Tims	b61f522d9c	Change references to "main" branch Change references to "main" branch	2021-03-17 21:28:39 +00:00
Tom Minka	9cd729fd29	CopyPropagationTransform can substitute a value of different type if the context allows it (#328 ) InferNet.Infer is generic. ModelBuilder uses the correct overload of InferNet.Infer. CodeBuilder.Method checks for correct number of arguments. FileArray implements IReadOnlyList.	2021-03-16 15:06:59 +00:00
Tom Minka	4394af9e21	Added GetProbBetween to CanGetProbLessThan (#325 ) CanGetQuantile and CanGetProbLessThan are generic.	2021-03-12 07:16:01 +00:00
Tom Minka	b16f8074d4	Added MeanAccumulator and MMath.WeightedAverage (#324 )	2021-03-11 18:49:31 +00:00
Tom Minka	eeae7afcb5	Added TruncatedPoisson (#323 ) * Added WordStrings and StringFormatTests. * FactorManager does not allow point mass conversion of the return value argument of EP evidence methods (previously handled by MessageTransform). * Code cleanup. Renamed IdentityComparer to ReferenceEqualityComparer.	2021-03-05 18:29:54 +00:00
Tom Minka	3624808df0	Removed the unnecessary new() constraint on ICollectionDistribution.TransformElements and strengthened the method signature. (#321 ) * GammaPower.GetProbLessThan returns 0 for x < 0 * LDA example does not use DistributionRefArray * Updated FactorDocs * Updated Sequential code doc	2021-02-26 00:38:53 +00:00
Tom Minka	8f3189b8e0	StringFormatOp takes IReadOnlyList (#320 )	2021-02-09 15:10:29 +00:00
Tom Minka	7ae07237fe	SchedulingTransform identifies when a model with offset edges does not need iteration. (#319 ) Fixed ForwardBackwardTransform.	2021-02-09 00:40:02 +00:00
Ivan Korostelev	ac0c035bb9	Micro-optimizations for string distributions (#317 ) * Enumerating support of distribution does not require it to be normalized unless determinization is requested. Avoiding doing this speeds up enumerating some auotomata 10x. * states of `EpsilonClosure` can be stored in plain array instead of list. This minor change speeds up epsilon closure operations by 20% due to one less level of indirection and better memory allocation.	2021-02-08 14:19:42 +00:00
Ivan Korostelev	429b4debe7	Reimplement EnumerateSupport() (#315 ) Previous version was correct but had a pathalogical (exponential) runtime for some forms of automata with multiple branches with epsilon transitions. Also code was unneccessary complex, because it tried to compute unreachable states in presence of loops. Specific changes in this PR: - `EnumerateSupport` implementation was moved into its own file - `Automaton.EnumerateSupport` - `ComputeEndStateReachability` method has been resurrected. It is invoked only if loops are detected In all other cases, it is trivial to check for end-state reachability during normal traversal - Traversal loop has been split in steps, some of which were moved into their own local methods. - Fast path for non-branchy part of automaton was implemented.	2021-02-08 12:06:09 +00:00
Tom Minka	b34f76f193	Channel2Transform creates unique array names (#316 ) Removed Cancels attribute from PowerPlateOp.EnterAverageConditional Sum_Expanded handles an empty array. Dirichlet.GetMean handles zero pseudo-counts. Moved TrueSkill tests into TrueSkillTests.	2021-02-04 08:10:25 +00:00
Tom Minka	9235b1732c	Added Python versions of string tutorials (#313 )	2021-01-21 13:05:40 +00:00
Tom Minka	0c75b5ef41	Basic support for difference of Beta-distributed variables (#309 )	2021-01-05 00:34:00 +00:00
Tom Minka	920e3facbc	DoubleIsBetweenOp with random bounds is more accurate (#306 )	2020-12-23 23:07:54 +00:00
Tom Minka	9b79818034	Update motif finder example (#304 ) * Motif Finder uses SequenceCount = 70 * DiscreteFromDirichletOp.ProbsAverageConditional handles PiecewiseVector * GammaPower.FromLogMeanMinusMeanLog handles infinite mean and negative power	2020-11-27 11:12:31 +00:00
Tom Minka	3baf283b6f	Fixed ExpOp with point mass arguments (#303 ) * Added Gamma.GetLogMeanMinusMeanLog and FromLogMeanMinusMeanLog	2020-11-23 22:02:32 +00:00
Tom Minka	69a7a1979f	Improved convergence rate of GammaPower.FromMeanAndMeanPower (#301 ) * Improved accuracy of ExpOp.ExpAverageConditional * Tutorials project uses Unicode OutputEncoding * Fixed MethodBodySynthesizer	2020-11-18 23:44:47 +00:00
Tom Minka	dcc3503b7d	Fixed TruncatedGaussian.SetToRatio (#300 ) * Improved accuracy of GammaPower.FromMeanAndMeanLog, ExpOp, ExpOp_Slow * Added MMath.DigammaInv	2020-11-11 08:01:47 +00:00
Tom Minka	b3d4436e1f	Generated code uses .NET arrays where possible (#298 ) * Improved accuracy of Gamma.FromMeanAndMeanLog, ExpOp, PlusGammaOp, PlusGammaVmpOp * Fixed GammaPower.SetMeanAndVariance for infinite variance * TruncatedGamma implements CanGetQuantile * TruncatedGamma.Sample is faster when shape==1 * Extracted PlusWrappedGaussianOp and PlusTruncatedGaussianOp from PlusDoubleOp * Gamma.FromMeanAndMeanLog takes an optimal logMean argument. Removed Gamma.FromLogMeanAndMeanLog. * Improved accuracy of GammaPower.FromMeanAndMeanLog. * ExpOp supports GammaPower output.	2020-10-29 19:21:12 +00:00
Jonathan Tims	0be0aefe1d	Rename ImmutableArray to ReadOnlyArray (#293 )	2020-09-30 18:54:20 +01:00
Jonathan Tims	568cb96cda	QuantileEstimator stores its random number generator (#291 ) This allows QuantileEstimator to produce the same result even if it is serialized/deserialized in the middle.	2020-09-25 15:42:09 +01:00
Tom Minka	f1c9073776	MethodInvoke.CanBeInlined returns true for more cases (#292 ) * Tests use Assert.Throws instead of try/catch * Test output is less verbose * Added SimplestBackwardChainTest3 * Fixed IterativeProcessTransform when variable has QueryTypes.MarginalDividedByPrior and ConstrainEqualRandom(variable, observed) * Discrete allows Dimension=0	2020-09-24 21:38:30 +01:00
msdmkats	e167ce18f3	Use ulp-based constants (#289 ) * Fix long overflow in OperatorTests.Longs() * Fixed printing big float arrays, extended series for ((exp(x) - 1) / x - 1) - 0.5 * GammaPower.GetLogProb uses ulp-based threshold * Moved a constant for maximum terms in NormalCdfMomentRatio outside of method body * Separate methods for Previous/NextDoubleWithPositiveDifference * More magic constants replaced with ulp-based ones * Fixed abs value comparison in GammaPower.GetLogProb * Ulp-based constatnts in IsBetween.XAverageConditional * Moved the definitions of Ulp1-dependant constants below that of Ulp1 * Replaced <= in GammaPower with a < as it used to be Co-authored-by: Dmitry Kats <ratkillerx@hotmail.com>	2020-09-15 10:18:21 +01:00
Tom Minka	9ca3230018	Update URL for BookCrossing dataset (#288 ) * Added Matrix.FromDiagonal	2020-09-09 23:54:45 +01:00
Tom Minka	2e03d713b9	Fixed Binomial.GetLogProb and Gamma.GetLogNormalizer (#287 ) * Removed references to NETFULL. * Don't define NET45. * Don't define NETCORE, NETSTANDARD, or NETSTANDARD2_0	2020-09-08 13:39:12 +01:00

1 2 3 4 5 ...

452 Коммитов Все ветки Поиск

452 Коммитов

Все ветки