Граф коммитов

452 Коммитов

Автор SHA1 Сообщение Дата
Philip Ball 3855935efd add random data point corruption ability, change to scatter 2019-08-30 15:53:21 +01:00
Robert Tinn e887a5ff94 Add model with student t likelihood to tutorial 2019-08-29 18:44:51 -07:00
Robert Tinn df775f7a06 Add student t likelihood as option 2019-08-29 13:29:54 -07:00
Rob Tinn 6b84def27a
Merge pull request #5 from r-tinn/r-tinn/regression-tutorial
R tinn/regression tutorial
2019-08-29 13:01:24 -07:00
Robert Tinn f9a7c81052 Add comment to stablising factor for Positive Definite Matrix 2019-08-29 09:35:56 -07:00
Robert Tinn 8c9cfc5c74 ALter threading import position 2019-08-29 09:33:25 -07:00
Robert Tinn 0f884437fc Merge branch 'master' into r-tinn/regression-tutorial 2019-08-29 08:46:24 -07:00
Robert Tinn c741b81935 Add comments about AIS dataset 2019-08-29 08:44:57 -07:00
Robert Tinn f4d07ab2e4 Draft regression tutorial 2019-08-23 19:09:47 +01:00
Tom Minka a280a6272f Added GatedInnerProductVectorTest and missing overloads of InnerProductOp 2019-08-23 16:03:05 +01:00
Tom Minka 98e07ae335 AutomatonTests.Determinize10 is OpenBug 2019-08-22 18:09:39 +01:00
Tom Minka 9e7dfcf7f2 InnerProduct supports fixed output when B vector has zero or one nonzero element.
ExpOp_Slow handles GammaPower messages.
GammaPower implements division by constant.
2019-08-22 14:53:06 +01:00
Philip Ball 027ff32ea0
Merge pull request #2 from r-tinn/pjball/tidyup-inverse-fix
tidy up stabilising code for matrix inversion
2019-08-21 16:49:56 +01:00
Philip Ball 0296628640 tidy up stabilising code for matrix inversion 2019-08-21 16:47:58 +01:00
Philip Ball e36b48c2cc tidy up stabilising code for matrix inversion 2019-08-21 16:45:18 +01:00
Rob Tinn ab10db14a1
Merge pull request #1 from r-tinn/pjball/data-set-generator
Data set generator
2019-08-21 16:26:46 +01:00
Philip Ball 6718d65eea tidy up references (alphabetical) 2019-08-21 16:21:00 +01:00
Philip Ball a230384a50 remove regression example as we will use Rob's instead 2019-08-21 16:18:47 +01:00
Philip Ball 77881cd4d3 introduce plotting, create helper function for Vector linspace generation 2019-08-21 16:13:01 +01:00
Philip Ball c2bf3e45ac MWE for data generator, has some stability hotfixes for matrix inversion 2019-08-21 13:41:52 +01:00
Philip Ball daea7bda3f create a regression example 2019-08-21 10:46:57 +01:00
Tom Minka bb0a6fd597
Merge pull request #160 from dotnet/ShowFactorGraph
ShowFactorGraph appends array indexing to a variable's name when it is local to a ForEachBlock.
2019-08-07 18:31:51 +01:00
Tom Minka ac9dae99f7 ShowFactorGraph appends array indexing to a variable's name when it is local to a ForEachBlock.
WindowsVisualizer reuses code from DefaultVisualizer
2019-08-06 18:57:06 +01:00
Vijay 40ace0c839
Merge pull request #158 from dotnet/a-vishar-readme-update
Updated readme to removed debug badges.
2019-06-26 13:03:01 +01:00
Vijay c312783608
Updated readme
Updated readme to remove debug build badges.
2019-06-26 12:27:01 +01:00
Tom Minka d5cdbad37f
Merge pull request #157 from dotnet/minka
Improved numerical accuracy of GaussianOp, DoubleIsBetweenOp, and MMath.NormalCdf2
2019-06-25 23:59:54 +01:00
Tom Minka 397642ff2a Improved numerical accuracy of GaussianOp, DoubleIsBetweenOp, and MMath.NormalCdf2.
Fixed cases where MMath.NormalCdf, DoubleIsBetweenOp, IsPositiveOp, DoublePlusOp would throw.
JaggedSubarrayOp uses ForceProper.
PlusDoubleOp gracefully handles improper distributions on input.
Added MMath.NormalCdfIntegral, NormalCdfDiff, NormalCdfExtended.
Added ExtendedDouble class.
MeanVarianceAccumulator correctly handles weight=0.
Region.GetLogVolume has a lower bound.
Variables with PointEstimate do not use JaggedSubarrayWithMarginal.
2019-06-25 23:23:09 +01:00
Tom Minka dbd465f0b9
Merge pull request #156 from dotnet/quantiles
OuterQuantiles and InnerQuantiles can be serialized and compared.
2019-06-13 23:01:25 +01:00
Tom Minka 63fe682bed OuterQuantiles and InnerQuantiles can be serialized and compared.
Their second constructor is now a factory method.
2019-06-13 22:35:24 +01:00
Tom Minka 7e79cb8b4b
Merge pull request #155 from dotnet/quantiles
InnerQuantiles.GetQuantile is increasing
2019-06-08 12:02:22 +01:00
Tom Minka 8e7c7d5a96 InnerQuantiles.GetQuantile is increasing 2019-06-08 11:32:12 +01:00
Tom Minka 17ba73a3ab
Merge pull request #154 from dotnet/quantiles
InnerQuantiles and OuterQuantiles throw on infinite quantiles.  InnerQuantiles.GetLogProb and GetQuantile are non-decreasing.
2019-06-07 21:34:57 +01:00
Tom Minka 8e237a0d02 Added GammaPower.GetMode 2019-06-07 20:49:23 +01:00
Tom Minka b7ab86af1b InnerQuantiles and OuterQuantiles throw on infinite quantiles. InnerQuantiles.GetLogProb and GetQuantile are non-decreasing. 2019-06-07 20:36:09 +01:00
Ivan Korostelev 70ae46cd79
Fix DiscreteChar.Sample() (#153)
Due to refactoring mistake `sampleProb / prob * intervalLength`
turned into `sampleProb / (prob * intervalLength)` which is obviosly incorrect.

Fixed that + added a rudimentary test for `DiscreteChar.Sample()`
2019-05-28 15:52:38 +01:00
Tom Minka 5b67978a61
Merge pull request #151 from loopylangur/pareto-fix
Pareto distribution bug fix
2019-05-07 12:33:31 +01:00
loopylangur 30cce7edc7 pareto bug fix 2019-05-04 17:05:05 -05:00
Elena Pochernina 97c8995e4f Added test for StringFormatOp parallel run (#147)
Made `ArgumentCountToNames` cache thread-safety.
After that another exception appeared with `ArgsToValidatingAutomaton` dictionary (reference null exception). Making the dictionary concurrent fixed this error.
2019-04-24 11:20:23 +01:00
Ivan Korostelev e486a155b9
Automaton determinization improvements (#144)
* Determinization doesn't change language of the automaton anymore.
  2 new tests were added which check that it doesn't happen.
  Previously all very low probability transitions (with probability of less than e^-35) were removed.
  That was done because of 2 reasons:
  - some probabilities were represented in linear space. And e^-35 is as low resolution
    as you can get with regular doubles. (That is, if one probability = 1 and another is e^-35,
    then when you add them together, second one is indistinguishable from zero).
    This was fixed when discrete charprobabilities
  - Trying to determinize some non-determinizable automata lead to explosion of low-probability
    states and transitions which led to a very poor performance.
    (E.g. `AutomatonNormalizationPerformance3` test). Now a smarter strategy for detecting
    these non-determinizable automata is used - during traversal all sets of states from root
    are remembered. If automaton comes to the same set of states but with different weights
    than it stops immediately, because otherwise it will be caught in infinite loop
* `Equals()` and `GetHashCode()` for `WeightedStateSet` take into account only high 32 bits of weight.
  This coupled with normalization of weights allows to reuse already added states with
  very close weights. This speeds up the "PropertyInferencePerformanceTest" 2.5x due to
  smaller intermediate automata.
* Weighted sets of size one are handled specially in `TryDeterminize` - they don't need to be
  determinized and can be copied into result almost as is. (Unless they have non-deterministic
  transitions. Simple heuristic of "has different destination states" is used to detect that
  and fallback to slow general path).
* Representation for `WeightedStateSet` is changed from (int -> float) dictionary to
  sorted array of (int, float) pairs. As an optimization, a common case of single-element
  set does not allocate any arrays.
* Determinization code for `ListAutomaton` was removed, because it has never worked
2019-04-23 15:30:16 +01:00
Tom Minka 526bfefd3b
Merge pull request #150 from novikov-alexander/master
Irrelevant exception while setting TraitCount to 0 is fixed
2019-04-23 09:57:30 +01:00
Alexander Novikov 446599cd36 Irrelevant exception while setting TraitCount to 0 is fixed 2019-04-23 10:37:23 +03:00
Ivan Korostelev faf55fffb9
Do not fail in MergeParallelTransitions when transition weights are high (#149)
Transition weights may become infinite when going from log space into value space. And
`SetToSum()` which is used by `MergerParallelTransitions` can't handle 2 inifnite weights at once.
To solve this, transitions weights are normalized by max weight prior to passing into `SetToSum()`
2019-04-20 00:10:13 +01:00
Tom Minka 8398c1d4f7
ModelCompiler.TraceAllMessages activates Tracing.Trace for all variables (#145)
Tracing.Trace uses System.Diagnostics.Trace.
Added DifficultyAbility.fs to TestFSharp.
2019-04-01 22:52:22 +01:00
Ivan Korostelev 9854cb768a
Faster PointMass operations for DiscreteChar (#143)
Pointmasses are the most common form of char distributions. Common operations for them
have been sped up:
* `IsPointMass` and `Point` are now computed at DiscreteChar construction time and
  stored in `char? Storage.Point`. `Storage.Ranges` still contain range with pointmass,
  because many operations are not specialized for points.
* `FindProb()`, `GetLogAverageOf()` and `SetToProduct()` now have fast-path for pointmass distributions.
* `DiscreteChar` and `DiscreteChar.Storage` implement `IEquatable<>`
2019-03-28 21:47:25 +00:00
Ivan Korostelev 9ab76c6b7c
Change transitions range representation in Automaton.StateData (#140)
* Pair (FirstTranstiionIndex, LastTransitionIndex) changed to pair
  (FirstTransitionIndex, TransitionsCount).
* Introduced Automaton.Builder.LinkedStateData struct which
  mirrors Automaton.StateData but represents transitions as linked list.
  Previously Automaton.StateData was reused for this purpose.
  That was confusing.
2019-03-26 19:20:27 +00:00
Ivan Korostelev 26c093e17f
GetLogProb should return stored log value directly (#139)
Since DiscreteChar was refactored, GetLogProb() can directly return the saved
logarithm of probability. Instead of first exponentiating it and then taking logarithm.
2019-03-26 18:44:08 +00:00
Ivan Korostelev 8d1857adbf
Make ToLower() work with more char classes (#135)
Previously `NotSupportedException` was thrown for `WordChar` and `Uniform` char classes.
Now versions of these distributions with upper chars excluded are returned.

Note: implementation of `ToLower()` for `Unknown` char class is still somewhat broken:
- it ignores probability outside ranges.
- it can get slow if char ranges are big.
2019-03-22 14:23:15 +00:00
Ivan Korostelev a0246ff22d
Infer determinization state in simple cases (#136)
If both inputs are determinized than product is also determinized
2019-03-22 10:24:16 +00:00
Ivan Korostelev 946b5dfb83
Represent ReadOnlyArraySegment range as a (begin, length) tuple
1. Previously it was represented as (begin, end) tuple.
2. Add index checks in debug builds
2019-03-21 16:15:44 +00:00
Ivan Korostelev 036f9bfd71
Store log probabilities inside DiscreteChar (#137)
In some edge cases StringAutomaton needs to represent extremely
low probabilities of character transitions. To be able to do that,
instead of storing probabilities as double values they are stored
as `Weight` structs already used by automatons. Weight stores logarithm
of value instead of value itself.
2019-03-21 13:49:06 +00:00