* fixing pypi readme errors and adding a test
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* installing dynamic versioning
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* Add compatibility between GCM and CausalModel class
Before, the causal graphs between the GCM and CausalModel part required some additional work to be converted.
Now, a GCM can be used to initiate a CausalModel or CausalGraph and vice versa.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
* minor changes and updating docstrings
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* fixed error when incompatible type is added and added a test for it
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* fixed formatting issues
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Amit Sharma <amit_sharma@live.com>
Co-authored-by: Amit Sharma <amit_sharma@live.com>
It now does not raise a division by zero error anymore. Other changes:
- Add new parameter indicating whether the method requires data for all nodes in the graph or also allows a subset of data.
- If no tests were performed, the summary now returns "Cannot be evaluated".
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
* removed deepiv and updated flaky test
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* black reformattingb
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* removed all outputs from nb
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* linked to up-to-date list of estimators
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* updated docs
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* using absolute paths
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* fixed bug where CATE is not returned by lr
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* added test
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* formatted file
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Amit Sharma <amit_sharma@live.com>
- Slightly update and revise existing GCM notebooks
- Moving mediation analysis, direct arrow strength and ICC to their own "Quantify Causal Influence" section
- Adding brief overview to describe differences between the quantification methods
- Change navigation image to reflect newest changes
- Adding related notebooks links to some of the causal task entries
- Adding a direct arrow strength example to the ICC notebook
- Adding a brief overview of the available root cause analysis and explanation methods
- Smaller revision of other GCM entries, such as the basic example
- Smaller typos and missing refernce fixes
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
- Reduce it only the essential points
- Add image as overview of the offered features
- Made connection between GCM and PO framework more consistent
- Revise the GCM example to an executable code snipped
- Removed conda installation guide, because conda is still at version 0.8
- Extended some references to include GCM related work
- Fix build status icon
- Changed github references to py-why (was still pointing to microsoft)
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
- Add new Discrete Additive Noise Model class that enforces the outputs to be discrete. This should help in generating more consistent data.
- As part of this, revised the auto assignment function and revised its docstring.
- Revise the auto assignment summary.
- Revise the evaluation summary.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
* Deprecate CausalGraph
The effect estimation API is now based on an functional API that expects a networkx graph as input.
- The graph should now be defined via a networkx graph. Most identification methods now expect an additional "observed_nodes" parameter accordingly.
- CausalModel and CausalGraph still exist and should be compatible with the old API.
---------
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Amit Sharma <amit_sharma@live.com>
Co-authored-by: Amit Sharma <amit_sharma@live.com>
* auto identify the effect modifier columns
Signed-off-by: Amit Sharma <amit_sharma@live.com>
* fixed formatting errors
Signed-off-by: Amit Sharma <amit_sharma@live.com>
---------
Signed-off-by: Amit Sharma <amit_sharma@live.com>
In addition to CRPS and depending on the node data type, it now also reports the MSE, NMSE, R2 and F1 score.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, the Support Vector Classifier did not produce probabilities, which are required for different algorithms in the GCM module. This changes the 'probability' parameter to True.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
If the confidence intervals are misspecified, e.g., greater lower bound than upper bound, the method threw an error before. This, however, can sometimes happen due to precision errors in some algorithms and lead to random build fails. This change fixes the issue and ignores invalid intervals accordingly.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, the scorer was not able to handle numpy object types directly. However, GCM often uses the object dtype to ensure support of mixing categorical and float values. This fixes the handling of object dtypes by explicitly converting them to floats first.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, the method threw an error when all samples were equal. However, in these cases, it should rather return a KL divergence of 0.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This module adds a new method for evaluating a fitted gcm. Here, we evaluate the performance of causal mechanisms, the underlying modeling assumptions (if possible), the goodness of the generated joint distribution and the graph structure. This utilizes some of the existing methods, but also introduces new ones.
This further adds a new user guide and notebook entries demonstrating the usage.
Part of introducing the module required to make some changes in other modules and implementatins, which are mostly fixes and improvements.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
The build sometimes randomly fails due to a timeout issue in the unit tests of the unit change methods of the GCM module. While this only happens in the github builds, this is most likely due to the prallelization of the underlying RandomForestRegressors being fitted.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This change aims at providing a better overview of the notebooks by displaying them as separate cards instead of a card carousel.
Other changes:
- Introductory examples and Real world-inspired examples are now more prominent with individual images and a grid layout by 2 per-row.
- All other examples are now in a grid layout with 3 examples per row.
- Clear outputs of some notebooks.
- Fix issue with rendering counterfactual example notebook.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, when creating a linear regressor with fixed parameters, these parameters are overridden when fit to data. Now, the parameters remain fixed.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This should work better with multivariate data and mixed data types. However, it is generally slower than the knn appraoch.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
The graph module uses Protocols, which is natively supported since Python 3.8 (the required minimum version for DoWhy). To support Protocols in earlier Python versions, an extension provided Protocol support. However, this extension can causes compatibility issues with newer versions of other packages.
DoWhy now uses the Protocol implementation that is available since Python 3.8.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Changing the value from "dot" to None. The current default value requires the dot extension to be installed. However, if it is not installed, an error is raised. By default, this should not happen since it is an optional dependency.
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>