Functional api/refute estimate (#672)

* refactor bootstrap, data_subset, random_common_cause refuters, placebo_treatment_refuter, add_unobserved_common_cause Signed-off-by: Andres Morales <andresmor@microsoft.com> * Add documentation Signed-off-by: Andres Morales <andresmor@microsoft.com> * Add refute_estimate function Signed-off-by: Andres Morales <andresmor@microsoft.com> * Add missing parameter Signed-off-by: Andres Morales <andresmor@microsoft.com> * Rename methods for identify_effect Signed-off-by: Andres Morales <andresmor@microsoft.com> * Fix notebook imports Signed-off-by: Andres Morales <andresmor@microsoft.com> Signed-off-by: Andres Morales <andresmor@microsoft.com>
2022-10-06 22:30:22 -06:00 · 2022-10-06 22:30:22 -06:00 · e1652ec3c6
--- a/docs/source/example_notebooks/dowhy_functional_api.ipynb
+++ b/docs/source/example_notebooks/dowhy_functional_api.ipynb
@ -0,0 +1,341 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Functional API Preview\n",
+    "\n",
+    "This notebook is part of a set of notebooks that provides a preview of the proposed functional API for dowhy. For details on the new API for DoWhy, check out https://github.com/py-why/dowhy/wiki/API-proposal-for-v1 It is a work-in-progress and is updated as we add new functionality. We welcome your feedback through Discord or on the Discussions page.\n",
+    "This functional API is designed with backwards compatibility. So both the old and new API will continue to co-exist and work for the immediate new releases. Gradually the old API using CausalModel will be deprecated in favor of the new API. \n",
+    "\n",
+    "The current Functional API covers:\n",
+    "* Identify Effect:\n",
+    "  * `identify_effect(...)`: Run the identify effect algorithm using defaults just provide the graph, treatment and outcome.\n",
+    "  * `auto_identify_effect(...)`: More configurable version of `identify_effect(...)`.\n",
+    "  * `id_identify_effect(...)`: Identify Effect using the ID-Algorithm.\n",
+    "* Refute Estimate:\n",
+    "  * `refute_estimate`: Function to run a set of the refuters below with the default parameters.\n",
+    "  * `refute_bootstrap`: Refute an estimate by running it on a random sample of the data containing measurement error in the confounders.\n",
+    "  * `refute_data_subset`: Refute an estimate by rerunning it on a random subset of the original data.\n",
+    "  * `refute_random_common_cause`: Refute an estimate by introducing a randomly generated confounder (that may have been unobserved).\n",
+    "  * `refute_placebo_treatment`: Refute an estimate by replacing treatment with a randomly-generated placebo variable.\n",
+    "  * `sensitivity_simulation`: Add an unobserved confounder for refutation (Simulation of an unobserved confounder).\n",
+    "  * `sensitivity_linear_partial_r2`: Add an unobserved confounder for refutation (Linear partial R2 : Sensitivity Analysis for linear models).\n",
+    "  * `sensitivity_non_parametric_partial_r2`: Add an unobserved confounder for refutation (Non-Parametric partial R2 based : Sensitivity Analyis for non-parametric models).\n",
+    "  * `sensitivity_e_value`: Computes E-value for point estimate and confidence limits. Benchmarks E-values against measured confounders using Observed Covariate E-values. Plots E-values and Observed\n",
+    "        Covariate E-values.\n",
+    "  * `refute_dummy_outcome`: Refute an estimate by introducing a randomly generated confounder (that may have been unobserved)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Import Dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Functional API imports\n",
+    "from dowhy.causal_identifier import (\n",
+    "    BackdoorAdjustment,\n",
+    "    EstimandType,\n",
+    "    identify_effect,\n",
+    "    identify_effect_auto,\n",
+    "    identify_effect_id,\n",
+    ")  # import effect identifier\n",
+    "from dowhy.causal_refuters import (\n",
+    "    refute_bootstrap,\n",
+    "    refute_data_subset,\n",
+    "    refute_random_common_cause,\n",
+    "    refute_placebo_treatment,\n",
+    "    sensitivity_e_value,\n",
+    "    sensitivity_linear_partial_r2,\n",
+    "    sensitivity_non_parametric_partial_r2,\n",
+    "    sensitivity_simulation,\n",
+    "    refute_dummy_outcome,\n",
+    "    refute_estimate,\n",
+    ")  # import refuters\n",
+    "\n",
+    "from dowhy.causal_graph import CausalGraph\n",
+    "\n",
+    "# Other imports required\n",
+    "from dowhy.datasets import linear_dataset\n",
+    "from dowhy import CausalModel  # We still need this as we haven't created the functional API for effect estimation\n",
+    "import econml\n",
+    "\n",
+    "# Config dict to set the logging level\n",
+    "import logging.config\n",
+    "\n",
+    "DEFAULT_LOGGING = {\n",
+    "    \"version\": 1,\n",
+    "    \"disable_existing_loggers\": False,\n",
+    "    \"loggers\": {\n",
+    "        \"\": {\n",
+    "            \"level\": \"WARN\",\n",
+    "        },\n",
+    "    },\n",
+    "}\n",
+    "\n",
+    "logging.config.dictConfig(DEFAULT_LOGGING)\n",
+    "# Disabling warnings output\n",
+    "import warnings\n",
+    "from sklearn.exceptions import DataConversionWarning\n",
+    "\n",
+    "warnings.filterwarnings(action=\"ignore\", category=DataConversionWarning)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create the Datasets"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Parameters for creating the Dataset\n",
+    "TREATMENT_IS_BINARY = True\n",
+    "BETA = 10\n",
+    "NUM_SAMPLES = 500\n",
+    "NUM_CONFOUNDERS = 3\n",
+    "NUM_INSTRUMENTS = 2\n",
+    "NUM_EFFECT_MODIFIERS = 2\n",
+    "\n",
+    "# Creating a Linear Dataset with the given parameters\n",
+    "data = linear_dataset(\n",
+    "    beta=BETA,\n",
+    "    num_common_causes=NUM_CONFOUNDERS,\n",
+    "    num_instruments=NUM_INSTRUMENTS,\n",
+    "    num_effect_modifiers=NUM_EFFECT_MODIFIERS,\n",
+    "    num_samples=NUM_SAMPLES,\n",
+    "    treatment_is_binary=True,\n",
+    ")\n",
+    "\n",
+    "treatment_name = data[\"treatment_name\"]\n",
+    "outcome_name = data[\"outcome_name\"]\n",
+    "\n",
+    "graph = CausalGraph(\n",
+    "    treatment_name=treatment_name,\n",
+    "    outcome_name=outcome_name,\n",
+    "    graph=data[\"gml_graph\"],\n",
+    "    effect_modifier_names=data[\"effect_modifier_names\"],\n",
+    "    common_cause_names=data[\"common_causes_names\"],\n",
+    "    observed_node_names=data[\"df\"].columns.tolist(),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Identify Effect - Functional API (Preview)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Default identify_effect call example:\n",
+    "identified_estimand = identify_effect(graph, treatment_name, outcome_name)\n",
+    "\n",
+    "# auto_identify_effect example with extra parameters:\n",
+    "identified_estimand_auto = identify_effect_auto(\n",
+    "    graph,\n",
+    "    treatment_name,\n",
+    "    outcome_name,\n",
+    "    estimand_type=EstimandType.NONPARAMETRIC_ATE,\n",
+    "    backdoor_adjustment=BackdoorAdjustment.BACKDOOR_EFFICIENT,\n",
+    ")\n",
+    "\n",
+    "# id_identify_effect example:\n",
+    "identified_estimand_id = identify_effect_id(\n",
+    "    graph, treatment_name, outcome_name\n",
+    ")  # Note that the return type for id_identify_effect is IDExpression and not IdentifiedEstimand\n",
+    "\n",
+    "print(identified_estimand)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Estimate Effect\n",
+    "\n",
+    "Estimate Effect is performed by using the causal_model api as there is not functional equivalent yet"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We will still need CausalModel as the Functional Effect Estimation is still Work-In-Progress\n",
+    "causal_model = CausalModel(data=data[\"df\"], treatment=treatment_name, outcome=outcome_name, graph=data[\"gml_graph\"])\n",
+    "\n",
+    "estimate = causal_model.estimate_effect(identified_estimand, method_name=\"backdoor.propensity_score_matching\")\n",
+    "\n",
+    "print(estimate)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Refute Estimate - Functional API (Preview)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# You can call the refute_estimate function for executing several refuters using default parameters\n",
+    "# Currently this function does not support sensitivity_* functions\n",
+    "refutation_results = refute_estimate(\n",
+    "    data[\"df\"],\n",
+    "    identified_estimand,\n",
+    "    estimate,\n",
+    "    treatment_name=treatment_name,\n",
+    "    outcome_name=outcome_name,\n",
+    "    refuters=[refute_bootstrap, refute_data_subset],\n",
+    ")\n",
+    "\n",
+    "for result in refutation_results:\n",
+    "    print(result)\n",
+    "\n",
+    "# Or you can execute refute methods directly\n",
+    "# You can change the refute_bootstrap - refute_data_subset for any of the other refuters and add the missing parameters\n",
+    "\n",
+    "bootstrap_refutation = refute_bootstrap(data[\"df\"], identified_estimand, estimate)\n",
+    "print(bootstrap_refutation)\n",
+    "\n",
+    "data_subset_refutation = refute_data_subset(data[\"df\"], identified_estimand, estimate)\n",
+    "print(data_subset_refutation)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Backwards Compatibility\n",
+    "\n",
+    "This section shows replicating the same results using only the CausalModel API"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Identify Effect"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "identified_estimand_causal_model_api = (\n",
+    "    causal_model.identify_effect()\n",
+    ")  # graph, treatment and outcome comes from the causal_model object\n",
+    "\n",
+    "print(identified_estimand_causal_model_api)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Estimate Effect"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "estimate_causal_model_api = causal_model.estimate_effect(\n",
+    "    identified_estimand, method_name=\"backdoor.propensity_score_matching\"\n",
+    ")\n",
+    "\n",
+    "print(estimate)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Refute Estimate"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "bootstrap_refutation_causal_model_api = causal_model.refute_estimate(identified_estimand, estimate, \"bootstrap_refuter\")\n",
+    "print(bootstrap_refutation_causal_model_api)\n",
+    "\n",
+    "data_subset_refutation_causal_model_api = causal_model.refute_estimate(\n",
+    "    identified_estimand, estimate, \"data_subset_refuter\"\n",
+    ")\n",
+    "print(data_subset_refutation_causal_model_api)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.8.10 ('dowhy-_zBapv7Q-py3.8')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": true,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "dcb481ad5d98e2afacd650b2c07afac80a299b7b701b553e333fc82865502500"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/source/example_notebooks/functional_api.ipynb
+++ b/docs/source/example_notebooks/functional_api.ipynb
@ -1,217 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Functional API Preview\n",
-    "\n",
-    "This notebook provides a preview of the proposed functional API for dowhy. For details on the new API for DoWhy, check out https://github.com/py-why/dowhy/wiki/API-proposal-for-v1 It is a work-in-progress and is updated as we add new functionality. We welcome your feedback through Discord or on the Discussions page.\n",
-    "This functional API is designed with backwards compatibility. So both the old and new API will continue to co-exist and work for the immediate new releases. Gradually the old API using CausalModel will be deprecated in favor of the new API."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Setup data for tests"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import dowhy as dw\n",
-    "from dowhy.causal_identifier import (\n",
-    "    AutoIdentifier,\n",
-    "    BackdoorAdjustment,\n",
-    "    IDIdentifier,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from dowhy.causal_graph import CausalGraph\n",
-    "import dowhy.datasets\n",
-    "\n",
-    "data = dowhy.datasets.linear_dataset(\n",
-    "    beta=4,\n",
-    "    num_common_causes=2,\n",
-    "    num_instruments=1,\n",
-    "    num_effect_modifiers=1,\n",
-    "    num_samples=500,\n",
-    "    treatment_is_binary=True,\n",
-    "    stddev_treatment_noise=10,\n",
-    "    num_discrete_common_causes=1,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "treatment_name = data[\"treatment_name\"]\n",
-    "outcome_name = data[\"outcome_name\"]\n",
-    "\n",
-    "graph = CausalGraph(\n",
-    "    treatment_name=treatment_name,\n",
-    "    outcome_name=outcome_name,\n",
-    "    graph=data[\"gml_graph\"],\n",
-    "    effect_modifier_names=data[\"effect_modifier_names\"],\n",
-    "    common_cause_names=data[\"common_causes_names\"],\n",
-    "    observed_node_names=data[\"df\"].columns.tolist(),\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Functional API Usage\n",
-    "\n",
-    "This section describes how to use the **experimental** functional API"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# New functional API identify_effect function this calls auto_identify_effect with pre-set parameters\n",
-    "\n",
-    "identified_estimand_auto = dw.identify_effect(graph=graph, treatment_name=treatment_name, outcome_name=outcome_name)\n",
-    "print(identified_estimand_auto)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# New functional API (AutoIdentifier with default parameters backdoor_adjustment = DEFAULT)\n",
-    "\n",
-    "identified_estimand_auto = dw.auto_identify_effect(\n",
-    "    graph=graph, treatment_name=treatment_name, outcome_name=outcome_name, estimand_type=dw.EstimandType.NONPARAMETRIC_ATE\n",
-    ")\n",
-    "print(identified_estimand_auto)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# New functional API (AutoIdentifier with custom parameters)\n",
-    "\n",
-    "identified_estimand_auto = dw.auto_identify_effect(\n",
-    "    graph=graph,\n",
-    "    treatment_name=treatment_name,\n",
-    "    outcome_name=outcome_name,\n",
-    "    estimand_type=dw.EstimandType.NONPARAMETRIC_NIE,\n",
-    "    backdoor_adjustment=BackdoorAdjustment.BACKDOOR_MIN_EFFICIENT,\n",
-    ")\n",
-    "print(identified_estimand_auto)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# New functional API (IDIdentifier)\n",
-    "identified_estimand_id = dw.id_identify_effect(\n",
-    "    graph=graph,\n",
-    "    treatment_name=treatment_name,\n",
-    "    outcome_name=outcome_name,\n",
-    ")\n",
-    "print(identified_estimand_id)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Backwards compatibility\n",
-    "\n",
-    "This section shows the same results as the new Functional API with the old `CausalModel` API"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# CausalModel API with AutoIdentifier\n",
-    "\n",
-    "model = dw.CausalModel(\n",
-    "    data=data[\"df\"],\n",
-    "    treatment=data[\"treatment_name\"],\n",
-    "    outcome=data[\"outcome_name\"],\n",
-    "    graph=data[\"gml_graph\"],\n",
-    ")\n",
-    "\n",
-    "identified_estimand = model.identify_effect()\n",
-    "print(identified_estimand)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# CausalModel API with IDIdentifier \n",
-    "\n",
-    "model = dw.CausalModel(\n",
-    "    data=data[\"df\"],\n",
-    "    treatment=data[\"treatment_name\"],\n",
-    "    outcome=data[\"outcome_name\"],\n",
-    "    graph=data[\"gml_graph\"],\n",
-    ")\n",
-    "\n",
-    "identified_estimand = model.identify_effect(method_name=\"id-algorithm\")\n",
-    "print(identified_estimand)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3.8.10 ('dowhy-_zBapv7Q-py3.8')",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.8.10"
-  },
-  "orig_nbformat": 4,
-  "vscode": {
-   "interpreter": {
-    "hash": "dcb481ad5d98e2afacd650b2c07afac80a299b7b701b553e333fc82865502500"
-   }
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/dowhy/init.py
+++ b/dowhy/init.py
@ -1,6 +1,6 @@
 import logging

-from dowhy.causal_identifier import EstimandType, auto_identify_effect, id_identify_effect, identify_effect
+from dowhy.causal_identifier import EstimandType, identify_effect, identify_effect_auto, identify_effect_id
 from dowhy.causal_model import CausalModel

 logging.getLogger(__name__).addHandler(logging.NullHandler())
@ -12,4 +12,4 @@ logging.getLogger(__name__).addHandler(logging.NullHandler())
 __version__ = "0.0.0"


-__all__ = ["EstimandType", "auto_identify_effect", "id_identify_effect", "identify_effect", "CausalModel"]
+__all__ = ["EstimandType", "identify_effect_auto", "identify_effect_id", "identify_effect", "CausalModel"]
--- a/dowhy/causal_identifier/init.py
+++ b/dowhy/causal_identifier/init.py
@ -2,16 +2,16 @@ from dowhy.causal_identifier.auto_identifier import (
    AutoIdentifier,
    BackdoorAdjustment,
    EstimandType,
-    auto_identify_effect,
+    identify_effect_auto,
 )
-from dowhy.causal_identifier.id_identifier import IDIdentifier, id_identify_effect
+from dowhy.causal_identifier.id_identifier import IDIdentifier, identify_effect_id
 from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand
 from dowhy.causal_identifier.identify_effect import identify_effect

 __all__ = [
    "AutoIdentifier",
-    "auto_identify_effect",
-    "id_identify_effect",
+    "identify_effect_auto",
+    "identify_effect_id",
    "BackdoorAdjustment",
    "EstimandType",
    "IdentifiedEstimand",
--- a/dowhy/causal_identifier/auto_identifier.py
+++ b/dowhy/causal_identifier/auto_identifier.py
@ -89,7 +89,7 @@ class AutoIdentifier:
        conditional_node_names: List[str] = None,
        **kwargs,
    ):
-        estimand = auto_identify_effect(
+        estimand = identify_effect_auto(
            graph,
            treatment_name,
            outcome_name,
@ -126,7 +126,7 @@ class AutoIdentifier:
        )


-def auto_identify_effect(
+def identify_effect_auto(
    graph: CausalGraph,
    treatment_name: Union[str, List[str]],
    outcome_name: Union[str, List[str]],
@ -137,7 +137,7 @@ def auto_identify_effect(
    optimize_backdoor: bool = False,
    costs: Optional[List] = None,
    **kwargs,
-):
+) -> IdentifiedEstimand:
    """Main method that returns an identified estimand (if one exists).

    If estimand_type is non-parametric ATE, then  uses backdoor, instrumental variable and frontdoor identification methods,  to check if an identified estimand exists, based on the causal graph.
--- a/dowhy/causal_identifier/id_identifier.py
+++ b/dowhy/causal_identifier/id_identifier.py
@ -102,16 +102,16 @@ class IDIdentifier:
        node_names: Optional[Union[str, List[str]]] = None,
        **kwargs,
    ):
-        return id_identify_effect(graph, treatment_name, outcome_name, node_names, **kwargs)
+        return identify_effect_id(graph, treatment_name, outcome_name, node_names, **kwargs)


-def id_identify_effect(
+def identify_effect_id(
    graph: CausalGraph,
    treatment_name: Union[str, List[str]],
    outcome_name: Union[str, List[str]],
    node_names: Optional[Union[str, List[str]]] = None,
    **kwargs,
-):
+) -> IDExpression:
    """
    Implementation of the ID algorithm.
    Link - https://ftp.cs.ucla.edu/pub/stat_ser/shpitser-thesis.pdf
--- a/dowhy/causal_identifier/identify_effect.py
+++ b/dowhy/causal_identifier/identify_effect.py
@ -1,7 +1,8 @@
 from typing import List, Protocol, Union

 from dowhy.causal_graph import CausalGraph
-from dowhy.causal_identifier.auto_identifier import BackdoorAdjustment, EstimandType, auto_identify_effect
+from dowhy.causal_identifier.auto_identifier import BackdoorAdjustment, EstimandType, identify_effect_auto
+from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand


 class CausalIdentifier(Protocol):
@ -31,7 +32,7 @@ def identify_effect(
    graph: CausalGraph,
    treatment_name: Union[str, List[str]],
    outcome_name: Union[str, List[str]],
-):
+) -> IdentifiedEstimand:
    """Identify the causal effect to be estimated based on a CausalGraph

    :param graph: CausalGraph to be analyzed
@ -39,7 +40,7 @@ def identify_effect(
    :param outcome: name of the outcome
    :returns: a probability expression (estimand) for the causal effect if identified, else NULL
    """
-    return auto_identify_effect(
+    return identify_effect_auto(
        graph,
        treatment_name,
        outcome_name,
--- a/dowhy/causal_refuter.py
+++ b/dowhy/causal_refuter.py
@ -1,5 +1,7 @@
 import logging
 import random
+from enum import Enum
+from typing import List, Union

 import numpy as np
 import scipy.stats as st
@ -7,6 +9,16 @@ import scipy.stats as st
 from dowhy.utils.api import parse_state


+class SignificanceTestType(Enum):
+
+    AUTO = "auto"
+    BOOTSTRAP = "bootstrap"
+    NORMAL = "normal_test"
+
+
+logger = logging.getLogger(__name__)
+
+
 class CausalRefuter:

    """Base class for different refutation methods.
@ -15,6 +27,9 @@ class CausalRefuter:

    # todo: add docstring for common parameters here and remove from child refuter classes

+    This class is for backwards compatibility with CausalModel
+    Will be deprecated in the future in favor of function call refute_method_name() functions
+
    """

    # Default value for the number of simulations to be conducted
@ -37,8 +52,6 @@ class CausalRefuter:
            self._random_seed = kwargs["random_seed"]
            np.random.seed(self._random_seed)

-        self.logger = logging.getLogger(__name__)
-
        # Concatenate the confounders, instruments and effect modifiers
        try:
            self._variables_of_interest = (
@ -47,206 +60,218 @@ class CausalRefuter:
                + self._estimate.params["effect_modifiers"]
            )
        except AttributeError as attr_error:
-            self.logger.error(attr_error)
+            logger.error(attr_error)

    def choose_variables(self, required_variables):
-        """
-        This method provides a way to choose the confounders whose values we wish to
-        modify for finding its effect on the ability of the treatment to affect the outcome.
-        """
-
-        invert = None
-
-        if required_variables is False:
-
-            self.logger.info(
-                "All variables required: Running bootstrap adding noise to confounders, instrumental variables and effect modifiers."
-            )
-            return None
-
-        elif required_variables is True:
-
-            self.logger.info(
-                "All variables required: Running bootstrap adding noise to confounders, instrumental variables and effect modifiers."
-            )
-            return self._variables_of_interest
-
-        elif type(required_variables) is int:
-
-            if len(self._variables_of_interest) < required_variables:
-                self.logger.error(
-                    "Too many variables passed.\n The number of  variables is: {}.\n The number of variables passed: {}".format(
-                        len(self._variables_of_interest), required_variables
-                    )
-                )
-                raise ValueError(
-                    "The number of variables in the required_variables is greater than the number of confounders, instrumental variables and effect modifiers"
-                )
-            else:
-                # Shuffle the confounders
-                random.shuffle(self._variables_of_interest)
-                return self._variables_of_interest[:required_variables]
-
-        elif type(required_variables) is list:
-
-            # Check if all are select or deselect variables
-            if all(variable[0] == "-" for variable in required_variables):
-                invert = True
-                required_variables = [variable[1:] for variable in required_variables]
-            elif all(variable[0] != "-" for variable in required_variables):
-                invert = False
-            else:
-                self.logger.error("{} has both select and delect variables".format(required_variables))
-                raise ValueError(
-                    "It appears that there are some select and deselect variables. Note you can either select or delect variables at a time, but not both"
-                )
-
-            # Check if all the required_variables belong to confounders, instrumental variables or effect
-            if set(required_variables) - set(self._variables_of_interest) != set([]):
-                self.logger.error(
-                    "{} are not confounder, instrumental variable or effect modifier".format(
-                        list(set(required_variables) - set(self._variables_of_interest))
-                    )
-                )
-                raise ValueError(
-                    "At least one of required_variables is not a valid variable name, or it is not a confounder, instrumental variable or effect modifier"
-                )
-
-            if invert is False:
-                return required_variables
-            elif invert is True:
-                return list(set(self._variables_of_interest) - set(required_variables))
-
-        else:
-            self.logger.error("Incorrect type: {}. Expected an int,list or bool".format(type(required_variables)))
-            raise TypeError("Expected int, list or bool. Got an unexpected datatype")
+        return choose_variables(required_variables, self._variables_of_interest)

    def test_significance(self, estimate, simulations, test_type="auto", significance_level=0.05):
-        """Tests the statistical significance of the estimate obtained to the simulations produced by a refuter.
-
-        The basis behind using the sample statistics of the refuter when we are in fact testing the estimate,
-        is due to the fact that, we would ideally expect them to follow the same distribition.
-
-        For refutation tests (e.g., placebo refuters), consider the null distribution as a distribution of effect
-        estimates over multiple simulations with placebo treatment, and compute how likely the true estimate (e.g.,
-        zero for placebo test) is under the null. If the probability of true effect estimate is lower than the
-        p-value, then estimator method fails the test.
-
-        For sensitivity analysis tests (e.g., bootstrap, subset or common cause refuters), the null distribution captures
-        the distribution of effect estimates under the "true" dataset (e.g., with an additional confounder or different
-        sampling), and we compute the probability of the obtained estimate under this distribution. If the probability is
-        lower than the p-value, then the estimator method fails the test.
-
-        Null Hypothesis- The estimate is a part of the distribution
-        Alternative Hypothesis- The estimate does not fall in the distribution.
-
-        :param 'estimate': CausalEstimate
-            The estimate obtained from the estimator for the original data.
-        :param 'simulations': np.array
-            An array containing the result of the refuter for the simulations
-        :param 'test_type': string, default 'auto'
-            The type of test the user wishes to perform.
-        :param 'significance_level': float, default 0.05
-            The significance level for the statistical test
-
-        :returns: significance_dict: Dict
-            A Dict containing the p_value and a boolean that indicates if the result is statistically significant
-        """
-        # Initializing the p_value
-        p_value = 0
-
-        if test_type == "auto":
-            num_simulations = len(simulations)
-            if num_simulations >= 100:  # Bootstrapping
-                self.logger.info(
-                    "Making use of Bootstrap as we have more than 100 examples.\n \
-                Note: The greater the number of examples, the more accurate are the confidence estimates"
-                )
-
-                # Perform Bootstrap Significance Test with the original estimate and the set of refutations
-                p_value = self.perform_bootstrap_test(estimate, simulations)
-
-            else:
-                self.logger.warning(
-                    "We assume a Normal Distribution as the sample has less than 100 examples.\n \
-                Note: The underlying distribution may not be Normal. We assume that it approaches normal with the increase in sample size."
-                )
-
-                # Perform Normal Tests of Significance with the original estimate and the set of refutations
-                p_value = self.perform_normal_distribution_test(estimate, simulations)
-
-        elif test_type == "bootstrap":
-            self.logger.info(
-                "Performing Bootstrap Test with {} samples\n \
-            Note: The greater the number of examples, the more accurate are the confidence estimates".format(
-                    len(simulations)
-                )
-            )
-
-            # Perform Bootstrap Significance Test with the original estimate and the set of refutations
-            p_value = self.perform_bootstrap_test(estimate, simulations)
-
-        elif test_type == "normal_test":
-            self.logger.info(
-                "Performing Normal Test with {} samples\n \
-            Note: We assume that the underlying distribution is Normal.".format(
-                    len(simulations)
-                )
-            )
-
-            # Perform Normal Tests of Significance with the original estimate and the set of refutations
-            p_value = self.perform_normal_distribution_test(estimate, simulations)
-
-        else:
-            raise NotImplementedError
-
-        significance_dict = {"p_value": p_value, "is_statistically_significant": p_value <= significance_level}
-
-        return significance_dict
+        return test_significance(estimate, simulations, SignificanceTestType(test_type), significance_level)

    def perform_bootstrap_test(self, estimate, simulations):
-
-        # Get the number of simulations
-        num_simulations = len(simulations)
-        # Sort the simulations
-        simulations.sort()
-        # Obtain the median value
-        median_refute_values = simulations[int(num_simulations / 2)]
-
-        # Performing a two sided test
-        if estimate.value > median_refute_values:
-            # np.searchsorted tells us the index if it were a part of the array
-            # We select side to be left as we want to find the first value that matches
-            estimate_index = np.searchsorted(simulations, estimate.value, side="left")
-            # We subtact 1 as we are finding the value from the right tail
-            p_value = 1 - (estimate_index / num_simulations)
-        else:
-            # We take the side to be right as we want to find the last index that matches
-            estimate_index = np.searchsorted(simulations, estimate.value, side="right")
-            # We get the probability with respect to the left tail.
-            p_value = estimate_index / num_simulations
-        # return twice the determined quantile as this is a two sided test
-        return 2 * p_value
+        return perform_bootstrap_test(estimate, simulations)

    def perform_normal_distribution_test(self, estimate, simulations):
-        # Get the mean for the simulations
-        mean_refute_values = np.mean(simulations)
-        # Get the standard deviation for the simulations
-        std_dev_refute_values = np.std(simulations)
-        # Get the Z Score [(val - mean)/ std_dev ]
-        z_score = (estimate.value - mean_refute_values) / std_dev_refute_values
-
-        if z_score > 0:  # Right Tail
-            p_value = 1 - st.norm.cdf(z_score)
-        else:  # Left Tail
-            p_value = st.norm.cdf(z_score)
-
-        return p_value
+        return perform_normal_distribution_test(estimate, simulations)

    def refute_estimate(self, show_progress_bar=False):
        raise NotImplementedError


+def choose_variables(required_variables: Union[bool, int, list], variables_of_interest: List):
+    """
+    This method provides a way to choose the confounders whose values we wish to
+    modify for finding its effect on the ability of the treatment to affect the outcome.
+    """
+
+    invert = None
+
+    if required_variables is False:
+
+        logger.info(
+            "All variables required: Running bootstrap adding noise to confounders, instrumental variables and effect modifiers."
+        )
+        return None
+
+    elif required_variables is True:
+
+        logger.info(
+            "All variables required: Running bootstrap adding noise to confounders, instrumental variables and effect modifiers."
+        )
+        return variables_of_interest
+
+    elif type(required_variables) is int:
+
+        if len(variables_of_interest) < required_variables:
+            logger.error(
+                "Too many variables passed.\n The number of  variables is: {}.\n The number of variables passed: {}".format(
+                    len(variables_of_interest), required_variables
+                )
+            )
+            raise ValueError(
+                "The number of variables in the required_variables is greater than the number of confounders, instrumental variables and effect modifiers"
+            )
+        else:
+            # Shuffle the confounders
+            return random.sample(variables_of_interest, required_variables)
+
+    elif type(required_variables) is list:
+
+        # Check if all are select or deselect variables
+        if all(variable[0] == "-" for variable in required_variables):
+            invert = True
+            required_variables = [variable[1:] for variable in required_variables]
+        elif all(variable[0] != "-" for variable in required_variables):
+            invert = False
+        else:
+            logger.error("{} has both select and delect variables".format(required_variables))
+            raise ValueError(
+                "It appears that there are some select and deselect variables. Note you can either select or delect variables at a time, but not both"
+            )
+
+        # Check if all the required_variables belong to confounders, instrumental variables or effect
+        if set(required_variables) - set(variables_of_interest) != set([]):
+            logger.error(
+                "{} are not confounder, instrumental variable or effect modifier".format(
+                    list(set(required_variables) - set(variables_of_interest))
+                )
+            )
+            raise ValueError(
+                "At least one of required_variables is not a valid variable name, or it is not a confounder, instrumental variable or effect modifier"
+            )
+
+        if invert is False:
+            return required_variables
+        elif invert is True:
+            return list(set(variables_of_interest) - set(required_variables))
+
+
+def perform_bootstrap_test(estimate, simulations: List):
+    # Get the number of simulations
+    num_simulations = len(simulations)
+    # Sort the simulations
+    simulations.sort()
+    # Obtain the median value
+    median_refute_values = simulations[int(num_simulations / 2)]
+
+    # Performing a two sided test
+    if estimate.value > median_refute_values:
+        # np.searchsorted tells us the index if it were a part of the array
+        # We select side to be left as we want to find the first value that matches
+        estimate_index = np.searchsorted(simulations, estimate.value, side="left")
+        # We subtact 1 as we are finding the value from the right tail
+        p_value = 1 - (estimate_index / num_simulations)
+    else:
+        # We take the side to be right as we want to find the last index that matches
+        estimate_index = np.searchsorted(simulations, estimate.value, side="right")
+        # We get the probability with respect to the left tail.
+        p_value = estimate_index / num_simulations
+    # return twice the determined quantile as this is a two sided test
+    return 2 * p_value
+
+
+def perform_normal_distribution_test(estimate, simulations: List):
+    # Get the mean for the simulations
+    mean_refute_values = np.mean(simulations)
+    # Get the standard deviation for the simulations
+    std_dev_refute_values = np.std(simulations)
+    # Get the Z Score [(val - mean)/ std_dev ]
+    z_score = (estimate.value - mean_refute_values) / std_dev_refute_values
+
+    if z_score > 0:  # Right Tail
+        p_value = 1 - st.norm.cdf(z_score)
+    else:  # Left Tail
+        p_value = st.norm.cdf(z_score)
+
+    return p_value
+
+
+def test_significance(
+    estimate,
+    simulations: List,
+    test_type: SignificanceTestType = SignificanceTestType.AUTO,
+    significance_level: float = 0.85,
+):
+    """Tests the statistical significance of the estimate obtained to the simulations produced by a refuter.
+
+    The basis behind using the sample statistics of the refuter when we are in fact testing the estimate,
+    is due to the fact that, we would ideally expect them to follow the same distribition.
+
+    For refutation tests (e.g., placebo refuters), consider the null distribution as a distribution of effect
+    estimates over multiple simulations with placebo treatment, and compute how likely the true estimate (e.g.,
+    zero for placebo test) is under the null. If the probability of true effect estimate is lower than the
+    p-value, then estimator method fails the test.
+
+    For sensitivity analysis tests (e.g., bootstrap, subset or common cause refuters), the null distribution captures
+    the distribution of effect estimates under the "true" dataset (e.g., with an additional confounder or different
+    sampling), and we compute the probability of the obtained estimate under this distribution. If the probability is
+    lower than the p-value, then the estimator method fails the test.
+
+    Null Hypothesis- The estimate is a part of the distribution
+    Alternative Hypothesis- The estimate does not fall in the distribution.
+
+    :param 'estimate': CausalEstimate
+        The estimate obtained from the estimator for the original data.
+    :param 'simulations': np.array
+        An array containing the result of the refuter for the simulations
+    :param 'test_type': string, default 'auto'
+        The type of test the user wishes to perform.
+    :param 'significance_level': float, default 0.05
+        The significance level for the statistical test
+
+    :returns: significance_dict: Dict
+        A Dict containing the p_value and a boolean that indicates if the result is statistically significant
+    """
+    # Initializing the p_value
+    p_value = 0
+
+    if test_type == SignificanceTestType.AUTO:
+        num_simulations = len(simulations)
+        if num_simulations >= 100:  # Bootstrapping
+            logger.info(
+                "Making use of Bootstrap as we have more than 100 examples.\n \
+            Note: The greater the number of examples, the more accurate are the confidence estimates"
+            )
+
+            # Perform Bootstrap Significance Test with the original estimate and the set of refutations
+            p_value = perform_bootstrap_test(estimate, simulations)
+
+        else:
+            logger.warning(
+                "We assume a Normal Distribution as the sample has less than 100 examples.\n \
+            Note: The underlying distribution may not be Normal. We assume that it approaches normal with the increase in sample size."
+            )
+
+            # Perform Normal Tests of Significance with the original estimate and the set of refutations
+            p_value = perform_normal_distribution_test(estimate, simulations)
+
+    elif test_type == SignificanceTestType.BOOTSTRAP:
+        logger.info(
+            "Performing Bootstrap Test with {} samples\n \
+        Note: The greater the number of examples, the more accurate are the confidence estimates".format(
+                len(simulations)
+            )
+        )
+
+        # Perform Bootstrap Significance Test with the original estimate and the set of refutations
+        p_value = perform_bootstrap_test(estimate, simulations)
+
+    elif test_type == SignificanceTestType.NORMAL:
+        logger.info(
+            "Performing Normal Test with {} samples\n \
+        Note: We assume that the underlying distribution is Normal.".format(
+                len(simulations)
+            )
+        )
+
+        # Perform Normal Tests of Significance with the original estimate and the set of refutations
+        p_value = perform_normal_distribution_test(estimate, simulations)
+
+    significance_dict = {"p_value": p_value, "is_statistically_significant": p_value <= significance_level}
+
+    return significance_dict
+
+
 class CausalRefutation:
    """Class for storing the result of a refutation method."""

--- a/dowhy/causal_refuters/init.py
+++ b/dowhy/causal_refuters/init.py
@ -2,6 +2,19 @@ import string
 from importlib import import_module

 from dowhy.causal_refuter import CausalRefuter
+from dowhy.causal_refuters.add_unobserved_common_cause import (
+    AddUnobservedCommonCause,
+    sensitivity_e_value,
+    sensitivity_linear_partial_r2,
+    sensitivity_non_parametric_partial_r2,
+    sensitivity_simulation,
+)
+from dowhy.causal_refuters.bootstrap_refuter import BootstrapRefuter, refute_bootstrap
+from dowhy.causal_refuters.data_subset_refuter import DataSubsetRefuter, refute_data_subset
+from dowhy.causal_refuters.dummy_outcome_refuter import DummyOutcomeRefuter, refute_dummy_outcome
+from dowhy.causal_refuters.placebo_treatment_refuter import PlaceboTreatmentRefuter, refute_placebo_treatment
+from dowhy.causal_refuters.random_common_cause import RandomCommonCause, refute_random_common_cause
+from dowhy.causal_refuters.refute_estimate import refute_estimate


 def get_class_object(method_name, *args, **kwargs):
@ -17,3 +30,23 @@ def get_class_object(method_name, *args, **kwargs):
    except (AttributeError, AssertionError, ImportError):
        raise ImportError("{} is not an existing causal refuter.".format(method_name))
    return refuter_class
+
+
+__all__ = [
+    "AddUnobservedCommonCause",
+    "BootstrapRefuter",
+    "DataSubsetRefuter",
+    "PlaceboTreatmentRefuter",
+    "RandomCommonCause",
+    "DummyOutcomeRefuter",
+    "refute_bootstrap",
+    "refute_data_subset",
+    "refute_random_common_cause",
+    "refute_placebo_treatment",
+    "sensitivity_simulation",
+    "sensitivity_linear_partial_r2",
+    "sensitivity_non_parametric_partial_r2",
+    "sensitivity_e_value",
+    "refute_dummy_outcome",
+    "refute_estimate",
+]
--- a/dowhy/causal_refuters/add_unobserved_common_cause.py
+++ b/dowhy/causal_refuters/add_unobserved_common_cause.py
--- a/dowhy/causal_refuters/bootstrap_refuter.py
+++ b/dowhy/causal_refuters/bootstrap_refuter.py
@ -1,11 +1,18 @@
 import logging
 import random
+from typing import List, Optional, Union

 import numpy as np
+import pandas as pd
+from joblib import Parallel, delayed
 from sklearn.utils import resample
+from tqdm.auto import tqdm

-from dowhy.causal_estimator import CausalEstimator
-from dowhy.causal_refuter import CausalRefutation, CausalRefuter
+from dowhy.causal_estimator import CausalEstimate, CausalEstimator
+from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand
+from dowhy.causal_refuter import CausalRefutation, CausalRefuter, choose_variables, test_significance
+
+logger = logging.getLogger(__name__)


 class BootstrapRefuter(CausalRefuter):
@ -58,87 +65,167 @@ class BootstrapRefuter(CausalRefuter):
        super().__init__(*args, **kwargs)
        self._num_simulations = kwargs.pop("num_simulations", CausalRefuter.DEFAULT_NUM_SIMULATIONS)
        self._sample_size = kwargs.pop("sample_size", len(self._data))
-        required_variables = kwargs.pop("required_variables", True)
+        self._required_variables = kwargs.pop("required_variables", True)
        self._noise = kwargs.pop("noise", BootstrapRefuter.DEFAULT_STD_DEV)
        self._probability_of_change = kwargs.pop("probability_of_change", None)
        self._random_state = kwargs.pop("random_state", None)

        self.logger = logging.getLogger(__name__)

-        self._chosen_variables = self.choose_variables(required_variables)
-
-        if self._chosen_variables is None:
-            self.logger.info("INFO: There are no chosen variables")
-        else:
-            self.logger.info("INFO: The chosen variables are: " + ",".join(self._chosen_variables))
-
-        if self._probability_of_change is None:
-            if self._noise > 1:
-                self.logger.error(
-                    "Error in using noise:{} for Binary Flip. The value is greater than 1".format(self._noise)
-                )
-                raise ValueError("The value for Binary Flip cannot be greater than 1")
-            else:
-                self._probability_of_change = self._noise
-        elif self._probability_of_change > 1:
-            self.logger.error(
-                "The probability of flip is: {}, However, this value cannot be greater than 1".format(
-                    self._probability_of_change
-                )
-            )
-            raise ValueError("Probability of Flip cannot be greater than 1")
-
-    def refute_estimate(self, *args, **kwargs):
-        if self._sample_size > len(self._data):
-            self.logger.warning("The sample size is larger than the population size")
-
-        sample_estimates = np.zeros(self._num_simulations)
-        self.logger.info(
-            "Refutation over {} simulated datasets of size {} each".format(self._num_simulations, self._sample_size)
+    def refute_estimate(self, show_progress_bar: bool = False, *args, **kwargs):
+        refute = refute_bootstrap(
+            data=self._data,
+            target_estimand=self._target_estimand,
+            estimate=self._estimate,
+            num_simulations=self._num_simulations,
+            random_state=self._random_state,
+            sample_size=self._sample_size,
+            required_variables=self._required_variables,
+            noise=self._noise,
+            probability_of_change=self._probability_of_change,
+            show_progress_bar=show_progress_bar,
+            n_jobs=self._n_jobs,
+            verbose=self._verbose,
        )
-
-        for index in range(self._num_simulations):
-            if self._random_state is None:
-                new_data = resample(self._data, n_samples=self._sample_size)
-            else:
-                new_data = resample(self._data, n_samples=self._sample_size, random_state=self._random_state)
-
-            if self._chosen_variables is not None:
-                for variable in self._chosen_variables:
-
-                    if ("float" or "int") in new_data[variable].dtype.name:
-                        scaling_factor = new_data[variable].std()
-                        new_data[variable] += np.random.normal(
-                            loc=0.0, scale=self._noise * scaling_factor, size=self._sample_size
-                        )
-
-                    elif "bool" in new_data[variable].dtype.name:
-                        probs = np.random.uniform(0, 1, self._sample_size)
-                        new_data[variable] = np.where(
-                            probs < self._probability_of_change, np.logical_not(new_data[variable]), new_data[variable]
-                        )
-
-                    elif "category" in new_data[variable].dtype.name:
-                        categories = new_data[variable].unique()
-                        # Find the set difference for each row
-                        changed_data = new_data[variable].apply(lambda row: list(set(categories) - set([row])))
-                        # Choose one out of the remaining
-                        changed_data = changed_data.apply(lambda row: random.choice(row))
-                        new_data[variable] = np.where(probs < self._probability_of_change, changed_data)
-                        new_data[variable].astype("category")
-
-            new_estimator = CausalEstimator.get_estimator_object(new_data, self._target_estimand, self._estimate)
-            new_effect = new_estimator.estimate_effect()
-            sample_estimates[index] = new_effect.value
-
-        refute = CausalRefutation(
-            self._estimate.value, np.mean(sample_estimates), refutation_type="Refute: Bootstrap Sample Dataset"
-        )
-
-        # We want to see if the estimate falls in the same distribution as the one generated by the refuter
-        # Ideally that should be the case as running bootstrap should not have a significant effect on the ability
-        # of the treatment to affect the outcome
-        refute.add_significance_test_results(self.test_significance(self._estimate, sample_estimates))
-
        refute.add_refuter(self)
        return refute
+
+
+def _refute_once(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    chosen_variables: Optional[List] = None,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+    sample_size: Optional[int] = None,
+    noise: float = 0.1,
+    probability_of_change: Optional[float] = None,
+):
+    if random_state is None:
+        new_data = resample(data, n_samples=sample_size)
+    else:
+        new_data = resample(data, n_samples=sample_size, random_state=random_state)
+
+    if chosen_variables is not None:
+        for variable in chosen_variables:
+
+            if ("float" or "int") in new_data[variable].dtype.name:
+                scaling_factor = new_data[variable].std()
+                new_data[variable] += np.random.normal(loc=0.0, scale=noise * scaling_factor, size=sample_size)
+
+            elif "bool" in new_data[variable].dtype.name:
+                probs = np.random.uniform(0, 1, sample_size)
+                new_data[variable] = np.where(
+                    probs < probability_of_change, np.logical_not(new_data[variable]), new_data[variable]
+                )
+
+            elif "category" in new_data[variable].dtype.name:
+                categories = new_data[variable].unique()
+                # Find the set difference for each row
+                changed_data = new_data[variable].apply(lambda row: list(set(categories) - set([row])))
+                # Choose one out of the remaining
+                changed_data = changed_data.apply(lambda row: random.choice(row))
+                new_data[variable] = np.where(probs < probability_of_change, changed_data)
+                new_data[variable].astype("category")
+
+    new_estimator = CausalEstimator.get_estimator_object(new_data, target_estimand, estimate)
+    new_effect = new_estimator.estimate_effect()
+    return new_effect.value
+
+
+def refute_bootstrap(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    num_simulations: int = 100,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+    sample_size: Optional[int] = None,
+    required_variables: bool = True,
+    noise: float = 0.1,
+    probability_of_change: Optional[float] = None,
+    show_progress_bar: bool = False,
+    n_jobs: int = 1,
+    verbose: int = 0,
+    **_,
+) -> CausalRefutation:
+    """Refute an estimate by running it on a random sample of the data containing measurement error in the
+    confounders. This allows us to find the ability of the estimator to find the effect of the
+    treatment on the outcome.
+
+    :param data: pd.DataFrame: Data to run the refutation
+    :param target_estimand: IdentifiedEstimand: Identified estimand to run the refutation
+    :param estimate: CausalEstimate: Estimate to run the refutation
+    :param num_simulations: The number of simulations to be run, ``CausalRefuter.DEFAULT_NUM_SIMULATIONS`` by default
+    :param random_state: The seed value to be added if we wish to repeat the same random behavior. For this purpose, we repeat the same seed in the psuedo-random generator.
+    :param sample_size: The size of each bootstrap sample and is the size of the original data by default
+    :param required_variables: The list of variables to be used as the input for ``y~f(W)``
+      This is ``True`` by default, which in turn selects all variables leaving the treatment and the outcome
+    1. An integer argument refers to how many variables will be used for estimating the value of the outcome
+    2. A list explicitly refers to which variables will be used to estimate the outcome
+       Furthermore, it gives the ability to explictly select or deselect the covariates present in the estimation of the
+       outcome. This is done by either adding or explicitly removing variables from the list as shown below:
+    .. note::
+            * We need to pass required_variables = ``[W0,W1]`` if we want ``W0`` and ``W1``.
+            * We need to pass required_variables = ``[-W0,-W1]`` if we want all variables excluding ``W0`` and ``W1``.
+    3. If the value is True, we wish to include all variables to estimate the value of the outcome.
+    .. warning:: A ``False`` value is ``INVALID`` and will result in an ``error``.
+    :param noise: The standard deviation of the noise to be added to the data and is ``BootstrapRefuter.DEFAULT_STD_DEV`` by default
+    :param probability_of_change: It specifies the probability with which we change the data for a boolean or categorical variable
+      It is ``noise`` by default, only if the value of ``noise`` is less than 1.
+    :param n_jobs: The maximum number of concurrently running jobs. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all (this is the default).
+    :param verbose: The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. The default is 0.
+    """
+    if sample_size is None:
+        sample_size = len(data)
+
+    chosen_variables = choose_variables(
+        required_variables,
+        target_estimand.get_backdoor_variables()
+        + target_estimand.instrumental_variables
+        + estimate.params["effect_modifiers"],
+    )
+
+    if chosen_variables is None:
+        logger.info("INFO: There are no chosen variables")
+    else:
+        logger.info("INFO: The chosen variables are: " + ",".join(chosen_variables))
+
+    if probability_of_change is None and noise > 1:
+        logger.error("Error in using noise:{} for Binary Flip. The value is greater than 1".format(noise))
+        raise ValueError("The value for Binary Flip cannot be greater than 1")
+    elif probability_of_change is None and noise <= 1:
+        probability_of_change = noise
+    elif probability_of_change > 1:
+        logger.error(
+            "The probability of flip is: {}, However, this value cannot be greater than 1".format(probability_of_change)
+        )
+        raise ValueError("Probability of Flip cannot be greater than 1")
+
+    if sample_size > len(data):
+        logger.warning("The sample size is larger than the population size")
+
+    logger.info("Refutation over {} simulated datasets of size {} each".format(num_simulations, sample_size))
+
+    sample_estimates = Parallel(n_jobs=n_jobs, verbose=verbose)(
+        delayed(_refute_once)(
+            data, target_estimand, estimate, chosen_variables, random_state, sample_size, noise, probability_of_change
+        )
+        for _ in tqdm(
+            range(num_simulations),
+            colour=CausalRefuter.PROGRESS_BAR_COLOR,
+            disable=not show_progress_bar,
+            desc="Refuting Estimates: ",
+        )
+    )
+    sample_estimates = np.array(sample_estimates)
+
+    refute = CausalRefutation(
+        estimate.value, np.mean(sample_estimates), refutation_type="Refute: Bootstrap Sample Dataset"
+    )
+
+    # We want to see if the estimate falls in the same distribution as the one generated by the refuter
+    # Ideally that should be the case as running bootstrap should not have a significant effect on the ability
+    # of the treatment to affect the outcome
+    refute.add_significance_test_results(test_significance(estimate, sample_estimates))
+
+    return refute
--- a/dowhy/causal_refuters/data_subset_refuter.py
+++ b/dowhy/causal_refuters/data_subset_refuter.py
@ -1,11 +1,16 @@
 import logging
+from typing import Optional, Union

 import numpy as np
+import pandas as pd
 from joblib import Parallel, delayed
 from tqdm.auto import tqdm

-from dowhy.causal_estimator import CausalEstimator
-from dowhy.causal_refuter import CausalRefutation, CausalRefuter
+from dowhy.causal_estimator import CausalEstimate, CausalEstimator
+from dowhy.causal_identifier import IdentifiedEstimand
+from dowhy.causal_refuter import CausalRefutation, CausalRefuter, test_significance
+
+logger = logging.getLogger(__name__)


 class DataSubsetRefuter(CausalRefuter):
@ -38,47 +43,87 @@ class DataSubsetRefuter(CausalRefuter):
        self._num_simulations = kwargs.pop("num_simulations", CausalRefuter.DEFAULT_NUM_SIMULATIONS)
        self._random_state = kwargs.pop("random_state", None)

-        self.logger = logging.getLogger(__name__)
-
-    def refute_estimate(self, show_progress_bar=False):
-
-        sample_estimates = np.zeros(self._num_simulations)
-        self.logger.info(
-            "Refutation over {} simulated datasets of size {} each".format(
-                self._subset_fraction, self._subset_fraction * len(self._data.index)
-            )
+    def refute_estimate(self, show_progress_bar: bool = False):
+        refute = refute_data_subset(
+            data=self._data,
+            target_estimand=self._target_estimand,
+            estimate=self._estimate,
+            subset_fraction=self._subset_fraction,
+            num_simulations=self._num_simulations,
+            random_state=self._random_state,
+            show_progress_bar=show_progress_bar,
+            n_jobs=self._n_jobs,
+            verbose=self._verbose,
        )

-        def refute_once():
-            if self._random_state is None:
-                new_data = self._data.sample(frac=self._subset_fraction)
-            else:
-                new_data = self._data.sample(frac=self._subset_fraction, random_state=self._random_state)
-
-            new_estimator = CausalEstimator.get_estimator_object(new_data, self._target_estimand, self._estimate)
-            new_effect = new_estimator.estimate_effect()
-            return new_effect.value
-
-        # Run refutation in parallel
-        sample_estimates = Parallel(n_jobs=self._n_jobs, verbose=self._verbose)(
-            delayed(refute_once)()
-            for _ in tqdm(
-                range(self._num_simulations),
-                colour=CausalRefuter.PROGRESS_BAR_COLOR,
-                disable=not show_progress_bar,
-                desc="Refuting Estimates: ",
-            )
-        )
-        sample_estimates = np.array(sample_estimates)
-
-        refute = CausalRefutation(
-            self._estimate.value, np.mean(sample_estimates), refutation_type="Refute: Use a subset of data"
-        )
-
-        # We want to see if the estimate falls in the same distribution as the one generated by the refuter
-        # Ideally that should be the case as choosing a subset should not have a significant effect on the ability
-        # of the treatment to affect the outcome
-        refute.add_significance_test_results(self.test_significance(self._estimate, sample_estimates))
-
        refute.add_refuter(self)
        return refute
+
+
+def _refute_once(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    subset_fraction: float,
+    random_state: Optional[Union[int, np.random.RandomState]],
+):
+    if random_state is None:
+        new_data = data.sample(frac=subset_fraction)
+    else:
+        new_data = data.sample(frac=subset_fraction, random_state=random_state)
+
+    new_estimator = CausalEstimator.get_estimator_object(new_data, target_estimand, estimate)
+    new_effect = new_estimator.estimate_effect()
+    return new_effect.value
+
+
+def refute_data_subset(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    subset_fraction: float = 0.8,
+    num_simulations: int = 100,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+    show_progress_bar: bool = False,
+    n_jobs: int = 1,
+    verbose: int = 0,
+    **_,
+) -> CausalRefutation:
+    """Refute an estimate by rerunning it on a random subset of the original data.
+
+    :param data: pd.DataFrame: Data to run the refutation
+    :param target_estimand: IdentifiedEstimand: Identified estimand to run the refutation
+    :param estimate: CausalEstimate: Estimate to run the refutation
+    :param subset_fraction: Fraction of the data to be used for re-estimation, which is ``DataSubsetRefuter.DEFAULT_SUBSET_FRACTION`` by default.
+    :param num_simulations: The number of simulations to be run, ``CausalRefuter.DEFAULT_NUM_SIMULATIONS`` by default
+    :param random_state: The seed value to be added if we wish to repeat the same random behavior. For this purpose, we repeat the same seed in the psuedo-random generator.
+    :param n_jobs: The maximum number of concurrently running jobs. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all (this is the default).
+    :param verbose: The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. The default is 0.
+    """
+
+    logger.info(
+        "Refutation over {} simulated datasets of size {} each".format(
+            subset_fraction, subset_fraction * len(data.index)
+        )
+    )
+
+    # Run refutation in parallel
+    sample_estimates = Parallel(n_jobs=n_jobs, verbose=verbose)(
+        delayed(_refute_once)(data, target_estimand, estimate, subset_fraction, random_state)
+        for _ in tqdm(
+            range(num_simulations),
+            colour=CausalRefuter.PROGRESS_BAR_COLOR,
+            disable=not show_progress_bar,
+            desc="Refuting Estimates: ",
+        )
+    )
+    sample_estimates = np.array(sample_estimates)
+
+    refute = CausalRefutation(estimate.value, np.mean(sample_estimates), refutation_type="Refute: Use a subset of data")
+
+    # We want to see if the estimate falls in the same distribution as the one generated by the refuter
+    # Ideally that should be the case as choosing a subset should not have a significant effect on the ability
+    # of the treatment to affect the outcome
+    refute.add_significance_test_results(test_significance(estimate, sample_estimates))
+
+    return refute
--- a/dowhy/causal_refuters/dummy_outcome_refuter.py
+++ b/dowhy/causal_refuters/dummy_outcome_refuter.py
--- a/dowhy/causal_refuters/placebo_treatment_refuter.py
+++ b/dowhy/causal_refuters/placebo_treatment_refuter.py
@ -1,5 +1,7 @@
 import copy
 import logging
+from enum import Enum
+from typing import Dict, List, Optional, Union

 import numpy as np
 import pandas as pd
@ -7,9 +9,27 @@ from joblib import Parallel, delayed
 from tqdm.auto import tqdm

 from dowhy.causal_estimator import CausalEstimate, CausalEstimator
-from dowhy.causal_refuter import CausalRefutation, CausalRefuter
+from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand
+from dowhy.causal_refuter import CausalRefutation, CausalRefuter, test_significance
 from dowhy.utils.api import parse_state

+logger = logging.getLogger(__name__)
+
+
+# Default value of the p value taken for the distribution
+DEFAULT_PROBABILITY_OF_BINOMIAL = 0.5
+# Number of Trials: Number of cointosses to understand if a sample gets the treatment
+DEFAULT_NUMBER_OF_TRIALS = 1
+# Mean of the Normal Distribution
+DEFAULT_MEAN_OF_NORMAL = 0
+# Standard Deviation of the Normal Distribution
+DEFAULT_STD_DEV_OF_NORMAL = 0
+
+
+class PlaceboType(Enum):
+    DEFAULT = "Random Data"
+    PERMUTE = "permute"
+

 class PlaceboTreatmentRefuter(CausalRefuter):
    """Refute an estimate by replacing treatment with a randomly-generated placebo variable.
@ -32,15 +52,6 @@ class PlaceboTreatmentRefuter(CausalRefuter):
    :type verbose: int, optional
    """

-    # Default value of the p value taken for the distribution
-    DEFAULT_PROBABILITY_OF_BINOMIAL = 0.5
-    # Number of Trials: Number of cointosses to understand if a sample gets the treatment
-    DEFAULT_NUMBER_OF_TRIALS = 1
-    # Mean of the Normal Distribution
-    DEFAULT_MEAN_OF_NORMAL = 0
-    # Standard Deviation of the Normal Distribution
-    DEFAULT_STD_DEV_OF_NORMAL = 0
-
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._placebo_type = kwargs.pop("placebo_type", None)
@ -52,159 +63,184 @@ class PlaceboTreatmentRefuter(CausalRefuter):
        self.logger = logging.getLogger(__name__)

    def refute_estimate(self, show_progress_bar=False):
-        # only permute is supported for iv methods
-        if self._target_estimand.identifier_method.startswith("iv"):
-            if self._placebo_type != "permute":
-                self.logger.error(
-                    "Only placebo_type=''permute'' is supported for creating placebo for instrumental variable estimation methods"
-                )
-                raise ValueError(
-                    "Only placebo_type=''permute'' is supported for creating placebo for instrumental variable estimation methods."
-                )
-
-        # We need to change the identified estimand
-        # We make a copy as a safety measure, we don't want to change the
-        # original DataFrame
-        identified_estimand = copy.deepcopy(self._target_estimand)
-        identified_estimand.treatment_variable = ["placebo"]
-        if self._target_estimand.identifier_method.startswith("iv"):
-            identified_estimand.instrumental_variables = [
-                "placebo_" + s for s in identified_estimand.instrumental_variables
-            ]
-            # For IV methods, the estimating_instrument_names should also be
-            # changed. So we change it inside the estimate and then restore it
-            # back at the end of this method.
-            if (
-                self._estimate.params["method_params"] is not None
-                and "iv_instrument_name" in self._estimate.params["method_params"]
-            ):
-                self._estimate.params["method_params"]["iv_instrument_name"] = [
-                    "placebo_" + s for s in parse_state(self._estimate.params["method_params"]["iv_instrument_name"])
-                ]
-
-        self.logger.info(
-            "Refutation over {} simulated datasets of {} treatment".format(self._num_simulations, self._placebo_type)
+        refute = refute_placebo_treatment(
+            data=self._data,
+            target_estimand=self._target_estimand,
+            estimate=self._estimate,
+            treatment_names=self._treatment_name,
+            num_simulations=self._num_simulations,
+            placebo_type=PlaceboType(self._placebo_type),
+            random_state=self._random_state,
+            show_progress_bar=show_progress_bar,
+            n_jobs=self._n_jobs,
+            verbose=self._verbose,
        )

-        num_rows = self._data.shape[0]
-        treatment_name = self._treatment_name[0]  # Extract the name of the treatment variable
-        type_dict = dict(self._data.dtypes)
-
-        def refute_once():
-            if self._placebo_type == "permute":
-                permuted_idx = None
-                if self._random_state is None:
-                    permuted_idx = np.random.choice(self._data.shape[0], size=self._data.shape[0], replace=False)
-
-                else:
-                    permuted_idx = self._random_state.choice(
-                        self._data.shape[0], size=self._data.shape[0], replace=False
-                    )
-                new_treatment = self._data[self._treatment_name].iloc[permuted_idx].values
-                if self._target_estimand.identifier_method.startswith("iv"):
-                    new_instruments_values = (
-                        self._data[self._estimate.estimator.estimating_instrument_names].iloc[permuted_idx].values
-                    )
-                    new_instruments_df = pd.DataFrame(
-                        new_instruments_values,
-                        columns=[
-                            "placebo_" + s
-                            for s in self._data[self._estimate.estimator.estimating_instrument_names].columns
-                        ],
-                    )
-            else:
-                if "float" in type_dict[treatment_name].name:
-                    self.logger.info(
-                        "Using a Normal Distribution with Mean:{} and Variance:{}".format(
-                            PlaceboTreatmentRefuter.DEFAULT_MEAN_OF_NORMAL,
-                            PlaceboTreatmentRefuter.DEFAULT_STD_DEV_OF_NORMAL,
-                        )
-                    )
-                    new_treatment = (
-                        np.random.randn(num_rows) * PlaceboTreatmentRefuter.DEFAULT_STD_DEV_OF_NORMAL
-                        + PlaceboTreatmentRefuter.DEFAULT_MEAN_OF_NORMAL
-                    )
-
-                elif "bool" in type_dict[treatment_name].name:
-                    self.logger.info(
-                        "Using a Binomial Distribution with {} trials and {} probability of success".format(
-                            PlaceboTreatmentRefuter.DEFAULT_NUMBER_OF_TRIALS,
-                            PlaceboTreatmentRefuter.DEFAULT_PROBABILITY_OF_BINOMIAL,
-                        )
-                    )
-                    new_treatment = np.random.binomial(
-                        PlaceboTreatmentRefuter.DEFAULT_NUMBER_OF_TRIALS,
-                        PlaceboTreatmentRefuter.DEFAULT_PROBABILITY_OF_BINOMIAL,
-                        num_rows,
-                    ).astype(bool)
-
-                elif "int" in type_dict[treatment_name].name:
-                    self.logger.info(
-                        "Using a Discrete Uniform Distribution lying between {} and {}".format(
-                            self._data[treatment_name].min(), self._data[treatment_name].max()
-                        )
-                    )
-                    new_treatment = np.random.randint(
-                        low=self._data[treatment_name].min(), high=self._data[treatment_name].max(), size=num_rows
-                    )
-
-                elif "category" in type_dict[treatment_name].name:
-                    categories = self._data[treatment_name].unique()
-                    self.logger.info(
-                        "Using a Discrete Uniform Distribution with the following categories:{}".format(categories)
-                    )
-                    sample = np.random.choice(categories, size=num_rows)
-                    new_treatment = pd.Series(sample, index=self._data.index).astype("category")
-
-            # Create a new column in the data by the name of placebo
-            new_data = self._data.assign(placebo=new_treatment)
-            if self._target_estimand.identifier_method.startswith("iv"):
-                new_data = pd.concat((new_data, new_instruments_df), axis=1)
-            # Sanity check the data
-            self.logger.debug(new_data[0:10])
-            new_estimator = CausalEstimator.get_estimator_object(new_data, identified_estimand, self._estimate)
-            new_effect = new_estimator.estimate_effect()
-            return new_effect.value
-
-        # Run refutation in parallel
-        sample_estimates = Parallel(n_jobs=self._n_jobs, verbose=self._verbose)(
-            delayed(refute_once)()
-            for _ in tqdm(
-                range(self._num_simulations),
-                disable=not show_progress_bar,
-                colour=CausalRefuter.PROGRESS_BAR_COLOR,
-                desc="Refuting Estimates: ",
-            )
-        )
-
-        # for _ in range(self._num_simulations))
-        sample_estimates = np.array(sample_estimates)
-
-        # Restoring the value of iv_instrument_name
-        if self._target_estimand.identifier_method.startswith("iv"):
-            if (
-                self._estimate.params["method_params"] is not None
-                and "iv_instrument_name" in self._estimate.params["method_params"]
-            ):
-                self._estimate.params["method_params"]["iv_instrument_name"] = [
-                    s.replace("placebo_", "", 1)
-                    for s in parse_state(self._estimate.params["method_params"]["iv_instrument_name"])
-                ]
-        refute = CausalRefutation(
-            self._estimate.value, np.mean(sample_estimates), refutation_type="Refute: Use a Placebo Treatment"
-        )
-
-        # Note: We hardcode the estimate value to ZERO as we want to check if it falls in the distribution of the refuter
-        # Ideally we should expect that ZERO should fall in the distribution of the effect estimates as we have severed any causal
-        # relationship between the treatment and the outcome.
-        dummy_estimator = CausalEstimate(
-            estimate=0,
-            control_value=self._estimate.control_value,
-            treatment_value=self._estimate.treatment_value,
-            target_estimand=self._estimate.target_estimand,
-            realized_estimand_expr=self._estimate.realized_estimand_expr,
-        )
-
-        refute.add_significance_test_results(self.test_significance(dummy_estimator, sample_estimates))
        refute.add_refuter(self)
+
        return refute
+
+
+def _refute_once(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    treatment_names: List[str],
+    type_dict: Dict,
+    placebo_type: PlaceboType = PlaceboType.DEFAULT,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+):
+    if placebo_type == PlaceboType.PERMUTE:
+        permuted_idx = None
+        if random_state is None:
+            permuted_idx = np.random.choice(data.shape[0], size=data.shape[0], replace=False)
+
+        else:
+            permuted_idx = random_state.choice(data.shape[0], size=data.shape[0], replace=False)
+        new_treatment = data[treatment_names].iloc[permuted_idx].values
+        if target_estimand.identifier_method.startswith("iv"):
+            new_instruments_values = data[estimate.estimator.estimating_instrument_names].iloc[permuted_idx].values
+            new_instruments_df = pd.DataFrame(
+                new_instruments_values,
+                columns=["placebo_" + s for s in data[estimate.estimator.estimating_instrument_names].columns],
+            )
+    else:
+        if "float" in type_dict[treatment_names[0]].name:
+            logger.info(
+                "Using a Normal Distribution with Mean:{} and Variance:{}".format(
+                    DEFAULT_MEAN_OF_NORMAL,
+                    DEFAULT_STD_DEV_OF_NORMAL,
+                )
+            )
+            new_treatment = np.random.randn(data.shape[0]) * DEFAULT_STD_DEV_OF_NORMAL + DEFAULT_MEAN_OF_NORMAL
+
+        elif "bool" in type_dict[treatment_names[0]].name:
+            logger.info(
+                "Using a Binomial Distribution with {} trials and {} probability of success".format(
+                    DEFAULT_NUMBER_OF_TRIALS,
+                    DEFAULT_PROBABILITY_OF_BINOMIAL,
+                )
+            )
+            new_treatment = np.random.binomial(
+                DEFAULT_NUMBER_OF_TRIALS,
+                DEFAULT_PROBABILITY_OF_BINOMIAL,
+                data.shape[0],
+            ).astype(bool)
+
+        elif "int" in type_dict[treatment_names[0]].name:
+            logger.info(
+                "Using a Discrete Uniform Distribution lying between {} and {}".format(
+                    data[treatment_names[0]].min(), data[treatment_names[0]].max()
+                )
+            )
+            new_treatment = np.random.randint(
+                low=data[treatment_names[0]].min(), high=data[treatment_names[0]].max(), size=data.shape[0]
+            )
+
+        elif "category" in type_dict[treatment_names[0]].name:
+            categories = data[treatment_names[0]].unique()
+            logger.info("Using a Discrete Uniform Distribution with the following categories:{}".format(categories))
+            sample = np.random.choice(categories, size=data.shape[0])
+            new_treatment = pd.Series(sample, index=data.index).astype("category")
+
+    # Create a new column in the data by the name of placebo
+    new_data = data.assign(placebo=new_treatment)
+    if target_estimand.identifier_method.startswith("iv"):
+        new_data = pd.concat((new_data, new_instruments_df), axis=1)
+    # Sanity check the data
+    logger.debug(new_data[0:10])
+    new_estimator = CausalEstimator.get_estimator_object(new_data, target_estimand, estimate)
+    new_effect = new_estimator.estimate_effect()
+    return new_effect.value
+
+
+def refute_placebo_treatment(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    treatment_names: List,
+    num_simulations: int = 100,
+    placebo_type: PlaceboType = PlaceboType.DEFAULT,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+    show_progress_bar: bool = False,
+    n_jobs: int = 1,
+    verbose: int = 0,
+    **_,
+) -> CausalRefutation:
+    """Refute an estimate by replacing treatment with a randomly-generated placebo variable.
+
+    :param data: pd.DataFrame: Data to run the refutation
+    :param target_estimand: IdentifiedEstimand: Identified estimand to run the refutation
+    :param estimate: CausalEstimate: Estimate to run the refutation
+    :param treatment_names: list: List of treatments
+    :param num_simulations: The number of simulations to be run, which defaults to ``CausalRefuter.DEFAULT_NUM_SIMULATIONS``
+    :param placebo_type: Default is to generate random values for the treatment. If placebo_type is "permute", then the original treatment values are permuted by row.
+    :param random_state: The seed value to be added if we wish to repeat the same random behavior. If we want to repeat the same behavior we push the same seed in the psuedo-random generator.
+    :param n_jobs: The maximum number of concurrently running jobs. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all (this is the default).
+    :param verbose: The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. The default is 0.
+    """
+    # only permute is supported for iv methods
+    if target_estimand.identifier_method.startswith("iv"):
+        if placebo_type != PlaceboType.PERMUTE:
+            logger.error(
+                "Only placebo_type=''permute'' is supported for creating placebo for instrumental variable estimation methods"
+            )
+            raise ValueError(
+                "Only placebo_type=''permute'' is supported for creating placebo for instrumental variable estimation methods."
+            )
+
+    # We need to change the identified estimand
+    # We make a copy as a safety measure, we don't want to change the
+    # original DataFrame
+    identified_estimand = copy.deepcopy(target_estimand)
+    identified_estimand.treatment_variable = ["placebo"]
+
+    if target_estimand.identifier_method.startswith("iv"):
+        identified_estimand.instrumental_variables = [
+            "placebo_" + s for s in identified_estimand.instrumental_variables
+        ]
+        # For IV methods, the estimating_instrument_names should also be
+        # changed. Create a copy to avoid modifying original object
+        if estimate.params["method_params"] is not None and "iv_instrument_name" in estimate.params["method_params"]:
+            estimate = copy.deepcopy(estimate)
+            estimate.params["method_params"]["iv_instrument_name"] = [
+                "placebo_" + s for s in parse_state(estimate.params["method_params"]["iv_instrument_name"])
+            ]
+
+    logger.info("Refutation over {} simulated datasets of {} treatment".format(num_simulations, placebo_type))
+
+    type_dict = dict(data.dtypes)
+
+    # Run refutation in parallel
+    sample_estimates = Parallel(n_jobs=n_jobs, verbose=verbose)(
+        delayed(_refute_once)(
+            data, identified_estimand, estimate, treatment_names, type_dict, placebo_type, random_state
+        )
+        for _ in tqdm(
+            range(num_simulations),
+            disable=not show_progress_bar,
+            colour=CausalRefuter.PROGRESS_BAR_COLOR,
+            desc="Refuting Estimates: ",
+        )
+    )
+
+    sample_estimates = np.array(sample_estimates)
+
+    refute = CausalRefutation(
+        estimate.value, np.mean(sample_estimates), refutation_type="Refute: Use a Placebo Treatment"
+    )
+
+    # Note: We hardcode the estimate value to ZERO as we want to check if it falls in the distribution of the refuter
+    # Ideally we should expect that ZERO should fall in the distribution of the effect estimates as we have severed any causal
+    # relationship between the treatment and the outcome.
+    dummy_estimator = CausalEstimate(
+        estimate=0,
+        control_value=estimate.control_value,
+        treatment_value=estimate.treatment_value,
+        target_estimand=estimate.target_estimand,
+        realized_estimand_expr=estimate.realized_estimand_expr,
+    )
+
+    refute.add_significance_test_results(test_significance(dummy_estimator, sample_estimates))
+
+    return refute
--- a/dowhy/causal_refuters/random_common_cause.py
+++ b/dowhy/causal_refuters/random_common_cause.py
@ -1,12 +1,17 @@
 import copy
 import logging
+from typing import Optional, Union

 import numpy as np
+import pandas as pd
 from joblib import Parallel, delayed
 from tqdm.auto import tqdm

-from dowhy.causal_estimator import CausalEstimator
-from dowhy.causal_refuter import CausalRefutation, CausalRefuter
+from dowhy.causal_estimator import CausalEstimate, CausalEstimator
+from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand
+from dowhy.causal_refuter import CausalRefutation, CausalRefuter, test_significance
+
+logger = logging.getLogger(__name__)


 class RandomCommonCause(CausalRefuter):
@ -36,46 +41,88 @@ class RandomCommonCause(CausalRefuter):
        self.logger = logging.getLogger(__name__)

    def refute_estimate(self, show_progress_bar=False):
-        num_rows = self._data.shape[0]
-        self.logger.info(
-            "Refutation over {} simulated datasets, each with a random common cause added".format(self._num_simulations)
+        refute = refute_random_common_cause(
+            self._data,
+            self._target_estimand,
+            self._estimate,
+            self._num_simulations,
+            self._random_state,
+            show_progress_bar,
+            self._n_jobs,
+            self._verbose,
        )
-
-        new_backdoor_variables = self._target_estimand.get_backdoor_variables() + ["w_random"]
-        identified_estimand = copy.deepcopy(self._target_estimand)
-        # Adding a new backdoor variable to the identified estimand
-        identified_estimand.set_backdoor_variables(new_backdoor_variables)
-
-        def refute_once():
-            if self._random_state is None:
-                new_data = self._data.assign(w_random=np.random.randn(num_rows))
-            else:
-                new_data = self._data.assign(w_random=self._random_state.normal(size=num_rows))
-
-            new_estimator = CausalEstimator.get_estimator_object(new_data, identified_estimand, self._estimate)
-            new_effect = new_estimator.estimate_effect()
-            return new_effect.value
-
-        # Run refutation in parallel
-        sample_estimates = Parallel(n_jobs=self._n_jobs, verbose=self._verbose)(
-            delayed(refute_once)()
-            for _ in tqdm(
-                range(self._num_simulations),
-                colour=CausalRefuter.PROGRESS_BAR_COLOR,
-                disable=not show_progress_bar,
-                desc="Refuting Estimates: ",
-            )
-        )
-        sample_estimates = np.array(sample_estimates)
-
-        refute = CausalRefutation(
-            self._estimate.value, np.mean(sample_estimates), refutation_type="Refute: Add a random common cause"
-        )
-
-        # We want to see if the estimate falls in the same distribution as the one generated by the refuter
-        # Ideally that should be the case as choosing a subset should not have a significant effect on the ability
-        # of the treatment to affect the outcome
-        refute.add_significance_test_results(self.test_significance(self._estimate, sample_estimates))
-
        refute.add_refuter(self)
        return refute
+
+
+def _refute_once(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    random_state: Optional[np.random.RandomState] = None,
+):
+    if random_state is None:
+        new_data = data.assign(w_random=np.random.randn(data.shape[0]))
+    else:
+        new_data = data.assign(w_random=random_state.normal(size=data.shape[0]))
+
+    new_estimator = CausalEstimator.get_estimator_object(new_data, target_estimand, estimate)
+    new_effect = new_estimator.estimate_effect()
+    return new_effect.value
+
+
+def refute_random_common_cause(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    num_simulations: int = 100,
+    random_state: Optional[Union[int, np.random.RandomState]] = None,
+    show_progress_bar: bool = False,
+    n_jobs: int = 1,
+    verbose: int = 0,
+    **_,
+) -> CausalRefutation:
+    """Refute an estimate by introducing a randomly generated confounder
+    (that may have been unobserved).
+
+    :param data: pd.DataFrame: Data to run the refutation
+    :param target_estimand: IdentifiedEstimand: Identified estimand to run the refutation
+    :param estimate: CausalEstimate: Estimate to run the refutation
+    :param num_simulations: The number of simulations to be run, which defaults to ``CausalRefuter.DEFAULT_NUM_SIMULATIONS``
+    :param random_state: The seed value to be added if we wish to repeat the same random behavior. If we want to repeat the same behavior we push the same seed in the psuedo-random generator.
+    :param n_jobs: The maximum number of concurrently running jobs. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all (this is the default).
+    :param verbose: The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. The default is 0.
+
+    """
+    logger.info("Refutation over {} simulated datasets, each with a random common cause added".format(num_simulations))
+
+    new_backdoor_variables = target_estimand.get_backdoor_variables() + ["w_random"]
+    identified_estimand = copy.deepcopy(target_estimand)
+    # Adding a new backdoor variable to the identified estimand
+    identified_estimand.set_backdoor_variables(new_backdoor_variables)
+
+    if isinstance(random_state, int):
+        random_state = np.random.RandomState(seed=random_state)
+
+    # Run refutation in parallel
+    sample_estimates = Parallel(n_jobs=n_jobs, verbose=verbose)(
+        delayed(_refute_once)(data, identified_estimand, estimate, random_state)
+        for _ in tqdm(
+            range(num_simulations),
+            colour=CausalRefuter.PROGRESS_BAR_COLOR,
+            disable=not show_progress_bar,
+            desc="Refuting Estimates: ",
+        )
+    )
+    sample_estimates = np.array(sample_estimates)
+
+    refute = CausalRefutation(
+        estimate.value, np.mean(sample_estimates), refutation_type="Refute: Add a random common cause"
+    )
+
+    # We want to see if the estimate falls in the same distribution as the one generated by the refuter
+    # Ideally that should be the case as choosing a subset should not have a significant effect on the ability
+    # of the treatment to affect the outcome
+    refute.add_significance_test_results(test_significance(estimate, sample_estimates))
+
+    return refute
--- a/dowhy/causal_refuters/refute_estimate.py
+++ b/dowhy/causal_refuters/refute_estimate.py
@ -0,0 +1,64 @@
+from typing import Any, Callable, List, Optional, Union
+
+import pandas as pd
+
+from dowhy.causal_estimator import CausalEstimate
+from dowhy.causal_identifier.identified_estimand import IdentifiedEstimand
+from dowhy.causal_refuter import CausalRefutation
+from dowhy.causal_refuters.add_unobserved_common_cause import sensitivity_simulation
+from dowhy.causal_refuters.bootstrap_refuter import refute_bootstrap
+from dowhy.causal_refuters.data_subset_refuter import refute_data_subset
+from dowhy.causal_refuters.dummy_outcome_refuter import refute_dummy_outcome
+from dowhy.causal_refuters.placebo_treatment_refuter import refute_placebo_treatment
+from dowhy.causal_refuters.random_common_cause import refute_random_common_cause
+
+ALL_REFUTERS = [
+    sensitivity_simulation,
+    refute_bootstrap,
+    refute_data_subset,
+    refute_dummy_outcome,
+    refute_placebo_treatment,
+    refute_random_common_cause,
+    # TODO: Sensitivity Analyzers excluded from list due to different return type
+]
+
+
+def refute_estimate(
+    data: pd.DataFrame,
+    target_estimand: IdentifiedEstimand,
+    estimate: CausalEstimate,
+    treatment_name: Optional[str] = None,
+    outcome_name: Optional[str] = None,
+    refuters: List[Callable[..., Union[CausalRefutation, List[CausalRefutation]]]] = ALL_REFUTERS,
+    **kwargs,
+) -> List[CausalRefutation]:
+    """Executes a list of refuters using the default parameters
+       Only refuters that return CausalRefutation or a list of CausalRefutation is supported
+
+    :param data: pd.DataFrame: Data to run the refutation
+    :param target_estimand: IdentifiedEstimand: Identified estimand to run the refutation
+    :param estimate: CausalEstimate: Estimate to run the refutation
+    :param treatment_name: str: Name of the treatment (Optional)
+    :param outcome_name: str: Name of the outcome (Optional)
+    :param refuters: list: List of refuters to execute
+    :**kwargs: Replace any default for the provided list of refuters
+
+    """
+    refuter_kwargs = {
+        "data": data,
+        "target_estimand": target_estimand,
+        "estimate": estimate,
+        "treatment_name": treatment_name,
+        "outcome_name": outcome_name,
+        **kwargs,
+    }
+
+    results = []
+    for refuter in refuters:
+        refute = refuter(**refuter_kwargs)
+        if isinstance(refute, list):
+            results.extend(refute)
+        else:
+            results.append(refute)
+
+    return results
--- a/dowhy/gcm/confidence_intervals.py
+++ b/dowhy/gcm/confidence_intervals.py
@ -94,7 +94,7 @@ def confidence_intervals(
                position=0,
                leave=True,
                disable=not config.show_progress_bars,
-                desc="Estimating boostrap interval...",
+                desc="Estimating bootstrap interval...",
            )
        )
    )
--- a/dowhy/gcm/independence_test/kernel.py
+++ b/dowhy/gcm/independence_test/kernel.py
@ -63,7 +63,7 @@ def kernel_based(
    :param bootstrap_num_runs: Number of bootstrap runs (only relevant if use_bootstrap is True).
    :param bootstrap_num_samples_per_run: Number of samples used in a bootstrap run (only relevant if use_bootstrap is
                                          True).
-    :param bootstrap_n_jobs: Number of parallel jobs for the boostrap runs.
+    :param bootstrap_n_jobs: Number of parallel jobs for the bootstrap runs.
    :param p_value_adjust_func: A callable that expects a numpy array of multiple p-values and returns one p-value. This
                                is typically used a family wise error rate control method.
    :return: The p-value for the null hypothesis that X and Y are independent (given Z).
@ -172,7 +172,7 @@ def approx_kernel_based(
                          p-value is constructed based on the provided p_value_adjust_func function.
    :param bootstrap_num_runs: Number of bootstrap runs (only relevant if use_bootstrap is True).
    :param bootstrap_num_samples: Maximum number of used samples per bootstrap run.
-    :param bootstrap_n_jobs: Number of parallel jobs for the boostrap runs.
+    :param bootstrap_n_jobs: Number of parallel jobs for the bootstrap runs.
    :param p_value_adjust_func: A callable that expects a numpy array of multiple p-values and returns one p-value. This
                                is typically used a family wise error rate control method.
    :return: The p-value for the null hypothesis that X and Y are independent (given Z).