Add 'Typical usage' section in GCM introduction

Signed-off-by: Peter Goetz <pego@amazon.com>
2022-09-05 16:08:10 +02:00 · 2022-09-05 16:08:10 +02:00 · 560b3460aa
--- a/docs/source/user_guide/gcm_based_inference/introduction.rst
+++ b/docs/source/user_guide/gcm_based_inference/introduction.rst
@ -170,5 +170,52 @@ This intervention says: "I'll ignore any causal effects of X on Y, and set every
 to 2.34." So the distribution of X will remain unchanged, whereas values of Y will be at a fixed
 value and Z will respond according to its causal model.

+These are the basic steps that need to happen. While we can run these steps explicitly, often they get
+executed as part of other steps, e.g. when fitting and re-fitting as part of computing confidence
+intervals. The next section therefore dives into a more typical usage pattern of the ``dowhy.gcm`` package.
+
+Typical usage of the ``dowhy.gcm`` package
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In practice, we may not execute the steps we've learned above explicitly and in this order, but they
+get called by other abstractions. E.g. we usually want to use confidence intervals, when answering
+causal questions, to quantify our confidence in the results. In this case, ``fit`` will be called on
+behalf of us, but we won't call it directly.
+
+Modeling an SCM
+---------------
+
+The graph modeling is the same as in `Step 1: Modeling cause-effect relationships as a structural
+causal model (SCM)`_. First we construct the causal graph:
+
+>>> causal_model = gcm.StructuralCausalModel(nx.DiGraph([('X', 'Y'), ('Y', 'Z')])) # X → Y → Z
+
+Answering a causal query with confidence intervals
+--------------------------------------------------
+
+When we answer a causal query without computing its confidence intervals, what we effectively get,
+are point estimates. These are not very useful when trying to assess the confidence in our results.
+Instead of calling ``fit`` explicitly, we can achieve its execution by going through the API for
+confidence intervals. Let's say we wanted to understand the direct arrow strengths between nodes and
+quantify our confidence in those results. This is how we would do it:
+
+>>> strength_median, strength_intervals = gcm.confidence_intervals(
+>>>     gcm.bootstrap_training_and_sampling(gcm.direct_arrow_strength,
+>>>                                         causal_model,
+>>>                                         bootstrap_training_data=data,
+>>>                                         target_node='Y'))
+>>> strength_median, strength_intervals
+({('X', 'Y'): 45.90886398636573, ('Z', 'Y'): 15.47129383737619},
+{('X', 'Y'): array([42.88319632, 50.43890079]), ('Z', 'Y'): array([13.44202416, 17.74266107])})
+
+In this case, ``fit`` will be called within ``bootstrap_training_and_sampling``, so there is no need
+to do this ourselves.
+
+The calling sequence of ``confidence_intervals`` and ``bootstrap_training_and_sampling`` is not
+trivial, but exploits the fact our APIs are composable. If not everything makes sense
+to you yet, we recommend to simply treat this calling sequence as a ready-to-use construct. Read:
+"Get confidence intervals via bootstrapping training and sampling of direct arrow strength". For a
+deeper understanding of this construct, see section :doc:`estimating_confidence_intervals`.
+
 With this knowledge, we can now dive deep into the meaning and usages of causal queries in section
 :doc:`answering_causal_questions/index`.