Initial checkin of some documentation

2003-03-27 23:43:32 +00:00 · 2003-03-27 23:43:32 +00:00 · 6b3bf90a25
--- a/extensions/transformiix/docs/optimized-stylesheets.html
+++ b/extensions/transformiix/docs/optimized-stylesheets.html
--- a/extensions/transformiix/docs/optimized-xpath.html
+++ b/extensions/transformiix/docs/optimized-xpath.html
@ -0,0 +1,490 @@
+<html>
+  <head>
+    <title>Optimizing XPath</title>
+    <style>
+      .comment {
+        font-style: italic;
+      }
+      h4 {
+        margin: 0;
+        padding: 0;
+      }
+    </style>
+  </head>
+  <body>
+
+    <h1>Optimizing XPath</h1>
+
+    <h2>Overview</h2>
+      <p>
+        This document outlines optimizations that we can perform to execute
+        xpath-expressions faster.
+      </p>
+
+    <h2>Stage 1, DONE</h2>
+
+      <h3>Summary</h3>
+        <p>
+          Speed up retrieval of orderInfo objects by storing them in resp.
+          node instead of in a hash.
+        </p>
+
+      <h3>Details</h3>
+        <p>
+          We currently spend a GREAT deal of time looking through a
+          DOMHelper::orders hash looking for the orderInfo object for a
+          specific node. If we moved the ownership and retrieval of these
+          orderInfo objects to the Node class instead we will probably save
+          a lot of time. I.E. instead of calling
+          <code>myDOMHelper->getDocumentOrder(node)</code> you call
+          <code>node->getDocumentOrder()</code> which then returns the
+          orderInfo object.
+        </p>
+
+        <p>
+          It would also be nice if we at the same time fixed some bugs wrt the
+          orderInfo objects and the function that sorts nodes using them.
+        </p>
+
+        <p>
+          Bugs filed at this are 88964 and 94471
+        </p>
+
+
+
+    <h2>Stage 2, DONE</h2>
+
+
+      <h3>Summary</h3>
+        <p>
+          Speed up document-order sorting by having the XPath engine always
+          return document-ordered nodesets.
+        </p>
+
+      <h3>Details</h3>
+        <p>
+          Currently the nodesets returned from the XPath engine are totally
+          unordered (or rather, have undefined order) which forces the XSLT
+          code to sort the nodesets. This is quite expensive since it requires
+          us to generate orderInfo objects for every node. Considering that
+          many XPath classes actually returns nodesets that are already
+          ordered in document order (or reversed document order) this seems a
+           bit unnecessary.
+        </p>
+
+        <p>
+          However we still need to handle the classes that don't by default
+          return document-ordered nodesets. A good example of this is the id()
+          function. For example "id('foo bar')" produces two nodes which the
+          id-function has no idea how they relate in terms of document order.
+          Another example is "foo | bar", where the UnionExpr object gets two
+          nodesets (ordered in document order since all XPath classes should
+          now return ordered nodesets) and need to merge them into a single
+          ordered nodeset.
+        </p>
+
+    <h2>Stage 3</h2>
+
+      <h3>Summary</h3>
+        <p>
+          Speed up evaluation of XPath expressions by using specialized
+          classes for common optimizable expressions.
+        </p>
+
+      <h3>Details</h3>
+        <p>
+          Some common expressions are possible to execute faster if we have
+          classes that are specialized for them. For example the expression
+          "@foo" can be evaluated by simply calling |context->getAttributeNode
+          ("foo")|, instead we now walk all attributes of the context node and
+          filter each node using a AttributeExpr. Below is a list of
+          expressions that I can think of that are optimizable, but there are
+          probably more.
+        </p>
+
+        <p>
+          One thing that we IMHO should keep in mind is to only put effort on
+          optimising expressions that are actually used in realworld
+          stylesheets. For example "foo | foo", "foo | bar[0]" and
+          "foo[position()]" can all be optimised to "foo", but since noone
+          should be so stupid as to write such an expression we shouldn't
+          spend time or codesize on that. Of course we should return the
+          correct result according to spec for those expressions, we just
+          shouldn't bother with evaluating them fast.
+        </p>
+
+
+        <p>
+          Apart from finding expression that we can evaluate more cleverly
+          there is also the problem of how and where do we create these
+          optimised objects instead of the unoptimised, general ones we create
+          now. And what are these optimised classes, should they be normal
+          Expr classes or should they be something else? We could also add
+          "optional" methods to Expr which have default implementations in
+          Expr, for example a ::isContextSensitive() which returns MB_TRUE
+          unless overridden. However we probably can't answer all this until
+          we know which expressions we want to optimised and how we want to
+          optimise them.
+        </p>
+
+        <p>
+          These expressions can be optimised:
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Steps along the attribute axis which doesn't contain wildcards
+          </p>
+          <h4>Example:</h4>
+          <span>
+            @foo
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Walk through the attributes NamedNodeMap and filter each node using a
+          NameTest.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Call getAttributeNode (or actually getAttributeNodeNS) on the
+            contextnode and return a nodeset containing just the returned node, or
+            an empty nodeset if NULL is returned.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Union expressions where each expression consists of a LocationStep and
+          all LocationSteps have the same axis. None of the LocationSteps have any
+          predicates (well, this could be relaxed a bit)
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo | bar | baz
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Evaluate each LocationStep separately and thus walk the same path through
+            the document each time. During the walking the NodeTest is applied to
+            filter out the correct nodes. The resulting nodesets are then merged and
+            thus we generate orderInfo objects for most nodes.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Have just one LocationStep object which contains a NodeTest that is a
+            "UnionNodeTest" which contains a list of NodeTests. The UnionNodeTest
+            then tests each NodeTest until it finds one that returns true. If none
+            do then false is returned.
+            This results in just one walk along the axis and no need to generate any
+            orderInfo objects.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Steps where the predicates isn't context-node-list sensitive.
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[@bar]
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Build a nodeset of all nodes that match 'foo' and then filter the
+          nodeset through the predicate and thus do some node shuffling.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Create a "PredicatedNodeTest" that contains a NodeTest and a list of
+            predicates. The PredicatedNodeTest returns true if both the NodeTest
+            returns true and all predicats evaluate to true. Then let the
+            LocationStep have that PredicateNodeTest as NodeTest and no predicates.
+            This will save us the predicate filtering and thus some node shuffling.
+            (Note how this combines nicely with the previous optimisation...)
+            (Actually this can be done even if some predicates are context-list
+            sensitive, but only up until the first that isn't.)
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            PathExprs that only contains steps that from the child:: and attribute::
+            axes.
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo/bar/baz
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            For each step we evaluate the step once for every node in a nodeset
+            (for example for the second step the nodeset is the list of all "foo"
+            children) and then merge the resulting nodesets while making sure that
+            we keep the nodes in document order (and thus generate orderInfo
+            objects).
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            The same thing except that we don't merge the resulting nodeset, but
+            rather just concatenate them. We always know that the resulting nodesets
+            are after each other in node order.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            List of predicates where some predicate are not context-list sensitive
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[position() > 3][@bar][.//baz][position() > size() div 2][.//@fud]
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Apply each predicate separately requiring us to shuffle nodes five times
+          in the above example.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Merge all predicates that are not node context-list sensitive into the
+            previous predicate. The above predicate list could be merged into the
+            following predicate list
+            foo[(position() > 3) and (@bar) and (.//baz)][(position() > size() div 2) and (.//@fud)]
+            Which only requires two node-shuffles
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Predicates that are only context-list-position sensitive and not
+            context-list-size sensitive
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[position() > 5][position() mod 2]
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Build the entire list of nodes that matches "foo" and then apply the
+            predicates
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Apply the predicates during the initial build of the first nodeset. We
+            would have to keep track of how many nodes has passed each and somehow
+            override the code that calculates the context-list-position.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Predicates that are constants
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[5]
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Perform the appropriate walk and build the entire nodeset. Then apply
+          the predicate.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            There are three types of constant results; 1) Numerical values 2)
+            Results with a true boolean-value 3) Results with a false boolean value.
+            In the case of 1) we should only step up until the n:th node (5 in above
+            example) and then stop. For 2) we should completely ignore the predicate
+            and for 3) we should return an empty nodeset without doing any walking.
+            In some cases we can't at parsetime decide if a constant expression will
+            return a numerical or not, for example for "foo[$pos]", so the decision
+            of 1) 2) or 3) would have to be made at evaltime. However we should be
+            able to decide if it's a constant or not at parsetime.
+            Note that while evaluating a LocationStep [//foo] can be considered
+            constant.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            PathExprs that contains '//' followed by an unpredicated child-step.
+          </span>
+          <h4>Example:</h4>
+          <span>
+            .//bar
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            We walk the entire subtree below the contextnode and at every node we
+            evaluate the 'bar'-expression which walks all the children of the
+            contextnode. This means that we'll walk the entire subtree twice.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Change the expression into "./descendant::bar". This means that we'll
+            only walk the tree once. This can only be done if there are no
+            predicates since the context-node-list will be different for
+            predicates in the new expression.
+            Note that this combines nicely with the "Steps where the predicates
+            isn't context-node-list sensitive" optimization.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            PathExprs where the first step is '.'
+          </span>
+          <h4>Example:</h4>
+          <span>
+            ./*
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Evaluate the step "." which always returns the same node and then
+            evaluate the rest of the PathExpr.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Remove the '.'-step and simply evaluate the other steps. In the example
+            we could even remove the entire PathExpr-object and replace it with a
+            single Step-object.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Steps along the attribute axis which doesn't contain wildcards and
+            we only care about the boolean value.
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[@bar], @foo or @bar
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Evaluate the step and create a nodeset. Then get the bool-value of
+            the nodeset by checking if the nodeset contain any nodes.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Simply check if the current element has an attribute of the
+            requested name and return a bool-result.
+          </span>
+        </p>
+
+        <p>
+          <h4>Class:</h4>
+          <span>
+            Steps where we only care about the boolean value.
+          </span>
+          <h4>Example:</h4>
+          <span>
+            foo[processing-instruction()]
+          </span>
+          <h4>What we do today:</h4>
+          <span>
+            Evaluate the step and create a nodeset. Then get the bool-value of
+            the nodeset by checking if the nodeset contain any nodes.
+          </span>
+          <h4>What we could do:</h4>
+          <span>
+            Walk along the axis until we find a node that matches the nodetest.
+            If one is found we can stop the walking and return a true
+            bool-result immediatly, otherwise a false bool-result is returned.
+            It might not be worth implementing all axes unless we can reuse
+            code from the normal Step-code.
+          </span>
+        </p>
+
+    <h2>Stage 4</h2>
+
+      <h3>Summary</h3>
+        <p>
+          Refcount <code>ExprResult</code>s to reduce the number of objects
+          created during evaluation.
+        </p>
+
+      <h3>Details</h3>
+        <p>
+          Right now every subexpression creates a new object during evaluation.
+          If we refcounted objects we would be often be able to reuse the same
+          objects across multiple evaluations. We should also keep global
+          result-objects for true and false, that way expressions that return
+          bool-values would never have to create any objects.
+        </p>
+
+        <p>
+          This does however require that the returned objects arn't modified
+          since they might be used elsewhere. This is not a big problem in the
+          current code where we pretty much only modify nodesets in a couple
+          of places.
+        </p>
+
+        <p>
+          To be able to reuse objects across subexpressions we chould have an
+          <code>ExprResult::ensureModifyable</code>-function. This would
+          return the same object if the refcount is 1, and create a new object
+          to return otherwise. This is especially usefull for nodesets which
+          would be mostly used by a single object at a time. But it could be
+          just as usefull for other types, though then we might need a
+          <code>ExprResult::ensureModifyableOfType(ExprResult::ResultType)</code>-function
+          that only returned itself if it has a refcount of 1 and is of the
+          requsted type.
+        </p>
+
+    <h2>Stage 5</h2>
+
+      <h3>Summary</h3>
+        <p>
+          Detect when we can concatenate nodesets instead of merge them in
+          PathExpr.
+        </p>
+
+      <h3>Details</h3>
+        <p>
+          Why can we for expressions like "foo/bar/baz" concatenate the resulting
+          nodesets without having to check nodeorder? Because at every step two
+          statements are true:
+          <ol>
+            <li>We iterate a nodeset where no node is an ancestor of another</li>
+            <li>The LocationStep only returns nodes that are members of the subtree
+                below the context-node</li>
+          </ol>
+        </p>
+
+        <p>
+          For example; While evaluating the second step in "foo/bar/baz" we
+          iterate a nodelist containing all "foo" children of the original
+          contextnode, i.e. none can be an ancestor of another. And the
+          LocationStep "bar" only returns children of the contextnode.
+        </p>
+
+        <p>
+          So, it would be nice if we can detect when this occurs as often as
+          possible. For example the expression "id(foo)/bar/baz" fulfils those
+          requirements if the nodeset returned from contains doesn't contain any
+          ancestors of other nodes in the nodeset, which probably often is the
+          case in real-world stylesheets.
+        </p>
+
+        <p>
+          We should perform this check on every step to be able to take advantage
+          of it as often as possible. For example the in expression
+          "id(@boss)/ancestor::team/members" we can't use this optimisation at the
+          second step since the ancestor axis returns nodes that are not members
+          of the contextnodes subtree. However we will probably be able to use the
+          optimisation at the third step since if iterated nodeset contains only
+          one node (and thus can't contain ancestors of it's members).
+        </p>
+  </body>
+</html>