docs: expand handbook entry on monotonic aggregates

This commit is contained in:
james 2020-02-13 18:06:44 +00:00
Родитель e1644dd68b
Коммит b32683fc9f
1 изменённых файлов: 65 добавлений и 8 удалений

Просмотреть файл

@ -393,16 +393,73 @@ aggregation in a simpler form:
Monotonic aggregates
====================
In addition to the standard aggregates, QL also supports monotonic aggregates.
These are a slightly different way of computing aggregates which have some advantages.
For example, you can use monotonic aggregates :ref:`recursively <recursion>`.
You can't do this with normal aggregates.
In addition to standard aggregates, QL also supports monotonic aggregates.
Monotonic aggregates differ from standard aggregates in the way that they deal with the
values generated by the ``<expression>`` part of the formula:
For more information and examples, see `Monotonic aggregates in QL
<https://help.semmle.com/QL/learn-ql/advanced/monotonic-aggregates.html>`_.
- Standard aggregates take the ``<expression>`` values for each ``<formula>`` value and
flatten them into a list. A single aggregation function is applied to all the values.
- Monotonic aggregates take an ``<expression>`` for each value given by the ``<formula>``,
and create combinations of all the possible values. The aggregation
function is applied to each of the resulting combinations.
.. TODO: Eventually replace this link with just the relevant examples.
(Some of the content is a duplicate of the above discussion on aggregates.)
In general, if the ``<expression>`` is total and functional, then monotonic aggregates are
equivalent to standard aggregates. Results differ when there is not precisely one ``<expression>``
value for each value generated by the ``<formula>``:
- If there are missing ``<expression>`` values (that is, there is less than one
``<expression>`` value for each value generated by the ``<formula>``), monotonic aggregates
won't compute a result, as you cannot create combinations of values
including exactly one ``<expression>`` value for each value generated by the ``<formula>``.
- If there is more than one ``<expression>`` per ``<formula>`` result, you can create multiple
combinations of values including exactly one ``<expression>`` value for each
value generated by the ``<formula>``. Here, the aggregation function is applied to each of the
resulting combinations.
Recursive monotonic aggregates
------------------------------
Monotonic aggregates may be used :ref:`recursively <recursion>`, but the recursive call may only appear in the
expression, and not in the range. The recursive semantics for aggregates are the same as the
recursive semantics for the rest of QL. For example, we might define a predicate to calculate
the distance of a node in a graph from the leaves as follows:
.. code-block:: ql
int depth(Node n) {
if not exists(n.getAChild())
then result = 0
else result = 1 + max(Node child | child = n.getAChild() | depth(child))
}
Here the recursive call is in the expression, which is legal. The recursive semantics for aggregates
are the same as the recursive semantics for the rest of QL. If you understand how aggregates work in
the non-recursive case then you should not find it difficult to use them recursively. However, it is
worth seeing how the evaluation of a recursive aggregation proceeds.
Consider the depth example we just saw with the following graph as input (arrows point from children to parents):
.. |image0| image:: ../images/monotonic-aggregates-graph.png
|image0|
Then the evaluation of the ``depth`` predicate proceeds as follows:
+-----------+--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| **Stage** | **depth** | **Comments** |
+===========+============================================+==========================================================================================================================================================================+
| 0 |   | We always begin with the empty set. |
+-----------+--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | ``(0, b), (0, d), (0, e)`` | The nodes with no children have depth 0. The recursive step for **a** and **c** fails to produce a value, since some of their children do not have values for ``depth``. |
+-----------+--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2 | ``(0, b), (0, d), (0, e), (1, c)`` | The recursive step for **c** succeeds, since ``depth`` now has a value for all its children (**d** and **e**). The recursive step for **a** still fails. |
+-----------+--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3 | ``(0, b), (0, d), (0, e), (1, c), (2, a)`` | The recursive step for **a** succeeds, since ``depth`` now has a value for all its children (**b** and **c**). |
+-----------+--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Here, we can see that at the intermediate stages it is very important for the aggregate to
fail if some of the children lack a value - this prevents erroneous values being added.
.. index:: any