flow_dissector: rst'ify documentation
Rename bpf_flow_dissector.txt to bpf_flow_dissector.rst and fix
formatting. Also, link it from the Documentation/networking/index.rst.
Tested with 'make htmldocs' to make sure it looks reasonable.
Fixes: ae82899bbe
("flow_dissector: document BPF flow dissector environment")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This commit is contained in:
Родитель
a090dbf25c
Коммит
5eed789862
|
@ -0,0 +1,126 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==================
|
||||
BPF Flow Dissector
|
||||
==================
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
Flow dissector is a routine that parses metadata out of the packets. It's
|
||||
used in the various places in the networking subsystem (RFS, flow hash, etc).
|
||||
|
||||
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
|
||||
in BPF to gain all the benefits of BPF verifier (namely, limits on the
|
||||
number of instructions and tail calls).
|
||||
|
||||
API
|
||||
===
|
||||
|
||||
BPF flow dissector programs operate on an ``__sk_buff``. However, only the
|
||||
limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
|
||||
``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
|
||||
and output arguments.
|
||||
|
||||
The inputs are:
|
||||
* ``nhoff`` - initial offset of the networking header
|
||||
* ``thoff`` - initial offset of the transport header, initialized to nhoff
|
||||
* ``n_proto`` - L3 protocol type, parsed out of L2 header
|
||||
|
||||
Flow dissector BPF program should fill out the rest of the ``struct
|
||||
bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
|
||||
also adjusted accordingly.
|
||||
|
||||
The return code of the BPF program is either BPF_OK to indicate successful
|
||||
dissection, or BPF_DROP to indicate parsing error.
|
||||
|
||||
__sk_buff->data
|
||||
===============
|
||||
|
||||
In the VLAN-less case, this is what the initial state of the BPF flow
|
||||
dissector looks like::
|
||||
|
||||
+------+------+------------+-----------+
|
||||
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
In case of VLAN, flow dissector can be called with the two different states.
|
||||
|
||||
Pre-VLAN parsing::
|
||||
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of TCI
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = TPID
|
||||
|
||||
Please note that TPID can be 802.1AD and, hence, BPF program would
|
||||
have to parse VLAN information twice for double tagged packets.
|
||||
|
||||
|
||||
Post-VLAN parsing::
|
||||
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
In this case VLAN information has been processed before the flow dissector
|
||||
and BPF flow dissector is not required to handle it.
|
||||
|
||||
|
||||
The takeaway here is as follows: BPF flow dissector program can be called with
|
||||
the optional VLAN header and should gracefully handle both cases: when single
|
||||
or double VLAN is present and when it is not present. The same program
|
||||
can be called for both cases and would have to be written carefully to
|
||||
handle both cases.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
|
||||
implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
|
||||
for the loader. bpftool can be used to load BPF flow dissector program as well.
|
||||
|
||||
The reference implementation is organized as follows:
|
||||
* ``jmp_table`` map that contains sub-programs for each supported L3 protocol
|
||||
* ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
|
||||
does ``bpf_tail_call`` to the appropriate L3 handler
|
||||
|
||||
Since BPF at this point doesn't support looping (or any jumping back),
|
||||
jmp_table is used instead to handle multiple levels of encapsulation (and
|
||||
IPv6 options).
|
||||
|
||||
|
||||
Current Limitations
|
||||
===================
|
||||
BPF flow dissector doesn't support exporting all the metadata that in-kernel
|
||||
C-based implementation can export. Notable example is single VLAN (802.1Q)
|
||||
and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
|
||||
for a set of information that's currently can be exported from the BPF context.
|
|
@ -1,115 +0,0 @@
|
|||
==================
|
||||
BPF Flow Dissector
|
||||
==================
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
Flow dissector is a routine that parses metadata out of the packets. It's
|
||||
used in the various places in the networking subsystem (RFS, flow hash, etc).
|
||||
|
||||
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
|
||||
in BPF to gain all the benefits of BPF verifier (namely, limits on the
|
||||
number of instructions and tail calls).
|
||||
|
||||
API
|
||||
===
|
||||
|
||||
BPF flow dissector programs operate on an __sk_buff. However, only the
|
||||
limited set of fields is allowed: data, data_end and flow_keys. flow_keys
|
||||
is 'struct bpf_flow_keys' and contains flow dissector input and
|
||||
output arguments.
|
||||
|
||||
The inputs are:
|
||||
* nhoff - initial offset of the networking header
|
||||
* thoff - initial offset of the transport header, initialized to nhoff
|
||||
* n_proto - L3 protocol type, parsed out of L2 header
|
||||
|
||||
Flow dissector BPF program should fill out the rest of the 'struct
|
||||
bpf_flow_keys' fields. Input arguments nhoff/thoff/n_proto should be also
|
||||
adjusted accordingly.
|
||||
|
||||
The return code of the BPF program is either BPF_OK to indicate successful
|
||||
dissection, or BPF_DROP to indicate parsing error.
|
||||
|
||||
__sk_buff->data
|
||||
===============
|
||||
|
||||
In the VLAN-less case, this is what the initial state of the BPF flow
|
||||
dissector looks like:
|
||||
+------+------+------------+-----------+
|
||||
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER.
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
|
||||
In case of VLAN, flow dissector can be called with the two different states.
|
||||
|
||||
Pre-VLAN parsing:
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of TCI.
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = TPID
|
||||
|
||||
Please note that TPID can be 802.1AD and, hence, BPF program would
|
||||
have to parse VLAN information twice for double tagged packets.
|
||||
|
||||
|
||||
Post-VLAN parsing:
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER.
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
In this case VLAN information has been processed before the flow dissector
|
||||
and BPF flow dissector is not required to handle it.
|
||||
|
||||
|
||||
The takeaway here is as follows: BPF flow dissector program can be called with
|
||||
the optional VLAN header and should gracefully handle both cases: when single
|
||||
or double VLAN is present and when it is not present. The same program
|
||||
can be called for both cases and would have to be written carefully to
|
||||
handle both cases.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
See tools/testing/selftests/bpf/progs/bpf_flow.c for the reference
|
||||
implementation and tools/testing/selftests/bpf/flow_dissector_load.[hc] for
|
||||
the loader. bpftool can be used to load BPF flow dissector program as well.
|
||||
|
||||
The reference implementation is organized as follows:
|
||||
* jmp_table map that contains sub-programs for each supported L3 protocol
|
||||
* _dissect routine - entry point; it does input n_proto parsing and does
|
||||
bpf_tail_call to the appropriate L3 handler
|
||||
|
||||
Since BPF at this point doesn't support looping (or any jumping back),
|
||||
jmp_table is used instead to handle multiple levels of encapsulation (and
|
||||
IPv6 options).
|
||||
|
||||
|
||||
Current Limitations
|
||||
===================
|
||||
BPF flow dissector doesn't support exporting all the metadata that in-kernel
|
||||
C-based implementation can export. Notable example is single VLAN (802.1Q)
|
||||
and double VLAN (802.1AD) tags. Please refer to the 'struct bpf_flow_keys'
|
||||
for a set of information that's currently can be exported from the BPF context.
|
|
@ -9,6 +9,7 @@ Contents:
|
|||
netdev-FAQ
|
||||
af_xdp
|
||||
batman-adv
|
||||
bpf_flow_dissector
|
||||
can
|
||||
can_ucan_protocol
|
||||
device_drivers/freescale/dpaa2/index
|
||||
|
|
Загрузка…
Ссылка в новой задаче