bpf-next-for-netdev
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY7X/4wAKCRDbK58LschI g7gzAQCjKsLtAWg1OplW+B7pvEPwkQ8g3O1+PYWlToCUACTlzQD+PEMrqGnxB573 oQAk6I2yOTwLgvlHkrm+TIdKSouI4gs= =2hUY -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2023-01-04 We've added 45 non-merge commits during the last 21 day(s) which contain a total of 50 files changed, 1454 insertions(+), 375 deletions(-). The main changes are: 1) Fixes, improvements and refactoring of parts of BPF verifier's state equivalence checks, from Andrii Nakryiko. 2) Fix a few corner cases in libbpf's BTF-to-C converter in particular around padding handling and enums, also from Andrii Nakryiko. 3) Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata, from Christian Ehrig. 4) Improve x86 JIT's codegen for PROBE_MEM runtime error checks, from Dave Marchevsky. 5) Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers, from Jiri Olsa. 6) Add proper documentation for BPF_MAP_TYPE_SOCK{MAP,HASH} maps, from Maryam Tahhan. 7) Improvements in libbpf's btf_parse_elf error handling, from Changbin Du. 8) Bigger batch of improvements to BPF tracing code samples, from Daniel T. Lee. 9) Add LoongArch support to libbpf's bpf_tracing helper header, from Hengqi Chen. 10) Fix a libbpf compiler warning in perf_event_open_probe on arm32, from Khem Raj. 11) Optimize bpf_local_storage_elem by removing 56 bytes of padding, from Martin KaFai Lau. 12) Use pkg-config to locate libelf for resolve_btfids build, from Shen Jiamin. 13) Various libbpf improvements around API documentation and errno handling, from Xin Liu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits) libbpf: Return -ENODATA for missing btf section libbpf: Add LoongArch support to bpf_tracing.h libbpf: Restore errno after pr_warn. libbpf: Added the description of some API functions libbpf: Fix invalid return address register in s390 samples/bpf: Use BPF_KSYSCALL macro in syscall tracing programs samples/bpf: Fix tracex2 by using BPF_KSYSCALL macro samples/bpf: Change _kern suffix to .bpf with syscall tracing program samples/bpf: Use vmlinux.h instead of implicit headers in syscall tracing program samples/bpf: Use kyscall instead of kprobe in syscall tracing program bpf: rename list_head -> graph_root in field info types libbpf: fix errno is overwritten after being closed. bpf: fix regs_exact() logic in regsafe() to remap IDs correctly bpf: perform byte-by-byte comparison only when necessary in regsafe() bpf: reject non-exact register type matches in regsafe() bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule bpf: reorganize struct bpf_reg_state fields bpf: teach refsafe() to take into account ID remapping bpf: Remove unused field initialization in bpf's ctl_table selftests/bpf: Add jit probe_mem corner case tests to s390x denylist ... ==================== Link: https://lore.kernel.org/r/20230105000926.31350-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Коммит
d75858ef10
|
@ -0,0 +1,498 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0-only
|
||||
.. Copyright Red Hat
|
||||
|
||||
==============================================
|
||||
BPF_MAP_TYPE_SOCKMAP and BPF_MAP_TYPE_SOCKHASH
|
||||
==============================================
|
||||
|
||||
.. note::
|
||||
- ``BPF_MAP_TYPE_SOCKMAP`` was introduced in kernel version 4.14
|
||||
- ``BPF_MAP_TYPE_SOCKHASH`` was introduced in kernel version 4.18
|
||||
|
||||
``BPF_MAP_TYPE_SOCKMAP`` and ``BPF_MAP_TYPE_SOCKHASH`` maps can be used to
|
||||
redirect skbs between sockets or to apply policy at the socket level based on
|
||||
the result of a BPF (verdict) program with the help of the BPF helpers
|
||||
``bpf_sk_redirect_map()``, ``bpf_sk_redirect_hash()``,
|
||||
``bpf_msg_redirect_map()`` and ``bpf_msg_redirect_hash()``.
|
||||
|
||||
``BPF_MAP_TYPE_SOCKMAP`` is backed by an array that uses an integer key as the
|
||||
index to look up a reference to a ``struct sock``. The map values are socket
|
||||
descriptors. Similarly, ``BPF_MAP_TYPE_SOCKHASH`` is a hash backed BPF map that
|
||||
holds references to sockets via their socket descriptors.
|
||||
|
||||
.. note::
|
||||
The value type is either __u32 or __u64; the latter (__u64) is to support
|
||||
returning socket cookies to userspace. Returning the ``struct sock *`` that
|
||||
the map holds to user-space is neither safe nor useful.
|
||||
|
||||
These maps may have BPF programs attached to them, specifically a parser program
|
||||
and a verdict program. The parser program determines how much data has been
|
||||
parsed and therefore how much data needs to be queued to come to a verdict. The
|
||||
verdict program is essentially the redirect program and can return a verdict
|
||||
of ``__SK_DROP``, ``__SK_PASS``, or ``__SK_REDIRECT``.
|
||||
|
||||
When a socket is inserted into one of these maps, its socket callbacks are
|
||||
replaced and a ``struct sk_psock`` is attached to it. Additionally, this
|
||||
``sk_psock`` inherits the programs that are attached to the map.
|
||||
|
||||
A sock object may be in multiple maps, but can only inherit a single
|
||||
parse or verdict program. If adding a sock object to a map would result
|
||||
in having multiple parser programs the update will return an EBUSY error.
|
||||
|
||||
The supported programs to attach to these maps are:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
struct sk_psock_progs {
|
||||
struct bpf_prog *msg_parser;
|
||||
struct bpf_prog *stream_parser;
|
||||
struct bpf_prog *stream_verdict;
|
||||
struct bpf_prog *skb_verdict;
|
||||
};
|
||||
|
||||
.. note::
|
||||
Users are not allowed to attach ``stream_verdict`` and ``skb_verdict``
|
||||
programs to the same map.
|
||||
|
||||
The attach types for the map programs are:
|
||||
|
||||
- ``msg_parser`` program - ``BPF_SK_MSG_VERDICT``.
|
||||
- ``stream_parser`` program - ``BPF_SK_SKB_STREAM_PARSER``.
|
||||
- ``stream_verdict`` program - ``BPF_SK_SKB_STREAM_VERDICT``.
|
||||
- ``skb_verdict`` program - ``BPF_SK_SKB_VERDICT``.
|
||||
|
||||
There are additional helpers available to use with the parser and verdict
|
||||
programs: ``bpf_msg_apply_bytes()`` and ``bpf_msg_cork_bytes()``. With
|
||||
``bpf_msg_apply_bytes()`` BPF programs can tell the infrastructure how many
|
||||
bytes the given verdict should apply to. The helper ``bpf_msg_cork_bytes()``
|
||||
handles a different case where a BPF program cannot reach a verdict on a msg
|
||||
until it receives more bytes AND the program doesn't want to forward the packet
|
||||
until it is known to be good.
|
||||
|
||||
Finally, the helpers ``bpf_msg_pull_data()`` and ``bpf_msg_push_data()`` are
|
||||
available to ``BPF_PROG_TYPE_SK_MSG`` BPF programs to pull in data and set the
|
||||
start and end pointers to given values or to add metadata to the ``struct
|
||||
sk_msg_buff *msg``.
|
||||
|
||||
All these helpers will be described in more detail below.
|
||||
|
||||
Usage
|
||||
=====
|
||||
Kernel BPF
|
||||
----------
|
||||
bpf_msg_redirect_map()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_msg_redirect_map(struct sk_msg_buff *msg, struct bpf_map *map, u32 key, u64 flags)
|
||||
|
||||
This helper is used in programs implementing policies at the socket level. If
|
||||
the message ``msg`` is allowed to pass (i.e., if the verdict BPF program
|
||||
returns ``SK_PASS``), redirect it to the socket referenced by ``map`` (of type
|
||||
``BPF_MAP_TYPE_SOCKMAP``) at index ``key``. Both ingress and egress interfaces
|
||||
can be used for redirection. The ``BPF_F_INGRESS`` value in ``flags`` is used
|
||||
to select the ingress path otherwise the egress path is selected. This is the
|
||||
only flag supported for now.
|
||||
|
||||
Returns ``SK_PASS`` on success, or ``SK_DROP`` on error.
|
||||
|
||||
bpf_sk_redirect_map()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key u64 flags)
|
||||
|
||||
Redirect the packet to the socket referenced by ``map`` (of type
|
||||
``BPF_MAP_TYPE_SOCKMAP``) at index ``key``. Both ingress and egress interfaces
|
||||
can be used for redirection. The ``BPF_F_INGRESS`` value in ``flags`` is used
|
||||
to select the ingress path otherwise the egress path is selected. This is the
|
||||
only flag supported for now.
|
||||
|
||||
Returns ``SK_PASS`` on success, or ``SK_DROP`` on error.
|
||||
|
||||
bpf_map_lookup_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
|
||||
|
||||
socket entries of type ``struct sock *`` can be retrieved using the
|
||||
``bpf_map_lookup_elem()`` helper.
|
||||
|
||||
bpf_sock_map_update()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_sock_map_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags)
|
||||
|
||||
Add an entry to, or update a ``map`` referencing sockets. The ``skops`` is used
|
||||
as a new value for the entry associated to ``key``. The ``flags`` argument can
|
||||
be one of the following:
|
||||
|
||||
- ``BPF_ANY``: Create a new element or update an existing element.
|
||||
- ``BPF_NOEXIST``: Create a new element only if it did not exist.
|
||||
- ``BPF_EXIST``: Update an existing element.
|
||||
|
||||
If the ``map`` has BPF programs (parser and verdict), those will be inherited
|
||||
by the socket being added. If the socket is already attached to BPF programs,
|
||||
this results in an error.
|
||||
|
||||
Returns 0 on success, or a negative error in case of failure.
|
||||
|
||||
bpf_sock_hash_update()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_sock_hash_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags)
|
||||
|
||||
Add an entry to, or update a sockhash ``map`` referencing sockets. The ``skops``
|
||||
is used as a new value for the entry associated to ``key``.
|
||||
|
||||
The ``flags`` argument can be one of the following:
|
||||
|
||||
- ``BPF_ANY``: Create a new element or update an existing element.
|
||||
- ``BPF_NOEXIST``: Create a new element only if it did not exist.
|
||||
- ``BPF_EXIST``: Update an existing element.
|
||||
|
||||
If the ``map`` has BPF programs (parser and verdict), those will be inherited
|
||||
by the socket being added. If the socket is already attached to BPF programs,
|
||||
this results in an error.
|
||||
|
||||
Returns 0 on success, or a negative error in case of failure.
|
||||
|
||||
bpf_msg_redirect_hash()
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_msg_redirect_hash(struct sk_msg_buff *msg, struct bpf_map *map, void *key, u64 flags)
|
||||
|
||||
This helper is used in programs implementing policies at the socket level. If
|
||||
the message ``msg`` is allowed to pass (i.e., if the verdict BPF program returns
|
||||
``SK_PASS``), redirect it to the socket referenced by ``map`` (of type
|
||||
``BPF_MAP_TYPE_SOCKHASH``) using hash ``key``. Both ingress and egress
|
||||
interfaces can be used for redirection. The ``BPF_F_INGRESS`` value in
|
||||
``flags`` is used to select the ingress path otherwise the egress path is
|
||||
selected. This is the only flag supported for now.
|
||||
|
||||
Returns ``SK_PASS`` on success, or ``SK_DROP`` on error.
|
||||
|
||||
bpf_sk_redirect_hash()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_sk_redirect_hash(struct sk_buff *skb, struct bpf_map *map, void *key, u64 flags)
|
||||
|
||||
This helper is used in programs implementing policies at the skb socket level.
|
||||
If the sk_buff ``skb`` is allowed to pass (i.e., if the verdict BPF program
|
||||
returns ``SK_PASS``), redirect it to the socket referenced by ``map`` (of type
|
||||
``BPF_MAP_TYPE_SOCKHASH``) using hash ``key``. Both ingress and egress
|
||||
interfaces can be used for redirection. The ``BPF_F_INGRESS`` value in
|
||||
``flags`` is used to select the ingress path otherwise the egress path is
|
||||
selected. This is the only flag supported for now.
|
||||
|
||||
Returns ``SK_PASS`` on success, or ``SK_DROP`` on error.
|
||||
|
||||
bpf_msg_apply_bytes()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_msg_apply_bytes(struct sk_msg_buff *msg, u32 bytes)
|
||||
|
||||
For socket policies, apply the verdict of the BPF program to the next (number
|
||||
of ``bytes``) of message ``msg``. For example, this helper can be used in the
|
||||
following cases:
|
||||
|
||||
- A single ``sendmsg()`` or ``sendfile()`` system call contains multiple
|
||||
logical messages that the BPF program is supposed to read and for which it
|
||||
should apply a verdict.
|
||||
- A BPF program only cares to read the first ``bytes`` of a ``msg``. If the
|
||||
message has a large payload, then setting up and calling the BPF program
|
||||
repeatedly for all bytes, even though the verdict is already known, would
|
||||
create unnecessary overhead.
|
||||
|
||||
Returns 0
|
||||
|
||||
bpf_msg_cork_bytes()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_msg_cork_bytes(struct sk_msg_buff *msg, u32 bytes)
|
||||
|
||||
For socket policies, prevent the execution of the verdict BPF program for
|
||||
message ``msg`` until the number of ``bytes`` have been accumulated.
|
||||
|
||||
This can be used when one needs a specific number of bytes before a verdict can
|
||||
be assigned, even if the data spans multiple ``sendmsg()`` or ``sendfile()``
|
||||
calls.
|
||||
|
||||
Returns 0
|
||||
|
||||
bpf_msg_pull_data()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_msg_pull_data(struct sk_msg_buff *msg, u32 start, u32 end, u64 flags)
|
||||
|
||||
For socket policies, pull in non-linear data from user space for ``msg`` and set
|
||||
pointers ``msg->data`` and ``msg->data_end`` to ``start`` and ``end`` bytes
|
||||
offsets into ``msg``, respectively.
|
||||
|
||||
If a program of type ``BPF_PROG_TYPE_SK_MSG`` is run on a ``msg`` it can only
|
||||
parse data that the (``data``, ``data_end``) pointers have already consumed.
|
||||
For ``sendmsg()`` hooks this is likely the first scatterlist element. But for
|
||||
calls relying on the ``sendpage`` handler (e.g., ``sendfile()``) this will be
|
||||
the range (**0**, **0**) because the data is shared with user space and by
|
||||
default the objective is to avoid allowing user space to modify data while (or
|
||||
after) BPF verdict is being decided. This helper can be used to pull in data
|
||||
and to set the start and end pointers to given values. Data will be copied if
|
||||
necessary (i.e., if data was not linear and if start and end pointers do not
|
||||
point to the same chunk).
|
||||
|
||||
A call to this helper is susceptible to change the underlying packet buffer.
|
||||
Therefore, at load time, all checks on pointers previously done by the verifier
|
||||
are invalidated and must be performed again, if the helper is used in
|
||||
combination with direct packet access.
|
||||
|
||||
All values for ``flags`` are reserved for future usage, and must be left at
|
||||
zero.
|
||||
|
||||
Returns 0 on success, or a negative error in case of failure.
|
||||
|
||||
bpf_map_lookup_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
|
||||
|
||||
Look up a socket entry in the sockmap or sockhash map.
|
||||
|
||||
Returns the socket entry associated to ``key``, or NULL if no entry was found.
|
||||
|
||||
bpf_map_update_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags)
|
||||
|
||||
Add or update a socket entry in a sockmap or sockhash.
|
||||
|
||||
The flags argument can be one of the following:
|
||||
|
||||
- BPF_ANY: Create a new element or update an existing element.
|
||||
- BPF_NOEXIST: Create a new element only if it did not exist.
|
||||
- BPF_EXIST: Update an existing element.
|
||||
|
||||
Returns 0 on success, or a negative error in case of failure.
|
||||
|
||||
bpf_map_delete_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
long bpf_map_delete_elem(struct bpf_map *map, const void *key)
|
||||
|
||||
Delete a socket entry from a sockmap or a sockhash.
|
||||
|
||||
Returns 0 on success, or a negative error in case of failure.
|
||||
|
||||
User space
|
||||
----------
|
||||
bpf_map_update_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags)
|
||||
|
||||
Sockmap entries can be added or updated using the ``bpf_map_update_elem()``
|
||||
function. The ``key`` parameter is the index value of the sockmap array. And the
|
||||
``value`` parameter is the FD value of that socket.
|
||||
|
||||
Under the hood, the sockmap update function uses the socket FD value to
|
||||
retrieve the associated socket and its attached psock.
|
||||
|
||||
The flags argument can be one of the following:
|
||||
|
||||
- BPF_ANY: Create a new element or update an existing element.
|
||||
- BPF_NOEXIST: Create a new element only if it did not exist.
|
||||
- BPF_EXIST: Update an existing element.
|
||||
|
||||
bpf_map_lookup_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
int bpf_map_lookup_elem(int fd, const void *key, void *value)
|
||||
|
||||
Sockmap entries can be retrieved using the ``bpf_map_lookup_elem()`` function.
|
||||
|
||||
.. note::
|
||||
The entry returned is a socket cookie rather than a socket itself.
|
||||
|
||||
bpf_map_delete_elem()
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: c
|
||||
|
||||
int bpf_map_delete_elem(int fd, const void *key)
|
||||
|
||||
Sockmap entries can be deleted using the ``bpf_map_delete_elem()``
|
||||
function.
|
||||
|
||||
Returns 0 on success, or negative error in case of failure.
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
Kernel BPF
|
||||
----------
|
||||
Several examples of the use of sockmap APIs can be found in:
|
||||
|
||||
- `tools/testing/selftests/bpf/progs/test_sockmap_kern.h`_
|
||||
- `tools/testing/selftests/bpf/progs/sockmap_parse_prog.c`_
|
||||
- `tools/testing/selftests/bpf/progs/sockmap_verdict_prog.c`_
|
||||
- `tools/testing/selftests/bpf/progs/test_sockmap_listen.c`_
|
||||
- `tools/testing/selftests/bpf/progs/test_sockmap_update.c`_
|
||||
|
||||
The following code snippet shows how to declare a sockmap.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_SOCKMAP);
|
||||
__uint(max_entries, 1);
|
||||
__type(key, __u32);
|
||||
__type(value, __u64);
|
||||
} sock_map_rx SEC(".maps");
|
||||
|
||||
The following code snippet shows a sample parser program.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
SEC("sk_skb/stream_parser")
|
||||
int bpf_prog_parser(struct __sk_buff *skb)
|
||||
{
|
||||
return skb->len;
|
||||
}
|
||||
|
||||
The following code snippet shows a simple verdict program that interacts with a
|
||||
sockmap to redirect traffic to another socket based on the local port.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
SEC("sk_skb/stream_verdict")
|
||||
int bpf_prog_verdict(struct __sk_buff *skb)
|
||||
{
|
||||
__u32 lport = skb->local_port;
|
||||
__u32 idx = 0;
|
||||
|
||||
if (lport == 10000)
|
||||
return bpf_sk_redirect_map(skb, &sock_map_rx, idx, 0);
|
||||
|
||||
return SK_PASS;
|
||||
}
|
||||
|
||||
The following code snippet shows how to declare a sockhash map.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
struct socket_key {
|
||||
__u32 src_ip;
|
||||
__u32 dst_ip;
|
||||
__u32 src_port;
|
||||
__u32 dst_port;
|
||||
};
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_SOCKHASH);
|
||||
__uint(max_entries, 1);
|
||||
__type(key, struct socket_key);
|
||||
__type(value, __u64);
|
||||
} sock_hash_rx SEC(".maps");
|
||||
|
||||
The following code snippet shows a simple verdict program that interacts with a
|
||||
sockhash to redirect traffic to another socket based on a hash of some of the
|
||||
skb parameters.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
static inline
|
||||
void extract_socket_key(struct __sk_buff *skb, struct socket_key *key)
|
||||
{
|
||||
key->src_ip = skb->remote_ip4;
|
||||
key->dst_ip = skb->local_ip4;
|
||||
key->src_port = skb->remote_port >> 16;
|
||||
key->dst_port = (bpf_htonl(skb->local_port)) >> 16;
|
||||
}
|
||||
|
||||
SEC("sk_skb/stream_verdict")
|
||||
int bpf_prog_verdict(struct __sk_buff *skb)
|
||||
{
|
||||
struct socket_key key;
|
||||
|
||||
extract_socket_key(skb, &key);
|
||||
|
||||
return bpf_sk_redirect_hash(skb, &sock_hash_rx, &key, 0);
|
||||
}
|
||||
|
||||
User space
|
||||
----------
|
||||
Several examples of the use of sockmap APIs can be found in:
|
||||
|
||||
- `tools/testing/selftests/bpf/prog_tests/sockmap_basic.c`_
|
||||
- `tools/testing/selftests/bpf/test_sockmap.c`_
|
||||
- `tools/testing/selftests/bpf/test_maps.c`_
|
||||
|
||||
The following code sample shows how to create a sockmap, attach a parser and
|
||||
verdict program, as well as add a socket entry.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
int create_sample_sockmap(int sock, int parse_prog_fd, int verdict_prog_fd)
|
||||
{
|
||||
int index = 0;
|
||||
int map, err;
|
||||
|
||||
map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), sizeof(int), 1, NULL);
|
||||
if (map < 0) {
|
||||
fprintf(stderr, "Failed to create sockmap: %s\n", strerror(errno));
|
||||
return -1;
|
||||
}
|
||||
|
||||
err = bpf_prog_attach(parse_prog_fd, map, BPF_SK_SKB_STREAM_PARSER, 0);
|
||||
if (err){
|
||||
fprintf(stderr, "Failed to attach_parser_prog_to_map: %s\n", strerror(errno));
|
||||
goto out;
|
||||
}
|
||||
|
||||
err = bpf_prog_attach(verdict_prog_fd, map, BPF_SK_SKB_STREAM_VERDICT, 0);
|
||||
if (err){
|
||||
fprintf(stderr, "Failed to attach_verdict_prog_to_map: %s\n", strerror(errno));
|
||||
goto out;
|
||||
}
|
||||
|
||||
err = bpf_map_update_elem(map, &index, &sock, BPF_NOEXIST);
|
||||
if (err) {
|
||||
fprintf(stderr, "Failed to update sockmap: %s\n", strerror(errno));
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
close(map);
|
||||
return err;
|
||||
}
|
||||
|
||||
References
|
||||
===========
|
||||
|
||||
- https://github.com/jrfastab/linux-kernel-xdp/commit/c89fd73cb9d2d7f3c716c3e00836f07b1aeb261f
|
||||
- https://lwn.net/Articles/731133/
|
||||
- http://vger.kernel.org/lpc_net2018_talks/ktls_bpf_paper.pdf
|
||||
- https://lwn.net/Articles/748628/
|
||||
- https://lore.kernel.org/bpf/20200218171023.844439-7-jakub@cloudflare.com/
|
||||
|
||||
.. _`tools/testing/selftests/bpf/progs/test_sockmap_kern.h`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/test_sockmap_kern.h
|
||||
.. _`tools/testing/selftests/bpf/progs/sockmap_parse_prog.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/sockmap_parse_prog.c
|
||||
.. _`tools/testing/selftests/bpf/progs/sockmap_verdict_prog.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/sockmap_verdict_prog.c
|
||||
.. _`tools/testing/selftests/bpf/prog_tests/sockmap_basic.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
|
||||
.. _`tools/testing/selftests/bpf/test_sockmap.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/test_sockmap.c
|
||||
.. _`tools/testing/selftests/bpf/test_maps.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/test_maps.c
|
||||
.. _`tools/testing/selftests/bpf/progs/test_sockmap_listen.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
|
||||
.. _`tools/testing/selftests/bpf/progs/test_sockmap_update.c`: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/test_sockmap_update.c
|
|
@ -1003,6 +1003,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
|
|||
u8 b2 = 0, b3 = 0;
|
||||
u8 *start_of_ldx;
|
||||
s64 jmp_offset;
|
||||
s16 insn_off;
|
||||
u8 jmp_cond;
|
||||
u8 *func;
|
||||
int nops;
|
||||
|
@ -1369,57 +1370,52 @@ st: if (is_imm8(insn->off))
|
|||
case BPF_LDX | BPF_PROBE_MEM | BPF_W:
|
||||
case BPF_LDX | BPF_MEM | BPF_DW:
|
||||
case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
|
||||
if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
|
||||
/* Though the verifier prevents negative insn->off in BPF_PROBE_MEM
|
||||
* add abs(insn->off) to the limit to make sure that negative
|
||||
* offset won't be an issue.
|
||||
* insn->off is s16, so it won't affect valid pointers.
|
||||
*/
|
||||
u64 limit = TASK_SIZE_MAX + PAGE_SIZE + abs(insn->off);
|
||||
u8 *end_of_jmp1, *end_of_jmp2;
|
||||
insn_off = insn->off;
|
||||
|
||||
if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
|
||||
/* Conservatively check that src_reg + insn->off is a kernel address:
|
||||
* 1. src_reg + insn->off >= limit
|
||||
* 2. src_reg + insn->off doesn't become small positive.
|
||||
* Cannot do src_reg + insn->off >= limit in one branch,
|
||||
* since it needs two spare registers, but JIT has only one.
|
||||
* src_reg + insn->off >= TASK_SIZE_MAX + PAGE_SIZE
|
||||
* src_reg is used as scratch for src_reg += insn->off and restored
|
||||
* after emit_ldx if necessary
|
||||
*/
|
||||
|
||||
u64 limit = TASK_SIZE_MAX + PAGE_SIZE;
|
||||
u8 *end_of_jmp;
|
||||
|
||||
/* At end of these emitted checks, insn->off will have been added
|
||||
* to src_reg, so no need to do relative load with insn->off offset
|
||||
*/
|
||||
insn_off = 0;
|
||||
|
||||
/* movabsq r11, limit */
|
||||
EMIT2(add_1mod(0x48, AUX_REG), add_1reg(0xB8, AUX_REG));
|
||||
EMIT((u32)limit, 4);
|
||||
EMIT(limit >> 32, 4);
|
||||
|
||||
if (insn->off) {
|
||||
/* add src_reg, insn->off */
|
||||
maybe_emit_1mod(&prog, src_reg, true);
|
||||
EMIT2_off32(0x81, add_1reg(0xC0, src_reg), insn->off);
|
||||
}
|
||||
|
||||
/* cmp src_reg, r11 */
|
||||
maybe_emit_mod(&prog, src_reg, AUX_REG, true);
|
||||
EMIT2(0x39, add_2reg(0xC0, src_reg, AUX_REG));
|
||||
/* if unsigned '<' goto end_of_jmp2 */
|
||||
EMIT2(X86_JB, 0);
|
||||
end_of_jmp1 = prog;
|
||||
|
||||
/* mov r11, src_reg */
|
||||
emit_mov_reg(&prog, true, AUX_REG, src_reg);
|
||||
/* add r11, insn->off */
|
||||
maybe_emit_1mod(&prog, AUX_REG, true);
|
||||
EMIT2_off32(0x81, add_1reg(0xC0, AUX_REG), insn->off);
|
||||
/* jmp if not carry to start_of_ldx
|
||||
* Otherwise ERR_PTR(-EINVAL) + 128 will be the user addr
|
||||
* that has to be rejected.
|
||||
*/
|
||||
EMIT2(0x73 /* JNC */, 0);
|
||||
end_of_jmp2 = prog;
|
||||
/* if unsigned '>=', goto load */
|
||||
EMIT2(X86_JAE, 0);
|
||||
end_of_jmp = prog;
|
||||
|
||||
/* xor dst_reg, dst_reg */
|
||||
emit_mov_imm32(&prog, false, dst_reg, 0);
|
||||
/* jmp byte_after_ldx */
|
||||
EMIT2(0xEB, 0);
|
||||
|
||||
/* populate jmp_offset for JB above to jump to xor dst_reg */
|
||||
end_of_jmp1[-1] = end_of_jmp2 - end_of_jmp1;
|
||||
/* populate jmp_offset for JNC above to jump to start_of_ldx */
|
||||
/* populate jmp_offset for JAE above to jump to start_of_ldx */
|
||||
start_of_ldx = prog;
|
||||
end_of_jmp2[-1] = start_of_ldx - end_of_jmp2;
|
||||
end_of_jmp[-1] = start_of_ldx - end_of_jmp;
|
||||
}
|
||||
emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
|
||||
emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
|
||||
if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
|
||||
struct exception_table_entry *ex;
|
||||
u8 *_insn = image + proglen + (start_of_ldx - temp);
|
||||
|
@ -1428,6 +1424,18 @@ st: if (is_imm8(insn->off))
|
|||
/* populate jmp_offset for JMP above */
|
||||
start_of_ldx[-1] = prog - start_of_ldx;
|
||||
|
||||
if (insn->off && src_reg != dst_reg) {
|
||||
/* sub src_reg, insn->off
|
||||
* Restore src_reg after "add src_reg, insn->off" in prev
|
||||
* if statement. But if src_reg == dst_reg, emit_ldx
|
||||
* above already clobbered src_reg, so no need to restore.
|
||||
* If add src_reg, insn->off was unnecessary, no need to
|
||||
* restore either.
|
||||
*/
|
||||
maybe_emit_1mod(&prog, src_reg, true);
|
||||
EMIT2_off32(0x81, add_1reg(0xE8, src_reg), insn->off);
|
||||
}
|
||||
|
||||
if (!bpf_prog->aux->extable)
|
||||
break;
|
||||
|
||||
|
|
|
@ -189,7 +189,7 @@ struct btf_field_kptr {
|
|||
u32 btf_id;
|
||||
};
|
||||
|
||||
struct btf_field_list_head {
|
||||
struct btf_field_graph_root {
|
||||
struct btf *btf;
|
||||
u32 value_btf_id;
|
||||
u32 node_offset;
|
||||
|
@ -201,7 +201,7 @@ struct btf_field {
|
|||
enum btf_field_type type;
|
||||
union {
|
||||
struct btf_field_kptr kptr;
|
||||
struct btf_field_list_head list_head;
|
||||
struct btf_field_graph_root graph_root;
|
||||
};
|
||||
};
|
||||
|
||||
|
@ -2795,10 +2795,18 @@ struct btf_id_set;
|
|||
bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
|
||||
|
||||
#define MAX_BPRINTF_VARARGS 12
|
||||
#define MAX_BPRINTF_BUF 1024
|
||||
|
||||
struct bpf_bprintf_data {
|
||||
u32 *bin_args;
|
||||
char *buf;
|
||||
bool get_bin_args;
|
||||
bool get_buf;
|
||||
};
|
||||
|
||||
int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
|
||||
u32 **bin_buf, u32 num_args);
|
||||
void bpf_bprintf_cleanup(void);
|
||||
u32 num_args, struct bpf_bprintf_data *data);
|
||||
void bpf_bprintf_cleanup(struct bpf_bprintf_data *data);
|
||||
|
||||
/* the implementation of the opaque uapi struct bpf_dynptr */
|
||||
struct bpf_dynptr_kern {
|
||||
|
|
|
@ -92,6 +92,26 @@ struct bpf_reg_state {
|
|||
|
||||
u32 subprogno; /* for PTR_TO_FUNC */
|
||||
};
|
||||
/* For scalar types (SCALAR_VALUE), this represents our knowledge of
|
||||
* the actual value.
|
||||
* For pointer types, this represents the variable part of the offset
|
||||
* from the pointed-to object, and is shared with all bpf_reg_states
|
||||
* with the same id as us.
|
||||
*/
|
||||
struct tnum var_off;
|
||||
/* Used to determine if any memory access using this register will
|
||||
* result in a bad access.
|
||||
* These refer to the same value as var_off, not necessarily the actual
|
||||
* contents of the register.
|
||||
*/
|
||||
s64 smin_value; /* minimum possible (s64)value */
|
||||
s64 smax_value; /* maximum possible (s64)value */
|
||||
u64 umin_value; /* minimum possible (u64)value */
|
||||
u64 umax_value; /* maximum possible (u64)value */
|
||||
s32 s32_min_value; /* minimum possible (s32)value */
|
||||
s32 s32_max_value; /* maximum possible (s32)value */
|
||||
u32 u32_min_value; /* minimum possible (u32)value */
|
||||
u32 u32_max_value; /* maximum possible (u32)value */
|
||||
/* For PTR_TO_PACKET, used to find other pointers with the same variable
|
||||
* offset, so they can share range knowledge.
|
||||
* For PTR_TO_MAP_VALUE_OR_NULL this is used to share which map value we
|
||||
|
@ -144,26 +164,6 @@ struct bpf_reg_state {
|
|||
* allowed and has the same effect as bpf_sk_release(sk).
|
||||
*/
|
||||
u32 ref_obj_id;
|
||||
/* For scalar types (SCALAR_VALUE), this represents our knowledge of
|
||||
* the actual value.
|
||||
* For pointer types, this represents the variable part of the offset
|
||||
* from the pointed-to object, and is shared with all bpf_reg_states
|
||||
* with the same id as us.
|
||||
*/
|
||||
struct tnum var_off;
|
||||
/* Used to determine if any memory access using this register will
|
||||
* result in a bad access.
|
||||
* These refer to the same value as var_off, not necessarily the actual
|
||||
* contents of the register.
|
||||
*/
|
||||
s64 smin_value; /* minimum possible (s64)value */
|
||||
s64 smax_value; /* maximum possible (s64)value */
|
||||
u64 umin_value; /* minimum possible (u64)value */
|
||||
u64 umax_value; /* maximum possible (u64)value */
|
||||
s32 s32_min_value; /* minimum possible (s32)value */
|
||||
s32 s32_max_value; /* maximum possible (s32)value */
|
||||
u32 u32_min_value; /* minimum possible (u32)value */
|
||||
u32 u32_max_value; /* maximum possible (u32)value */
|
||||
/* parentage chain for liveness checking */
|
||||
struct bpf_reg_state *parent;
|
||||
/* Inside the callee two registers can be both PTR_TO_STACK like
|
||||
|
|
|
@ -2001,6 +2001,9 @@ union bpf_attr {
|
|||
* sending the packet. This flag was added for GRE
|
||||
* encapsulation, but might be used with other protocols
|
||||
* as well in the future.
|
||||
* **BPF_F_NO_TUNNEL_KEY**
|
||||
* Add a flag to tunnel metadata indicating that no tunnel
|
||||
* key should be set in the resulting tunnel header.
|
||||
*
|
||||
* Here is a typical usage on the transmit path:
|
||||
*
|
||||
|
@ -5764,6 +5767,7 @@ enum {
|
|||
BPF_F_ZERO_CSUM_TX = (1ULL << 1),
|
||||
BPF_F_DONT_FRAGMENT = (1ULL << 2),
|
||||
BPF_F_SEQ_NUMBER = (1ULL << 3),
|
||||
BPF_F_NO_TUNNEL_KEY = (1ULL << 4),
|
||||
};
|
||||
|
||||
/* BPF_FUNC_skb_get_tunnel_key flags. */
|
||||
|
|
|
@ -580,8 +580,8 @@ static struct bpf_local_storage_map *__bpf_local_storage_map_alloc(union bpf_att
|
|||
raw_spin_lock_init(&smap->buckets[i].lock);
|
||||
}
|
||||
|
||||
smap->elem_size =
|
||||
sizeof(struct bpf_local_storage_elem) + attr->value_size;
|
||||
smap->elem_size = offsetof(struct bpf_local_storage_elem,
|
||||
sdata.data[attr->value_size]);
|
||||
|
||||
return smap;
|
||||
}
|
||||
|
|
|
@ -3228,7 +3228,7 @@ struct btf_field_info {
|
|||
struct {
|
||||
const char *node_name;
|
||||
u32 value_btf_id;
|
||||
} list_head;
|
||||
} graph_root;
|
||||
};
|
||||
};
|
||||
|
||||
|
@ -3335,8 +3335,8 @@ static int btf_find_list_head(const struct btf *btf, const struct btf_type *pt,
|
|||
return -EINVAL;
|
||||
info->type = BPF_LIST_HEAD;
|
||||
info->off = off;
|
||||
info->list_head.value_btf_id = id;
|
||||
info->list_head.node_name = list_node;
|
||||
info->graph_root.value_btf_id = id;
|
||||
info->graph_root.node_name = list_node;
|
||||
return BTF_FIELD_FOUND;
|
||||
}
|
||||
|
||||
|
@ -3604,13 +3604,14 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field,
|
|||
u32 offset;
|
||||
int i;
|
||||
|
||||
t = btf_type_by_id(btf, info->list_head.value_btf_id);
|
||||
t = btf_type_by_id(btf, info->graph_root.value_btf_id);
|
||||
/* We've already checked that value_btf_id is a struct type. We
|
||||
* just need to figure out the offset of the list_node, and
|
||||
* verify its type.
|
||||
*/
|
||||
for_each_member(i, t, member) {
|
||||
if (strcmp(info->list_head.node_name, __btf_name_by_offset(btf, member->name_off)))
|
||||
if (strcmp(info->graph_root.node_name,
|
||||
__btf_name_by_offset(btf, member->name_off)))
|
||||
continue;
|
||||
/* Invalid BTF, two members with same name */
|
||||
if (n)
|
||||
|
@ -3627,9 +3628,9 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field,
|
|||
if (offset % __alignof__(struct bpf_list_node))
|
||||
return -EINVAL;
|
||||
|
||||
field->list_head.btf = (struct btf *)btf;
|
||||
field->list_head.value_btf_id = info->list_head.value_btf_id;
|
||||
field->list_head.node_offset = offset;
|
||||
field->graph_root.btf = (struct btf *)btf;
|
||||
field->graph_root.value_btf_id = info->graph_root.value_btf_id;
|
||||
field->graph_root.node_offset = offset;
|
||||
}
|
||||
if (!n)
|
||||
return -ENOENT;
|
||||
|
@ -3736,11 +3737,11 @@ int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec)
|
|||
|
||||
if (!(rec->fields[i].type & BPF_LIST_HEAD))
|
||||
continue;
|
||||
btf_id = rec->fields[i].list_head.value_btf_id;
|
||||
btf_id = rec->fields[i].graph_root.value_btf_id;
|
||||
meta = btf_find_struct_meta(btf, btf_id);
|
||||
if (!meta)
|
||||
return -EFAULT;
|
||||
rec->fields[i].list_head.value_rec = meta->record;
|
||||
rec->fields[i].graph_root.value_rec = meta->record;
|
||||
|
||||
if (!(rec->field_mask & BPF_LIST_NODE))
|
||||
continue;
|
||||
|
|
|
@ -756,19 +756,20 @@ static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
|
|||
/* Per-cpu temp buffers used by printf-like helpers to store the bprintf binary
|
||||
* arguments representation.
|
||||
*/
|
||||
#define MAX_BPRINTF_BUF_LEN 512
|
||||
#define MAX_BPRINTF_BIN_ARGS 512
|
||||
|
||||
/* Support executing three nested bprintf helper calls on a given CPU */
|
||||
#define MAX_BPRINTF_NEST_LEVEL 3
|
||||
struct bpf_bprintf_buffers {
|
||||
char tmp_bufs[MAX_BPRINTF_NEST_LEVEL][MAX_BPRINTF_BUF_LEN];
|
||||
char bin_args[MAX_BPRINTF_BIN_ARGS];
|
||||
char buf[MAX_BPRINTF_BUF];
|
||||
};
|
||||
static DEFINE_PER_CPU(struct bpf_bprintf_buffers, bpf_bprintf_bufs);
|
||||
|
||||
static DEFINE_PER_CPU(struct bpf_bprintf_buffers[MAX_BPRINTF_NEST_LEVEL], bpf_bprintf_bufs);
|
||||
static DEFINE_PER_CPU(int, bpf_bprintf_nest_level);
|
||||
|
||||
static int try_get_fmt_tmp_buf(char **tmp_buf)
|
||||
static int try_get_buffers(struct bpf_bprintf_buffers **bufs)
|
||||
{
|
||||
struct bpf_bprintf_buffers *bufs;
|
||||
int nest_level;
|
||||
|
||||
preempt_disable();
|
||||
|
@ -778,18 +779,19 @@ static int try_get_fmt_tmp_buf(char **tmp_buf)
|
|||
preempt_enable();
|
||||
return -EBUSY;
|
||||
}
|
||||
bufs = this_cpu_ptr(&bpf_bprintf_bufs);
|
||||
*tmp_buf = bufs->tmp_bufs[nest_level - 1];
|
||||
*bufs = this_cpu_ptr(&bpf_bprintf_bufs[nest_level - 1]);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void bpf_bprintf_cleanup(void)
|
||||
void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
|
||||
{
|
||||
if (this_cpu_read(bpf_bprintf_nest_level)) {
|
||||
this_cpu_dec(bpf_bprintf_nest_level);
|
||||
preempt_enable();
|
||||
}
|
||||
if (!data->bin_args && !data->buf)
|
||||
return;
|
||||
if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
|
||||
return;
|
||||
this_cpu_dec(bpf_bprintf_nest_level);
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -798,18 +800,20 @@ void bpf_bprintf_cleanup(void)
|
|||
* Returns a negative value if fmt is an invalid format string or 0 otherwise.
|
||||
*
|
||||
* This can be used in two ways:
|
||||
* - Format string verification only: when bin_args is NULL
|
||||
* - Format string verification only: when data->get_bin_args is false
|
||||
* - Arguments preparation: in addition to the above verification, it writes in
|
||||
* bin_args a binary representation of arguments usable by bstr_printf where
|
||||
* pointers from BPF have been sanitized.
|
||||
* data->bin_args a binary representation of arguments usable by bstr_printf
|
||||
* where pointers from BPF have been sanitized.
|
||||
*
|
||||
* In argument preparation mode, if 0 is returned, safe temporary buffers are
|
||||
* allocated and bpf_bprintf_cleanup should be called to free them after use.
|
||||
*/
|
||||
int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
|
||||
u32 **bin_args, u32 num_args)
|
||||
u32 num_args, struct bpf_bprintf_data *data)
|
||||
{
|
||||
bool get_buffers = (data->get_bin_args && num_args) || data->get_buf;
|
||||
char *unsafe_ptr = NULL, *tmp_buf = NULL, *tmp_buf_end, *fmt_end;
|
||||
struct bpf_bprintf_buffers *buffers = NULL;
|
||||
size_t sizeof_cur_arg, sizeof_cur_ip;
|
||||
int err, i, num_spec = 0;
|
||||
u64 cur_arg;
|
||||
|
@ -820,14 +824,19 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
|
|||
return -EINVAL;
|
||||
fmt_size = fmt_end - fmt;
|
||||
|
||||
if (bin_args) {
|
||||
if (num_args && try_get_fmt_tmp_buf(&tmp_buf))
|
||||
return -EBUSY;
|
||||
if (get_buffers && try_get_buffers(&buffers))
|
||||
return -EBUSY;
|
||||
|
||||
tmp_buf_end = tmp_buf + MAX_BPRINTF_BUF_LEN;
|
||||
*bin_args = (u32 *)tmp_buf;
|
||||
if (data->get_bin_args) {
|
||||
if (num_args)
|
||||
tmp_buf = buffers->bin_args;
|
||||
tmp_buf_end = tmp_buf + MAX_BPRINTF_BIN_ARGS;
|
||||
data->bin_args = (u32 *)tmp_buf;
|
||||
}
|
||||
|
||||
if (data->get_buf)
|
||||
data->buf = buffers->buf;
|
||||
|
||||
for (i = 0; i < fmt_size; i++) {
|
||||
if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
|
||||
err = -EINVAL;
|
||||
|
@ -1021,31 +1030,33 @@ nocopy_fmt:
|
|||
err = 0;
|
||||
out:
|
||||
if (err)
|
||||
bpf_bprintf_cleanup();
|
||||
bpf_bprintf_cleanup(data);
|
||||
return err;
|
||||
}
|
||||
|
||||
BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
|
||||
const void *, data, u32, data_len)
|
||||
const void *, args, u32, data_len)
|
||||
{
|
||||
struct bpf_bprintf_data data = {
|
||||
.get_bin_args = true,
|
||||
};
|
||||
int err, num_args;
|
||||
u32 *bin_args;
|
||||
|
||||
if (data_len % 8 || data_len > MAX_BPRINTF_VARARGS * 8 ||
|
||||
(data_len && !data))
|
||||
(data_len && !args))
|
||||
return -EINVAL;
|
||||
num_args = data_len / 8;
|
||||
|
||||
/* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
|
||||
* can safely give an unbounded size.
|
||||
*/
|
||||
err = bpf_bprintf_prepare(fmt, UINT_MAX, data, &bin_args, num_args);
|
||||
err = bpf_bprintf_prepare(fmt, UINT_MAX, args, num_args, &data);
|
||||
if (err < 0)
|
||||
return err;
|
||||
|
||||
err = bstr_printf(str, str_size, fmt, bin_args);
|
||||
err = bstr_printf(str, str_size, fmt, data.bin_args);
|
||||
|
||||
bpf_bprintf_cleanup();
|
||||
bpf_bprintf_cleanup(&data);
|
||||
|
||||
return err + 1;
|
||||
}
|
||||
|
@ -1745,12 +1756,12 @@ unlock:
|
|||
while (head != orig_head) {
|
||||
void *obj = head;
|
||||
|
||||
obj -= field->list_head.node_offset;
|
||||
obj -= field->graph_root.node_offset;
|
||||
head = head->next;
|
||||
/* The contained type can also have resources, including a
|
||||
* bpf_list_head which needs to be freed.
|
||||
*/
|
||||
bpf_obj_free_fields(field->list_head.value_rec, obj);
|
||||
bpf_obj_free_fields(field->graph_root.value_rec, obj);
|
||||
/* bpf_mem_free requires migrate_disable(), since we can be
|
||||
* called from map free path as well apart from BPF program (as
|
||||
* part of map ops doing bpf_obj_free_fields).
|
||||
|
|
|
@ -5319,7 +5319,6 @@ static struct ctl_table bpf_syscall_table[] = {
|
|||
{
|
||||
.procname = "bpf_stats_enabled",
|
||||
.data = &bpf_stats_enabled_key.key,
|
||||
.maxlen = sizeof(bpf_stats_enabled_key),
|
||||
.mode = 0644,
|
||||
.proc_handler = bpf_stats_handler,
|
||||
},
|
||||
|
|
|
@ -1402,9 +1402,11 @@ static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
|
|||
*/
|
||||
static void __mark_reg_known(struct bpf_reg_state *reg, u64 imm)
|
||||
{
|
||||
/* Clear id, off, and union(map_ptr, range) */
|
||||
/* Clear off and union(map_ptr, range) */
|
||||
memset(((u8 *)reg) + sizeof(reg->type), 0,
|
||||
offsetof(struct bpf_reg_state, var_off) - sizeof(reg->type));
|
||||
reg->id = 0;
|
||||
reg->ref_obj_id = 0;
|
||||
___mark_reg_known(reg, imm);
|
||||
}
|
||||
|
||||
|
@ -1750,11 +1752,13 @@ static void __mark_reg_unknown(const struct bpf_verifier_env *env,
|
|||
struct bpf_reg_state *reg)
|
||||
{
|
||||
/*
|
||||
* Clear type, id, off, and union(map_ptr, range) and
|
||||
* Clear type, off, and union(map_ptr, range) and
|
||||
* padding between 'type' and union
|
||||
*/
|
||||
memset(reg, 0, offsetof(struct bpf_reg_state, var_off));
|
||||
reg->type = SCALAR_VALUE;
|
||||
reg->id = 0;
|
||||
reg->ref_obj_id = 0;
|
||||
reg->var_off = tnum_unknown;
|
||||
reg->frameno = 0;
|
||||
reg->precise = !env->bpf_capable;
|
||||
|
@ -7612,6 +7616,7 @@ static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
|
|||
struct bpf_reg_state *fmt_reg = ®s[BPF_REG_3];
|
||||
struct bpf_reg_state *data_len_reg = ®s[BPF_REG_5];
|
||||
struct bpf_map *fmt_map = fmt_reg->map_ptr;
|
||||
struct bpf_bprintf_data data = {};
|
||||
int err, fmt_map_off, num_args;
|
||||
u64 fmt_addr;
|
||||
char *fmt;
|
||||
|
@ -7636,7 +7641,7 @@ static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
|
|||
/* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
|
||||
* can focus on validating the format specifiers.
|
||||
*/
|
||||
err = bpf_bprintf_prepare(fmt, UINT_MAX, NULL, NULL, num_args);
|
||||
err = bpf_bprintf_prepare(fmt, UINT_MAX, NULL, num_args, &data);
|
||||
if (err < 0)
|
||||
verbose(env, "Invalid format string\n");
|
||||
|
||||
|
@ -8771,21 +8776,22 @@ static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env,
|
|||
|
||||
field = meta->arg_list_head.field;
|
||||
|
||||
et = btf_type_by_id(field->list_head.btf, field->list_head.value_btf_id);
|
||||
et = btf_type_by_id(field->graph_root.btf, field->graph_root.value_btf_id);
|
||||
t = btf_type_by_id(reg->btf, reg->btf_id);
|
||||
if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->list_head.btf,
|
||||
field->list_head.value_btf_id, true)) {
|
||||
if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->graph_root.btf,
|
||||
field->graph_root.value_btf_id, true)) {
|
||||
verbose(env, "operation on bpf_list_head expects arg#1 bpf_list_node at offset=%d "
|
||||
"in struct %s, but arg is at offset=%d in struct %s\n",
|
||||
field->list_head.node_offset, btf_name_by_offset(field->list_head.btf, et->name_off),
|
||||
field->graph_root.node_offset,
|
||||
btf_name_by_offset(field->graph_root.btf, et->name_off),
|
||||
list_node_off, btf_name_by_offset(reg->btf, t->name_off));
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (list_node_off != field->list_head.node_offset) {
|
||||
if (list_node_off != field->graph_root.node_offset) {
|
||||
verbose(env, "arg#1 offset=%d, but expected bpf_list_node at offset=%d in struct %s\n",
|
||||
list_node_off, field->list_head.node_offset,
|
||||
btf_name_by_offset(field->list_head.btf, et->name_off));
|
||||
list_node_off, field->graph_root.node_offset,
|
||||
btf_name_by_offset(field->graph_root.btf, et->name_off));
|
||||
return -EINVAL;
|
||||
}
|
||||
/* Set arg#1 for expiration after unlock */
|
||||
|
@ -9227,9 +9233,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
|
|||
|
||||
mark_reg_known_zero(env, regs, BPF_REG_0);
|
||||
regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC;
|
||||
regs[BPF_REG_0].btf = field->list_head.btf;
|
||||
regs[BPF_REG_0].btf_id = field->list_head.value_btf_id;
|
||||
regs[BPF_REG_0].off = field->list_head.node_offset;
|
||||
regs[BPF_REG_0].btf = field->graph_root.btf;
|
||||
regs[BPF_REG_0].btf_id = field->graph_root.value_btf_id;
|
||||
regs[BPF_REG_0].off = field->graph_root.node_offset;
|
||||
} else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
|
||||
mark_reg_known_zero(env, regs, BPF_REG_0);
|
||||
regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
|
||||
|
@ -12941,6 +12947,13 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
|
|||
{
|
||||
unsigned int i;
|
||||
|
||||
/* either both IDs should be set or both should be zero */
|
||||
if (!!old_id != !!cur_id)
|
||||
return false;
|
||||
|
||||
if (old_id == 0) /* cur_id == 0 as well */
|
||||
return true;
|
||||
|
||||
for (i = 0; i < BPF_ID_MAP_SIZE; i++) {
|
||||
if (!idmap[i].old) {
|
||||
/* Reached an empty slot; haven't seen this id before */
|
||||
|
@ -13052,79 +13065,74 @@ next:
|
|||
}
|
||||
}
|
||||
|
||||
static bool regs_exact(const struct bpf_reg_state *rold,
|
||||
const struct bpf_reg_state *rcur,
|
||||
struct bpf_id_pair *idmap)
|
||||
{
|
||||
return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
|
||||
check_ids(rold->id, rcur->id, idmap) &&
|
||||
check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap);
|
||||
}
|
||||
|
||||
/* Returns true if (rold safe implies rcur safe) */
|
||||
static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
|
||||
struct bpf_reg_state *rcur, struct bpf_id_pair *idmap)
|
||||
{
|
||||
bool equal;
|
||||
|
||||
if (!(rold->live & REG_LIVE_READ))
|
||||
/* explored state didn't use this */
|
||||
return true;
|
||||
|
||||
equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, parent)) == 0;
|
||||
|
||||
if (rold->type == NOT_INIT)
|
||||
/* explored state can't have used this */
|
||||
return true;
|
||||
if (rcur->type == NOT_INIT)
|
||||
return false;
|
||||
|
||||
/* Enforce that register types have to match exactly, including their
|
||||
* modifiers (like PTR_MAYBE_NULL, MEM_RDONLY, etc), as a general
|
||||
* rule.
|
||||
*
|
||||
* One can make a point that using a pointer register as unbounded
|
||||
* SCALAR would be technically acceptable, but this could lead to
|
||||
* pointer leaks because scalars are allowed to leak while pointers
|
||||
* are not. We could make this safe in special cases if root is
|
||||
* calling us, but it's probably not worth the hassle.
|
||||
*
|
||||
* Also, register types that are *not* MAYBE_NULL could technically be
|
||||
* safe to use as their MAYBE_NULL variants (e.g., PTR_TO_MAP_VALUE
|
||||
* is safe to be used as PTR_TO_MAP_VALUE_OR_NULL, provided both point
|
||||
* to the same map).
|
||||
* However, if the old MAYBE_NULL register then got NULL checked,
|
||||
* doing so could have affected others with the same id, and we can't
|
||||
* check for that because we lost the id when we converted to
|
||||
* a non-MAYBE_NULL variant.
|
||||
* So, as a general rule we don't allow mixing MAYBE_NULL and
|
||||
* non-MAYBE_NULL registers as well.
|
||||
*/
|
||||
if (rold->type != rcur->type)
|
||||
return false;
|
||||
|
||||
switch (base_type(rold->type)) {
|
||||
case SCALAR_VALUE:
|
||||
if (equal)
|
||||
if (regs_exact(rold, rcur, idmap))
|
||||
return true;
|
||||
if (env->explore_alu_limits)
|
||||
return false;
|
||||
if (rcur->type == SCALAR_VALUE) {
|
||||
if (!rold->precise)
|
||||
return true;
|
||||
/* new val must satisfy old val knowledge */
|
||||
return range_within(rold, rcur) &&
|
||||
tnum_in(rold->var_off, rcur->var_off);
|
||||
} else {
|
||||
/* We're trying to use a pointer in place of a scalar.
|
||||
* Even if the scalar was unbounded, this could lead to
|
||||
* pointer leaks because scalars are allowed to leak
|
||||
* while pointers are not. We could make this safe in
|
||||
* special cases if root is calling us, but it's
|
||||
* probably not worth the hassle.
|
||||
*/
|
||||
return false;
|
||||
}
|
||||
if (!rold->precise)
|
||||
return true;
|
||||
/* new val must satisfy old val knowledge */
|
||||
return range_within(rold, rcur) &&
|
||||
tnum_in(rold->var_off, rcur->var_off);
|
||||
case PTR_TO_MAP_KEY:
|
||||
case PTR_TO_MAP_VALUE:
|
||||
/* a PTR_TO_MAP_VALUE could be safe to use as a
|
||||
* PTR_TO_MAP_VALUE_OR_NULL into the same map.
|
||||
* However, if the old PTR_TO_MAP_VALUE_OR_NULL then got NULL-
|
||||
* checked, doing so could have affected others with the same
|
||||
* id, and we can't check for that because we lost the id when
|
||||
* we converted to a PTR_TO_MAP_VALUE.
|
||||
*/
|
||||
if (type_may_be_null(rold->type)) {
|
||||
if (!type_may_be_null(rcur->type))
|
||||
return false;
|
||||
if (memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)))
|
||||
return false;
|
||||
/* Check our ids match any regs they're supposed to */
|
||||
return check_ids(rold->id, rcur->id, idmap);
|
||||
}
|
||||
|
||||
/* If the new min/max/var_off satisfy the old ones and
|
||||
* everything else matches, we are OK.
|
||||
* 'id' is not compared, since it's only used for maps with
|
||||
* bpf_spin_lock inside map element and in such cases if
|
||||
* the rest of the prog is valid for one map element then
|
||||
* it's valid for all map elements regardless of the key
|
||||
* used in bpf_map_lookup()
|
||||
*/
|
||||
return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
|
||||
return memcmp(rold, rcur, offsetof(struct bpf_reg_state, var_off)) == 0 &&
|
||||
range_within(rold, rcur) &&
|
||||
tnum_in(rold->var_off, rcur->var_off) &&
|
||||
check_ids(rold->id, rcur->id, idmap);
|
||||
case PTR_TO_PACKET_META:
|
||||
case PTR_TO_PACKET:
|
||||
if (rcur->type != rold->type)
|
||||
return false;
|
||||
/* We must have at least as much range as the old ptr
|
||||
* did, so that any accesses which were safe before are
|
||||
* still safe. This is true even if old range < old off,
|
||||
|
@ -13139,7 +13147,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
|
|||
if (rold->off != rcur->off)
|
||||
return false;
|
||||
/* id relations must be preserved */
|
||||
if (rold->id && !check_ids(rold->id, rcur->id, idmap))
|
||||
if (!check_ids(rold->id, rcur->id, idmap))
|
||||
return false;
|
||||
/* new val must satisfy old val knowledge */
|
||||
return range_within(rold, rcur) &&
|
||||
|
@ -13148,15 +13156,10 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
|
|||
/* two stack pointers are equal only if they're pointing to
|
||||
* the same stack frame, since fp-8 in foo != fp-8 in bar
|
||||
*/
|
||||
return equal && rold->frameno == rcur->frameno;
|
||||
return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
|
||||
default:
|
||||
/* Only valid matches are exact, which memcmp() */
|
||||
return equal;
|
||||
return regs_exact(rold, rcur, idmap);
|
||||
}
|
||||
|
||||
/* Shouldn't get here; if we do, say it's not safe */
|
||||
WARN_ON_ONCE(1);
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
|
||||
|
@ -13222,12 +13225,20 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
|
|||
return true;
|
||||
}
|
||||
|
||||
static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur)
|
||||
static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
|
||||
struct bpf_id_pair *idmap)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (old->acquired_refs != cur->acquired_refs)
|
||||
return false;
|
||||
return !memcmp(old->refs, cur->refs,
|
||||
sizeof(*old->refs) * old->acquired_refs);
|
||||
|
||||
for (i = 0; i < old->acquired_refs; i++) {
|
||||
if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap))
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/* compare two verifier states
|
||||
|
@ -13269,7 +13280,7 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
|
|||
if (!stacksafe(env, old, cur, env->idmap_scratch))
|
||||
return false;
|
||||
|
||||
if (!refsafe(old, cur))
|
||||
if (!refsafe(old, cur, env->idmap_scratch))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
|
|
|
@ -369,8 +369,6 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
|
|||
return &bpf_probe_write_user_proto;
|
||||
}
|
||||
|
||||
static DEFINE_RAW_SPINLOCK(trace_printk_lock);
|
||||
|
||||
#define MAX_TRACE_PRINTK_VARARGS 3
|
||||
#define BPF_TRACE_PRINTK_SIZE 1024
|
||||
|
||||
|
@ -378,23 +376,22 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
|
|||
u64, arg2, u64, arg3)
|
||||
{
|
||||
u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
|
||||
u32 *bin_args;
|
||||
static char buf[BPF_TRACE_PRINTK_SIZE];
|
||||
unsigned long flags;
|
||||
struct bpf_bprintf_data data = {
|
||||
.get_bin_args = true,
|
||||
.get_buf = true,
|
||||
};
|
||||
int ret;
|
||||
|
||||
ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
|
||||
MAX_TRACE_PRINTK_VARARGS);
|
||||
ret = bpf_bprintf_prepare(fmt, fmt_size, args,
|
||||
MAX_TRACE_PRINTK_VARARGS, &data);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
raw_spin_lock_irqsave(&trace_printk_lock, flags);
|
||||
ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
|
||||
ret = bstr_printf(data.buf, MAX_BPRINTF_BUF, fmt, data.bin_args);
|
||||
|
||||
trace_bpf_trace_printk(buf);
|
||||
raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
|
||||
trace_bpf_trace_printk(data.buf);
|
||||
|
||||
bpf_bprintf_cleanup();
|
||||
bpf_bprintf_cleanup(&data);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -427,30 +424,29 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
|
|||
return &bpf_trace_printk_proto;
|
||||
}
|
||||
|
||||
BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, data,
|
||||
BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, args,
|
||||
u32, data_len)
|
||||
{
|
||||
static char buf[BPF_TRACE_PRINTK_SIZE];
|
||||
unsigned long flags;
|
||||
struct bpf_bprintf_data data = {
|
||||
.get_bin_args = true,
|
||||
.get_buf = true,
|
||||
};
|
||||
int ret, num_args;
|
||||
u32 *bin_args;
|
||||
|
||||
if (data_len & 7 || data_len > MAX_BPRINTF_VARARGS * 8 ||
|
||||
(data_len && !data))
|
||||
(data_len && !args))
|
||||
return -EINVAL;
|
||||
num_args = data_len / 8;
|
||||
|
||||
ret = bpf_bprintf_prepare(fmt, fmt_size, data, &bin_args, num_args);
|
||||
ret = bpf_bprintf_prepare(fmt, fmt_size, args, num_args, &data);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
raw_spin_lock_irqsave(&trace_printk_lock, flags);
|
||||
ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
|
||||
ret = bstr_printf(data.buf, MAX_BPRINTF_BUF, fmt, data.bin_args);
|
||||
|
||||
trace_bpf_trace_printk(buf);
|
||||
raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
|
||||
trace_bpf_trace_printk(data.buf);
|
||||
|
||||
bpf_bprintf_cleanup();
|
||||
bpf_bprintf_cleanup(&data);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -472,23 +468,25 @@ const struct bpf_func_proto *bpf_get_trace_vprintk_proto(void)
|
|||
}
|
||||
|
||||
BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
|
||||
const void *, data, u32, data_len)
|
||||
const void *, args, u32, data_len)
|
||||
{
|
||||
struct bpf_bprintf_data data = {
|
||||
.get_bin_args = true,
|
||||
};
|
||||
int err, num_args;
|
||||
u32 *bin_args;
|
||||
|
||||
if (data_len & 7 || data_len > MAX_BPRINTF_VARARGS * 8 ||
|
||||
(data_len && !data))
|
||||
(data_len && !args))
|
||||
return -EINVAL;
|
||||
num_args = data_len / 8;
|
||||
|
||||
err = bpf_bprintf_prepare(fmt, fmt_size, data, &bin_args, num_args);
|
||||
err = bpf_bprintf_prepare(fmt, fmt_size, args, num_args, &data);
|
||||
if (err < 0)
|
||||
return err;
|
||||
|
||||
seq_bprintf(m, fmt, bin_args);
|
||||
seq_bprintf(m, fmt, data.bin_args);
|
||||
|
||||
bpf_bprintf_cleanup();
|
||||
bpf_bprintf_cleanup(&data);
|
||||
|
||||
return seq_has_overflowed(m) ? -EOVERFLOW : 0;
|
||||
}
|
||||
|
|
|
@ -4615,7 +4615,8 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
|
|||
struct ip_tunnel_info *info;
|
||||
|
||||
if (unlikely(flags & ~(BPF_F_TUNINFO_IPV6 | BPF_F_ZERO_CSUM_TX |
|
||||
BPF_F_DONT_FRAGMENT | BPF_F_SEQ_NUMBER)))
|
||||
BPF_F_DONT_FRAGMENT | BPF_F_SEQ_NUMBER |
|
||||
BPF_F_NO_TUNNEL_KEY)))
|
||||
return -EINVAL;
|
||||
if (unlikely(size != sizeof(struct bpf_tunnel_key))) {
|
||||
switch (size) {
|
||||
|
@ -4653,6 +4654,8 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
|
|||
info->key.tun_flags &= ~TUNNEL_CSUM;
|
||||
if (flags & BPF_F_SEQ_NUMBER)
|
||||
info->key.tun_flags |= TUNNEL_SEQ;
|
||||
if (flags & BPF_F_NO_TUNNEL_KEY)
|
||||
info->key.tun_flags &= ~TUNNEL_KEY;
|
||||
|
||||
info->key.tun_id = cpu_to_be64(from->tunnel_id);
|
||||
info->key.tos = from->tunnel_tos;
|
||||
|
|
|
@ -125,21 +125,21 @@ always-y += sockex1_kern.o
|
|||
always-y += sockex2_kern.o
|
||||
always-y += sockex3_kern.o
|
||||
always-y += tracex1_kern.o
|
||||
always-y += tracex2_kern.o
|
||||
always-y += tracex2.bpf.o
|
||||
always-y += tracex3_kern.o
|
||||
always-y += tracex4_kern.o
|
||||
always-y += tracex5_kern.o
|
||||
always-y += tracex6_kern.o
|
||||
always-y += tracex7_kern.o
|
||||
always-y += sock_flags_kern.o
|
||||
always-y += test_probe_write_user_kern.o
|
||||
always-y += trace_output_kern.o
|
||||
always-y += test_probe_write_user.bpf.o
|
||||
always-y += trace_output.bpf.o
|
||||
always-y += tcbpf1_kern.o
|
||||
always-y += tc_l2_redirect_kern.o
|
||||
always-y += lathist_kern.o
|
||||
always-y += offwaketime_kern.o
|
||||
always-y += spintest_kern.o
|
||||
always-y += map_perf_test_kern.o
|
||||
always-y += map_perf_test.bpf.o
|
||||
always-y += test_overhead_tp_kern.o
|
||||
always-y += test_overhead_raw_tp_kern.o
|
||||
always-y += test_overhead_kprobe_kern.o
|
||||
|
@ -147,7 +147,7 @@ always-y += parse_varlen.o parse_simple.o parse_ldabs.o
|
|||
always-y += test_cgrp2_tc_kern.o
|
||||
always-y += xdp1_kern.o
|
||||
always-y += xdp2_kern.o
|
||||
always-y += test_current_task_under_cgroup_kern.o
|
||||
always-y += test_current_task_under_cgroup.bpf.o
|
||||
always-y += trace_event_kern.o
|
||||
always-y += sampleip_kern.o
|
||||
always-y += lwt_len_hist_kern.o
|
||||
|
|
|
@ -0,0 +1 @@
|
|||
/* dummy .h to trick /usr/include/features.h to work with 'clang -target bpf' */
|
|
@ -4,14 +4,12 @@
|
|||
* modify it under the terms of version 2 of the GNU General Public
|
||||
* License as published by the Free Software Foundation.
|
||||
*/
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include "vmlinux.h"
|
||||
#include <errno.h>
|
||||
#include <linux/version.h>
|
||||
#include <uapi/linux/bpf.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include <bpf/bpf_core_read.h>
|
||||
#include "trace_common.h"
|
||||
|
||||
#define MAX_ENTRIES 1000
|
||||
#define MAX_NR_CPUS 1024
|
||||
|
@ -102,8 +100,8 @@ struct {
|
|||
__uint(max_entries, MAX_ENTRIES);
|
||||
} lru_hash_lookup_map SEC(".maps");
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_getuid))
|
||||
int stress_hmap(struct pt_regs *ctx)
|
||||
SEC("ksyscall/getuid")
|
||||
int BPF_KSYSCALL(stress_hmap)
|
||||
{
|
||||
u32 key = bpf_get_current_pid_tgid();
|
||||
long init_val = 1;
|
||||
|
@ -120,8 +118,8 @@ int stress_hmap(struct pt_regs *ctx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_geteuid))
|
||||
int stress_percpu_hmap(struct pt_regs *ctx)
|
||||
SEC("ksyscall/geteuid")
|
||||
int BPF_KSYSCALL(stress_percpu_hmap)
|
||||
{
|
||||
u32 key = bpf_get_current_pid_tgid();
|
||||
long init_val = 1;
|
||||
|
@ -137,8 +135,8 @@ int stress_percpu_hmap(struct pt_regs *ctx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_getgid))
|
||||
int stress_hmap_alloc(struct pt_regs *ctx)
|
||||
SEC("ksyscall/getgid")
|
||||
int BPF_KSYSCALL(stress_hmap_alloc)
|
||||
{
|
||||
u32 key = bpf_get_current_pid_tgid();
|
||||
long init_val = 1;
|
||||
|
@ -154,8 +152,8 @@ int stress_hmap_alloc(struct pt_regs *ctx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_getegid))
|
||||
int stress_percpu_hmap_alloc(struct pt_regs *ctx)
|
||||
SEC("ksyscall/getegid")
|
||||
int BPF_KSYSCALL(stress_percpu_hmap_alloc)
|
||||
{
|
||||
u32 key = bpf_get_current_pid_tgid();
|
||||
long init_val = 1;
|
||||
|
@ -170,11 +168,10 @@ int stress_percpu_hmap_alloc(struct pt_regs *ctx)
|
|||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_connect))
|
||||
int stress_lru_hmap_alloc(struct pt_regs *ctx)
|
||||
SEC("ksyscall/connect")
|
||||
int BPF_KSYSCALL(stress_lru_hmap_alloc, int fd, struct sockaddr_in *uservaddr,
|
||||
int addrlen)
|
||||
{
|
||||
struct pt_regs *real_regs = (struct pt_regs *)PT_REGS_PARM1_CORE(ctx);
|
||||
char fmt[] = "Failed at stress_lru_hmap_alloc. ret:%dn";
|
||||
union {
|
||||
u16 dst6[8];
|
||||
|
@ -187,14 +184,11 @@ int stress_lru_hmap_alloc(struct pt_regs *ctx)
|
|||
u32 key;
|
||||
};
|
||||
} test_params;
|
||||
struct sockaddr_in6 *in6;
|
||||
struct sockaddr_in6 *in6 = (struct sockaddr_in6 *)uservaddr;
|
||||
u16 test_case;
|
||||
int addrlen, ret;
|
||||
long val = 1;
|
||||
u32 key = 0;
|
||||
|
||||
in6 = (struct sockaddr_in6 *)PT_REGS_PARM2_CORE(real_regs);
|
||||
addrlen = (int)PT_REGS_PARM3_CORE(real_regs);
|
||||
int ret;
|
||||
|
||||
if (addrlen != sizeof(*in6))
|
||||
return 0;
|
||||
|
@ -251,8 +245,8 @@ done:
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_gettid))
|
||||
int stress_lpm_trie_map_alloc(struct pt_regs *ctx)
|
||||
SEC("ksyscall/gettid")
|
||||
int BPF_KSYSCALL(stress_lpm_trie_map_alloc)
|
||||
{
|
||||
union {
|
||||
u32 b32[2];
|
||||
|
@ -273,8 +267,8 @@ int stress_lpm_trie_map_alloc(struct pt_regs *ctx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_getpgid))
|
||||
int stress_hash_map_lookup(struct pt_regs *ctx)
|
||||
SEC("ksyscall/getpgid")
|
||||
int BPF_KSYSCALL(stress_hash_map_lookup)
|
||||
{
|
||||
u32 key = 1, i;
|
||||
long *value;
|
||||
|
@ -286,8 +280,8 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_getppid))
|
||||
int stress_array_map_lookup(struct pt_regs *ctx)
|
||||
SEC("ksyscall/getppid")
|
||||
int BPF_KSYSCALL(stress_array_map_lookup)
|
||||
{
|
||||
u32 key = 1, i;
|
||||
long *value;
|
|
@ -443,7 +443,7 @@ int main(int argc, char **argv)
|
|||
if (argc > 4)
|
||||
max_cnt = atoi(argv[4]);
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
if (libbpf_get_error(obj)) {
|
||||
fprintf(stderr, "ERROR: opening BPF object file failed\n");
|
||||
|
|
|
@ -5,12 +5,11 @@
|
|||
* License as published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/ptrace.h>
|
||||
#include <uapi/linux/bpf.h>
|
||||
#include "vmlinux.h"
|
||||
#include <linux/version.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
#include <uapi/linux/utsname.h>
|
||||
#include "trace_common.h"
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include <bpf/bpf_core_read.h>
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_CGROUP_ARRAY);
|
||||
|
@ -27,8 +26,8 @@ struct {
|
|||
} perf_map SEC(".maps");
|
||||
|
||||
/* Writes the last PID that called sync to a map at index 0 */
|
||||
SEC("kprobe/" SYSCALL(sys_sync))
|
||||
int bpf_prog1(struct pt_regs *ctx)
|
||||
SEC("ksyscall/sync")
|
||||
int BPF_KSYSCALL(bpf_prog1)
|
||||
{
|
||||
u64 pid = bpf_get_current_pid_tgid();
|
||||
int idx = 0;
|
|
@ -14,14 +14,14 @@
|
|||
int main(int argc, char **argv)
|
||||
{
|
||||
pid_t remote_pid, local_pid = getpid();
|
||||
int cg2 = -1, idx = 0, rc = 1;
|
||||
struct bpf_link *link = NULL;
|
||||
struct bpf_program *prog;
|
||||
int cg2, idx = 0, rc = 1;
|
||||
struct bpf_object *obj;
|
||||
char filename[256];
|
||||
int map_fd[2];
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
if (libbpf_get_error(obj)) {
|
||||
fprintf(stderr, "ERROR: opening BPF object file failed\n");
|
||||
|
@ -103,7 +103,9 @@ int main(int argc, char **argv)
|
|||
rc = 0;
|
||||
|
||||
err:
|
||||
close(cg2);
|
||||
if (cg2 != -1)
|
||||
close(cg2);
|
||||
|
||||
cleanup_cgroup_environment();
|
||||
|
||||
cleanup:
|
||||
|
|
|
@ -42,11 +42,6 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
|
|||
list->prev = list;
|
||||
}
|
||||
|
||||
static inline int list_empty(const struct list_head *head)
|
||||
{
|
||||
return head->next == head;
|
||||
}
|
||||
|
||||
static inline void __list_add(struct list_head *new,
|
||||
struct list_head *prev,
|
||||
struct list_head *next)
|
||||
|
|
|
@ -13,7 +13,6 @@
|
|||
#include <bpf/bpf_helpers.h>
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include <bpf/bpf_core_read.h>
|
||||
#include "trace_common.h"
|
||||
|
||||
#define MAX_NR_PORTS 65536
|
||||
|
||||
|
|
|
@ -4,14 +4,12 @@
|
|||
* modify it under the terms of version 2 of the GNU General Public
|
||||
* License as published by the Free Software Foundation.
|
||||
*/
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include <uapi/linux/bpf.h>
|
||||
#include "vmlinux.h"
|
||||
#include <string.h>
|
||||
#include <linux/version.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include <bpf/bpf_core_read.h>
|
||||
#include "trace_common.h"
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_HASH);
|
||||
|
@ -28,25 +26,23 @@ struct {
|
|||
* This example sits on a syscall, and the syscall ABI is relatively stable
|
||||
* of course, across platforms, and over time, the ABI may change.
|
||||
*/
|
||||
SEC("kprobe/" SYSCALL(sys_connect))
|
||||
int bpf_prog1(struct pt_regs *ctx)
|
||||
SEC("ksyscall/connect")
|
||||
int BPF_KSYSCALL(bpf_prog1, int fd, struct sockaddr_in *uservaddr,
|
||||
int addrlen)
|
||||
{
|
||||
struct pt_regs *real_regs = (struct pt_regs *)PT_REGS_PARM1_CORE(ctx);
|
||||
void *sockaddr_arg = (void *)PT_REGS_PARM2_CORE(real_regs);
|
||||
int sockaddr_len = (int)PT_REGS_PARM3_CORE(real_regs);
|
||||
struct sockaddr_in new_addr, orig_addr = {};
|
||||
struct sockaddr_in *mapped_addr;
|
||||
|
||||
if (sockaddr_len > sizeof(orig_addr))
|
||||
if (addrlen > sizeof(orig_addr))
|
||||
return 0;
|
||||
|
||||
if (bpf_probe_read_user(&orig_addr, sizeof(orig_addr), sockaddr_arg) != 0)
|
||||
if (bpf_probe_read_user(&orig_addr, sizeof(orig_addr), uservaddr) != 0)
|
||||
return 0;
|
||||
|
||||
mapped_addr = bpf_map_lookup_elem(&dnat_map, &orig_addr);
|
||||
if (mapped_addr != NULL) {
|
||||
memcpy(&new_addr, mapped_addr, sizeof(new_addr));
|
||||
bpf_probe_write_user(sockaddr_arg, &new_addr,
|
||||
bpf_probe_write_user(uservaddr, &new_addr,
|
||||
sizeof(new_addr));
|
||||
}
|
||||
return 0;
|
|
@ -24,7 +24,7 @@ int main(int ac, char **argv)
|
|||
mapped_addr_in = (struct sockaddr_in *)&mapped_addr;
|
||||
tmp_addr_in = (struct sockaddr_in *)&tmp_addr;
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
if (libbpf_get_error(obj)) {
|
||||
fprintf(stderr, "ERROR: opening BPF object file failed\n");
|
||||
|
|
|
@ -1,13 +0,0 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#ifndef __TRACE_COMMON_H
|
||||
#define __TRACE_COMMON_H
|
||||
|
||||
#ifdef __x86_64__
|
||||
#define SYSCALL(SYS) "__x64_" __stringify(SYS)
|
||||
#elif defined(__s390x__)
|
||||
#define SYSCALL(SYS) "__s390x_" __stringify(SYS)
|
||||
#else
|
||||
#define SYSCALL(SYS) __stringify(SYS)
|
||||
#endif
|
||||
|
||||
#endif
|
|
@ -1,8 +1,6 @@
|
|||
#include <linux/ptrace.h>
|
||||
#include "vmlinux.h"
|
||||
#include <linux/version.h>
|
||||
#include <uapi/linux/bpf.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
#include "trace_common.h"
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
|
||||
|
@ -11,7 +9,7 @@ struct {
|
|||
__uint(max_entries, 2);
|
||||
} my_map SEC(".maps");
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_write))
|
||||
SEC("ksyscall/write")
|
||||
int bpf_prog1(struct pt_regs *ctx)
|
||||
{
|
||||
struct S {
|
|
@ -51,7 +51,7 @@ int main(int argc, char **argv)
|
|||
char filename[256];
|
||||
FILE *f;
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
if (libbpf_get_error(obj)) {
|
||||
fprintf(stderr, "ERROR: opening BPF object file failed\n");
|
||||
|
|
|
@ -4,13 +4,11 @@
|
|||
* modify it under the terms of version 2 of the GNU General Public
|
||||
* License as published by the Free Software Foundation.
|
||||
*/
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include "vmlinux.h"
|
||||
#include <linux/version.h>
|
||||
#include <uapi/linux/bpf.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include "trace_common.h"
|
||||
#include <bpf/bpf_core_read.h>
|
||||
|
||||
struct {
|
||||
__uint(type, BPF_MAP_TYPE_HASH);
|
||||
|
@ -78,15 +76,14 @@ struct {
|
|||
__uint(max_entries, 1024);
|
||||
} my_hist_map SEC(".maps");
|
||||
|
||||
SEC("kprobe/" SYSCALL(sys_write))
|
||||
int bpf_prog3(struct pt_regs *ctx)
|
||||
SEC("ksyscall/write")
|
||||
int BPF_KSYSCALL(bpf_prog3, unsigned int fd, const char *buf, size_t count)
|
||||
{
|
||||
long write_size = PT_REGS_PARM3(ctx);
|
||||
long init_val = 1;
|
||||
long *value;
|
||||
struct hist_key key;
|
||||
|
||||
key.index = log2l(write_size);
|
||||
key.index = log2l(count);
|
||||
key.pid_tgid = bpf_get_current_pid_tgid();
|
||||
key.uid_gid = bpf_get_current_uid_gid();
|
||||
bpf_get_current_comm(&key.comm, sizeof(key.comm));
|
|
@ -123,7 +123,7 @@ int main(int ac, char **argv)
|
|||
int i, j = 0;
|
||||
FILE *f;
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
if (libbpf_get_error(obj)) {
|
||||
fprintf(stderr, "ERROR: opening BPF object file failed\n");
|
||||
|
|
|
@ -51,7 +51,7 @@ int main(int ac, char **argv)
|
|||
struct bpf_program *prog;
|
||||
struct bpf_object *obj;
|
||||
char filename[256];
|
||||
int map_fd, i, j = 0;
|
||||
int map_fd, j = 0;
|
||||
|
||||
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
|
||||
obj = bpf_object__open_file(filename, NULL);
|
||||
|
@ -82,7 +82,7 @@ int main(int ac, char **argv)
|
|||
j++;
|
||||
}
|
||||
|
||||
for (i = 0; ; i++) {
|
||||
while (1) {
|
||||
print_old_objects(map_fd);
|
||||
sleep(1);
|
||||
}
|
||||
|
|
|
@ -289,3 +289,6 @@ FORCE:
|
|||
.PHONY: all FORCE bootstrap clean install-bin install uninstall
|
||||
.PHONY: doc doc-clean doc-install doc-uninstall
|
||||
.DEFAULT_GOAL := all
|
||||
|
||||
# Delete partially updated (corrupted) files on error
|
||||
.DELETE_ON_ERROR:
|
||||
|
|
|
@ -56,13 +56,17 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OU
|
|||
DESTDIR=$(LIBBPF_DESTDIR) prefix= EXTRA_CFLAGS="$(CFLAGS)" \
|
||||
$(abspath $@) install_headers
|
||||
|
||||
LIBELF_FLAGS := $(shell $(HOSTPKG_CONFIG) libelf --cflags 2>/dev/null)
|
||||
LIBELF_LIBS := $(shell $(HOSTPKG_CONFIG) libelf --libs 2>/dev/null || echo -lelf)
|
||||
|
||||
CFLAGS += -g \
|
||||
-I$(srctree)/tools/include \
|
||||
-I$(srctree)/tools/include/uapi \
|
||||
-I$(LIBBPF_INCLUDE) \
|
||||
-I$(SUBCMD_SRC)
|
||||
-I$(SUBCMD_SRC) \
|
||||
$(LIBELF_FLAGS)
|
||||
|
||||
LIBS = -lelf -lz
|
||||
LIBS = $(LIBELF_LIBS) -lz
|
||||
|
||||
export srctree OUTPUT CFLAGS Q
|
||||
include $(srctree)/tools/build/Makefile.include
|
||||
|
|
|
@ -2001,6 +2001,9 @@ union bpf_attr {
|
|||
* sending the packet. This flag was added for GRE
|
||||
* encapsulation, but might be used with other protocols
|
||||
* as well in the future.
|
||||
* **BPF_F_NO_TUNNEL_KEY**
|
||||
* Add a flag to tunnel metadata indicating that no tunnel
|
||||
* key should be set in the resulting tunnel header.
|
||||
*
|
||||
* Here is a typical usage on the transmit path:
|
||||
*
|
||||
|
@ -5764,6 +5767,7 @@ enum {
|
|||
BPF_F_ZERO_CSUM_TX = (1ULL << 1),
|
||||
BPF_F_DONT_FRAGMENT = (1ULL << 2),
|
||||
BPF_F_SEQ_NUMBER = (1ULL << 3),
|
||||
BPF_F_NO_TUNNEL_KEY = (1ULL << 4),
|
||||
};
|
||||
|
||||
/* BPF_FUNC_skb_get_tunnel_key flags. */
|
||||
|
|
|
@ -32,6 +32,9 @@
|
|||
#elif defined(__TARGET_ARCH_arc)
|
||||
#define bpf_target_arc
|
||||
#define bpf_target_defined
|
||||
#elif defined(__TARGET_ARCH_loongarch)
|
||||
#define bpf_target_loongarch
|
||||
#define bpf_target_defined
|
||||
#else
|
||||
|
||||
/* Fall back to what the compiler says */
|
||||
|
@ -62,6 +65,9 @@
|
|||
#elif defined(__arc__)
|
||||
#define bpf_target_arc
|
||||
#define bpf_target_defined
|
||||
#elif defined(__loongarch__)
|
||||
#define bpf_target_loongarch
|
||||
#define bpf_target_defined
|
||||
#endif /* no compiler target */
|
||||
|
||||
#endif
|
||||
|
@ -137,7 +143,7 @@ struct pt_regs___s390 {
|
|||
#define __PT_PARM3_REG gprs[4]
|
||||
#define __PT_PARM4_REG gprs[5]
|
||||
#define __PT_PARM5_REG gprs[6]
|
||||
#define __PT_RET_REG grps[14]
|
||||
#define __PT_RET_REG gprs[14]
|
||||
#define __PT_FP_REG gprs[11] /* Works only with CONFIG_FRAME_POINTER */
|
||||
#define __PT_RC_REG gprs[2]
|
||||
#define __PT_SP_REG gprs[15]
|
||||
|
@ -258,6 +264,23 @@ struct pt_regs___arm64 {
|
|||
/* arc does not select ARCH_HAS_SYSCALL_WRAPPER. */
|
||||
#define PT_REGS_SYSCALL_REGS(ctx) ctx
|
||||
|
||||
#elif defined(bpf_target_loongarch)
|
||||
|
||||
/* https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html */
|
||||
|
||||
#define __PT_PARM1_REG regs[4]
|
||||
#define __PT_PARM2_REG regs[5]
|
||||
#define __PT_PARM3_REG regs[6]
|
||||
#define __PT_PARM4_REG regs[7]
|
||||
#define __PT_PARM5_REG regs[8]
|
||||
#define __PT_RET_REG regs[1]
|
||||
#define __PT_FP_REG regs[22]
|
||||
#define __PT_RC_REG regs[4]
|
||||
#define __PT_SP_REG regs[3]
|
||||
#define __PT_IP_REG csr_era
|
||||
/* loongarch does not select ARCH_HAS_SYSCALL_WRAPPER. */
|
||||
#define PT_REGS_SYSCALL_REGS(ctx) ctx
|
||||
|
||||
#endif
|
||||
|
||||
#if defined(bpf_target_defined)
|
||||
|
|
|
@ -688,8 +688,21 @@ int btf__align_of(const struct btf *btf, __u32 id)
|
|||
if (align <= 0)
|
||||
return libbpf_err(align);
|
||||
max_align = max(max_align, align);
|
||||
|
||||
/* if field offset isn't aligned according to field
|
||||
* type's alignment, then struct must be packed
|
||||
*/
|
||||
if (btf_member_bitfield_size(t, i) == 0 &&
|
||||
(m->offset % (8 * align)) != 0)
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* if struct/union size isn't a multiple of its alignment,
|
||||
* then struct must be packed
|
||||
*/
|
||||
if ((t->size % max_align) != 0)
|
||||
return 1;
|
||||
|
||||
return max_align;
|
||||
}
|
||||
default:
|
||||
|
@ -990,7 +1003,8 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
|
|||
err = 0;
|
||||
|
||||
if (!btf_data) {
|
||||
err = -ENOENT;
|
||||
pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
|
||||
err = -ENODATA;
|
||||
goto done;
|
||||
}
|
||||
btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf);
|
||||
|
|
|
@ -13,6 +13,7 @@
|
|||
#include <ctype.h>
|
||||
#include <endian.h>
|
||||
#include <errno.h>
|
||||
#include <limits.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/btf.h>
|
||||
#include <linux/kernel.h>
|
||||
|
@ -833,14 +834,9 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id,
|
|||
const struct btf_type *t)
|
||||
{
|
||||
const struct btf_member *m;
|
||||
int align, i, bit_sz;
|
||||
int max_align = 1, align, i, bit_sz;
|
||||
__u16 vlen;
|
||||
|
||||
align = btf__align_of(btf, id);
|
||||
/* size of a non-packed struct has to be a multiple of its alignment*/
|
||||
if (align && t->size % align)
|
||||
return true;
|
||||
|
||||
m = btf_members(t);
|
||||
vlen = btf_vlen(t);
|
||||
/* all non-bitfield fields have to be naturally aligned */
|
||||
|
@ -849,8 +845,11 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id,
|
|||
bit_sz = btf_member_bitfield_size(t, i);
|
||||
if (align && bit_sz == 0 && m->offset % (8 * align) != 0)
|
||||
return true;
|
||||
max_align = max(align, max_align);
|
||||
}
|
||||
|
||||
/* size of a non-packed struct has to be a multiple of its alignment */
|
||||
if (t->size % max_align != 0)
|
||||
return true;
|
||||
/*
|
||||
* if original struct was marked as packed, but its layout is
|
||||
* naturally aligned, we'll detect that it's not packed
|
||||
|
@ -858,44 +857,97 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id,
|
|||
return false;
|
||||
}
|
||||
|
||||
static int chip_away_bits(int total, int at_most)
|
||||
{
|
||||
return total % at_most ? : at_most;
|
||||
}
|
||||
|
||||
static void btf_dump_emit_bit_padding(const struct btf_dump *d,
|
||||
int cur_off, int m_off, int m_bit_sz,
|
||||
int align, int lvl)
|
||||
int cur_off, int next_off, int next_align,
|
||||
bool in_bitfield, int lvl)
|
||||
{
|
||||
int off_diff = m_off - cur_off;
|
||||
int ptr_bits = d->ptr_sz * 8;
|
||||
const struct {
|
||||
const char *name;
|
||||
int bits;
|
||||
} pads[] = {
|
||||
{"long", d->ptr_sz * 8}, {"int", 32}, {"short", 16}, {"char", 8}
|
||||
};
|
||||
int new_off, pad_bits, bits, i;
|
||||
const char *pad_type;
|
||||
|
||||
if (off_diff <= 0)
|
||||
/* no gap */
|
||||
return;
|
||||
if (m_bit_sz == 0 && off_diff < align * 8)
|
||||
/* natural padding will take care of a gap */
|
||||
return;
|
||||
if (cur_off >= next_off)
|
||||
return; /* no gap */
|
||||
|
||||
while (off_diff > 0) {
|
||||
const char *pad_type;
|
||||
int pad_bits;
|
||||
/* For filling out padding we want to take advantage of
|
||||
* natural alignment rules to minimize unnecessary explicit
|
||||
* padding. First, we find the largest type (among long, int,
|
||||
* short, or char) that can be used to force naturally aligned
|
||||
* boundary. Once determined, we'll use such type to fill in
|
||||
* the remaining padding gap. In some cases we can rely on
|
||||
* compiler filling some gaps, but sometimes we need to force
|
||||
* alignment to close natural alignment with markers like
|
||||
* `long: 0` (this is always the case for bitfields). Note
|
||||
* that even if struct itself has, let's say 4-byte alignment
|
||||
* (i.e., it only uses up to int-aligned types), using `long:
|
||||
* X;` explicit padding doesn't actually change struct's
|
||||
* overall alignment requirements, but compiler does take into
|
||||
* account that type's (long, in this example) natural
|
||||
* alignment requirements when adding implicit padding. We use
|
||||
* this fact heavily and don't worry about ruining correct
|
||||
* struct alignment requirement.
|
||||
*/
|
||||
for (i = 0; i < ARRAY_SIZE(pads); i++) {
|
||||
pad_bits = pads[i].bits;
|
||||
pad_type = pads[i].name;
|
||||
|
||||
if (ptr_bits > 32 && off_diff > 32) {
|
||||
pad_type = "long";
|
||||
pad_bits = chip_away_bits(off_diff, ptr_bits);
|
||||
} else if (off_diff > 16) {
|
||||
pad_type = "int";
|
||||
pad_bits = chip_away_bits(off_diff, 32);
|
||||
} else if (off_diff > 8) {
|
||||
pad_type = "short";
|
||||
pad_bits = chip_away_bits(off_diff, 16);
|
||||
} else {
|
||||
pad_type = "char";
|
||||
pad_bits = chip_away_bits(off_diff, 8);
|
||||
new_off = roundup(cur_off, pad_bits);
|
||||
if (new_off <= next_off)
|
||||
break;
|
||||
}
|
||||
|
||||
if (new_off > cur_off && new_off <= next_off) {
|
||||
/* We need explicit `<type>: 0` aligning mark if next
|
||||
* field is right on alignment offset and its
|
||||
* alignment requirement is less strict than <type>'s
|
||||
* alignment (so compiler won't naturally align to the
|
||||
* offset we expect), or if subsequent `<type>: X`,
|
||||
* will actually completely fit in the remaining hole,
|
||||
* making compiler basically ignore `<type>: X`
|
||||
* completely.
|
||||
*/
|
||||
if (in_bitfield ||
|
||||
(new_off == next_off && roundup(cur_off, next_align * 8) != new_off) ||
|
||||
(new_off != next_off && next_off - new_off <= new_off - cur_off))
|
||||
/* but for bitfields we'll emit explicit bit count */
|
||||
btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type,
|
||||
in_bitfield ? new_off - cur_off : 0);
|
||||
cur_off = new_off;
|
||||
}
|
||||
|
||||
/* Now we know we start at naturally aligned offset for a chosen
|
||||
* padding type (long, int, short, or char), and so the rest is just
|
||||
* a straightforward filling of remaining padding gap with full
|
||||
* `<type>: sizeof(<type>);` markers, except for the last one, which
|
||||
* might need smaller than sizeof(<type>) padding.
|
||||
*/
|
||||
while (cur_off != next_off) {
|
||||
bits = min(next_off - cur_off, pad_bits);
|
||||
if (bits == pad_bits) {
|
||||
btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits);
|
||||
cur_off += bits;
|
||||
continue;
|
||||
}
|
||||
/* For the remainder padding that doesn't cover entire
|
||||
* pad_type bit length, we pick the smallest necessary type.
|
||||
* This is pure aesthetics, we could have just used `long`,
|
||||
* but having smallest necessary one communicates better the
|
||||
* scale of the padding gap.
|
||||
*/
|
||||
for (i = ARRAY_SIZE(pads) - 1; i >= 0; i--) {
|
||||
pad_type = pads[i].name;
|
||||
pad_bits = pads[i].bits;
|
||||
if (pad_bits < bits)
|
||||
continue;
|
||||
|
||||
btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, bits);
|
||||
cur_off += bits;
|
||||
break;
|
||||
}
|
||||
btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits);
|
||||
off_diff -= pad_bits;
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -915,9 +967,11 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
|
|||
{
|
||||
const struct btf_member *m = btf_members(t);
|
||||
bool is_struct = btf_is_struct(t);
|
||||
int align, i, packed, off = 0;
|
||||
bool packed, prev_bitfield = false;
|
||||
int align, i, off = 0;
|
||||
__u16 vlen = btf_vlen(t);
|
||||
|
||||
align = btf__align_of(d->btf, id);
|
||||
packed = is_struct ? btf_is_struct_packed(d->btf, id, t) : 0;
|
||||
|
||||
btf_dump_printf(d, "%s%s%s {",
|
||||
|
@ -927,41 +981,47 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
|
|||
|
||||
for (i = 0; i < vlen; i++, m++) {
|
||||
const char *fname;
|
||||
int m_off, m_sz;
|
||||
int m_off, m_sz, m_align;
|
||||
bool in_bitfield;
|
||||
|
||||
fname = btf_name_of(d, m->name_off);
|
||||
m_sz = btf_member_bitfield_size(t, i);
|
||||
m_off = btf_member_bit_offset(t, i);
|
||||
align = packed ? 1 : btf__align_of(d->btf, m->type);
|
||||
m_align = packed ? 1 : btf__align_of(d->btf, m->type);
|
||||
|
||||
btf_dump_emit_bit_padding(d, off, m_off, m_sz, align, lvl + 1);
|
||||
in_bitfield = prev_bitfield && m_sz != 0;
|
||||
|
||||
btf_dump_emit_bit_padding(d, off, m_off, m_align, in_bitfield, lvl + 1);
|
||||
btf_dump_printf(d, "\n%s", pfx(lvl + 1));
|
||||
btf_dump_emit_type_decl(d, m->type, fname, lvl + 1);
|
||||
|
||||
if (m_sz) {
|
||||
btf_dump_printf(d, ": %d", m_sz);
|
||||
off = m_off + m_sz;
|
||||
prev_bitfield = true;
|
||||
} else {
|
||||
m_sz = max((__s64)0, btf__resolve_size(d->btf, m->type));
|
||||
off = m_off + m_sz * 8;
|
||||
prev_bitfield = false;
|
||||
}
|
||||
|
||||
btf_dump_printf(d, ";");
|
||||
}
|
||||
|
||||
/* pad at the end, if necessary */
|
||||
if (is_struct) {
|
||||
align = packed ? 1 : btf__align_of(d->btf, id);
|
||||
btf_dump_emit_bit_padding(d, off, t->size * 8, 0, align,
|
||||
lvl + 1);
|
||||
}
|
||||
if (is_struct)
|
||||
btf_dump_emit_bit_padding(d, off, t->size * 8, align, false, lvl + 1);
|
||||
|
||||
/*
|
||||
* Keep `struct empty {}` on a single line,
|
||||
* only print newline when there are regular or padding fields.
|
||||
*/
|
||||
if (vlen || t->size)
|
||||
if (vlen || t->size) {
|
||||
btf_dump_printf(d, "\n");
|
||||
btf_dump_printf(d, "%s}", pfx(lvl));
|
||||
btf_dump_printf(d, "%s}", pfx(lvl));
|
||||
} else {
|
||||
btf_dump_printf(d, "}");
|
||||
}
|
||||
if (packed)
|
||||
btf_dump_printf(d, " __attribute__((packed))");
|
||||
}
|
||||
|
@ -1073,6 +1133,43 @@ static void btf_dump_emit_enum_def(struct btf_dump *d, __u32 id,
|
|||
else
|
||||
btf_dump_emit_enum64_val(d, t, lvl, vlen);
|
||||
btf_dump_printf(d, "\n%s}", pfx(lvl));
|
||||
|
||||
/* special case enums with special sizes */
|
||||
if (t->size == 1) {
|
||||
/* one-byte enums can be forced with mode(byte) attribute */
|
||||
btf_dump_printf(d, " __attribute__((mode(byte)))");
|
||||
} else if (t->size == 8 && d->ptr_sz == 8) {
|
||||
/* enum can be 8-byte sized if one of the enumerator values
|
||||
* doesn't fit in 32-bit integer, or by adding mode(word)
|
||||
* attribute (but probably only on 64-bit architectures); do
|
||||
* our best here to try to satisfy the contract without adding
|
||||
* unnecessary attributes
|
||||
*/
|
||||
bool needs_word_mode;
|
||||
|
||||
if (btf_is_enum(t)) {
|
||||
/* enum can't represent 64-bit values, so we need word mode */
|
||||
needs_word_mode = true;
|
||||
} else {
|
||||
/* enum64 needs mode(word) if none of its values has
|
||||
* non-zero upper 32-bits (which means that all values
|
||||
* fit in 32-bit integers and won't cause compiler to
|
||||
* bump enum to be 64-bit naturally
|
||||
*/
|
||||
int i;
|
||||
|
||||
needs_word_mode = true;
|
||||
for (i = 0; i < vlen; i++) {
|
||||
if (btf_enum64(t)[i].val_hi32 != 0) {
|
||||
needs_word_mode = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (needs_word_mode)
|
||||
btf_dump_printf(d, " __attribute__((mode(word)))");
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
static void btf_dump_emit_fwd_def(struct btf_dump *d, __u32 id,
|
||||
|
|
|
@ -9903,7 +9903,7 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name,
|
|||
char errmsg[STRERR_BUFSIZE];
|
||||
int type, pfd;
|
||||
|
||||
if (ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS))
|
||||
if ((__u64)ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS))
|
||||
return -EINVAL;
|
||||
|
||||
memset(&attr, 0, attr_sz);
|
||||
|
|
|
@ -96,6 +96,12 @@ enum libbpf_print_level {
|
|||
typedef int (*libbpf_print_fn_t)(enum libbpf_print_level level,
|
||||
const char *, va_list ap);
|
||||
|
||||
/**
|
||||
* @brief **libbpf_set_print()** sets user-provided log callback function to
|
||||
* be used for libbpf warnings and informational messages.
|
||||
* @param fn The log print function. If NULL, libbpf won't print anything.
|
||||
* @return Pointer to old print function.
|
||||
*/
|
||||
LIBBPF_API libbpf_print_fn_t libbpf_set_print(libbpf_print_fn_t fn);
|
||||
|
||||
/* Hide internal to user */
|
||||
|
@ -174,6 +180,14 @@ struct bpf_object_open_opts {
|
|||
};
|
||||
#define bpf_object_open_opts__last_field kernel_log_level
|
||||
|
||||
/**
|
||||
* @brief **bpf_object__open()** creates a bpf_object by opening
|
||||
* the BPF ELF object file pointed to by the passed path and loading it
|
||||
* into memory.
|
||||
* @param path BPF object file path.
|
||||
* @return pointer to the new bpf_object; or NULL is returned on error,
|
||||
* error code is stored in errno
|
||||
*/
|
||||
LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
|
||||
|
||||
/**
|
||||
|
@ -203,10 +217,21 @@ LIBBPF_API struct bpf_object *
|
|||
bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
|
||||
const struct bpf_object_open_opts *opts);
|
||||
|
||||
/* Load/unload object into/from kernel */
|
||||
/**
|
||||
* @brief **bpf_object__load()** loads BPF object into kernel.
|
||||
* @param obj Pointer to a valid BPF object instance returned by
|
||||
* **bpf_object__open*()** APIs
|
||||
* @return 0, on success; negative error code, otherwise, error code is
|
||||
* stored in errno
|
||||
*/
|
||||
LIBBPF_API int bpf_object__load(struct bpf_object *obj);
|
||||
|
||||
LIBBPF_API void bpf_object__close(struct bpf_object *object);
|
||||
/**
|
||||
* @brief **bpf_object__close()** closes a BPF object and releases all
|
||||
* resources.
|
||||
* @param obj Pointer to a valid BPF object
|
||||
*/
|
||||
LIBBPF_API void bpf_object__close(struct bpf_object *obj);
|
||||
|
||||
/* pin_maps and unpin_maps can both be called with a NULL path, in which case
|
||||
* they will use the pin_path attribute of each map (and ignore all maps that
|
||||
|
|
|
@ -382,3 +382,6 @@ LIBBPF_1.1.0 {
|
|||
user_ring_buffer__reserve_blocking;
|
||||
user_ring_buffer__submit;
|
||||
} LIBBPF_1.0.0;
|
||||
|
||||
LIBBPF_1.2.0 {
|
||||
} LIBBPF_1.1.0;
|
||||
|
|
|
@ -39,14 +39,14 @@ static const char *libbpf_strerror_table[NR_ERRNO] = {
|
|||
|
||||
int libbpf_strerror(int err, char *buf, size_t size)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (!buf || !size)
|
||||
return libbpf_err(-EINVAL);
|
||||
|
||||
err = err > 0 ? err : -err;
|
||||
|
||||
if (err < __LIBBPF_ERRNO__START) {
|
||||
int ret;
|
||||
|
||||
ret = strerror_r(err, buf, size);
|
||||
buf[size - 1] = '\0';
|
||||
return libbpf_err_errno(ret);
|
||||
|
@ -56,12 +56,20 @@ int libbpf_strerror(int err, char *buf, size_t size)
|
|||
const char *msg;
|
||||
|
||||
msg = libbpf_strerror_table[ERRNO_OFFSET(err)];
|
||||
snprintf(buf, size, "%s", msg);
|
||||
ret = snprintf(buf, size, "%s", msg);
|
||||
buf[size - 1] = '\0';
|
||||
/* The length of the buf and msg is positive.
|
||||
* A negative number may be returned only when the
|
||||
* size exceeds INT_MAX. Not likely to appear.
|
||||
*/
|
||||
if (ret >= size)
|
||||
return libbpf_err(-ERANGE);
|
||||
return 0;
|
||||
}
|
||||
|
||||
snprintf(buf, size, "Unknown libbpf error %d", err);
|
||||
ret = snprintf(buf, size, "Unknown libbpf error %d", err);
|
||||
buf[size - 1] = '\0';
|
||||
if (ret >= size)
|
||||
return libbpf_err(-ERANGE);
|
||||
return libbpf_err(-ENOENT);
|
||||
}
|
||||
|
|
|
@ -543,6 +543,7 @@ static inline int ensure_good_fd(int fd)
|
|||
fd = fcntl(fd, F_DUPFD_CLOEXEC, 3);
|
||||
saved_errno = errno;
|
||||
close(old_fd);
|
||||
errno = saved_errno;
|
||||
if (fd < 0) {
|
||||
pr_warn("failed to dup FD %d to FD > 2: %d\n", old_fd, -saved_errno);
|
||||
errno = saved_errno;
|
||||
|
|
|
@ -4,6 +4,6 @@
|
|||
#define __LIBBPF_VERSION_H
|
||||
|
||||
#define LIBBPF_MAJOR_VERSION 1
|
||||
#define LIBBPF_MINOR_VERSION 1
|
||||
#define LIBBPF_MINOR_VERSION 2
|
||||
|
||||
#endif /* __LIBBPF_VERSION_H */
|
||||
|
|
|
@ -26,6 +26,7 @@ get_func_args_test # trampoline
|
|||
get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (trampoline)
|
||||
get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
|
||||
htab_update # failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
|
||||
jit_probe_mem # jit_probe_mem__open_and_load unexpected error: -524 (kfunc)
|
||||
kfree_skb # attach fentry unexpected error: -524 (trampoline)
|
||||
kfunc_call # 'bpf_prog_active': not found in kernel BTF (?)
|
||||
kfunc_dynptr_param # JIT does not support calling kernel function (kfunc)
|
||||
|
|
|
@ -626,3 +626,6 @@ EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \
|
|||
liburandom_read.so)
|
||||
|
||||
.PHONY: docs docs-clean
|
||||
|
||||
# Delete partially updated (corrupted) files on error
|
||||
.DELETE_ON_ERROR:
|
||||
|
|
|
@ -0,0 +1,28 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
|
||||
#include <test_progs.h>
|
||||
#include <network_helpers.h>
|
||||
|
||||
#include "jit_probe_mem.skel.h"
|
||||
|
||||
void test_jit_probe_mem(void)
|
||||
{
|
||||
LIBBPF_OPTS(bpf_test_run_opts, opts,
|
||||
.data_in = &pkt_v4,
|
||||
.data_size_in = sizeof(pkt_v4),
|
||||
.repeat = 1,
|
||||
);
|
||||
struct jit_probe_mem *skel;
|
||||
int ret;
|
||||
|
||||
skel = jit_probe_mem__open_and_load();
|
||||
if (!ASSERT_OK_PTR(skel, "jit_probe_mem__open_and_load"))
|
||||
return;
|
||||
|
||||
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.test_jit_probe_mem), &opts);
|
||||
ASSERT_OK(ret, "jit_probe_mem ret");
|
||||
ASSERT_OK(opts.retval, "jit_probe_mem opts.retval");
|
||||
ASSERT_EQ(skel->data->total_sum, 192, "jit_probe_mem total_sum");
|
||||
|
||||
jit_probe_mem__destroy(skel);
|
||||
}
|
|
@ -53,7 +53,7 @@ struct bitfields_only_mixed_types {
|
|||
*/
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
struct bitfield_mixed_with_others {
|
||||
long: 4; /* char is enough as a backing field */
|
||||
char: 4; /* char is enough as a backing field */
|
||||
int a: 4;
|
||||
/* 8-bit implicit padding */
|
||||
short b; /* combined with previous bitfield */
|
||||
|
|
|
@ -58,7 +58,81 @@ union jump_code_union {
|
|||
} __attribute__((packed));
|
||||
};
|
||||
|
||||
/*------ END-EXPECTED-OUTPUT ------ */
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*struct nested_packed_but_aligned_struct {
|
||||
* int x1;
|
||||
* int x2;
|
||||
*};
|
||||
*
|
||||
*struct outer_implicitly_packed_struct {
|
||||
* char y1;
|
||||
* struct nested_packed_but_aligned_struct y2;
|
||||
*} __attribute__((packed));
|
||||
*
|
||||
*/
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
struct nested_packed_but_aligned_struct {
|
||||
int x1;
|
||||
int x2;
|
||||
} __attribute__((packed));
|
||||
|
||||
struct outer_implicitly_packed_struct {
|
||||
char y1;
|
||||
struct nested_packed_but_aligned_struct y2;
|
||||
};
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*struct usb_ss_ep_comp_descriptor {
|
||||
* char: 8;
|
||||
* char bDescriptorType;
|
||||
* char bMaxBurst;
|
||||
* short wBytesPerInterval;
|
||||
*};
|
||||
*
|
||||
*struct usb_host_endpoint {
|
||||
* long: 64;
|
||||
* char: 8;
|
||||
* struct usb_ss_ep_comp_descriptor ss_ep_comp;
|
||||
* long: 0;
|
||||
*} __attribute__((packed));
|
||||
*
|
||||
*/
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
struct usb_ss_ep_comp_descriptor {
|
||||
char: 8;
|
||||
char bDescriptorType;
|
||||
char bMaxBurst;
|
||||
int: 0;
|
||||
short wBytesPerInterval;
|
||||
} __attribute__((packed));
|
||||
|
||||
struct usb_host_endpoint {
|
||||
long: 64;
|
||||
char: 8;
|
||||
struct usb_ss_ep_comp_descriptor ss_ep_comp;
|
||||
long: 0;
|
||||
};
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
struct nested_packed_struct {
|
||||
int a;
|
||||
char b;
|
||||
} __attribute__((packed));
|
||||
|
||||
struct outer_nonpacked_struct {
|
||||
short a;
|
||||
struct nested_packed_struct b;
|
||||
};
|
||||
|
||||
struct outer_packed_struct {
|
||||
short a;
|
||||
struct nested_packed_struct b;
|
||||
} __attribute__((packed));
|
||||
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
int f(struct {
|
||||
struct packed_trailing_space _1;
|
||||
|
@ -69,6 +143,10 @@ int f(struct {
|
|||
union union_is_never_packed _6;
|
||||
union union_does_not_need_packing _7;
|
||||
union jump_code_union _8;
|
||||
struct outer_implicitly_packed_struct _9;
|
||||
struct usb_host_endpoint _10;
|
||||
struct outer_nonpacked_struct _11;
|
||||
struct outer_packed_struct _12;
|
||||
} *_)
|
||||
{
|
||||
return 0;
|
||||
|
|
|
@ -19,7 +19,7 @@ struct padded_implicitly {
|
|||
/*
|
||||
*struct padded_explicitly {
|
||||
* int a;
|
||||
* int: 32;
|
||||
* long: 0;
|
||||
* int b;
|
||||
*};
|
||||
*
|
||||
|
@ -28,41 +28,28 @@ struct padded_implicitly {
|
|||
|
||||
struct padded_explicitly {
|
||||
int a;
|
||||
int: 1; /* algo will explicitly pad with full 32 bits here */
|
||||
int: 1; /* algo will emit aligning `long: 0;` here */
|
||||
int b;
|
||||
};
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*struct padded_a_lot {
|
||||
* int a;
|
||||
* long: 32;
|
||||
* long: 64;
|
||||
* long: 64;
|
||||
* int b;
|
||||
*};
|
||||
*
|
||||
*/
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
struct padded_a_lot {
|
||||
int a;
|
||||
/* 32 bit of implicit padding here, which algo will make explicit */
|
||||
long: 64;
|
||||
long: 64;
|
||||
int b;
|
||||
};
|
||||
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*struct padded_cache_line {
|
||||
* int a;
|
||||
* long: 32;
|
||||
* long: 64;
|
||||
* long: 64;
|
||||
* long: 64;
|
||||
* int b;
|
||||
* long: 32;
|
||||
* long: 64;
|
||||
* long: 64;
|
||||
* long: 64;
|
||||
|
@ -85,7 +72,7 @@ struct padded_cache_line {
|
|||
*struct zone {
|
||||
* int a;
|
||||
* short b;
|
||||
* short: 16;
|
||||
* long: 0;
|
||||
* struct zone_padding __pad__;
|
||||
*};
|
||||
*
|
||||
|
@ -108,6 +95,131 @@ struct padding_wo_named_members {
|
|||
long: 64;
|
||||
};
|
||||
|
||||
struct padding_weird_1 {
|
||||
int a;
|
||||
long: 64;
|
||||
short: 16;
|
||||
short b;
|
||||
};
|
||||
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*struct padding_weird_2 {
|
||||
* long: 56;
|
||||
* char a;
|
||||
* long: 56;
|
||||
* char b;
|
||||
* char: 8;
|
||||
*};
|
||||
*
|
||||
*/
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
struct padding_weird_2 {
|
||||
int: 32; /* these paddings will be collapsed into `long: 56;` */
|
||||
short: 16;
|
||||
char: 8;
|
||||
char a;
|
||||
int: 32; /* these paddings will be collapsed into `long: 56;` */
|
||||
short: 16;
|
||||
char: 8;
|
||||
char b;
|
||||
char: 8;
|
||||
};
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
struct exact_1byte {
|
||||
char x;
|
||||
};
|
||||
|
||||
struct padded_1byte {
|
||||
char: 8;
|
||||
};
|
||||
|
||||
struct exact_2bytes {
|
||||
short x;
|
||||
};
|
||||
|
||||
struct padded_2bytes {
|
||||
short: 16;
|
||||
};
|
||||
|
||||
struct exact_4bytes {
|
||||
int x;
|
||||
};
|
||||
|
||||
struct padded_4bytes {
|
||||
int: 32;
|
||||
};
|
||||
|
||||
struct exact_8bytes {
|
||||
long x;
|
||||
};
|
||||
|
||||
struct padded_8bytes {
|
||||
long: 64;
|
||||
};
|
||||
|
||||
struct ff_periodic_effect {
|
||||
int: 32;
|
||||
short magnitude;
|
||||
long: 0;
|
||||
short phase;
|
||||
long: 0;
|
||||
int: 32;
|
||||
int custom_len;
|
||||
short *custom_data;
|
||||
};
|
||||
|
||||
struct ib_wc {
|
||||
long: 64;
|
||||
long: 64;
|
||||
int: 32;
|
||||
int byte_len;
|
||||
void *qp;
|
||||
union {} ex;
|
||||
long: 64;
|
||||
int slid;
|
||||
int wc_flags;
|
||||
long: 64;
|
||||
char smac[6];
|
||||
long: 0;
|
||||
char network_hdr_type;
|
||||
};
|
||||
|
||||
struct acpi_object_method {
|
||||
long: 64;
|
||||
char: 8;
|
||||
char type;
|
||||
short reference_count;
|
||||
char flags;
|
||||
short: 0;
|
||||
char: 8;
|
||||
char sync_level;
|
||||
long: 64;
|
||||
void *node;
|
||||
void *aml_start;
|
||||
union {} dispatch;
|
||||
long: 64;
|
||||
int aml_length;
|
||||
};
|
||||
|
||||
struct nested_unpacked {
|
||||
int x;
|
||||
};
|
||||
|
||||
struct nested_packed {
|
||||
struct nested_unpacked a;
|
||||
char c;
|
||||
} __attribute__((packed));
|
||||
|
||||
struct outer_mixed_but_unpacked {
|
||||
struct nested_packed b1;
|
||||
short a1;
|
||||
struct nested_packed b2;
|
||||
};
|
||||
|
||||
/* ------ END-EXPECTED-OUTPUT ------ */
|
||||
|
||||
int f(struct {
|
||||
|
@ -117,6 +229,20 @@ int f(struct {
|
|||
struct padded_cache_line _4;
|
||||
struct zone _5;
|
||||
struct padding_wo_named_members _6;
|
||||
struct padding_weird_1 _7;
|
||||
struct padding_weird_2 _8;
|
||||
struct exact_1byte _100;
|
||||
struct padded_1byte _101;
|
||||
struct exact_2bytes _102;
|
||||
struct padded_2bytes _103;
|
||||
struct exact_4bytes _104;
|
||||
struct padded_4bytes _105;
|
||||
struct exact_8bytes _106;
|
||||
struct padded_8bytes _107;
|
||||
struct ff_periodic_effect _200;
|
||||
struct ib_wc _201;
|
||||
struct acpi_object_method _202;
|
||||
struct outer_mixed_but_unpacked _203;
|
||||
} *_)
|
||||
{
|
||||
return 0;
|
||||
|
|
|
@ -25,6 +25,39 @@ typedef enum {
|
|||
H = 2,
|
||||
} e3_t;
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*enum e_byte {
|
||||
* EBYTE_1 = 0,
|
||||
* EBYTE_2 = 1,
|
||||
*} __attribute__((mode(byte)));
|
||||
*
|
||||
*/
|
||||
/* ----- END-EXPECTED-OUTPUT ----- */
|
||||
enum e_byte {
|
||||
EBYTE_1,
|
||||
EBYTE_2,
|
||||
} __attribute__((mode(byte)));
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
/*
|
||||
*enum e_word {
|
||||
* EWORD_1 = 0LL,
|
||||
* EWORD_2 = 1LL,
|
||||
*} __attribute__((mode(word)));
|
||||
*
|
||||
*/
|
||||
/* ----- END-EXPECTED-OUTPUT ----- */
|
||||
enum e_word {
|
||||
EWORD_1,
|
||||
EWORD_2,
|
||||
} __attribute__((mode(word))); /* force to use 8-byte backing for this enum */
|
||||
|
||||
/* ----- START-EXPECTED-OUTPUT ----- */
|
||||
enum e_big {
|
||||
EBIG_1 = 1000000000000ULL,
|
||||
};
|
||||
|
||||
typedef int int_t;
|
||||
|
||||
typedef volatile const int * volatile const crazy_ptr_t;
|
||||
|
@ -224,6 +257,9 @@ struct root_struct {
|
|||
enum e2 _2;
|
||||
e2_t _2_1;
|
||||
e3_t _2_2;
|
||||
enum e_byte _100;
|
||||
enum e_word _101;
|
||||
enum e_big _102;
|
||||
struct struct_w_typedefs _3;
|
||||
anon_struct_t _7;
|
||||
struct struct_fwd *_8;
|
||||
|
|
|
@ -0,0 +1,61 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
|
||||
#include <vmlinux.h>
|
||||
#include <bpf/bpf_tracing.h>
|
||||
#include <bpf/bpf_helpers.h>
|
||||
|
||||
static struct prog_test_ref_kfunc __kptr_ref *v;
|
||||
long total_sum = -1;
|
||||
|
||||
extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(unsigned long *sp) __ksym;
|
||||
extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym;
|
||||
|
||||
SEC("tc")
|
||||
int test_jit_probe_mem(struct __sk_buff *ctx)
|
||||
{
|
||||
struct prog_test_ref_kfunc *p;
|
||||
unsigned long zero = 0, sum;
|
||||
|
||||
p = bpf_kfunc_call_test_acquire(&zero);
|
||||
if (!p)
|
||||
return 1;
|
||||
|
||||
p = bpf_kptr_xchg(&v, p);
|
||||
if (p)
|
||||
goto release_out;
|
||||
|
||||
/* Direct map value access of kptr, should be PTR_UNTRUSTED */
|
||||
p = v;
|
||||
if (!p)
|
||||
return 1;
|
||||
|
||||
asm volatile (
|
||||
"r9 = %[p];"
|
||||
"%[sum] = 0;"
|
||||
|
||||
/* r8 = p->a */
|
||||
"r8 = *(u32 *)(r9 + 0);"
|
||||
"%[sum] += r8;"
|
||||
|
||||
/* r8 = p->b */
|
||||
"r8 = *(u32 *)(r9 + 4);"
|
||||
"%[sum] += r8;"
|
||||
|
||||
"r9 += 8;"
|
||||
/* r9 = p->a */
|
||||
"r9 = *(u32 *)(r9 - 8);"
|
||||
"%[sum] += r9;"
|
||||
|
||||
: [sum] "=r"(sum)
|
||||
: [p] "r"(p)
|
||||
: "r8", "r9"
|
||||
);
|
||||
|
||||
total_sum = sum;
|
||||
return 0;
|
||||
release_out:
|
||||
bpf_kfunc_call_test_release(p);
|
||||
return 1;
|
||||
}
|
||||
|
||||
char _license[] SEC("license") = "GPL";
|
|
@ -81,6 +81,27 @@ int gre_set_tunnel(struct __sk_buff *skb)
|
|||
return TC_ACT_OK;
|
||||
}
|
||||
|
||||
SEC("tc")
|
||||
int gre_set_tunnel_no_key(struct __sk_buff *skb)
|
||||
{
|
||||
int ret;
|
||||
struct bpf_tunnel_key key;
|
||||
|
||||
__builtin_memset(&key, 0x0, sizeof(key));
|
||||
key.remote_ipv4 = 0xac100164; /* 172.16.1.100 */
|
||||
key.tunnel_ttl = 64;
|
||||
|
||||
ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key),
|
||||
BPF_F_ZERO_CSUM_TX | BPF_F_SEQ_NUMBER |
|
||||
BPF_F_NO_TUNNEL_KEY);
|
||||
if (ret < 0) {
|
||||
log_err(ret);
|
||||
return TC_ACT_SHOT;
|
||||
}
|
||||
|
||||
return TC_ACT_OK;
|
||||
}
|
||||
|
||||
SEC("tc")
|
||||
int gre_get_tunnel(struct __sk_buff *skb)
|
||||
{
|
||||
|
|
|
@ -66,15 +66,20 @@ config_device()
|
|||
|
||||
add_gre_tunnel()
|
||||
{
|
||||
tun_key=
|
||||
if [ -n "$1" ]; then
|
||||
tun_key="key $1"
|
||||
fi
|
||||
|
||||
# at_ns0 namespace
|
||||
ip netns exec at_ns0 \
|
||||
ip link add dev $DEV_NS type $TYPE seq key 2 \
|
||||
ip link add dev $DEV_NS type $TYPE seq $tun_key \
|
||||
local 172.16.1.100 remote 172.16.1.200
|
||||
ip netns exec at_ns0 ip link set dev $DEV_NS up
|
||||
ip netns exec at_ns0 ip addr add dev $DEV_NS 10.1.1.100/24
|
||||
|
||||
# root namespace
|
||||
ip link add dev $DEV type $TYPE key 2 external
|
||||
ip link add dev $DEV type $TYPE $tun_key external
|
||||
ip link set dev $DEV up
|
||||
ip addr add dev $DEV 10.1.1.200/24
|
||||
}
|
||||
|
@ -238,7 +243,7 @@ test_gre()
|
|||
|
||||
check $TYPE
|
||||
config_device
|
||||
add_gre_tunnel
|
||||
add_gre_tunnel 2
|
||||
attach_bpf $DEV gre_set_tunnel gre_get_tunnel
|
||||
ping $PING_ARG 10.1.1.100
|
||||
check_err $?
|
||||
|
@ -253,6 +258,30 @@ test_gre()
|
|||
echo -e ${GREEN}"PASS: $TYPE"${NC}
|
||||
}
|
||||
|
||||
test_gre_no_tunnel_key()
|
||||
{
|
||||
TYPE=gre
|
||||
DEV_NS=gre00
|
||||
DEV=gre11
|
||||
ret=0
|
||||
|
||||
check $TYPE
|
||||
config_device
|
||||
add_gre_tunnel
|
||||
attach_bpf $DEV gre_set_tunnel_no_key gre_get_tunnel
|
||||
ping $PING_ARG 10.1.1.100
|
||||
check_err $?
|
||||
ip netns exec at_ns0 ping $PING_ARG 10.1.1.200
|
||||
check_err $?
|
||||
cleanup
|
||||
|
||||
if [ $ret -ne 0 ]; then
|
||||
echo -e ${RED}"FAIL: $TYPE"${NC}
|
||||
return 1
|
||||
fi
|
||||
echo -e ${GREEN}"PASS: $TYPE"${NC}
|
||||
}
|
||||
|
||||
test_ip6gre()
|
||||
{
|
||||
TYPE=ip6gre
|
||||
|
@ -589,6 +618,7 @@ cleanup()
|
|||
ip link del ipip6tnl11 2> /dev/null
|
||||
ip link del ip6ip6tnl11 2> /dev/null
|
||||
ip link del gretap11 2> /dev/null
|
||||
ip link del gre11 2> /dev/null
|
||||
ip link del ip6gre11 2> /dev/null
|
||||
ip link del ip6gretap11 2> /dev/null
|
||||
ip link del geneve11 2> /dev/null
|
||||
|
@ -641,6 +671,10 @@ bpf_tunnel_test()
|
|||
test_gre
|
||||
errors=$(( $errors + $? ))
|
||||
|
||||
echo "Testing GRE tunnel (without tunnel keys)..."
|
||||
test_gre_no_tunnel_key
|
||||
errors=$(( $errors + $? ))
|
||||
|
||||
echo "Testing IP6GRE tunnel..."
|
||||
test_ip6gre
|
||||
errors=$(( $errors + $? ))
|
||||
|
|
Загрузка…
Ссылка в новой задаче