WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Ivan Khoronzhuk	54b7fbd448	samples/bpf: Drop unnecessarily inclusion for bpf_load Drop inclusion for bpf_load -I$(objtree)/usr/include as it is included for all objects anyway, with above line: KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191011002808.28206-7-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:59 -07:00
Ivan Khoronzhuk	0e865aedad	samples/bpf: Use __LINUX_ARM_ARCH__ selector for arm For arm, -D__LINUX_ARM_ARCH__=X is min version used as instruction set selector and is absolutely required while parsing some parts of headers. It's present in KBUILD_CFLAGS but not in autoconf.h, so let's retrieve it from and add to programs cflags. In another case errors like "SMP is not supported" for armv7 and bunch of other errors are issued resulting to incorrect final object. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191011002808.28206-6-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:59 -07:00
Ivan Khoronzhuk	2a560df7c1	samples/bpf: Use own EXTRA_CFLAGS for clang commands It can overlap with CFLAGS used for libraries built with gcc if not now then in next patches. Correct it here for simplicity. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191011002808.28206-5-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:59 -07:00
Ivan Khoronzhuk	518c13401e	samples/bpf: Use --target from cross-compile For cross compiling the target triple can be inherited from cross-compile prefix as it's done in CLANG_FLAGS from kernel makefile. So copy-paste this decision from kernel Makefile. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191011002808.28206-4-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:59 -07:00
Ivan Khoronzhuk	39e0c3649f	samples/bpf: Fix cookie_uid_helper_example obj build Don't list userspace "cookie_uid_helper_example" object in list for bpf objects. 'always' target is used for listing bpf programs, but 'cookie_uid_helper_example.o' is a user space ELF file, and covered by rule `per_socket_stats_example`, so shouldn't be in 'always'. Let us remove `always += cookie_uid_helper_example.o`, which avoids breaking cross compilation due to mismatched includes. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191011002808.28206-3-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:59 -07:00
Ivan Khoronzhuk	cdd5b2d1fc	samples/bpf: Fix HDR_PROBE "echo" echo should be replaced with echo -e to handle '\n' correctly, but instead, replace it with printf as some systems can't handle echo -e. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191011002808.28206-2-ivan.khoronzhuk@linaro.org	2019-10-12 16:08:58 -07:00
Andrii Nakryiko	e01a75c159	libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those headers are installed along the other libbpf headers. Also, adjust selftests and samples include path to include libbpf now. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com	2019-10-08 23:16:03 +02:00
Andrii Nakryiko	3ac4dbe3dd	selftests/bpf: Split off tracing-only helpers into bpf_tracing.h Split-off PT_REGS-related helpers into bpf_tracing.h header. Adjust selftests and samples to include it where necessary. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191008175942.1769476-5-andriin@fb.com	2019-10-08 23:16:03 +02:00
Andrii Nakryiko	36b5d47113	selftests/bpf: samples/bpf: Split off legacy stuff from bpf_helpers.h Split off few legacy things from bpf_helpers.h into separate bpf_legacy.h file: - load_{byte\|half\|word}; - remove extra inner_idx and numa_node fields from bpf_map_def and introduce bpf_map_def_legacy for use in samples; - move BPF_ANNOTATE_KV_PAIR into bpf_legacy.h. Adjust samples and selftests accordingly by either including bpf_legacy.h and using bpf_map_def_legacy, or switching to BTF-defined maps altogether. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191008175942.1769476-3-andriin@fb.com	2019-10-08 23:16:03 +02:00
Daniel T. Lee	8fdf5b780a	samples: bpf: Add max_pckt_size option at xdp_adjust_tail Currently, at xdp_adjust_tail_kern.c, MAX_PCKT_SIZE is limited to 600. To make this size flexible, static global variable 'max_pcktsz' is added. By updating new packet size from the user space, xdp_adjust_tail_kern.o will use this value as a new max packet size. This static global variable can be accesible from .data section with bpf_object__find_map* from user space, since it is considered as internal map (accessible with .bss/.data/.rodata suffix). If no '-P <MAX_PCKT_SIZE>' option is used, the size of maximum packet will be 600 as a default. For clarity, change the helper to fetch map from 'bpf_map__next' to 'bpf_object__find_map_fd_by_name'. Also, changed the way to test prog_fd, map_fd from '!= 0' to '< 0', since fd could be 0 when stdin is closed. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191007172117.3916-1-danieltimlee@gmail.com	2019-10-07 20:22:27 -07:00
Anton Ivanov	4564a8bb57	samples/bpf: Trivial - fix spelling mistake in usage Fix spelling mistake. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191007082636.14686-1-anton.ivanov@cambridgegreys.com	2019-10-07 20:12:55 -07:00
KP Singh	98beb3edeb	samples/bpf: Add a workaround for asm_inline This was added in commit `eb11186930` ("compiler-types.h: add asm_inline definition") and breaks samples/bpf as clang does not support asm __inline. Fixes: `eb11186930` ("compiler-types.h: add asm_inline definition") Co-developed-by: Florent Revest <revest@google.com> Signed-off-by: Florent Revest <revest@google.com> Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191002191652.11432-1-kpsingh@chromium.org	2019-10-03 17:37:11 +02:00
Björn Töpel	e55190f26f	samples/bpf: Fix build for task_fd_query_user.c Add missing include for <linux/perf_event.h> which was removed from perf-sys.h in commit `91854f9a07` ("perf tools: Move everything related to sys_perf_event_open() to perf-sys.h"). Fixes: `91854f9a07` ("perf tools: Move everything related to sys_perf_event_open() to perf-sys.h") Reported-by: KP Singh <kpsingh@google.com> Reported-by: Florent Revest <revest@google.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: KP Singh <kpsingh@google.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191001112249.27341-1-bjorn.topel@gmail.com	2019-10-03 16:27:03 +02:00
Ciara Loftus	5a712e1363	samples/bpf: fix xdpsock l2fwd tx for unaligned mode Preserve the offset of the address of the received descriptor, and include it in the address set for the tx descriptor, so the kernel can correctly locate the start of the packet data. Fixes: `03895e63ff` ("samples/bpf: add buffer recycling for unaligned chunks to xdpsock") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-16 09:35:10 +02:00
Kevin Laatz	3945b37a97	samples/bpf: use hugepages in xdpsock app This patch modifies xdpsock to use mmap instead of posix_memalign. With this change, we can use hugepages when running the application in unaligned chunks mode. Using hugepages makes it more likely that we have physically contiguous memory, which supports the unaligned chunk mode better. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-31 01:08:27 +02:00
Kevin Laatz	03895e63ff	samples/bpf: add buffer recycling for unaligned chunks to xdpsock This patch adds buffer recycling support for unaligned buffers. Since we don't mask the addr to 2k at umem_reg in unaligned mode, we need to make sure we give back the correct (original) addr to the fill queue. We achieve this using the new descriptor format and associated masks. The new format uses the upper 16-bits for the offset and the lower 48-bits for the addr. Since we have a field for the offset, we no longer need to modify the actual address. As such, all we have to do to get back the original address is mask for the lower 48 bits (i.e. strip the offset and we get the address on it's own). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-31 01:08:26 +02:00
Kevin Laatz	c543f54698	samples/bpf: add unaligned chunks mode support to xdpsock This patch adds support for the unaligned chunks mode. The addition of the unaligned chunks option will allow users to run the application with more relaxed chunk placement in the XDP umem. Unaligned chunks mode can be used with the '-u' or '--unaligned' command line options. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-31 01:08:26 +02:00
Ivan Khoronzhuk	bb4b5c08a8	samples: bpf: syscall_nrs: use mmap2 if defined For arm32 xdp sockets mmap2 is preferred, so use it if it's defined. Declaration of __NR_mmap can be skipped and it breaks build. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-21 14:31:38 +02:00
Magnus Karlsson	46738f73ea	samples/bpf: add use of need_wakeup flag in xdpsock This commit adds using the need_wakeup flag to the xdpsock sample application. It is turned on by default as we think it is a feature that seems to always produce a performance benefit, if the application has been written taking advantage of it. It can be turned off in the sample app by using the '-m' command line option. The txpush and l2fwd sub applications have also been updated to support poll() with multiple sockets. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:07:32 +02:00
Jesper Dangaard Brouer	abcce733ad	samples/bpf: xdp_fwd explain bpf_fib_lookup return codes Make it clear that this XDP program depend on the network stack to do the ARP resolution. This is connected with the BPF_FIB_LKUP_RET_NO_NEIGH return code from bpf_fib_lookup(). Another common mistake (seen via XDP-tutorial) is that users don't realize that sysctl net.ipv{4,6}.conf.all.forwarding setting is honored by bpf_fib_lookup. Reported-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Jesper Dangaard Brouer	a32a32cb26	samples/bpf: make xdp_fwd more practically usable via devmap lookup This address the TODO in samples/bpf/xdp_fwd_kern.c, which points out that the chosen egress index should be checked for existence in the devmap. This can now be done via taking advantage of Toke's work in commit `0cdbb4b09a` ("devmap: Allow map lookups from eBPF"). This change makes xdp_fwd more practically usable, as this allows for a mixed environment, where IP-forwarding fallback to network stack, if the egress device isn't configured to use XDP. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Jesper Dangaard Brouer	3783d43752	samples/bpf: xdp_fwd rename devmap name to be xdp_tx_ports The devmap name 'tx_port' came from a copy-paste from xdp_redirect_map which only have a single TX port. Change name to xdp_tx_ports to make it more descriptive. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Andrii Nakryiko	c17bec549c	samples/bpf: switch trace_output sample to perf_buffer API Convert trace_output sample to libbpf's perf_buffer API. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-07-23 16:05:42 -07:00
Andrii Nakryiko	f58a4d51d8	samples/bpf: convert xdp_sample_pkts_user to perf_buffer API Convert xdp_sample_pkts_user to libbpf's perf_buffer API. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-07-23 16:05:42 -07:00
Ilya Leoshkevich	81f522f96f	samples/bpf: build with -D__TARGET_ARCH_$(SRCARCH) While $ARCH can be relatively flexible (see Makefile and tools/scripts/Makefile.arch), $SRCARCH always corresponds to a directory name under arch/. Therefore, build samples with -D__TARGET_ARCH_$(SRCARCH), since that matches the expectations of bpf_helpers.h. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-15 23:13:17 +02:00
David S. Miller	af144a9834	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-08 19:48:57 -07:00
David S. Miller	c3ead2df97	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2019-07-03 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix the interpreter to properly handle BPF_ALU32 \| BPF_ARSH on BE architectures, from Jiong. 2) Fix several bugs in the x32 BPF JIT for handling shifts by 0, from Luke and Xi. 3) Fix NULL pointer deref in btf_type_is_resolve_source_only(), from Stanislav. 4) Properly handle the check that forwarding is enabled on the device in bpf_ipv6_fib_lookup() helper code, from Anton. 5) Fix UAPI bpf_prog_info fields alignment for archs that have 16 bit alignment such as m68k, from Baruch. 6) Fix kernel hanging in unregister_netdevice loop while unregistering device bound to XDP socket, from Ilya. 7) Properly terminate tail update in xskq_produce_flush_desc(), from Nathan. 8) Fix broken always_inline handling in test_lwt_seg6local, from Jiri. 9) Fix bpftool to use correct argument in cgroup errors, from Jakub. 10) Fix detaching dummy prog in XDP redirect sample code, from Prashant. 11) Add Jonathan to AF_XDP reviewers, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-03 12:09:00 -07:00
Stanislav Fomichev	d78e3f0614	samples/bpf: fix tcp_bpf.readme detach command Copy-paste, should be detach, not attach. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-03 16:52:02 +02:00
Stanislav Fomichev	395338843d	samples/bpf: add sample program that periodically dumps TCP stats Uses new RTT callback to dump stats every second. $ mkdir -p /tmp/cgroupv2 $ mount -t cgroup2 none /tmp/cgroupv2 $ mkdir -p /tmp/cgroupv2/foo $ echo $$ >> /tmp/cgroupv2/foo/cgroup.procs $ bpftool prog load ./tcp_dumpstats_kern.o /sys/fs/bpf/tcp_prog $ bpftool cgroup attach /tmp/cgroupv2/foo sock_ops pinned /sys/fs/bpf/tcp_prog $ bpftool prog tracelog $ # run neper/netperf/etc Used neper to compare performance with and without this program attached and didn't see any noticeable performance impact. Sample output: <idle>-0 [015] ..s. 2074.128800: 0: dsack_dups=0 delivered=242526 <idle>-0 [015] ..s. 2074.128808: 0: delivered_ce=0 icsk_retransmits=0 <idle>-0 [015] ..s. 2075.130133: 0: dsack_dups=0 delivered=323599 <idle>-0 [015] ..s. 2075.130138: 0: delivered_ce=0 icsk_retransmits=0 <idle>-0 [005] .Ns. 2076.131440: 0: dsack_dups=0 delivered=404648 <idle>-0 [005] .Ns. 2076.131447: 0: delivered_ce=0 icsk_retransmits=0 Cc: Eric Dumazet <edumazet@google.com> Cc: Priyaranjan Jha <priyarjha@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-03 16:52:02 +02:00
brakmo	71634d7f92	bpf: Add support for fq's EDT to HBM Adds support for fq's Earliest Departure Time to HBM (Host Bandwidth Manager). Includes a new BPF program supporting EDT, and also updates corresponding programs. It will drop packets with an EDT of more than 500us in the future unless the packet belongs to a flow with less than 2 packets in flight. This is done so each flow has at least 2 packets in flight, so they will not starve, and also to help prevent delayed ACK timeouts. It will also work with ECN enabled traffic, where the packets will be CE marked if their EDT is more than 50us in the future. The table below shows some performance numbers. The flows are back to back RPCS. One server sending to another, either 2 or 4 flows. One flow is a 10KB RPC, the rest are 1MB RPCs. When there are more than one flow of a given RPC size, the numbers represent averages. The rate limit applies to all flows (they are in the same cgroup). Tests ending with "-edt" ran with the new BPF program supporting EDT. Tests ending with "-hbt" ran on top HBT qdisc with the specified rate (i.e. no HBM). The other tests ran with the HBM BPF program included in the HBM patch-set. EDT has limited value when using DCTCP, but it helps in many cases when using Cubic. It usually achieves larger link utilization and lower 99% latencies for the 1MB RPCs. HBM ends up queueing a lot of packets with its default parameter values, reducing the goodput of the 10KB RPCs and increasing their latency. Also, the RTTs seen by the flows are quite large. Aggr 10K 10K 10K 1MB 1MB 1MB Limit rate drops RTT rate P90 P99 rate P90 P99 Test rate Flows Mbps % us Mbps us us Mbps ms ms -------- ---- ----- ---- ----- --- ---- ---- ---- ---- ---- ---- cubic 1G 2 904 0.02 108 257 511 539 647 13.4 24.5 cubic-edt 1G 2 982 0.01 156 239 656 967 743 14.0 17.2 dctcp 1G 2 977 0.00 105 324 408 744 653 14.5 15.9 dctcp-edt 1G 2 981 0.01 142 321 417 811 660 15.7 17.0 cubic-htb 1G 2 919 0.00 1825 40 2822 4140 879 9.7 9.9 cubic 200M 2 155 0.30 220 81 532 655 74 283 450 cubic-edt 200M 2 188 0.02 222 87 1035 1095 101 84 85 dctcp 200M 2 188 0.03 111 77 912 939 111 76 325 dctcp-edt 200M 2 188 0.03 217 74 1416 1738 114 76 79 cubic-htb 200M 2 188 0.00 5015 8 14ms 15ms 180 48 50 cubic 1G 4 952 0.03 110 165 516 546 262 38 154 cubic-edt 1G 4 973 0.01 190 111 1034 1314 287 65 79 dctcp 1G 4 951 0.00 103 180 617 905 257 37 38 dctcp-edt 1G 4 967 0.00 163 151 732 1126 272 43 55 cubic-htb 1G 4 914 0.00 3249 13 7ms 8ms 300 29 34 cubic 5G 4 4236 0.00 134 305 490 624 1310 10 17 cubic-edt 5G 4 4865 0.00 156 306 425 759 1520 10 16 dctcp 5G 4 4936 0.00 128 485 221 409 1484 7 9 dctcp-edt 5G 4 4924 0.00 148 390 392 623 1508 11 26 v1 -> v2: Incorporated Andrii's suggestions v2 -> v3: Incorporated Yonghong's suggestions v3 -> v4: Removed credit update that is not needed Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-03 15:03:00 +02:00
Maxim Mikityanskiy	123e8da1d3	xsk: Change the default frame size to 4096 and allow controlling it The typical XDP memory scheme is one packet per page. Change the AF_XDP frame size in libbpf to 4096, which is the page size on x86, to allow libbpf to be used with the drivers with the packet-per-page scheme. Add a command line option -f to xdpsock to allow to specify a custom frame size. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Daniel T. Lee	9e859e8f19	samples: bpf: make the use of xdp samples consistent Currently, each xdp samples are inconsistent in the use. Most of the samples fetch the interface with it's name. (ex. xdp1, xdp2skb, xdp_redirect_cpu, xdp_sample_pkts, etc.) But some of the xdp samples are fetching the interface with ifindex by command argument. This commit enables xdp samples to fetch interface with it's name without changing the original index interface fetching. (<ifname\|ifindex> fetching in the same way as xdp_sample_pkts_user.c does.) Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-26 15:39:15 +02:00
Michal Rostecki	7ae9f2817a	samples: bpf: Remove bpf_debug macro in favor of bpf_printk ibumad example was implementing the bpf_debug macro which is exactly the same as the bpf_printk macro available in bpf_helpers.h. This change makes use of bpf_printk instead of bpf_debug. Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-24 18:18:30 -07:00
Prashant Bhole	20f6239d49	samples/bpf: xdp_redirect, correctly get dummy program id When we terminate xdp_redirect, it ends up with following message: "Program on iface OUT changed, not removing" This results in dummy prog still attached to OUT interface. It is because signal handler checks if the programs are the same that we had attached. But while fetching dummy_prog_id, current code uses prog_fd instead of dummy_prog_fd. This patch passes the correct fd. Fixes: `3b7a8ec2de` ("samples/bpf: Check the prog id before exiting") Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-24 15:56:56 +02:00
David S. Miller	92ad6325cb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor SPDX change conflict. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-22 08:59:24 -04:00
Linus Torvalds	c884d8ac7f	SPDX update for 5.2-rc6 Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXQyQYA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymnGQCghETUBotn1p3hTjY56VEs6dGzpHMAnRT0m+lv kbsjBGEJpLbMRB2krnaU =RMcT -----END PGP SIGNATURE----- Merge tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull still more SPDX updates from Greg KH: "Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat" * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 ...	2019-06-21 09:58:42 -07:00
David S. Miller	dca73a65a6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-06-19 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) new SO_REUSEPORT_DETACH_BPF setsocktopt, from Martin. 2) BTF based map definition, from Andrii. 3) support bpf_map_lookup_elem for xskmap, from Jonathan. 4) bounded loops and scalar precision logic in the verifier, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-20 00:06:27 -04:00
Thomas Gleixner	7f904d7e1f	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 Based on 1 normalized pattern(s): gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 58 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081207.556988620@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:11:22 +02:00
David S. Miller	13091aa305	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 20:20:36 -07:00
Linus Torvalds	da0f382029	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Lots of bug fixes here: 1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer. 2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John Crispin. 3) Use after free in psock backlog workqueue, from John Fastabend. 4) Fix source port matching in fdb peer flow rule of mlx5, from Raed Salem. 5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet. 6) Network header needs to be set for packet redirect in nfp, from John Hurley. 7) Fix udp zerocopy refcnt, from Willem de Bruijn. 8) Don't assume linear buffers in vxlan and geneve error handlers, from Stefano Brivio. 9) Fix TOS matching in mlxsw, from Jiri Pirko. 10) More SCTP cookie memory leak fixes, from Neil Horman. 11) Fix VLAN filtering in rtl8366, from Linus Walluij. 12) Various TCP SACK payload size and fragmentation memory limit fixes from Eric Dumazet. 13) Use after free in pneigh_get_next(), also from Eric Dumazet. 14) LAPB control block leak fix from Jeremy Sowden" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits) lapb: fixed leak of control-blocks. tipc: purge deferredq list for each grp member in tipc_group_delete ax25: fix inconsistent lock state in ax25_destroy_timer neigh: fix use-after-free read in pneigh_get_next tcp: fix compile error if !CONFIG_SYSCTL hv_sock: Suppress bogus "may be used uninitialized" warnings be2net: Fix number of Rx queues used for flow hashing net: handle 802.1P vlan 0 packets properly tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() tcp: add tcp_min_snd_mss sysctl tcp: tcp_fragment() should apply sane memory limits tcp: limit payload size of sacked skbs Revert "net: phylink: set the autoneg state in phylink_phy_change" bpf: fix nested bpf tracepoints with per-cpu data bpf: Fix out of bounds memory access in bpf_sk_storage vsock/virtio: set SOCK_DONE on peer shutdown net: dsa: rtl8366: Fix up VLAN filtering net: phylink: set the autoneg state in phylink_phy_change net: add high_order_alloc_disable sysctl/static key tcp: add tcp_tx_skb_cache sysctl ...	2019-06-17 15:55:34 -07:00
Daniel T. Lee	4d18f6de6a	samples: bpf: refactor header include path Currently, header inclusion in each file is inconsistent. For example, "libbpf.h" header is included as multiple ways. #include "bpf/libbpf.h" #include "libbpf.h" Due to commit `b552d33c80` ("samples/bpf: fix include path in Makefile"), $(srctree)/tools/lib/bpf/ path had been included during build, path "bpf/" in header isn't necessary anymore. This commit removes path "bpf/" in header inclusion. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:28:36 +02:00
Daniel T. Lee	fa206dccd8	samples: bpf: remove unnecessary include options in Makefile Due to recent change of include path at commit `b552d33c80` ("samples/bpf: fix include path in Makefile"), some of the previous include options became unnecessary. This commit removes duplicated include options in Makefile. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:28:36 +02:00
Prashant Bhole	b552d33c80	samples/bpf: fix include path in Makefile Recent commit included libbpf.h in selftests/bpf/bpf_util.h. Since some samples use bpf_util.h and samples/bpf/Makefile doesn't have libbpf.h path included, build was failing. Let's add the path in samples/bpf/Makefile. Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-15 01:14:06 +02:00
Jakub Kicinski	0ed3cc4abc	samples: bpf: don't run probes at the local make stage Quentin reports that commit `07c3bbdb1a` ("samples: bpf: print a warning about headers_install") is producing the false positive when make is invoked locally, from the samples/bpf/ directory. When make is run locally it hits the "all" target, which will recursively invoke make through the full build system. Speed up the "local" run which doesn't actually build anything, and avoid false positives by skipping all the probes if not in kbuild environment (cover both the new warning and the BTF probes). Reported-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-10 23:37:20 -07:00
David S. Miller	38e406f600	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2019-06-07 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix several bugs in riscv64 JIT code emission which forgot to clear high 32-bits for alu32 ops, from Björn and Luke with selftests covering all relevant BPF alu ops from Björn and Jiong. 2) Two fixes for UDP BPF reuseport that avoid calling the program in case of __udp6_lib_err and UDP GRO which broke reuseport_select_sock() assumption that skb->data is pointing to transport header, from Martin. 3) Two fixes for BPF sockmap: a use-after-free from sleep in psock's backlog workqueue, and a missing restore of sk_write_space when psock gets dropped, from Jakub and John. 4) Fix unconnected UDP sendmsg hook API which is insufficient as-is since it breaks standard applications like DNS if reverse NAT is not performed upon receive, from Daniel. 5) Fix an out-of-bounds read in __bpf_skc_lookup which in case of AF_INET6 fails to verify that the length of the tuple is long enough, from Lorenz. 6) Fix libbpf's libbpf__probe_raw_btf to return an fd instead of 0/1 (for {un,}successful probe) as that is expected to be propagated as an fd to load_sk_storage_btf() and thus closing the wrong descriptor otherwise, from Michal. 7) Fix bpftool's JSON output for the case when a lookup fails, from Krzesimir. 8) Minor misc fixes in docs, samples and selftests, from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-07 14:46:47 -07:00
David S. Miller	a6cdeeb16b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Some ISDN files that got removed in net-next had some changes done in mainline, take the removals. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-07 11:00:14 -07:00
Jakub Kicinski	07c3bbdb1a	samples: bpf: print a warning about headers_install It seems like periodically someone posts patches to "fix" header includes. The issue is that samples expect the include path to have the uAPI headers (from usr/) first, and then tools/ headers, so that locally installed uAPI headers take precedence. This means that if users didn't run headers_install they will see all sort of strange compilation errors, e.g.: HOSTCC samples/bpf/test_lru_dist samples/bpf/test_lru_dist.c:39:8: error: redefinition of ‘struct list_head’ struct list_head { ^~~~~~~~~ In file included from samples/bpf/test_lru_dist.c:9:0: ../tools/include/linux/types.h:69:8: note: originally defined here struct list_head { ^~~~~~~~~ Try to detect this situation, and print a helpful warning. v2: just use HOSTCC (Jiong). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-06 02:15:14 +02:00
Thomas Gleixner	5b497af42f	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of version 2 of the gnu general public license as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 64 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:36:38 +02:00
Colin Ian King	2ed99339e9	bpf: hbm: fix spelling mistake "notifcations" -> "notificiations" There is a spelling mistake in the help information, fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-04 16:55:32 +02:00
brakmo	d58c6f7212	bpf: Add more stats to HBM Adds more stats to HBM, including average cwnd and rtt of all TCP flows, percents of packets that are ecn ce marked and distribution of return values. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-05-31 16:41:29 -07:00
brakmo	ffd81558d5	bpf: Add cn support to hbm_out_kern.c Update hbm_out_kern.c to support returning cn notifications. Also updates relevant files to allow disabling cn notifications. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-05-31 16:41:29 -07:00
Thomas Gleixner	25763b3c86	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of version 2 of the gnu general public license as published by the free software foundation extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 107 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.615055994@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:53 -07:00
Roman Gushchin	ba0c0cc05d	selftests/bpf: convert test_cgrp2_attach2 example into kselftest Convert test_cgrp2_attach2 example into a proper test_cgroup_attach kselftest. It's better because we do run kselftest on a constant basis, so there are better chances to spot a potential regression. Also make it slightly less verbose to conform kselftests output style. Output example: $ ./test_cgroup_attach #override:PASS #multi:PASS test_cgroup_attach:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-05-28 09:30:02 -07:00
Daniel T. Lee	37b54aed12	samples/bpf: fix a couple of style issues in bpf_load This commit fixes a few style problems in samples/bpf/bpf_load.c: - Magic string use of 'DEBUGFS' - Useless zero initialization of a global variable - Minor style fix with whitespace Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-05-28 11:10:37 +02:00
Matteo Croce	d9a6f413f8	samples: bpf: add ibumad sample to .gitignore This commit adds ibumad to .gitignore which is currently ommited from the ignore file. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-05-24 19:58:03 -07:00
Michal Rostecki	c87f60a77d	samples: bpf: Do not define bpf_printk macro The bpf_printk macro was moved to bpf_helpers.h which is included in all example programs. Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-05-24 13:47:17 -07:00
Matteo Croce	a195cefff4	samples, bpf: suppress compiler warning GCC 9 fails to calculate the size of local constant strings and produces a false positive: samples/bpf/task_fd_query_user.c: In function ‘test_debug_fs_uprobe’: samples/bpf/task_fd_query_user.c:242:67: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 215 [-Wformat-truncation=] 242 \| snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id", \| ^~ 243 \| event_type, event_alias); \| ~~~~~~~~~~~ samples/bpf/task_fd_query_user.c:242:2: note: ‘snprintf’ output between 45 and 300 bytes into a destination of size 256 242 \| snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 243 \| event_type, event_alias); \| ~~~~~~~~~~~~~~~~~~~~~~~~ Workaround this by lowering the buffer size to a reasonable value. Related GCC Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83431 Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-05-21 16:55:20 +02:00
Chang-Hsien Tsai	f7c2d64bac	samples, bpf: fix to change the buffer size for read() If the trace for read is larger than 4096, the return value sz will be 4096. This results in off-by-one error on buf: static char buf[4096]; ssize_t sz; sz = read(trace_fd, buf, sizeof(buf)); if (sz > 0) { buf[sz] = 0; puts(buf); } Signed-off-by: Chang-Hsien Tsai <luke.tw@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-05-21 16:36:48 +02:00
Linus Torvalds	dce45af5c2	5.2 Merge Window pull request This has been a smaller cycle than normal. One new driver was accepted, which is unusual, and at least one more driver remains in review on the list. - Driver fixes for hns, hfi1, nes, rxe, i40iw, mlx5, cxgb4, vmw_pvrdma - Many patches from MatthewW converting radix tree and IDR users to use xarray - Introduction of tracepoints to the MAD layer - Build large SGLs at the start for DMA mapping and get the driver to split them - Generally clean SGL handling code throughout the subsystem - Support for restricting RDMA devices to net namespaces for containers - Progress to remove object allocation boilerplate code from drivers - Change in how the mlx5 driver shows representor ports linked to VFs - mlx5 uapi feature to access the on chip SW ICM memory - Add a new driver for 'EFA'. This is HW that supports user space packet processing through QPs in Amazon's cloud -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlzTIU0ACgkQOG33FX4g mxrGKQ/8CqpyvuCyZDW5ovO4DI4YlzYSPXehWlwxA4CWhU1AYTujutnNOdZdngnz atTthOlJpZWJV26orvvzwIOi4qX/5UjLXEY3HYdn07JP1Z4iT7E3P4W2sdU3vdl3 j8bU7xM7ZWmnGxrBZ6yQlVRadEhB8+HJIZWMw+wx66cIPnvU+g9NgwouH67HEEQ3 PU8OCtGBwNNR508WPiZhjqMDfi/3BED4BfCihFhMbZEgFgObjRgtCV0M33SSXKcR IO2FGNVuDAUBlND3vU9guW1+M77xE6p1GvzkIgdCp6qTc724NuO5F2ngrpHKRyZT CxvBhAJI6tAZmjBVnmgVJex7rA8p+y/8M/2WD6GE3XSO89XVOkzNBiO2iTMeoxXr +CX6VvP2BWwCArxsfKMgW3j0h/WVE9w8Ciej1628m1NvvKEV4AGIJC1g93lIJkRN i3RkJ5PkIrdBrTEdKwDu1FdXQHaO7kGgKvwzJ7wBFhso8BRMrMfdULiMbaXs2Bw1 WdL5zoSe/bLUpPZxcT9IjXRxY5qR0FpIOoo6925OmvyYe/oZo1zbitS5GGbvV90g tkq6Jb+aq8ZKtozwCo+oMcg9QPLYNibQsnkL3QirtURXWCG467xdgkaJLdF6s5Oh cp+YBqbR/8HNMG/KQlCfnNQKp1ci8mG3EdthQPhvdcZ4jtbqnSI= =TS64 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "This has been a smaller cycle than normal. One new driver was accepted, which is unusual, and at least one more driver remains in review on the list. Summary: - Driver fixes for hns, hfi1, nes, rxe, i40iw, mlx5, cxgb4, vmw_pvrdma - Many patches from MatthewW converting radix tree and IDR users to use xarray - Introduction of tracepoints to the MAD layer - Build large SGLs at the start for DMA mapping and get the driver to split them - Generally clean SGL handling code throughout the subsystem - Support for restricting RDMA devices to net namespaces for containers - Progress to remove object allocation boilerplate code from drivers - Change in how the mlx5 driver shows representor ports linked to VFs - mlx5 uapi feature to access the on chip SW ICM memory - Add a new driver for 'EFA'. This is HW that supports user space packet processing through QPs in Amazon's cloud" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (186 commits) RDMA/ipoib: Allow user space differentiate between valid dev_port IB/core, ipoib: Do not overreact to SM LID change event RDMA/device: Don't fire uevent before device is fully initialized lib/scatterlist: Remove leftover from sg_page_iter comment RDMA/efa: Add driver to Kconfig/Makefile RDMA/efa: Add the efa module RDMA/efa: Add EFA verbs implementation RDMA/efa: Add common command handlers RDMA/efa: Implement functions that submit and complete admin commands RDMA/efa: Add the ABI definitions RDMA/efa: Add the com service API definitions RDMA/efa: Add the efa_com.h file RDMA/efa: Add the efa.h header file RDMA/efa: Add EFA device definitions RDMA: Add EFA related definitions RDMA/umem: Remove hugetlb flag RDMA/bnxt_re: Use core helpers to get aligned DMA address RDMA/i40iw: Use core helpers to get aligned DMA address within a supported page size RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks RDMA/umem: Add API to find best driver supported page size in an MR ...	2019-05-09 09:02:46 -07:00
Daniel T. Lee	ead442a0f9	samples: bpf: add hbm sample to .gitignore This commit adds hbm to .gitignore which is currently ommited from the ignore file. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-04-25 23:38:00 +02:00
Alexei Starovoitov	636e78b1cd	samples/bpf: fix build with new clang clang started to error on invalid asm clobber usage in x86 headers and many bpf program samples failed to build with the message: CLANG-bpf /data/users/ast/bpf-next/samples/bpf/xdp_redirect_kern.o In file included from /data/users/ast/bpf-next/samples/bpf/xdp_redirect_kern.c:14: In file included from ../include/linux/in.h:23: In file included from ../include/uapi/linux/in.h:24: In file included from ../include/linux/socket.h:8: In file included from ../include/linux/uio.h:14: In file included from ../include/crypto/hash.h:16: In file included from ../include/linux/crypto.h:26: In file included from ../include/linux/uaccess.h:5: In file included from ../include/linux/sched.h:15: In file included from ../include/linux/sem.h:5: In file included from ../include/uapi/linux/sem.h:5: In file included from ../include/linux/ipc.h:9: In file included from ../include/linux/refcount.h:72: ../arch/x86/include/asm/refcount.h:72:36: error: asm-specifier for input or output variable conflicts with asm clobber list r->refs.counter, e, "er", i, "cx"); ^ ../arch/x86/include/asm/refcount.h:86:27: error: asm-specifier for input or output variable conflicts with asm clobber list r->refs.counter, e, "cx"); ^ 2 errors generated. Override volatile() to workaround the problem. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-04-05 16:28:36 +02:00
Daniel T. Lee	e67b2c7154	samples, selftests/bpf: add NULL check for ksym_search Since, ksym_search added with verification logic for symbols existence, it could return NULL when the kernel symbols are not loaded. This commit will add NULL check logic after ksym_search. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-04-04 16:43:47 +02:00
Ira Weiny	0ac01febd4	BPF: Add sample code for new ib_umad tracepoint Provide a count of class types for a summary of MAD packets. The example shows one way to filter the trace data based on management class. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-03-27 15:53:26 -03:00
Daniel T. Lee	ab99e7a8f7	samples: bpf: add xdp_sample_pkts to .gitignore This commit adds xdp_sample_pkts to .gitignore which is currently ommited from the ignore file. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-03-21 19:35:36 -07:00
Colin Ian King	5b4f21b2a5	bpf: hbm: fix spelling mistake "deault" -> "default" There are a couple of typos, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-03-07 10:35:00 +01:00
brakmo	4ffd44cfd1	bpf: HBM test script Script for testing HBM (Host Bandwidth Manager) framework. It creates a cgroup to use for testing and load a BPF program to limit egress bandwidht. It then uses iperf3 or netperf to create loads. The output is the goodput in Mbps (unless -D is used). It can work on a single host using loopback or among two hosts (with netperf). When using loopback, it is recommended to also introduce a delay of at least 1ms (-d=1), otherwise the assigned bandwidth is likely to be underutilized. USAGE: $name [out] [-b=<prog>\|--bpf=<prog>] [-c=<cc>\|--cc=<cc>] [-D] [-d=<delay>\|--delay=<delay>] [--debug] [-E] [-f=<#flows>\|--flows=<#flows>] [-h] [-i=<id>\|--id=<id >] [-l] [-N] [-p=<port>\|--port=<port>] [-P] [-q=<qdisc>] [-R] [-s=<server>\|--server=<server] [--stats] [-t=<time>\|--time=<time>] [-w] [cubic\|dctcp] Where: out Egress (default egress) -b or --bpf BPF program filename to load and attach. Default is nrm_out_kern.o for egress, -c or -cc TCP congestion control (cubic or dctcp) -d or --delay Add a delay in ms using netem -D In addition to the goodput in Mbps, it also outputs other detailed information. This information is test dependent (i.e. iperf3 or netperf). --debug Print BPF trace buffer -E Enable ECN (not required for dctcp) -f or --flows Number of concurrent flows (default=1) -i or --id cgroup id (an integer, default is 1) -l Do not limit flows using loopback -N Use netperf instead of iperf3 -h Help -p or --port iperf3 port (default is 5201) -P Use an iperf3 instance for each flow -q Use the specified qdisc. -r or --rate Rate in Mbps (default 1s 1Gbps) -R Use TCP_RR for netperf. 1st flow has req size of 10KB, rest of 1MB. Reply in all cases is 1 byte. More detailed output for each flow can be found in the files netperf.<cg>.<flow>, where <cg> is the cgroup id as specified with the -i flag, and <flow> is the flow id starting at 1 and increasing by 1 for flow (as specified by -f). -s or --server hostname of netperf server. Used to create netperf test traffic between to hosts (default is within host) netserver must be running on the host. --stats Get HBM stats (marked, dropped, etc.) -t or --time duration of iperf3 in seconds (default=5) -w Work conserving flag. cgroup can increase its bandwidth beyond the rate limit specified while there is available bandwidth. Current implementation assumes there is only one NIC (eth0), but can be extended to support multiple NICs. This is just a proof of concept. cubic or dctcp specify TCP CC to use Examples: ./do_hbm_test.sh -l -d=1 -D --stats Runs a 5 second test, using a single iperf3 flow and with the default rate limit of 1Gbps and a delay of 1ms (using netem) using the default TCP congestion control on the loopback device (hence we use "-l" to enforce bandwidth limit on loopback device). Since no direction is specified, it defaults to egress. Since no TCP CC algorithm is specified it uses the system default (Cubic for this test). With no -D flag, only the value of the AGGREGATE OUTPUT would show. id refers to the cgroup id and is useful when running multi cgroup tests (supported by a future patch). This patchset does not support calling TCP's congesion window reduction, even when packets are dropped by the BPF program, resulting in a large number of packets dropped. It is recommended that the current HBM implemenation only be used with ECN enabled flows. A future patch will add support for reducing TCP's cwnd and will increase the performance of non-ECN enabled flows. Output: Details for HBM in cgroup 1 id:1 rate_mbps:493 duration:4.8 secs packets:11355 bytes_MB:590 pkts_dropped:4497 bytes_dropped_MB:292 pkts_marked_percent: 39.60 bytes_marked_percent: 49.49 pkts_dropped_percent: 39.60 bytes_dropped_percent: 49.49 PING AVG DELAY:2.075 AGGREGATE_GOODPUT:505 ./do_nrm_test.sh -l -d=1 -D --stats dctcp Same as above but using dctcp. Note that fewer bytes are dropped (0.01% vs. 49%). Output: Details for HBM in cgroup 1 id:1 rate_mbps:945 duration:4.9 secs packets:16859 bytes_MB:578 pkts_dropped:1 bytes_dropped_MB:0 pkts_marked_percent: 28.74 bytes_marked_percent: 45.15 pkts_dropped_percent: 0.01 bytes_dropped_percent: 0.01 PING AVG DELAY:2.083 AGGREGATE_GOODPUT:965 ./do_nrm_test.sh -d=1 -D --stats As first example, but without limiting loopback device (i.e. no "-l" flag). Since there is no bandwidth limiting, no details for HBM are printed out. Output: Details for HBM in cgroup 1 PING AVG DELAY:2.019 AGGREGATE_GOODPUT:42655 ./do_hbm.sh -l -d=1 -D --stats -f=2 Uses iper3 and does 2 flows ./do_hbm.sh -l -d=1 -D --stats -f=4 -P Uses iperf3 and does 4 flows, each flow as a separate process. ./do_hbm.sh -l -d=1 -D --stats -f=4 -N Uses netperf, 4 flows ./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats dctcp -s=<server-name> Uses netperf between two hosts. The remote host name is specified with -s= and you need to start the program netserver manually on the remote host. It will use 1 flow, a rate limit of 2Gbps and dctcp. ./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats -w dctcp \ -s=<server-name> As previous, but allows use of extra bandwidth. For this test the rate is 8Gbps vs. 1Gbps of the previous test. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-03-02 10:48:27 -08:00
brakmo	a1270fe95b	bpf: User program for testing HBM The program nrm creates a cgroup and attaches a BPF program to the cgroup for testing HBM (Host Bandwidth Manager) for egress traffic. One still needs to create network traffic. This can be done through netesto, netperf or iperf3. A follow-up patch contains a script to create traffic. USAGE: hbm [-d] [-l] [-n <id>] [-r <rate>] [-s] [-t <secs>] [-w] [-h] [prog] Where: -d Print BPF trace debug buffer -l Also limit flows doing loopback -n <#> To create cgroup "/hbm#" and attach prog. Default is /nrm1 This is convenient when testing HBM in more than 1 cgroup -r <rate> Rate limit in Mbps -s Get HBM stats (marked, dropped, etc.) -t <time> Exit after specified seconds (deault is 0) -w Work conserving flag. cgroup can increase its bandwidth beyond the rate limit specified while there is available bandwidth. Current implementation assumes there is only NIC (eth0), but can be extended to support multiple NICs. Currrently only supported for egress. Note, this is just a proof of concept. -h Print this info prog BPF program file name. Name defaults to hbm_out_kern.o More information about HBM can be found in the paper "BPF Host Resource Management" presented at the 2018 Linux Plumbers Conference, Networking Track (http://vger.kernel.org/lpc_net2018_talks/LPC%20BPF%20Network%20Resource%20Paper.pdf) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-03-02 10:48:27 -08:00
brakmo	187d0738ff	bpf: Sample HBM BPF program to limit egress bw A cgroup skb BPF program to limit cgroup output bandwidth. It uses a modified virtual token bucket queue to limit average egress bandwidth. The implementation uses credits instead of tokens. Negative credits imply that queueing would have happened (this is a virtual queue, so no queueing is done by it. However, queueing may occur at the actual qdisc (which is not used for rate limiting). This implementation uses 3 thresholds, one to start marking packets and the other two to drop packets: CREDIT - <--------------------------\|------------------------> + \| \| \| 0 \| Large pkt \| \| drop thresh \| Small pkt drop Mark threshold thresh The effect of marking depends on the type of packet: a) If the packet is ECN enabled, then the packet is ECN ce marked. The current mark threshold is tuned for DCTCP. c) Else, it is dropped if it is a large packet. If the credit is below the drop threshold, the packet is dropped. Note that dropping a packet through the BPF program does not trigger CWR (Congestion Window Reduction) in TCP packets. A future patch will add support for triggering CWR. This BPF program actually uses 2 drop thresholds, one threshold for larger packets (>= 120 bytes) and another for smaller packets. This protects smaller packets such as SYNs, ACKs, etc. The default bandwidth limit is set at 1Gbps but this can be changed by a user program through a shared BPF map. In addition, by default this BPF program does not limit connections using loopback. This behavior can be overwritten by the user program. There is also an option to calculate some statistics, such as percent of packets marked or dropped, which the user program can access. A latter patch provides such a program (hbm.c) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-03-02 10:48:27 -08:00
Yonghong Song	b74e21ab7d	samples/bpf: silence compiler warning for xdpsock_user.c Compiling xdpsock_user.c with 4.8.5, I hit the following compilation warning: HOSTCC samples/bpf/xdpsock_user.o /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c: In function ‘main’: /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:449:6: warning: ‘idx_cq’ may be used unini tialized in this function [-Wmaybe-uninitialized] u32 idx_cq, idx_fq; ^ /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:606:7: warning: ‘idx_rx’ may be used unini tialized in this function [-Wmaybe-uninitialized] u32 idx_rx, idx_tx = 0; ^ /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:506:6: warning: ‘idx_rx’ may be used unini tialized in this function [-Wmaybe-uninitialized] u32 idx_rx, idx_fq = 0; As an example, the code pattern looks like: u32 idx_cq; ... ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq); if (ret) { ... } ... idx_fq ... The compiler warns since it does not know whether &idx_fq is assigned or not inside the library function xsk_ring_prod__reserve(). Let us assign an initial value 0 to such auto variables to silence compiler warning. Fixes: `248c7f9c0e` ("samples/bpf: convert xdpsock to use libbpf for AF_XDP access") Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-03-02 01:07:10 +01:00
Jakub Kicinski	1a9b268c90	samples: bpf: use libbpf where easy Some samples don't really need the magic of bpf_load, switch them to libbpf. v2: - specify program types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-03-01 00:53:45 +01:00
Jakub Kicinski	ea9b636201	samples: bpf: remove load_sock_ops in favour of bpftool bpftool can do all the things load_sock_ops used to do, and more. Point users to bpftool instead of maintaining this sample utility. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-03-01 00:53:45 +01:00
Jakub Kicinski	5c3cf87d47	samples: bpf: force IPv4 in ping ping localhost may default of IPv6 on modern systems, but samples are trying to only parse IPv4. Force IPv4. samples/bpf/tracex1_user.c doesn't interpret the packet so we don't care which IP version will be used there. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-03-01 00:53:45 +01:00
Daniel T. Lee	d2e614cb07	samples: bpf: fix: broken sample regarding removed function Currently, running sample "task_fd_query" and "tracex3" occurs the following error. On kernel v5.0-rc* this sample will be unavailable due to the removal of function 'blk_start_request' at commit "a1ce35f". (function removed, as "Single Queue IO scheduler" no longer exists) $ sudo ./task_fd_query failed to create kprobe 'blk_start_request' error 'No such file or directory' This commit will change the function 'blk_start_request' to 'blk_mq_start_request' to fix the broken sample. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-27 17:27:22 +01:00
Magnus Karlsson	248c7f9c0e	samples/bpf: convert xdpsock to use libbpf for AF_XDP access This commit converts the xdpsock sample application to use the AF_XDP functions present in libbpf. This cuts down the size of it by nearly 300 lines of code. The default ring sizes plus the batch size has been increased and the size of the umem area has decreased. This so that the sample application will provide higher throughput. Note also that the shared umem code has been removed from the sample as this is not supported by libbpf at this point in time. Tested-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-25 23:21:42 +01:00
Toke Høiland-Jørgensen	915654fd71	samples/bpf: Fix dummy program unloading for xdp_redirect samples The xdp_redirect and xdp_redirect_map sample programs both load a dummy program onto the egress interfaces. However, the unload code checks these programs against the wrong id number, and thus refuses to unload them. Fix the comparison to avoid this. Fixes: `3b7a8ec2de` ("samples/bpf: Check the prog id before exiting") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-22 16:21:59 +01:00
Maciej Fijalkowski	3b7a8ec2de	samples/bpf: Check the prog id before exiting Check the program id within the signal handler on polling xdp samples that were previously converted to libbpf usage. Avoid the situation of unloading the program that was not attached by sample that is exiting. Handle also the case where bpf_get_link_xdp_id didn't exit with an error but the xdp program was not found on an interface. Reported-by: Michal Papaj <michal.papaj@intel.com> Reported-by: Jakub Spizewski <jakub.spizewski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 23:37:51 +01:00
Maciej Fijalkowski	743e568c15	samples/bpf: Add a "force" flag to XDP samples Make xdp samples consistent with iproute2 behavior and set the XDP_FLAGS_UPDATE_IF_NOEXIST by default when setting the xdp program on interface. Provide an option for user to force the program loading, which as a result will not include the mentioned flag in bpf_set_link_xdp_fd call. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 23:37:51 +01:00
Maciej Fijalkowski	6a5457618f	samples/bpf: Extend RLIMIT_MEMLOCK for xdp_{sample_pkts, router_ipv4} There is a common problem with xdp samples that happens when user wants to run a particular sample and some bpf program is already loaded. The default 64kb RLIMIT_MEMLOCK resource limit will cause a following error (assuming that xdp sample that is failing was converted to libbpf usage): libbpf: Error in bpf_object__probe_name():Operation not permitted(1). Couldn't load basic 'r0 = 0' BPF program. libbpf: failed to load object './xdp_sample_pkts_kern.o' Fix it in xdp_sample_pkts and xdp_router_ipv4 by setting RLIMIT_MEMLOCK to RLIM_INFINITY. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 23:37:51 +01:00
Maciej Fijalkowski	bbaf6029c4	samples/bpf: Convert XDP samples to libbpf usage Some of XDP samples that are attaching the bpf program to the interface via libbpf's bpf_set_link_xdp_fd are still using the bpf_load.c for loading and manipulating the ebpf program and maps. Convert them to do this through libbpf usage and remove bpf_load from the picture. While at it remove what looks like debug leftover in xdp_redirect_map_user.c In xdp_redirect_cpu, change the way that the program to be loaded onto interface is chosen - user now needs to pass the program's section name instead of the relative number. In case of typo print out the section names to choose from. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 23:37:51 +01:00
Jesper Dangaard Brouer	7313798b14	samples/bpf: xdp_redirect_cpu have not need for read_trace_pipe The sample xdp_redirect_cpu is not using helper bpf_trace_printk. Thus it makes no sense that the --debug option us reading from /sys/kernel/debug/tracing/trace_pipe via read_trace_pipe. Simply remove it. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 23:37:51 +01:00
David S. Miller	ec7146db15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2019-01-29 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Teach verifier dead code removal, this also allows for optimizing / removing conditional branches around dead code and to shrink the resulting image. Code store constrained architectures like nfp would have hard time doing this at JIT level, from Jakub. 2) Add JMP32 instructions to BPF ISA in order to allow for optimizing code generation for 32-bit sub-registers. Evaluation shows that this can result in code reduction of ~5-20% compared to 64 bit-only code generation. Also add implementation for most JITs, from Jiong. 3) Add support for __int128 types in BTF which is also needed for vmlinux's BTF conversion to work, from Yonghong. 4) Add a new command to bpftool in order to dump a list of BPF-related parameters from the system or for a specific network device e.g. in terms of available prog/map types or helper functions, from Quentin. 5) Add AF_XDP sock_diag interface for querying sockets from user space which provides information about the RX/TX/fill/completion rings, umem, memory usage etc, from Björn. 6) Add skb context access for skb_shared_info->gso_segs field, from Eric. 7) Add support for testing flow dissector BPF programs by extending existing BPF_PROG_TEST_RUN infrastructure, from Stanislav. 8) Split BPF kselftest's test_verifier into various subgroups of tests in order better deal with merge conflicts in this area, from Jakub. 9) Add support for queue/stack manipulations in bpftool, from Stanislav. 10) Document BTF, from Yonghong. 11) Dump supported ELF section names in libbpf on program load failure, from Taeung. 12) Silence a false positive compiler warning in verifier's BTF handling, from Peter. 13) Fix help string in bpftool's feature probing, from Prashant. 14) Remove duplicate includes in BPF kselftests, from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-28 19:38:33 -08:00
Jiong Wang	6ea848b5ce	selftests: bpf: functional and min/max reasoning unit tests for JMP32 This patch adds unit tests for new JMP32 instructions. This patch also added the new BPF_JMP32_REG and BPF_JMP32_IMM macros to samples/bpf/bpf_insn.h so that JMP32 insn builders are available to tests under 'samples' directory. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-26 13:33:02 -08:00
Yonghong Song	6bf3bbe1f4	samples/bpf: workaround clang asm goto compilation errors x86 compilation has required asm goto support since 4.17. Since clang does not support asm goto, at 4.17, Commit `b1ae32dbab` ("x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation") worked around the issue by permitting an alternative implementation without asm goto for clang. At 5.0, more asm goto usages appeared. [yhs@148 x86]$ egrep -r asm_volatile_goto include/asm/cpufeature.h: asm_volatile_goto("1: jmp 6f\n" include/asm/jump_label.h: asm_volatile_goto("1:" include/asm/jump_label.h: asm_volatile_goto("1:" include/asm/rmwcc.h: asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \ include/asm/uaccess.h: asm_volatile_goto("\n" \ include/asm/uaccess.h: asm_volatile_goto("\n" \ [yhs@148 x86]$ Compiling samples/bpf directories, most bpf programs failed compilation with error messages like: In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_sample_pkts_kern.c:2: In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6: In file included from /home/yhs/work/bpf-next/include/linux/sched.h:15: In file included from /home/yhs/work/bpf-next/include/linux/sem.h:5: In file included from /home/yhs/work/bpf-next/include/uapi/linux/sem.h:5: In file included from /home/yhs/work/bpf-next/include/linux/ipc.h:9: In file included from /home/yhs/work/bpf-next/include/linux/refcount.h:72: /home/yhs/work/bpf-next/arch/x86/include/asm/refcount.h:70:9: error: 'asm goto' constructs are not supported yet return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl", ^ /home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:67:2: note: expanded from macro 'GEN_BINARY_SUFFIXED_RMWcc' __GEN_RMWcc(op " %[val], %[var]\n\t" suffix, var, cc, \ ^ /home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:21:2: note: expanded from macro '__GEN_RMWcc' asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \ ^ /home/yhs/work/bpf-next/include/linux/compiler_types.h:188:37: note: expanded from macro 'asm_volatile_goto' #define asm_volatile_goto(x...) asm goto(x) Most implementation does not even provide an alternative implementation. And it is also not practical to make changes for each call site. This patch workarounded the asm goto issue by redefining the macro like below: #define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto") If asm_volatile_goto is not used by bpf programs, which is typically the case, nothing bad will happen. If asm_volatile_goto is used by bpf programs, which is incorrect, the compiler will issue an error since "invalid use of asm_volatile_goto" is not valid assembly codes. With this patch, all bpf programs under samples/bpf can pass compilation. Note that bpf programs under tools/testing/selftests/bpf/ compiled fine as they do not access kernel internal headers. Fixes: `e769742d35` ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"") Fixes: `18fe58229d` ("x86, asm: change the GEN_*_RMWcc() macros to not quote the condition") Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-15 20:57:30 +01:00
Ioana Ciornei	11b36abc24	samples: bpf: user proper argument index Use optind as index for argv instead of a hardcoded value. When the program has options this leads to improper parameter handling. Fixes: `dc378a1ab5` ("samples: bpf: get ifindex from ifname") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-01-10 15:54:47 +01:00
Stanislav Fomichev	a8911d6d58	selftests/bpf: fix incorrect users of create_and_get_cgroup We have some tests that assume create_and_get_cgroup returns -1 on error which is incorrect (it returns 0 on error). Since fd might be zero in general case, change create_and_get_cgroup to return -1 on error and fix the users that assume 0 on error. Fixes: `f269099a7e` ("tools/bpf: add a selftest for bpf_get_current_cgroup_id() helper") Fixes: `7d2c6cfc54` ("bpf: use --cgroup in test_suite if supplied") v2: - instead of fixing the uses that assume -1 on error, convert the users that assume 0 on error (fd might be zero in general case) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-07 13:15:55 -08:00
Linus Torvalds	668c35f69c	Kbuild updates for v4.21 Kbuild core: - remove unneeded $(call cc-option,...) switches - consolidate Clang compiler flags into CLANG_FLAGS - announce the deprecation of SUBDIRS - fix single target build for external module - simplify the dependencies of 'prepare' stage targets - allow fixdep to directly write to ..cmd files - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS - change if_changed_rule to accept multi-line recipe - move .SECONDARY special target to scripts/Kbuild.include - remove redundant 'set -e' - improve parallel execution for CONFIG_HEADERS_CHECK - misc cleanups Treewide fixes and cleanups - set Clang flags correctly for PowerPC boot images - fix UML build error with CONFIG_GCC_PLUGINS - remove unneeded patterns from .gitignore files - refactor firmware/Makefile - remove unneeded rules for offsets.s - avoid unneeded regeneration of intermediate .s files - clean up ./Kbuild Modpost: - remove unused -M, -K options - fix false positive warnings about section mismatch - use simple devtable lookup instead of linker magic - misc cleanups Coccinelle: - relax boolinit.cocci checks for overall consistency - fix warning messages of boolinit.cocci Other tools: - improve -dirty check of scripts/setlocalversion - add a tool to generate compile_commands.json from ..cmd files -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcJiVeAAoJED2LAQed4NsGXEMP/1fdkBfZGxYlxp4w1tEBeRKZ GInxxMfHiHZlu7nFPQ3k1mncczZjeLbnKcLaybnPmqBSfQe1F7tN0ux1sHtWkVfI DvqoKaHObZ3lEMEVxCHXQ2bKrN/j9nB2xSxzr4dvc9DscW8dwKElZuKVE7nHKdhl z70xsalxHGEGY6hptJrucbv8KTBOSleZ8Gaat79sEDkDSLCTjxXB3WcVMWqDT0M/ IqN5QCwiPjZC3UCwuqq6+vnG1gyvDUORcbrVgHrBIKxLYAABYzugB5IlLpi8B31C ZUDmijSTandHd4SG5gw5uZoyYuK4YhZeI7g4yNyXSqnPurmcJxrGdpiuwRcE7xet 5yB2uaNbAFO6Fbz4gUdDEvryA9IZEPPn1Z0Btfpp7UOxiWEqE61oHReCNdkiad94 Oonl+ROZw5UOT3AZD3xCZTf9bTnQoCFccHTmnbaKqqSiDWxxLUXum7sNfnJ6GXEb fNQDuwuxtsPStaWoU93QNrGyl5e8jYTKFdphUCnZZ2/hE8ygoQ06PYmeBAk+UKfN UI2IqR8PHgmA97XJg+fHhbw6X4nQcPsg/usLqxGItAZNO2O4uXgaN24dOcHG74Bu 2Dk/gQQB2sVB/Dxfmlaxvvj98MVg7IMtpusAT2bQ7miiSS3EqPFT8KQMZYLICWuZ u7QQe20yCho3ZULmsRwH =0PCt -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "Kbuild core: - remove unneeded $(call cc-option,...) switches - consolidate Clang compiler flags into CLANG_FLAGS - announce the deprecation of SUBDIRS - fix single target build for external module - simplify the dependencies of 'prepare' stage targets - allow fixdep to directly write to ..cmd files - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS - change if_changed_rule to accept multi-line recipe - move .SECONDARY special target to scripts/Kbuild.include - remove redundant 'set -e' - improve parallel execution for CONFIG_HEADERS_CHECK - misc cleanups Treewide fixes and cleanups - set Clang flags correctly for PowerPC boot images - fix UML build error with CONFIG_GCC_PLUGINS - remove unneeded patterns from .gitignore files - refactor firmware/Makefile - remove unneeded rules for offsets.s - avoid unneeded regeneration of intermediate .s files - clean up ./Kbuild Modpost: - remove unused -M, -K options - fix false positive warnings about section mismatch - use simple devtable lookup instead of linker magic - misc cleanups Coccinelle: - relax boolinit.cocci checks for overall consistency - fix warning messages of boolinit.cocci Other tools: - improve -dirty check of scripts/setlocalversion - add a tool to generate compile_commands.json from ..cmd files" * tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits) kbuild: remove unused cmd_gentimeconst kbuild: remove $(obj)/ prefixes in ./Kbuild treewide: add intermediate .s files to targets treewide: remove explicit rules for offsets.s firmware: refactor firmware/Makefile firmware: remove unnecessary patterns from .gitignore scripts: remove unnecessary ihex2fw and check-lc_ctypes from .gitignore um: remove unused filechk_gen_header in Makefile scripts: add a tool to produce a compile_commands.json file kbuild: add -Werror=implicit-int flag unconditionally kbuild: add -Werror=strict-prototypes flag unconditionally kbuild: add -fno-PIE flag unconditionally scripts: coccinelle: Correct warning message scripts: coccinelle: only suggest true/false in files that already use them kbuild: handle part-of-module correctly for .ll and *.symtypes kbuild: refactor part-of-module kbuild: refactor quiet_modtag kbuild: remove redundant quiet_modtag for $(obj-m) kbuild: refactor Makefile.asm-generic user/Makefile: Fix typo and capitalization in comment section ...	2018-12-29 12:03:17 -08:00
Masahiro Yamada	2c667d77fc	treewide: add intermediate .s files to targets Avoid unneeded recreation of these in the incremental build. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-12-23 10:12:08 +09:00
Masahiro Yamada	4d4b5c2e3b	treewide: remove explicit rules for *offsets.s These explicit rules are unneeded because scripts/Makefile.build provides a pattern rule to create %.s from %.c Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-12-23 10:12:03 +09:00
Daniel T. Lee	d59dd69d55	samples: bpf: fix: seg fault with NULL pointer arg When NULL pointer accidentally passed to write_kprobe_events, due to strlen(NULL), segmentation fault happens. Changed code returns -1 to deal with this situation. Bug issued with Smatch, static analysis. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-03 23:58:03 +01:00
Matteo Croce	dc378a1ab5	samples: bpf: get ifindex from ifname Find the ifindex with if_nametoindex() instead of requiring the numeric ifindex. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-30 22:06:41 -08:00
Matteo Croce	d606ee5c1d	samples: bpf: improve xdp1 example Store only the total packet count for every protocol, instead of the whole per-cpu array. Use bpf_map_get_next_key() to iterate the map, instead of looking up all the protocols. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-30 22:06:41 -08:00
Daniel T. Lee	5a86381321	samples: bpf: fix: error handling regarding kprobe_events Currently, kprobe_events failure won't be handled properly. Due to calling system() indirectly to write to kprobe_events, it can't be identified whether an error is derived from kprobe or system. // buf = "echo '%c:%s %s' >> /s/k/d/t/kprobe_events" err = system(buf); if (err < 0) { printf("failed to create kprobe .."); return -1; } For example, running ./tracex7 sample in ext4 partition, "echo p:open_ctree open_ctree >> /s/k/d/t/kprobe_events" gets 256 error code system() failure. => The error comes from kprobe, but it's not handled correctly. According to man of system(3), it's return value just passes the termination status of the child shell rather than treating the error as -1. (don't care success) Which means, currently it's not working as desired. (According to the upper code snippet) ex) running ./tracex7 with ext4 env. # Current Output sh: echo: I/O error failed to open event open_ctree # Desired Output failed to create kprobe 'open_ctree' error 'No such file or directory' The problem is, error can't be verified whether from child ps or system. But using write() directly can verify the command failure, and it will treat all error as -1. So I suggest using write() directly to 'kprobe_events' rather than calling system(). Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-11-23 22:39:09 +01:00
Yonghong Song	9ce6ae22c8	tools/bpf: do not use pahole if clang/llvm can generate BTF sections Add additional checks in tools/testing/selftests/bpf and samples/bpf such that if clang/llvm compiler can generate BTF sections, do not use pahole. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Shannon Nelson	bce6a14996	bpf_load: add map name to load_maps error message To help when debugging bpf/xdp load issues, have the load_map() error message include the number and name of the map that failed. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-11-07 22:34:54 +01:00
Bo YU	20cdeb5408	bpf, tracex3_user: erase "ARRAY_SIZE" redefined There is a warning when compiling bpf sample programs in sample/bpf: make -C /home/foo/bpf/samples/bpf/../../tools/lib/bpf/ RM='rm -rf' LDFLAGS= srctree=/home/foo/bpf/samples/bpf/../../ O= HOSTCC /home/foo/bpf/samples/bpf/tracex3_user.o /home/foo/bpf/samples/bpf/tracex3_user.c:20:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) In file included from /home/foo/bpf/samples/bpf/tracex3_user.c:18:0: ./tools/testing/selftests/bpf/bpf_util.h:48:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-04 16:31:57 +02:00
Roman Gushchin	5fcbd29b37	samples/bpf: extend test_cgrp2_attach2 test to use per-cpu cgroup storage This commit extends the test_cgrp2_attach2 test to cover per-cpu cgroup storage. Bpf program will use shared and per-cpu cgroup storages simultaneously, so a better coverage of corresponding core code will be achieved. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:33 +02:00
Prashant Bhole	32c0097983	samples/bpf: fix compilation failure following commit: commit `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") added struct bpf_flow_keys which conflicts with the struct with same name in sockex2_kern.c and sockex3_kern.c similar to commit: commit `534e0e52bc` ("samples/bpf: fix a compilation failure") we tried the rename it "flow_keys" but it also conflicted with struct having same name in include/net/flow_dissector.h. Hence renaming the struct to "flow_key_record". Also, this commit doesn't fix the compilation error completely because the similar struct is present in sockex3_kern.c. Hence renaming it in both files sockex3_user.c and sockex3_kern.c Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-21 22:51:16 +02:00
Yonghong Song	534e0e52bc	samples/bpf: fix a compilation failure samples/bpf build failed with the following errors: $ make samples/bpf/ ... HOSTCC samples/bpf/sockex3_user.o /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:16:8: error: redefinition of ‘struct bpf_flow_keys’ struct bpf_flow_keys { ^ In file included from /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:4:0: ./usr/include/linux/bpf.h:2338:9: note: originally defined here struct bpf_flow_keys flow_keys; ^ make[3]: ** [samples/bpf/sockex3_user.o] Error 1 Commit `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") introduced struct bpf_flow_keys in include/uapi/linux/bpf.h and hence caused the naming conflict with samples/bpf/sockex3_user.c. The fix is to rename struct bpf_flow_keys in samples/bpf/sockex3_user.c to flow_keys to avoid the conflict. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-18 17:50:02 +02:00
YueHaibing	664e787845	samples/bpf: remove duplicated includes Remove duplicated includes. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-18 17:49:33 +02:00
Prashant Bhole	11c3f51136	samples/bpf: xdpsock, minor fixes - xsks_map size was fixed to 4, changed it MAX_SOCKS - Remove redundant definition of MAX_SOCKS in xdpsock_user.c - In dump_stats(), add NULL check for xsks[i] Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-01 01:36:08 +02:00
Nikita V. Shirokov	acb4ea9564	bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample program Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example: bpf program which is doing TOS/TCLASS reflection (server would reply with a same TOS/TCLASS as client). Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-01 01:36:04 +02:00
Björn Töpel	58c50ae4a0	samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsock The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy mode. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-29 12:25:53 -07:00
Jesper Dangaard Brouer	817b89beb9	samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM It is common XDP practice to unload/deattach the XDP bpf program, when the XDP sample program is Ctrl-C interrupted (SIGINT) or killed (SIGTERM). The samples/bpf programs xdp_redirect_cpu and xdp_rxq_info, forgot to trap signal SIGTERM (which is the default signal used by the kill command). This was discovered by Red Hat QA, which automated scripts depend on killing the XDP sample program after a timeout period. Fixes: `fad3917e36` ("samples/bpf: add cpumap sample program xdp_redirect_cpu") Fixes: `0fca931a6f` ("samples/bpf: program demonstrating access to xdp_rxq_info") Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-16 21:55:32 +02:00
Linus Torvalds	9a76aba02a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: - Gustavo A. R. Silva keeps working on the implicit switch fallthru changes. - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From Luca Coelho. - Re-enable ASPM in r8169, from Kai-Heng Feng. - Add virtual XFRM interfaces, which avoids all of the limitations of existing IPSEC tunnels. From Steffen Klassert. - Convert GRO over to use a hash table, so that when we have many flows active we don't traverse a long list during accumluation. - Many new self tests for routing, TC, tunnels, etc. Too many contributors to mention them all, but I'm really happy to keep seeing this stuff. - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu. - Lots of cleanups and fixes in L2TP code from Guillaume Nault. - Add IPSEC offload support to netdevsim, from Shannon Nelson. - Add support for slotting with non-uniform distribution to netem packet scheduler, from Yousuk Seung. - Add UDP GSO support to mlx5e, from Boris Pismenny. - Support offloading of Team LAG in NFP, from John Hurley. - Allow to configure TX queue selection based upon RX queue, from Amritha Nambiar. - Support ethtool ring size configuration in aquantia, from Anton Mikaev. - Support DSCP and flowlabel per-transport in SCTP, from Xin Long. - Support list based batching and stack traversal of SKBs, this is very exciting work. From Edward Cree. - Busyloop optimizations in vhost_net, from Toshiaki Makita. - Introduce the ETF qdisc, which allows time based transmissions. IGB can offload this in hardware. From Vinicius Costa Gomes. - Add parameter support to devlink, from Moshe Shemesh. - Several multiplication and division optimizations for BPF JIT in nfp driver, from Jiong Wang. - Lots of prepatory work to make more of the packet scheduler layer lockless, when possible, from Vlad Buslov. - Add ACK filter and NAT awareness to sch_cake packet scheduler, from Toke Høiland-Jørgensen. - Support regions and region snapshots in devlink, from Alex Vesker. - Allow to attach XDP programs to both HW and SW at the same time on a given device, with initial support in nfp. From Jakub Kicinski. - Add TLS RX offload and support in mlx5, from Ilya Lesokhin. - Use PHYLIB in r8169 driver, from Heiner Kallweit. - All sorts of changes to support Spectrum 2 in mlxsw driver, from Ido Schimmel. - PTP support in mv88e6xxx DSA driver, from Andrew Lunn. - Make TCP_USER_TIMEOUT socket option more accurate, from Jon Maxwell. - Support for templates in packet scheduler classifier, from Jiri Pirko. - IPV6 support in RDS, from Ka-Cheong Poon. - Native tproxy support in nf_tables, from Máté Eckl. - Maintain IP fragment queue in an rbtree, but optimize properly for in-order frags. From Peter Oskolkov. - Improvde handling of ACKs on hole repairs, from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits) bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT" hv/netvsc: Fix NULL dereference at single queue mode fallback net: filter: mark expected switch fall-through xen-netfront: fix warn message as irq device name has '/' cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0 net: dsa: mv88e6xxx: missing unlock on error path rds: fix building with IPV6=m inet/connection_sock: prefer _THIS_IP_ to current_text_addr net: dsa: mv88e6xxx: bitwise vs logical bug net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() ieee802154: hwsim: using right kind of iteration net: hns3: Add vlan filter setting by ethtool command -K net: hns3: Set tx ring' tc info when netdev is up net: hns3: Remove tx ring BD len register in hns3_enet net: hns3: Fix desc num set to default when setting channel net: hns3: Fix for phy link issue when using marvell phy driver net: hns3: Fix for information of phydev lost problem when down/up net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero net: hns3: Add support for serdes loopback selftest bnxt_en: take coredump_record structure off stack ...	2018-08-15 15:04:25 -07:00
Linus Torvalds	e026bcc561	Kbuild updates for v4.19 - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOSTFLAGS, and rename internal variables to KBUILD_HOSTFLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJbdFZ0AAoJED2LAQed4NsGcHYP/23txxk3GRP7O4UkfPw9Rtky MHiXTgcoy2vbG+l12BgzWX+qFii8XTUe3dQtK4HnGQFUIBtEBV/hpZPJtxfgGSev Zou5cv1kr5rNzTkCn//TG3O6/WIkTBCe2hahDCtmGDI3kd/cPK4dHbU/q6KpaqIJ qzZYBXIvCeu2GM8idQoCRrwdMpgu1pBz1gz2sDje1yHH2toI7T6cXHRLQDgx+HPq LIP7W9GUsoDdXjecvPD51LiW89E6BUxETBh5Ft9r9uzwB5ylQQMcw6Qyu2DiYDUX PPsHCMiolYV+Ttcy+vj/67KOvKmEaFotssck+RD/xDCF17zKhRkup+YM8kPLHTVZ TcAUZadbnT6U/s2W6GFwvVbN/P7cc3aif+aNCC/Pl23yagp3pydlSCocYxQgiVR7 /rx48haYDEgu/MJ1X0dOpSO0ErY7zu2OoAlNerW+D9QizwbP+WtZO/CJH8SxQRuN dQ1xmyNrie+ODgi9tbc4eBrsb+1rioX927TP5MbJcfXt5CTsxDmIqop5XwyYIoQN ZWWlzC8Ii3P2trAVpBgM2IEbngSxwr6T9Wbf1ScJnPKr/o1rq+pBk49cYstTz3kQ OwJ8gPwUrkW4R+hlD7L6mL/WcrKzZBQS0Ij1QW2kVSEhRrsKo99psE1/rGehnHu9 KGB0LYYCqGSOHR4zOjg0 =VjfG -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOSTFLAGS, and rename internal variables to KBUILD_HOSTFLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) coccicheck: return proper error code on fail Coccinelle: doubletest: reduce side effect false positives kbuild: remove deprecated host-progs variable kbuild: make samples really depend on headers_install um: clean up archheaders recipe kbuild: add %asm-generic to no-dot-config-targets um: fix parallel building with O= option scripts: Add Python 3 support to tracing/draw_functrace.py builddeb: Add automatic support for sh{3,4}{,eb} architectures builddeb: Add automatic support for riscv* architectures builddeb: Add automatic support for m68k architecture builddeb: Add automatic support for or1k architecture builddeb: Add automatic support for sparc64 architecture builddeb: Add automatic support for mips{,64}r6{,el} architectures builddeb: Add automatic support for mips64el architecture builddeb: Add automatic support for ppc64 and powerpcspe architectures builddeb: Introduce functions to simplify kconfig tests in set_debarch builddeb: Drop check for 32-bit s390 builddeb: Change architecture detection fallback to use dpkg-architecture builddeb: Skip architecture detection when KBUILD_DEBARCH is set ...	2018-08-15 12:09:03 -07:00
David S. Miller	c1617fb4c5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-13 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add driver XDP support for veth. This can be used in conjunction with redirect of another XDP program e.g. sitting on NIC so the xdp_frame can be forwarded to the peer veth directly without modification, from Toshiaki. 2) Add a new BPF map type REUSEPORT_SOCKARRAY and prog type SK_REUSEPORT in order to provide more control and visibility on where a SO_REUSEPORT sk should be located, and the latter enables to directly select a sk from the bpf map. This also enables map-in-map for application migration use cases, from Martin. 3) Add a new BPF helper bpf_skb_ancestor_cgroup_id() that returns the id of cgroup v2 that is the ancestor of the cgroup associated with the skb at the ancestor_level, from Andrey. 4) Implement BPF fs map pretty-print support based on BTF data for regular hash table and LRU map, from Yonghong. 5) Decouple the ability to attach BTF for a map from the key and value pretty-printer in BPF fs, and enable further support of BTF for maps for percpu and LPM trie, from Daniel. 6) Implement a better BPF sample of using XDP's CPU redirect feature for load balancing SKB processing to remote CPU. The sample implements the same XDP load balancing as Suricata does which is symmetric hash based on IP and L4 protocol, from Jesper. 7) Revert adding NULL pointer check with WARN_ON_ONCE() in __xdp_return()'s critical path as it is ensured that the allocator is present, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-13 10:07:23 -07:00
David S. Miller	6a92ef08a1	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-08-11 17:52:00 -07:00
Jesper Dangaard Brouer	1bca4e6b18	samples/bpf: xdp_redirect_cpu load balance like Suricata This implement XDP CPU redirection load-balancing across available CPUs, based on the hashing IP-pairs + L4-protocol. This equivalent to xdp-cpu-redirect feature in Suricata, which is inspired by the Suricata 'ippair' hashing code. An important property is that the hashing is flow symmetric, meaning that if the source and destination gets swapped then the selected CPU will remain the same. This is helps locality by placing both directions of a flows on the same CPU, in a forwarding/routing scenario. The hashing INITVAL (15485863 the 10^6th prime number) was fairly arbitrary choosen, but experiments with kernel tree pktgen scripts (pktgen_sample04_many_flows.sh +pktgen_sample05_flow_per_thread.sh) showed this improved the distribution. This patch also change the default loaded XDP program to be this load-balancer. As based on different user feedback, this seems to be the expected behavior of the sample xdp_redirect_cpu. Link: https://github.com/OISF/suricata/commit/796ec08dd7a63 Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:07:49 +02:00
Jesper Dangaard Brouer	1139568658	samples/bpf: add Paul Hsieh's (LGPL 2.1) hash function SuperFastHash Adjusted function call API to take an initval. This allow the API user to set the initial value, as a seed. This could also be used for inputting the previous hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:07:49 +02:00
Jesper Dangaard Brouer	37d7ff2595	samples/bpf: xdp_redirect_cpu adjustment to reproduce teardown race easier The teardown race in cpumap is really hard to reproduce. These changes makes it easier to reproduce, for QA. The --stress-mode now have a case of a very small queue size of 8, that helps to trigger teardown flush to encounter a full queue, which results in calling xdp_return_frame API, in a non-NAPI protect context. Also increase MAX_CPUS, as my QA department have larger machines than me. Tested-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-09 21:50:44 +02:00
Roman Gushchin	28ba068760	samples/bpf: extend test_cgrp2_attach2 test to use cgroup storage The test_cgrp2_attach test covers bpf cgroup attachment code well, so let's re-use it for testing allocation/releasing of cgroup storage. The extension is pretty straightforward: the bpf program will use the cgroup storage to save the number of transmitted bytes. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-03 00:47:33 +02:00
Jakub Kicinski	6748182c2d	samples: bpf: convert xdpsock_user.c to libbpf Convert xdpsock_user.c to use libbpf instead of bpf_load.o. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:18:44 +02:00
Jakub Kicinski	e1a40ef418	samples: bpf: convert xdp_fwd_user.c to libbpf Convert xdp_fwd_user.c to use libbpf instead of bpf_load.o. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:18:44 +02:00
Taeung Song	9778cfdfc9	samples/bpf: Add BTF build flags to Makefile To smoothly test BTF supported binary on samples/bpf, let samples/bpf/Makefile probe llc, pahole and llvm-objcopy for BPF support and use them like tools/testing/selftests/bpf/Makefile changed from the commit `c0fa1b6c3e` ("bpf: btf: Add BTF tests"). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 03:50:19 +02:00
Brian Brooks	598135e744	samples/bpf: xdpsock: order memory on AArch64 Define u_smp_rmb() and u_smp_wmb() to respective barrier instructions. This ensures the processor will order accesses to queue indices against accesses to queue ring entries. Signed-off-by: Brian Brooks <brian.brooks@linaro.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 03:49:02 +02:00
David S. Miller	eae249b27f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-20 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add sharing of BPF objects within one ASIC: this allows for reuse of the same program on multiple ports of a device, and therefore gains better code store utilization. On top of that, this now also enables sharing of maps between programs attached to different ports of a device, from Jakub. 2) Cleanup in libbpf and bpftool's Makefile to reduce unneeded feature detections and unused variable exports, also from Jakub. 3) First batch of RCU annotation fixes in prog array handling, i.e. there are several __rcu markers which are not correct as well as some of the RCU handling, from Roman. 4) Two fixes in BPF sample files related to checking of the prog_cnt upper limit from sample loader, from Dan. 5) Minor cleanup in sockmap to remove a set but not used variable, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:58:30 -07:00
David S. Miller	c4c5551df1	Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 21:17:12 -07:00
Laura Abbott	8377bd2b9e	kbuild: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES to KBUILD_HOSTLDLIBS as the internal use only flags. Also rename existing usage to HOSTLDLIBS for consistency. This should not have any visible effects. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-07-18 01:18:05 +09:00
Laura Abbott	96f14fe738	kbuild: Rename HOSTCFLAGS to KBUILD_HOSTCFLAGS In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to KBUILD_HOSTCFLAGS as the internal use only flags. This should not have any visible effects. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-07-18 01:18:05 +09:00
Dan Carpenter	ee583014a9	samples/bpf: test_cgrp2_sock2: fix an off by one "prog_cnt" is the number of elements which are filled out in prog_fd[] so the test should be >= instead of >. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-07-16 15:01:09 -07:00
Dan Carpenter	b0294bc1ad	samples: bpf: ensure that we don't load over MAX_PROGS programs I can't see that we check prog_cnt to ensure it doesn't go over MAX_PROGS. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-07-16 15:00:56 -07:00
Jesper Dangaard Brouer	d23b27c02f	samples/bpf: xdp_redirect_cpu handle parsing of double VLAN tagged packets People noticed that the code match on IEEE 802.1ad (ETH_P_8021AD) ethertype, and this implies Q-in-Q or double tagged VLANs. Thus, we better parse the next VLAN header too. It is even marked as a TODO. This is relevant for real world use-cases, as XDP cpumap redirect can be used when the NIC RSS hashing is broken. E.g. the ixgbe driver HW cannot handle double tagged VLAN packets, and places everything into a single RX queue. Using cpumap redirect, users can redistribute traffic across CPUs to solve this, which is faster than the network stacks RPS solution. It is left as an exerise how to distribute the packets across CPUs. It would be convenient to use the RX hash, but that is not _yet_ exposed to XDP programs. For now, users can code their own hash, as I've demonstrated in the Suricata code (where Q-in-Q is handled correctly). Reported-by: Florian Maury <florian.maury-cv@x-cli.eu> Reported-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-14 00:52:54 +02:00
Taeung Song	b9626f45ab	samples/bpf: Fix tc and ip paths in xdp2skb_meta.sh The below path error can occur: # ./xdp2skb_meta.sh --dev eth0 --list ./xdp2skb_meta.sh: line 61: /usr/sbin/tc: No such file or directory So just use command names instead of absolute paths of tc and ip. In addition, it allow callers to redefine $TC and $IP paths Fixes: `36e04a2d78` ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-10 09:19:01 +02:00
Taeung Song	c48424d993	samples/bpf: add .gitignore file For untracked executables of samples/bpf, add this. Untracked files: (use "git add <file>..." to include in what will be committed) samples/bpf/cpustat samples/bpf/fds_example samples/bpf/lathist samples/bpf/load_sock_ops ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-05 09:58:53 +02:00
Taeung Song	02a2f000a3	samples/bpf: Check the error of write() and read() test_task_rename() and test_urandom_read() can be failed during write() and read(), So check the result of them. Reviewed-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-05 09:58:52 +02:00
Taeung Song	492b7e8945	samples/bpf: Check the result of system() To avoid the below build warning message, use new generate_load() checking the return value. ignoring return value of ‘system’, declared with attribute warn_unused_result And it also refactors the duplicate code of both test_perf_event_all_cpu() and test_perf_event_task() Cc: Teng Qin <qinteng@fb.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-05 09:58:52 +02:00
Taeung Song	4d5d33a085	samples/bpf: add missing <linux/if_vlan.h> This fixes build error regarding redefinition: CLANG-bpf samples/bpf/parse_varlen.o samples/bpf/parse_varlen.c:111:8: error: redefinition of 'vlan_hdr' struct vlan_hdr { ^ ./include/linux/if_vlan.h:38:8: note: previous definition is here So remove duplicate 'struct vlan_hdr' in sample code and include if_vlan.h Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-05 09:58:52 +02:00
David S. Miller	b68034087a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-03 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Various improvements to bpftool and libbpf, that is, bpftool build speed improvements, missing BPF program types added for detection by section name, ability to load programs from '.text' section is made to work again, and better bash completion handling, from Jakub. 2) Improvements to nfp JIT's map read handling which allows for optimizing memcpy from map to packet, from Jiong. 3) New BPF sample is added which demonstrates XDP in combination with bpf_perf_event_output() helper to sample packets on all CPUs, from Toke. 4) Add a new BPF kselftest case for tracking connect(2) BPF hooks infrastructure in combination with TFO, from Andrey. 5) Extend the XDP/BPF xdp_rxq_info sample code with a cmdline option to read payload from packet data in order to use it for benchmarking. Also for '--action XDP_TX' option implement swapping of MAC addresses to avoid drops on some hardware seen during testing, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 08:53:53 +09:00
Magnus Karlsson	c03079c9d9	samples/bpf: deal with EBUSY return code from sendmsg in xdpsock sample Sendmsg in the SKB path of AF_XDP can now return EBUSY when a packet was discarded and completed by the driver. Just ignore this message in the sample application. Fixes: `b4b8faa1de` ("samples/bpf: sample application and documentation for AF_XDP sockets") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Reported-by: Pavel Odintsov <pavel@fastnetmon.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-07-02 18:37:12 -07:00
David Ahern	4c79579b44	bpf: Change bpf_fib_lookup to return lookup status For ACLs implemented using either FIB rules or FIB entries, the BPF program needs the FIB lookup status to be able to drop the packet. Since the bpf_fib_lookup API has not reached a released kernel yet, change the return code to contain an encoding of the FIB lookup result and return the nexthop device index in the params struct. In addition, inform the BPF program of any post FIB lookup reason as to why the packet needs to go up the stack. The fib result for unicast routes must have an egress device, so remove the check that it is non-NULL. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-29 00:02:02 +02:00
Jesper Dangaard Brouer	509fda105b	samples/bpf: xdp_rxq_info action XDP_TX must adjust MAC-addrs XDP_TX requires also changing the MAC-addrs, else some hardware may drop the TX packet before reaching the wire. This was observed with driver mlx5. If xdp_rxq_info select --action XDP_TX the swapmac functionality is activated. It is also possible to manually enable via cmdline option --swapmac. This is practical if wanting to measure the overhead of writing/updating payload for other action types. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-28 23:50:20 +02:00
Jesper Dangaard Brouer	0d25c43ab9	samples/bpf: extend xdp_rxq_info to read packet payload There is a cost associated with reading the packet data payload that this test ignored. Add option --read to allow enabling reading part of the payload. This sample/tool helps us analyse an issue observed with a NIC mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4. With no_touch of data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch XDP stats CPU pps issue-pps XDP-RX CPU 0 14,465,157 0 XDP-RX CPU 1 14,464,728 0 XDP-RX CPU 2 14,465,283 0 XDP-RX CPU 3 14,465,282 0 XDP-RX CPU 4 14,464,159 0 XDP-RX CPU 5 14,465,379 0 XDP-RX CPU total 86,789,992 When not touching data, we observe that the CPUs have idle cycles. When reading data the CPUs are 100% busy in softirq. With reading data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read XDP stats CPU pps issue-pps XDP-RX CPU 0 9,620,639 0 XDP-RX CPU 1 9,489,843 0 XDP-RX CPU 2 9,407,854 0 XDP-RX CPU 3 9,422,289 0 XDP-RX CPU 4 9,321,959 0 XDP-RX CPU 5 9,395,242 0 XDP-RX CPU total 56,657,828 The effect seen above is a result of cache-misses occuring when more RXQs are being used. Based on perf-event observations, our conclusion is that the CPUs DDIO (Direct Data I/O) choose to deliver packet into main memory, instead of L3-cache. We also found, that this can be mitigated by either using less RXQs or by reducing NICs the RX-ring size. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-28 23:50:20 +02:00
Toke Høiland-Jørgensen	1e54ad251a	samples/bpf: Add xdp_sample_pkts example Add an example program showing how to sample packets from XDP using the perf event buffer. The example userspace program just prints the ethernet header for every packet sampled. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-27 11:01:03 +02:00
Björn Töpel	9f5232cc7f	samples/bpf: xdpsock: use skb Tx path for XDP_SKB Make sure that XDP_SKB also uses the skb Tx path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-05 15:48:57 +02:00
Magnus Karlsson	a65ea68b8d	samples/bpf: minor *_nb_free performance fix Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-04 17:21:02 +02:00
Björn Töpel	a412ef54fc	samples/bpf: adapted to new uapi Here, the xdpsock sample application is adjusted to the new descriptor format. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-04 17:21:02 +02:00
David Ahern	bd3a08aaa9	bpf: flowlabel in bpf_fib_lookup should be flowinfo As Michal noted the flow struct takes both the flow label and priority. Update the bpf_fib_lookup API to note that it is flowinfo and not just the flow label. Cc: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-06-03 18:29:07 -07:00
David S. Miller	90fed9c946	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-05-24 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc). 2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers. 3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit. 4) Jiong Wang adds support for indirect and arithmetic shifts to NFP 5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible. 6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions. 7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT. 8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events. 9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 22:20:51 -04:00
Jesper Dangaard Brouer	a570e48fee	samples/bpf: xdp_monitor use err code from tracepoint xdp:xdp_devmap_xmit Update xdp_monitor to use the recently added err code introduced in tracepoint xdp:xdp_devmap_xmit, to show if the drop count is caused by some driver general delivery problem. Other kind of drops will likely just be more normal TX space issues. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-24 18:36:15 -07:00
Jesper Dangaard Brouer	9940fbf633	samples/bpf: xdp_monitor use tracepoint xdp:xdp_devmap_xmit The xdp_monitor sample/tool is updated to use the new tracepoint xdp:xdp_devmap_xmit the previous patch just introduced. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-24 18:36:15 -07:00
Yonghong Song	ecb96f7fe1	samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY This is mostly to test kprobe/uprobe which needs kernel headers. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-24 18:18:20 -07:00
Björn Töpel	1c4917da36	samples/bpf: adapt xdpsock to the new uapi Adapt xdpsock to use the new getsockopt introduced in the previous commit. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00
David S. Miller	6f6e434aa2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-21 16:01:54 -04:00
Björn Töpel	dac09149d9	xsk: clean up SPDX headers Clean up SPDX-License-Identifier and removing licensing leftovers. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 16:07:02 +02:00
David Ahern	44edef77bd	samples/bpf: Decrement ttl in fib forwarding example Only consider forwarding packets if ttl in received packet is > 1 and decrement ttl before handing off to bpf_redirect_map. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-16 22:04:52 +02:00
Jakub Kicinski	768759edb9	samples: bpf: make the build less noisy Building samples with clang ignores the $(Q) setting, always printing full command to the output. Make it less verbose. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-14 22:52:10 -07:00
Jakub Kicinski	0cc54db181	samples: bpf: move libbpf from object dependencies to libs Make complains that it doesn't know how to make libbpf.a: scripts/Makefile.host:106: target 'samples/bpf/../../tools/lib/bpf/libbpf.a' doesn't match the target pattern Now that we have it as a dependency of the sources simply add libbpf.a to libraries not objects. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-14 22:52:10 -07:00
Jakub Kicinski	787360f8c2	samples: bpf: fix build after move to compiling full libbpf.a There are many ways users may compile samples, some of them got broken by commit `5f9380572b` ("samples: bpf: compile and link against full libbpf"). Improve path resolution and make libbpf building a dependency of source files to force its build. Samples should now again build with any of: cd samples/bpf; make make samples/bpf/ make -C samples/bpf cd samples/bpf; make O=builddir make samples/bpf/ O=builddir make -C samples/bpf O=builddir export KBUILD_OUTPUT=builddir make samples/bpf/ make -C samples/bpf Fixes: `5f9380572b` ("samples: bpf: compile and link against full libbpf") Reported-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-14 22:52:10 -07:00
Jakub Kicinski	8d93045077	samples: bpf: rename libbpf.h to bpf_insn.h The libbpf.h file in samples is clashing with libbpf's header. Since it only includes a subset of filter.h instruction helpers rename it to bpf_insn.h. Drop the unnecessary include of bpf/bpf.h. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-14 22:52:10 -07:00
Jakub Kicinski	2bf3e2ef42	samples: bpf: include bpf/bpf.h instead of local libbpf.h There are two files in the tree called libbpf.h which is becoming problematic. Most samples don't actually need the local libbpf.h they simply include it to get to bpf/bpf.h. Include bpf/bpf.h directly instead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-14 22:52:10 -07:00
Prashant Bhole	53ea24c20c	samples/bpf: xdp_monitor, accept short options Updated optstring parameter for getopt_long() to accept short options. Also updated usage() function. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-14 23:41:59 +02:00
Alexei Starovoitov	b1ae32dbab	x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation Workaround for the sake of BPF compilation which utilizes kernel headers, but clang does not support ASM GOTO and fails the build. Fixes: `d0266046ad` ("x86: Remove FAST_FEATURE_TESTS") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: daniel@iogearbox.net Cc: peterz@infradead.org Cc: netdev@vger.kernel.org Cc: bp@alien8.de Cc: yhs@fb.com Cc: kernel-team@fb.com Cc: torvalds@linux-foundation.org Cc: davem@davemloft.net Link: https://lkml.kernel.org/r/20180513193222.1997938-1-ast@kernel.org	2018-05-13 21:49:14 +02:00
Jakub Kicinski	be5bca44aa	samples: bpf: convert some XDP samples from bpf_load to libbpf Now that we can use full powers of libbpf in BPF samples, we should perhaps make the simplest XDP programs not depend on bpf_load helpers. This way newcomers will be exposed to the recommended library from the start. Use of bpf_prog_load_xattr() will also make it trivial to later on request offload of the programs by simply adding ifindex to the xattr. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:44:17 +02:00
Jakub Kicinski	d0cabbb021	tools: bpf: move the event reading loop to libbpf There are two copies of event reading loop - in bpftool and trace_helpers "library". Consolidate them and move the code to libbpf. Return codes from trace_helpers are kept, but renamed to include LIBBPF prefix. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	5f9380572b	samples: bpf: compile and link against full libbpf samples/bpf currently cherry-picks object files from tools/lib/bpf to link against. Just compile the full library and link statically against it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	74662ea5d4	samples: bpf: rename struct bpf_map_def to avoid conflict with libbpf Both tools/lib/bpf/libbpf.h and samples/bpf/bpf_load.h define their own version of struct bpf_map_def. The version in bpf_load.h has more fields. libbpf does not support inner maps and its definition of struct bpf_map_def lacks the related fields. Rename the definition in bpf_load.h (samples/bpf) to avoid conflicts. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:51 +02:00
David Ahern	fe616055f7	samples/bpf: Add example of ipv4 and ipv6 forwarding in XDP Simple example of fast-path forwarding. It has a serious flaw in not verifying the egress device index supports XDP forwarding. If the egress device does not packets are dropped. Take this only as a simple example of fast-path forwarding. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:57 +02:00
Magnus Karlsson	b4b8faa1de	samples/bpf: sample application and documentation for AF_XDP sockets This is a sample application for AF_XDP sockets. The application supports three different modes of operation: rxdrop, txonly and l2fwd. To show-case a simple round-robin load-balancing between a set of sockets in an xskmap, set the RR_LB compile time define option to 1 in "xdpsock.h". v2: The entries variable was calculated twice in {umem,xq}_nb_avail. Co-authored-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-05-03 15:55:25 -07:00
Yonghong Song	34745aed51	samples/bpf: fix kprobe attachment issue on x64 Commit `d5a00528b5` ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_() to __x64_sys_()") renamed a lot of syscall function sys_() to __x64_sys_(). This caused several kprobe based samples/bpf tests failing. This patch fixed the problem in bpf_load.c. For x86_64 architecture, function name __x64_sys_() will be first used for kprobe event creation. If the creation is successful, it will be used. Otherwise, function name sys_() will be used for kprobe event creation. Fixes: `d5a00528b5` ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_() to __x64_sys_()") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-04-29 20:36:53 -07:00
Yonghong Song	28dbf861de	samples/bpf: move common-purpose trace functions to selftests There is no functionality change in this patch. The common-purpose trace functions, including perf_event polling and ksym lookup, are moved from trace_output_user.c and bpf_load.c to selftests/bpf/trace_helpers.c so that these function can be reused later in selftests. Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-04-29 08:45:54 -07:00
Wang Sheng-Hui	c0885f61bb	samples, bpf: remove redundant ret assignment in bpf_load_program() 2 redundant ret assignments removed: * 'ret = 1' before the logic 'if (data_maps)', and if any errors jump to label 'done'. No 'ret = 1' needed before the error jump. * After the '/* load programs */' part, if everything goes well, then the BPF code will be loaded and 'ret' set to 0 by load_and_attach(). If something goes wrong, 'ret' set to none-O, the redundant 'ret = 0' after the for clause will make the error skipped. For example, if some BPF code cannot provide supported program types in ELF SEC("unknown"), the for clause will not call load_and_attach() to load the BPF code. 1 should be returned to callees instead of 0. Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-27 00:47:06 +02:00
William Tu	b05cd74043	samples/bpf: remove the bpf tunnel testsuite. Move the testsuite to selftests/bpf/{test_tunnel_kern.c, test_tunnel.sh} Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-27 00:11:15 +02:00
Eyal Birger	29a36f9eef	samples/bpf: extend test_tunnel_bpf.sh with xfrm state test Add a test for fetching xfrm state parameters from a tc program running on ingress. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-24 22:26:58 +02:00
Nikita V. Shirokov	c6ffd1ff78	bpf: add bpf_xdp_adjust_tail sample prog adding bpf's sample program which is using bpf_xdp_adjust_tail helper by generating ICMPv4 "packet to big" message if ingress packet's size is bigger then 600 bytes Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 23:34:17 +02:00
Jesper Dangaard Brouer	8de0e8ba97	samples/bpf: fix xdp_monitor user output for tracepoint exception The variable rec_i contains an XDP action code not an error. Thus, using err2str() was wrong, it should have been action2str(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 14:48:06 +02:00
Wang Sheng-Hui	97c33610ac	samples/bpf: correct comment in sock_example.c The program run against loopback interace "lo", not "eth0". Correct the comment. Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 14:45:26 +02:00
Alexei Starovoitov	4662a4e538	samples/bpf: raw tracepoint test add empty raw_tracepoint bpf program to test overhead similar to kprobe and traditional tracepoint tests Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-28 22:55:19 +02:00
Colin Ian King	20cfb7a04f	samples/bpf: fix spelling mistake: "revieve" -> "receive" Trivial fix to spelling mistake in error message text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-28 15:30:25 +02:00
John Fastabend	4c4c3c276c	bpf: sockmap sample, add option to attach SK_MSG program Add sockmap option to use SK_MSG program types. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-19 21:14:40 +01:00
Teng Qin	12fe12253c	samples/bpf: add example to test reading address This commit adds additional test in the trace_event example, by attaching the bpf program to MEM_UOPS_RETIRED.LOCK_LOADS event with PERF_SAMPLE_ADDR requested, and print the lock address value read from the bpf program to trace_pipe. Signed-off-by: Teng Qin <qinteng@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-08 02:22:34 +01:00
William Tu	5f280b60d2	samples/bpf: add gre sequence number test. The patch adds tests for GRE sequence number support for metadata mode tunnel. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-04 18:35:02 -05:00
Prashant Bhole	c8745e07d5	samples/bpf: detach prog from cgroup test_cgrp2_sock.sh and test_cgrp2_sock2.sh tests keep the program attached to cgroup even after completion. Using detach functionality of test_cgrp2_sock in both scripts. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-02 00:16:36 +01:00
Leo Yan	c535077789	samples/bpf: Add program for CPU state statistics CPU is active when have running tasks on it and CPUFreq governor can select different operating points (OPP) according to different workload; we use 'pstate' to present CPU state which have running tasks with one specific OPP. On the other hand, CPU is idle which only idle task on it, CPUIdle governor can select one specific idle state to power off hardware logics; we use 'cstate' to present CPU idle state. Based on trace events 'cpu_idle' and 'cpu_frequency' we can accomplish the duration statistics for every state. Every time when CPU enters into or exits from idle states, the trace event 'cpu_idle' is recorded; trace event 'cpu_frequency' records the event for CPU OPP changing, so it's easily to know how long time the CPU stays in the specified OPP, and the CPU must be not in any idle state. This patch is to utilize the mentioned trace events for pstate and cstate statistics. To achieve more accurate profiling data, the program uses below sequence to insure CPU running/idle time aren't missed: - Before profiling the user space program wakes up all CPUs for once, so can avoid to missing account time for CPU staying in idle state for long time; the program forces to set 'scaling_max_freq' to lowest frequency and then restore 'scaling_max_freq' to highest frequency, this can ensure the frequency to be set to lowest frequency and later after start to run workload the frequency can be easily to be changed to higher frequency; - User space program reads map data and update statistics for every 5s, so this is same with other sample bpf programs for avoiding big overload introduced by bpf program self; - When send signal to terminate program, the signal handler wakes up all CPUs, set lowest frequency and restore highest frequency to 'scaling_max_freq'; this is exactly same with the first step so avoid to missing account CPU pstate and cstate time during last stage. Finally it reports the latest statistics. The program has been tested on Hikey board with octa CA53 CPUs, below is one example for statistics result, the format mainly follows up Jesper Dangaard Brouer suggestion. Jesper reminds to 'get printf to pretty print with thousands separators use %' and setlocale(LC_NUMERIC, "en_US")', tried three different arm64 GCC toolchains (5.4.0 20160609, 6.2.1 20161016, 6.3.0 20170516) but all of them cannot support printf flag character %' on arm64 platform, so go back print number without grouping mode. CPU states statistics: state(ms) cstate-0 cstate-1 cstate-2 pstate-0 pstate-1 pstate-2 pstate-3 pstate-4 CPU-0 767 6111 111863 561 31 756 853 190 CPU-1 241 10606 107956 484 125 646 990 85 CPU-2 413 19721 98735 636 84 696 757 89 CPU-3 84 11711 79989 17516 909 4811 5773 341 CPU-4 152 19610 98229 444 53 649 708 1283 CPU-5 185 8781 108697 666 91 671 677 1365 CPU-6 157 21964 95825 581 67 566 684 1284 CPU-7 125 15238 102704 398 20 665 786 1197 Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-02-26 10:54:02 +01:00
Tushar Dave	8ac2441e0b	samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp_redirect Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@labbpf]# ./xdp_redirect $(</sys/class/net/eth2/ifindex) \ > $(</sys/class/net/eth3/ifindex) failed to create a map: 1 Operation not permitted The failure is seen when executing xdp_redirect while xdp_monitor is already runnig. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-02-13 17:42:01 -08:00
William Tu	9c33ca4317	sample/bpf: fix erspan metadata The commit `c69de58ba8` ("net: erspan: use bitfield instead of mask and offset") changes the erspan header to use bitfield, and commit `d350a82302` ("net: erspan: create erspan metadata uapi header") creates a uapi header file. The above two commit breaks the current erspan test. This patch fixes it by adapting the above two changes. Fixes: `ac80c2a165` ("samples/bpf: add erspan v2 sample code") Fixes: `ef88f89c83` ("samples/bpf: extend test_tunnel_bpf.sh with ERSPAN") Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-06 11:32:49 -05:00
Eric Leblond	b259c2ffd9	samples/bpf: use bpf_set_link_xdp_fd Use bpf_set_link_xdp_fd instead of set_link_xdp_fd to remove some code duplication and benefit of netlink ext ack errors message. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-02-02 17:53:48 -08:00
Eric Leblond	bbf48c18ee	libbpf: add error reporting in XDP Parse netlink ext attribute to get the error message returned by the card. Code is partially take from libnl. We add netlink.h to the uapi include of tools. And we need to avoid include of userspace netlink header to have a successful build of sample so nlattr.h has a define to avoid the inclusion. Using a direct define could have been an issue as NLMSGERR_ATTR_MAX can change in the future. We also define SOL_NETLINK if not defined to avoid to have to copy socket.h for a fixed value. Signed-off-by: Eric Leblond <eric@regit.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-02-02 17:53:48 -08:00
Mickaël Salaün	c25ef6a5e6	samples/bpf: Partially fixes the bpf.o build Do not build lib/bpf/bpf.o with this Makefile but use the one from the library directory. This avoid making a buggy bpf.o file (e.g. missing symbols). This patch is useful if some code (e.g. Landlock tests) needs both the bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf). Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-26 23:57:10 +01:00
Jesper Dangaard Brouer	417f1d9f21	samples/bpf: xdp_monitor include cpumap tracepoints in monitoring The xdp_redirect_cpu sample have some "builtin" monitoring of the tracepoints for xdp_cpumap_, but it is practical to have an external tool that can monitor these transpoint as an easy way to troubleshoot an application using XDP + cpumap. Specifically I need such external tool when working on Suricata and XDP cpumap redirect. Extend the xdp_monitor tool sample with monitoring of these xdp_cpumap_ tracepoints. Model the output format like xdp_redirect_cpu. Given I needed to handle per CPU decoding for cpumap, this patch also add per CPU info on the existing monitor events. This resembles part of the builtin monitoring output from sample xdp_rxq_info. Thus, also covering part of that sample in an external monitoring tool. Performance wise, the cpumap tracepoints uses bulking, which cause them to have very little overhead. Thus, they are enabled by default. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-20 02:10:55 +01:00
Jesper Dangaard Brouer	e2e3224122	samples/bpf: xdp2skb_meta comment explain why pkt-data pointers are invalidated Improve the 'unknown reason' comment, with an actual explaination of why the ctx pkt-data pointers need to be loaded after the helper function bpf_xdp_adjust_meta(). Based on the explaination Daniel gave. Fixes: `36e04a2d78` ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 01:49:09 +01:00
Luis de Bethencourt	4c38f74c91	samples/bpf: Fix trailing semicolon The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-17 00:23:58 +01:00
Jesper Dangaard Brouer	36e04a2d78	samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB Creating a bpf sample that shows howto use the XDP 'data_meta' infrastructure, created by Daniel Borkmann. Very few drivers support this feature, but I wanted a functional sample to begin with, when working on adding driver support. XDP data_meta is about creating a communication channel between BPF programs. This can be XDP tail-progs, but also other SKB based BPF hooks, like in this case the TC clsact hook. In this sample I show that XDP can store info named "mark", and TC/clsact chooses to use this info and store it into the skb->mark. It is a bit annoying that XDP and TC samples uses different tools/libs when attaching their BPF hooks. As the XDP and TC programs need to cooperate and agree on a struct-layout, it is best/easiest if the two programs can be contained within the same BPF restricted-C file. As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but instead wrote a bash shell scripted named xdp2skb_meta.sh, which demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for loading BPF programs. To make it easy for first time users, the shell script have command line parsing, and support --verbose and --dry-run mode, if you just want to see/learn the tc+ip command syntax: # ./xdp2skb_meta.sh --dev ixgbe2 --dry-run # Dry-run mode: enable VERBOSE and don't call TC+IP tc qdisc del dev ixgbe2 clsact tc qdisc add dev ixgbe2 clsact tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark # Flush XDP on device: ixgbe2 ip link set dev ixgbe2 xdp off ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-11 01:02:25 +01:00
Jesper Dangaard Brouer	0fca931a6f	samples/bpf: program demonstrating access to xdp_rxq_info This sample program can be used for monitoring and reporting how many packets per sec (pps) are received per NIC RX queue index and which CPU processed the packet. In itself it is a useful tool for quickly identifying RSS imbalance issues, see below. The default XDP action is XDP_PASS in-order to provide a monitor mode. For benchmarking purposes it is possible to specify other XDP actions on the cmdline --action. Output below shows an imbalance RSS case where most RXQ's deliver to CPU-0 while CPU-2 only get packets from a single RXQ. Looking at things from a CPU level the two CPUs are processing approx the same amount, BUT looking at the rx_queue_index levels it is clear that RXQ-2 receive much better service, than other RXQs which all share CPU-0. Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS XDP stats CPU pps issue-pps XDP-RX CPU 0 900,473 0 XDP-RX CPU 2 906,921 0 XDP-RX CPU total 1,807,395 RXQ stats RXQ:CPU pps issue-pps rx_queue_index 0:0 180,098 0 rx_queue_index 0:sum 180,098 rx_queue_index 1:0 180,098 0 rx_queue_index 1:sum 180,098 rx_queue_index 2:2 906,921 0 rx_queue_index 2:sum 906,921 rx_queue_index 3:0 180,098 0 rx_queue_index 3:sum 180,098 rx_queue_index 4:0 180,082 0 rx_queue_index 4:sum 180,082 rx_queue_index 5:0 180,093 0 rx_queue_index 5:sum 180,093 Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-05 15:21:22 -08:00
David S. Miller	59436c9ee1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit `e87c6bc385` ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 10:51:06 -05:00
William Tu	ac80c2a165	samples/bpf: add erspan v2 sample code Extend the existing tests for ipv4 ipv6 erspan version 2. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:34:00 -05:00
Josef Bacik	965de87e54	samples/bpf: add a test for bpf_override_return This adds a basic test for bpf_override_return to verify it works. We override the main function for mounting a btrfs fs so it'll return -ENOMEM and then make sure that trying to mount a btrfs fs will fail. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2017-12-12 09:02:40 -08:00
William Tu	d37e3bb774	samples/bpf: add ip6erspan sample code Extend the existing tests for ip6erspan. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-06 14:45:29 -05:00
David S. Miller	7cda4cee13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Small overlapping change conflict ('net' changed a line, 'net-next' added a line right afterwards) in flexcan.c Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 10:44:19 -05:00
David S. Miller	d671965b54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-03 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Addition of a software model for BPF offloads in order to ease testing code changes in that area and make semantics more clear. This is implemented in a new driver called netdevsim, which can later also be extended for other offloads. SR-IOV support is added as well to netdevsim. BPF kernel selftests for offloading are added so we can track basic functionality as well as exercising all corner cases around BPF offloading, from Jakub. 2) Today drivers have to drop the reference on BPF progs they hold due to XDP on device teardown themselves. Change this in order to make XDP handling inside the drivers less error prone, and move disabling XDP to the core instead, also from Jakub. 3) Misc set of BPF verifier improvements and cleanups as preparatory work for upcoming BPF-to-BPF calls. Among others, this set also improves liveness marking such that pruning can be slightly more effective. Register and stack liveness information is now included in the verifier log as well, from Alexei. 4) nfp JIT improvements in order to identify load/store sequences in the BPF prog e.g. coming from memcpy lowering and optimizing them through the NPU's command push pull (CPP) instruction, from Jiong. 5) Cleanups to test_cgrp2_attach2.c BPF sample code in oder to remove bpf_prog_attach() magic values and replacing them with actual proper attach flag instead, from David. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-04 12:07:10 -05:00
William Tu	56ddd30280	samples/bpf: extend test_tunnel_bpf.sh with ip6gre Extend existing tests for vxlan, gre, geneve, ipip, erspan, to include ip6 gre and gretap tunnel. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-04 11:04:19 -05:00
David S. Miller	c2eb6d07a6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2017-12-02 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix a compilation warning in xdp redirect tracepoint due to missing bpf.h include that pulls in struct bpf_map, from Xie. 2) Limit the maximum number of attachable BPF progs for a given perf event as long as uabi is not frozen yet. The hard upper limit is now 64 and therefore the same as with BPF multi-prog for cgroups. Also add related error checking for the sample BPF loader when enabling and attaching to the perf event, from Yonghong. 3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log case, so that the test case can always pass and not fail in some environments due to too low default limit, also from Yonghong. 4) Fix up a missing license header comment for kernel/bpf/offload.c, from Jakub. 5) Several fixes for bpftool, among others a crash on incorrect arguments when json output is used, error message handling fixes on unknown options and proper destruction of json writer for some exit cases, all from Quentin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-03 13:08:30 -05:00
Yonghong Song	0ec9552b43	samples/bpf: add error checking for perf ioctl calls in bpf loader load_bpf_file() should fail if ioctl with command PERF_EVENT_IOC_ENABLE and PERF_EVENT_IOC_SET_BPF fails. When they do fail, proper error messages are printed. With this change, the below "syscall_tp" run shows that the maximum number of bpf progs attaching to the same perf tracepoint is indeed enforced. $ ./syscall_tp -i 64 prog #0: map ids 4 5 ... prog #63: map ids 382 383 $ ./syscall_tp -i 65 prog #0: map ids 4 5 ... prog #64: map ids 388 389 ioctl PERF_EVENT_IOC_SET_BPF failed err Argument list too long Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 02:59:21 +01:00
David Ahern	51de082570	samples/bpf: Convert magic numbers to names in multi-prog cgroup test case Attach flag 1 == BPF_F_ALLOW_OVERRIDE; attach flag 2 == BPF_F_ALLOW_MULTI. Update the calls to bpf_prog_attach() in test_cgrp2_attach2.c to use the names over the magic numbers. Fixes: `39323e788c` ("samples/bpf: add multi-prog cgroup test case") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-30 19:58:11 +01:00
Masahiro Yamada	bf070bb0e6	kbuild: remove all dummy assignments to obj- Now kbuild core scripts create empty built-in.o where necessary. Remove "obj- := dummy.o" tricks. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2017-11-18 11:46:06 +09:00
Dan Carpenter	fae45363ae	xdp: sample: Missing curly braces in read_route() The assert statement is supposed to be part of the else branch but the curly braces were accidentally left off. Fixes: `3e29cd0e65` ("xdp: Sample xdp program implementing ip forward") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-14 22:00:13 +09:00
David S. Miller	f3edacbd69	bpf: Revert bpf_overrid_function() helper changes. NACK'd by x86 maintainer. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 18:24:55 +09:00
Lawrence Brakmo	03e982eed4	bpf: Fix tcp_clamp_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Lawrence Brakmo	e1853319fc	bpf: Fix tcp_iw_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Lawrence Brakmo	2ff969fbe2	bpf: Fix tcp_cong_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Lawrence Brakmo	a4174f0560	bpf: Fix tcp_bufs_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Lawrence Brakmo	016e661bb0	bpf: Fix tcp_rwnd_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Lawrence Brakmo	7863f46bac	bpf: Fix tcp_synrto_kern.c sample program The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 15:52:41 +09:00
Josef Bacik	eafb3401fa	samples/bpf: add a test for bpf_override_return This adds a basic test for bpf_override_return to verify it works. We override the main function for mounting a btrfs fs so it'll return -ENOMEM and then make sure that trying to mount a btrfs fs will fail. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 12:18:06 +09:00
Lawrence Brakmo	aaf151b9e6	bpf: Rename tcp_bbf.readme to tcp_bpf.readme The original patch had the wrong filename. Fixes: `bfdf756938` ("bpf: create samples/bpf/tcp_bpf.readme") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-08 12:13:41 +09:00
Christina Jacob	3e29cd0e65	xdp: Sample xdp program implementing ip forward Implements port to port forwarding with route table and arp table lookup for ipv4 packets using bpf_redirect helper function and lpm_trie map. Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-08 10:39:41 +09:00
Roman Gushchin	9d1f159419	bpf: move cgroup_helpers from samples/bpf/ to tools/testing/selftesting/bpf/ The purpose of this move is to use these files in bpf tests. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:26:51 +09:00
David S. Miller	2a171788ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-04 09:26:51 +09:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Tushar Dave	21d72af7dc	samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp_redirect_map Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@labbpf]# ./xdp_redirect_map $(</sys/class/net/eth2/ifindex) \ > $(</sys/class/net/eth3/ifindex) failed to create a map: 1 Operation not permitted The failure is 100% when multiple xdp programs are running. Fix it. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 12:17:32 +09:00
Tushar Dave	6dfca831c0	samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp1 Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@lab bpf]#./xdp1 -N $(</sys/class/net/eth2/ifindex) failed to create a map: 1 Operation not permitted Fix it. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 12:17:13 +09:00
Yonghong Song	a678be5cc7	bpf: add a test case to test single tp multiple bpf attachment The bpf sample program syscall_tp is modified to show attachment of more than bpf programs for a particular kernel tracepoint. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-25 10:47:47 +09:00
Lawrence Brakmo	bfdf756938	bpf: create samples/bpf/tcp_bpf.readme Readme file explaining how to create a cgroupv2 and attach one of the tcp_*_kern.o socket_ops BPF program. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked_by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-22 03:12:05 +01:00
Lawrence Brakmo	c890063e44	bpf: sample BPF_SOCKET_OPS_BASE_RTT program Sample socket_ops BPF program to test the BPF helper function bpf_getsocketops and the new socket_ops op BPF_SOCKET_OPS_BASE_RTT. The program provides a base RTT of 80us when the calling flow is within a DC (as determined by the IPV6 prefix) and the congestion algorithm is "nv". Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked_by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-22 03:12:05 +01:00
Jesper Dangaard Brouer	fad3917e36	samples/bpf: add cpumap sample program xdp_redirect_cpu This sample program show how to use cpumap and the associated tracepoints. It provides command line stats, which shows how the XDP-RX process, cpumap-enqueue and cpumap kthread dequeue is cooperating on a per CPU basis. It also utilize the xdp_exception and xdp_redirect_err transpoints to allow users quickly to identify setup issues. One issue with ixgbe driver is that the driver reset the link when loading XDP. This reset the procfs smp_affinity settings. Thus, after loading the program, these must be reconfigured. The easiest workaround it to reduce the RX-queue to e.g. two via: # ethtool --set-channels ixgbe1 combined 2 And then add CPUs above 0 and 1, like: # xdp_redirect_cpu --dev ixgbe1 --prog 2 --cpu 2 --cpu 3 --cpu 4 Another issue with ixgbe is that the page recycle mechanism is tied to the RX-ring size. And the default setting of 512 elements is too small. This is the same issue with regular devmap XDP_REDIRECT. To overcome this I've been using 1024 rx-ring size: # ethtool -G ixgbe1 rx 1024 tx 1024 V3: - whitespace cleanups - bpf tracepoint cannot access top part of struct V4: - report on kthread sched events, according to tracepoint change - report average bulk enqueue size V5: - bpf_map_lookup_elem on cpumap not allowed from bpf_prog use separate map to mark CPUs not available V6: - correct kthread sched summary output V7: - Added a --stress-mode for concurrently changing underlying cpumap Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-18 12:12:18 +01:00
Abhijit Ayarekar	9db9583839	bpf: Add -target to clang switch while cross compiling. Update to llvm excludes assembly instructions. llvm git revision is below commit 65fad7c26569 ("bpf: add inline-asm support") This change will be part of llvm release 6.0 __ASM_SYSREG_H define is not required for native compile. -target switch includes appropriate target specific files while cross compiling Tested on x86 and arm64. Signed-off-by: Abhijit Ayarekar <abhijit.ayarekar@caviumnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 21:04:32 +01:00
Yonghong Song	81b9cf8028	bpf: add a test case for helper bpf_perf_prog_read_value The bpf sample program trace_event is enhanced to use the new helper to print out enabled/running time. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-07 23:05:57 +01:00
Yonghong Song	020a32d958	bpf: add a test case for helper bpf_perf_event_read_value The bpf sample program tracex6 is enhanced to use the new helper to read enabled/running time as well. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-07 23:05:57 +01:00
Jesper Dangaard Brouer	c4eb7f4643	samples/bpf: xdp_monitor increase memory rlimit Other concurrent running programs, like perf or the XDP program what needed to be monitored, might take up part of the max locked memory limit. Thus, the xdp_monitor tool have to set the RLIMIT_MEMLOCK to RLIM_INFINITY, as it cannot determine a more sane limit. Using the man exit(3) specified EXIT_FAILURE return exit code, and correct other users too. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 10:04:36 -07:00
Jesper Dangaard Brouer	280b058d48	samples/bpf: xdp_monitor also record xdp_exception tracepoint Also monitor the tracepoint xdp_exception. This tracepoint is usually invoked by the drivers. Programs themselves can activate this by returning XDP_ABORTED, which will drop the packet but also trigger the tracepoint. This is useful for distinguishing intentional (XDP_DROP) vs. ebpf-program error cases that cased a drop (XDP_ABORTED). Drivers also use this tracepoint for reporting on XDP actions that are unknown to the specific driver. This can help the user to detect if a driver e.g. doesn't implement XDP_REDIRECT yet. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 10:04:35 -07:00
Jesper Dangaard Brouer	f4ce0a0116	samples/bpf: xdp_monitor first 8 bytes are not accessible by bpf The first 8 bytes of the tracepoint context struct are not accessible by the bpf code. This is a choice that dates back to the original inclusion of this code. See explaination in: commit `98b5c2c65c` ("perf, bpf: allow bpf programs attach to tracepoints") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 10:04:35 -07:00
Alexei Starovoitov	dfc069998e	samples/bpf: use bpf_prog_query() interface use BPF_PROG_QUERY command to strengthen test coverage Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 16:05:06 -07:00
Alexei Starovoitov	39323e788c	samples/bpf: add multi-prog cgroup test case create 5 cgroups, attach 6 progs and check that progs are executed as: cgrp1 (MULTI progs A, B) -> cgrp2 (OVERRIDE prog C) -> cgrp3 (MULTI prog D) -> cgrp4 (OVERRIDE prog E) -> cgrp5 (NONE prog F) the event in cgrp5 triggers execution of F,D,A,B in that order. if prog F is detached, the execution is E,D,A,B if prog F and D are detached, the execution is E,A,B if prog F, E and D are detached, the execution is C,A,B Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 16:05:05 -07:00
Stephen Hemminger	0929567a7a	samples/bpf: fix warnings in xdp_monitor_user Make local functions static to fix HOSTCC samples/bpf/xdp_monitor_user.o samples/bpf/xdp_monitor_user.c:64:7: warning: no previous prototype for ‘gettime’ [-Wmissing-prototypes] __u64 gettime(void) ^~~~~~~ samples/bpf/xdp_monitor_user.c:209:6: warning: no previous prototype for ‘print_bpf_prog_info’ [-Wmissing-prototypes] void print_bpf_prog_info(void) ^~~~~~~~~~~~~~~~~~~ Fixes: `3ffab54602` ("samples/bpf: xdp_monitor tool based on tracepoints") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-01 23:08:24 -07:00
Martin KaFai Lau	88cda1c9da	bpf: libbpf: Provide basic API support to specify BPF obj name This patch extends the libbpf to provide API support to allow specifying BPF object name. In tools/lib/bpf/libbpf, the C symbol of the function and the map is used. Regarding section name, all maps are under the same section named "maps". Hence, section name is not a good choice for map's name. To be consistent with map, bpf_prog also follows and uses its function symbol as the prog's name. This patch adds logic to collect function's symbols in libbpf. There is existing codes to collect the map's symbols and no change is needed. The bpf_load_program_name() and bpf_map_create_name() are added to take the name argument. For the other bpf_map_create_xxx() variants, a name argument is directly added to them. In samples/bpf, bpf_load.c in particular, the symbol is also used as the map's name and the map symbols has already been collected in the existing code. For bpf_prog, bpf_load.c does not collect the function symbol name. We can consider to collect them later if there is a need to continue supporting the bpf_load.c. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Joel Fernandes	8bf2ac25a9	samples/bpf: Add documentation on cross compilation Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:59:16 -07:00
Joel Fernandes	b655fc1c2e	samples/bpf: Fix pt_regs issues when cross-compiling BPF samples fail to build when cross-compiling for ARM64 because of incorrect pt_regs param selection. This is because clang defines __x86_64__ and bpf_headers thinks we're building for x86. Since clang is building for the BPF target, it shouldn't make assumptions about what target the BPF program is going to run on. To fix this, lets pass ARCH so the header knows which target the BPF program is being compiled for and can use the correct pt_regs code. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:59:16 -07:00
Joel Fernandes	876e88e327	samples/bpf: Enable cross compiler support When cross compiling, bpf samples use HOSTCC for compiling the non-BPF part of the sample, however what we really want is to use the cross compiler to build for the cross target since that is what will load and run the BPF sample. Detect this and compile samples correctly. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:59:16 -07:00
Joel Fernandes	95ec669685	samples/bpf: Use getppid instead of getpgrp for array map stress When cross-compiling the bpf sample map_perf_test for aarch64, I find that __NR_getpgrp is undefined. This causes build errors. This syscall is deprecated and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define that, just use a different syscall (getppid) for the array map stress test. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:59:16 -07:00
Martin KaFai Lau	637cd8c312	bpf: Add lru_hash_lookup performance test Create a new case to test the LRU lookup performance. At the beginning, the LRU map is fully loaded (i.e. the number of keys is equal to map->max_entries). The lookup is done through key 0 to num_map_entries and then repeats from 0 again. This patch also creates an anonymous struct to properly name the test params in stress_lru_hmap_alloc() in map_perf_test_kern.c. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 09:57:38 -07:00
David Ahern	0adc3dd900	samples/bpf: Update cgroup socket examples to use uid gid helper Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 06:05:15 +01:00
David Ahern	33aeb5e30a	samples/bpf: Update cgrp2 socket tests Update cgrp2 bpf sock tests to check that device, mark and priority can all be set on a socket via bpf programs attached to a cgroup. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 06:05:15 +01:00
David Ahern	f776d460b8	samples/bpf: Add option to dump socket settings Add option to dump socket settings. Will be used in the next patch to verify bpf programs are correctly setting mark, priority and device based on the cgroup attachment for the program run. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 06:05:15 +01:00
David Ahern	609b1c3275	samples/bpf: Add detach option to test_cgrp2_sock Add option to detach programs from a cgroup. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 06:05:15 +01:00
David Ahern	fa38aa17bc	samples/bpf: Update sock test to allow setting mark and priority Update sock test to set mark and priority on socket create. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 06:05:15 +01:00
Tariq Toukan	3edcf18ec2	samples/bpf: Fix compilation issue in redirect dummy program Fix compilation error below: $ make samples/bpf/ LLVM ERROR: 'xdp_redirect_dummy' label emitted multiple times to assembly file make[1]: * [samples/bpf/xdp_redirect_kern.o] Error 1 make: * [samples/bpf/] Error 2 Fixes: `306da4e685` ("samples/bpf: xdp_redirect load XDP dummy prog on TX device") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 11:56:57 -07:00
Jesper Dangaard Brouer	3ffab54602	samples/bpf: xdp_monitor tool based on tracepoints This tool xdp_monitor demonstrate how to use the different xdp_redirect tracepoints xdp_redirect{,_map}{,_err} from a BPF program. The default mode is to only monitor the error counters, to avoid affecting the per packet performance. Tracepoints comes with a base overhead of 25 nanosec for an attached bpf_prog, and 48 nanosec for using a full perf record (with non-matching filter). Thus, default loading the --stats mode could affect the maximum performance. This version of the tool is very simple and count all types of errors as one. It will be natural to extend this later with the different types of errors that can occur, which should help users quickly identify common mistakes. Because the TP_STRUCT was kept in sync all the tracepoints loads the same BPF code. It would also be natural to extend the map version to demonstrate how the map information could be used. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	306da4e685	samples/bpf: xdp_redirect load XDP dummy prog on TX device For supporting XDP_REDIRECT, a device driver must (obviously) implement the "TX" function ndo_xdp_xmit(). An additional requirement is you cannot TX out a device, unless it also have a xdp bpf program attached. This dependency is caused by the driver code need to setup XDP resources before it can ndo_xdp_xmit. Update bpf samples xdp_redirect and xdp_redirect_map to automatically attach a dummy XDP program to the configured ifindex_out device. Use the XDP flag XDP_FLAGS_UPDATE_IF_NOEXIST on the dummy load, to avoid overriding an existing XDP prog on the device. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
William Tu	ef88f89c83	samples/bpf: extend test_tunnel_bpf.sh with ERSPAN Extend existing tests for vxlan, gre, geneve, ipip to include ERSPAN tunnel. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-28 15:04:52 -07:00
Martin KaFai Lau	ad17d0e6c7	bpf: Allow numa selection in INNER_LRU_HASH_PREALLOC test of map_perf_test This patch makes the needed changes to allow each process of the INNER_LRU_HASH_PREALLOC test to provide its numa node id when creating the lru map. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-19 21:35:43 -07:00
John Fastabend	69e8cc134b	bpf: sockmap sample program This program binds a program to a cgroup and then matches hard coded IP addresses and adds these to a sockmap. This will receive messages from the backend and send them to the client. client:X <---> frontend:10000 client:X <---> backend:10001 To keep things simple this is only designed for 1:1 connections using hard coded values. A more complete example would allow many backends and clients. To run, # sockmap <cgroup2_dir> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 11:27:53 -07:00
Yonghong Song	1da236b6be	bpf: add a test case for syscalls/sys_{enter\|exit}_* tracepoints Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 14:09:48 -07:00
David S. Miller	29fda25a2d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 10:07:50 -07:00
William Tu	cc75f8514d	samples/bpf: fix bpf tunnel cleanup test_tunnel_bpf.sh fails to remove the vxlan11 tunnel device, causing the next geneve tunnelling test case fails. In addition, the geneve reserved bit in tcbpf2_kern.c should be zero, according to the RFC. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 22:02:47 -07:00
Andy Gospodarek	24251c2647	samples/bpf: add option for native and skb mode for redirect apps When testing with a driver that has both native and generic redirect support: $ sudo ./samples/bpf/xdp_redirect -N 5 6 input: 5 output: 6 ifindex 6: 4961879 pkt/s ifindex 6: 6391319 pkt/s ifindex 6: 6419468 pkt/s $ sudo ./samples/bpf/xdp_redirect -S 5 6 input: 5 output: 6 ifindex 6: 1845435 pkt/s ifindex 6: 3882850 pkt/s ifindex 6: 3893974 pkt/s $ sudo ./samples/bpf/xdp_redirect_map -N 5 6 input: 5 output: 6 map[0] (vports) = 4, map[1] (map) = 5, map[2] (count) = 0 ifindex 6: 2207374 pkt/s ifindex 6: 6212869 pkt/s ifindex 6: 6286515 pkt/s $ sudo ./samples/bpf/xdp_redirect_map -S 5 6 input: 5 output: 6 map[0] (vports) = 4, map[1] (map) = 5, map[2] (count) = 0 ifindex 6: 5052528 pkt/s ifindex 6: 5736631 pkt/s ifindex 6: 5739962 pkt/s Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 13:46:27 -07:00
John Fastabend	9d6e005287	xdp: bpf redirect with map sample program Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00
John Fastabend	832622e6bd	xdp: sample program for new bpf_redirect helper This implements a sample program for testing bpf_redirect. It reports the number of packets redirected per second and as input takes the ifindex of the device to run the xdp program on and the ifindex of the interface to redirect packets to. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00
Yonghong Song	533350227d	samples/bpf: fix a build issue With latest net-next: ==== clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -Isamples/bpf \ -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \ -Wno-compare-distinct-pointer-types \ -Wno-gnu-variable-sized-type-not-at-end \ -Wno-address-of-packed-member -Wno-tautological-compare \ -Wno-unknown-warning-option \ -O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -\| llc -march=bpf -filetype=obj -o samples/bpf/tcp_synrto_kern.o samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found ^~~~~~~~~~~~~~ 1 error generated. ==== net has the same issue. Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h. Also move bpf_helpers.h from samples/bpf to selftests/bpf and change compiler include logic so that programs in samples/bpf can access the headers in selftests/bpf, but not the other way around. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-11 20:51:29 -07:00
Lawrence Brakmo	f856e46978	bpf: fix return in load_bpf_file The function load_bpf_file ignores the return value of load_and_attach(), so even if load_and_attach() returns an error, load_bpf_file() will return 0. Now, load_bpf_file() can call load_and_attach() multiple times and some can succeed and some could fail. I think the correct behavor is to return error on the first failed load_and_attach(). v2: Added missing SOB Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-05 09:05:28 +01:00
Lawrence Brakmo	6c4a01b278	bpf: Sample bpf program to set sndcwnd clamp Sample BPF program, tcp_clamp_kern.c, to demostrate the use of setting the sndcwnd clamp. This program assumes that if the first 5.5 bytes of the host's IPv6 addresses are the same, then the hosts are in the same datacenter and sets sndcwnd clamp to 100 packets, SYN and SYN-ACK RTOs to 10ms and send/receive buffer sizes to 150KB. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-01 16:15:14 -07:00
Lawrence Brakmo	7bc62e2854	bpf: Sample BPF program to set initial cwnd Sample BPF program that assumes hosts are far away (i.e. large RTTs) and sets initial cwnd and initial receive window to 40 packets, send and receive buffers to 1.5MB. In practice there would be a test to insure the hosts are actually far enough away. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-01 16:15:14 -07:00

... 3 4 5 6 7 ...

624 Коммитов