WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
John Fastabend	0608c69c9a	bpf: sk_msg, sock{map\|hash} redirect through ULP A sockmap program that redirects through a kTLS ULP enabled socket will not work correctly because the ULP layer is skipped. This fixes the behavior to call through the ULP layer on redirect to ensure any operations required on the data stream at the ULP layer continue to be applied. To do this we add an internal flag MSG_SENDPAGE_NOPOLICY to avoid calling the BPF layer on a redirected message. This is required to avoid calling the BPF layer multiple times (possibly recursively) which is not the current/expected behavior without ULPs. In the future we may add a redirect flag if users _do_ want the policy applied again but this would need to work for both ULP and non-ULP sockets and be opt-in to avoid breaking existing programs. Also to avoid polluting the flag space with an internal flag we reuse the flag space overlapping MSG_SENDPAGE_NOPOLICY with MSG_WAITFORONE. Here WAITFORONE is specific to recv path and SENDPAGE_NOPOLICY is only used for sendpage hooks. The last thing to verify is user space API is masked correctly to ensure the flag can not be set by user. (Note this needs to be true regardless because we have internal flags already in-use that user space should not be able to set). But for completeness we have two UAPI paths into sendpage, sendfile and splice. In the sendfile case the function do_sendfile() zero's flags, ./fs/read_write.c: static ssize_t do_sendfile(int out_fd, int in_fd, loff_t ppos, size_t count, loff_t max) { ... fl = 0; #if 0 / * We need to debate whether we can enable this or not. The * man page documents EAGAIN return for the output at least, * and the application is arguably buggy if it doesn't expect * EAGAIN on a non-blocking file descriptor. / if (in.file->f_flags & O_NONBLOCK) fl = SPLICE_F_NONBLOCK; #endif file_start_write(out.file); retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl); } In the splice case the pipe_to_sendpage "actor" is used which masks flags with SPLICE_F_MORE. ./fs/splice.c: static int pipe_to_sendpage(struct pipe_inode_info pipe, struct pipe_buffer buf, struct splice_desc sd) { ... more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0; ... } Confirming what we expect that internal flags are in fact internal to socket side. Fixes: `d3b18ad31f` ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:09 +01:00
John Fastabend	a136678c0b	bpf: sk_msg, zap ingress queue on psock down In addition to releasing any cork'ed data on a psock when the psock is removed we should also release any skb's in the ingress work queue. Otherwise the skb's eventually get free'd but late in the tear down process so we see the WARNING due to non-zero sk_forward_alloc. void sk_stream_kill_queues(struct sock *sk) { ... WARN_ON(sk->sk_forward_alloc); ... } Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:09 +01:00
John Fastabend	552de91068	bpf: sk_msg, fix socket data_ready events When a skb verdict program is in-use and either another BPF program redirects to that socket or the new SK_PASS support is used the data_ready callback does not wake up application. Instead because the stream parser/verdict is using the sk data_ready callback we wake up the stream parser/verdict block. Fix this by adding a helper to check if the stream parser block is enabled on the sk and if so call the saved pointer which is the upper layers wake up function. This fixes application stalls observed when an application is waiting for data in a blocking read(). Fixes: `d829e9c411` ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:09 +01:00
John Fastabend	51199405f9	bpf: skb_verdict, support SK_PASS on RX BPF path Add SK_PASS verdict support to SK_SKB_VERDICT programs. Now that support for redirects exists we can implement SK_PASS as a redirect to the same socket. This simplifies the BPF programs and avoids an extra map lookup on RX path for simple visibility cases. Further, reduces user (BPF programmer in this context) confusion when their program drops skb due to lack of support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:09 +01:00
John Fastabend	7a69c0f250	bpf: skmsg, replace comments with BUILD bug Enforce comment on structure layout dependency with a BUILD_BUG_ON to ensure the condition is maintained. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:09 +01:00
John Fastabend	bc1b4f013b	bpf: sk_msg, improve offset chk in _is_valid_access The check for max offset in sk_msg_is_valid_access uses sizeof() which is incorrect because it would allow accessing possibly past the end of the struct in the padded case. Further, it doesn't preclude accessing any padding that may be added in the middle of a struct. All told this makes it fragile to rely on. To fix this explicitly check offsets with fields using the bpf_ctx_range() and bpf_ctx_range_till() macros. For reference the current structure layout looks as follows (reported by pahole) struct sk_msg_md { union { void * data; /* 8 / }; / 0 8 / union { void data_end; /* 8 / }; / 8 8 / __u32 family; / 16 4 / __u32 remote_ip4; / 20 4 / __u32 local_ip4; / 24 4 / __u32 remote_ip6[4]; / 28 16 / __u32 local_ip6[4]; / 44 16 / __u32 remote_port; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / __u32 local_port; / 64 4 / __u32 size; / 68 4 / / size: 72, cachelines: 2, members: 10 / / last cacheline: 8 bytes */ }; So there should be no padding at the moment but fixing this now prevents future errors. Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:08 +01:00
John Fastabend	9ee79a65d1	bpf: sk_msg, fix sk_msg_md access past end test Currently, the test to ensure reads past the end of the sk_msg_md data structure fail is incorrectly expecting success. Fix this typo and use correct expected error. Fixes: `945a47d87c` ("bpf: sk_msg, add tests for size field") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:47:08 +01:00
Jesper Dangaard Brouer	77ea5f4cbe	bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't The frame_size passed to build_skb must be aligned, else it is possible that the embedded struct skb_shared_info gets unaligned. For correctness make sure that xdpf->headroom in included in the alignment. No upstream drivers can hit this, as all XDP drivers provide an aligned headroom. This was discovered when playing with implementing XDP support for mvneta, which have a 2 bytes DSA header, and this Marvell ARM64 platform didn't like doing atomic operations on an unaligned skb_shinfo(skb)->dataref addresses. Fixes: `1c601d829a` ("bpf: cpumap xdp_buff to skb conversion and allocation") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 23:19:12 +01:00
Daniel Borkmann	d70f4ece9d	Merge branch 'bpf-jset-verifier' Jakub Kicinski says: ==================== This is a v2 of the patch set to teach the verifier about BPF_JSET instruction. There is also a number of tests include for both basic functioning of the instruction and the verifier logic. The NFP JIT handling of JSET is tweaked. Last patch adds missing file to gitignore. Reposting part of previous series without the dead code elimination. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:30 +01:00
Jakub Kicinski	489c066cfd	selftests: bpf: add missing executables to .gitignore commit `435f90a338` ("selftests/bpf: add a test case for sock_ops perf-event notification") missed adding new test to gitignore. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:29 +01:00
Jakub Kicinski	4987eaccd2	nfp: bpf: optimize codegen for JSET with a constant The top word of the constant can only have bits set if sign extension set it to all-1, therefore we don't really have to mask the top half of the register. We can just OR it into the result as is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:29 +01:00
Jakub Kicinski	6e774845b3	nfp: bpf: remove the trivial JSET optimization The verifier will now understand the JSET instruction, so don't mark the dead branch in the JIT as noop. We won't generate any code, anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Jakub Kicinski	9b38c4056b	bpf: verifier: reorder stack size check with dead code sanitization Reorder the calls to check_max_stack_depth() and sanitize_dead_code() to separate functions which can rewrite instructions from pure checks. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Jakub Kicinski	14507e35bd	selftests: bpf: verifier: add tests for JSET interpretation Validate that the verifier reasons correctly about the bounds and removes dead code based on results of JSET instruction. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Jakub Kicinski	960ea05656	bpf: verifier: teach the verifier to reason about the BPF_JSET instruction Some JITs (nfp) try to optimize code on their own. It could make sense in case of BPF_JSET instruction which is currently not interpreted by the verifier, meaning for instance that dead could would not be detected if it was under BPF_JSET branch. Teach the verifier basics of BPF_JSET, JIT optimizations will be removed shortly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Jakub Kicinski	5a8d5209ac	selftests: bpf: add trivial JSET tests We seem to have no JSET instruction test, and LLVM does not generate it at all, so let's add a simple hand-coded test to make sure JIT implementations are correct. v2: - extend test_verifier to handle multiple inputs and add the sample there (Daniel) - add a sign extension case Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Martin KaFai Lau	9df95e8ec5	bpf: sparc64: Enable sparc64 jit to provide bpf_line_info This patch enables sparc64's bpf_int_jit_compile() to provide bpf_line_info by calling bpf_prog_fill_jited_linfo(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 02:04:53 +01:00
Alexei Starovoitov	6f1f78efbb	Merge branch 'line_info-check-for-ld_imm64' Martin KaFai Lau says: ==================== This series ensures the line_info (passed by the userspace during bpf_prog_load) cannot have its line_info.insn_off pointing to a zero bpf insn code. F.e. a broken userspace tool might generate a line_info.insn_off that points to the second 8 bytes of a BPF_LD_IMM64. The first patch is the kernel change. The second patch is a new test case. ==================== Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-19 15:45:43 -08:00
Martin KaFai Lau	e30f5640e3	bpf: Add BPF_LD_IMM64 to the line_info test This patch adds a BPF_LD_IMM64 case to the line_info test to ensure the kernel rejects linfo_info.insn_off pointing to the 2nd 8 bytes of the BPF_LD_IMM64. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-19 15:42:55 -08:00
Martin KaFai Lau	fdbaa0beb7	bpf: Ensure line_info.insn_off cannot point to insn with zero code This patch rejects a line_info if the bpf insn code referred by line_info.insn_off is 0. F.e. a broken userspace tool might generate a line_info.insn_off that points to the second 8 bytes of a BPF_LD_IMM64. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-19 15:42:55 -08:00
Ivan Babrou	9e88b9312a	tools: bpftool: do not force gcc as CC This allows transparent cross-compilation with CROSS_COMPILE by relying on `7ed1c1901f` ("tools: fix cross-compile var clobbering"). Signed-off-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 21:57:25 +01:00
Björn Töpel	e2ce367488	xsk: simplify AF_XDP socket teardown Prior this commit, when the struct socket object was being released, the UMEM did not have its reference count decreased. Instead, this was done in the struct sock sk_destruct function. There is no reason to keep the UMEM reference around when the socket is being orphaned, so in this patch the xdp_put_mem is called in the xsk_release function. This results in that the xsk_destruct function can be removed! Note that, it still holds that a struct xsk_sock reference might still linger in the XSKMAP after the UMEM is released, e.g. if a user does not clear the XSKMAP prior to closing the process. This sock will be in a "released" zombie like state, until the XSKMAP is removed. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 21:45:17 +01:00
Yonghong Song	76c43ae84e	bpf: log struct/union attribute for forward type Current btf internal verbose logger logs fwd type as [2] FWD A type_id=0 where A is the type name. Commit `9d5f9f701b` ("bpf: btf: fix struct/union/fwd types with kind_flag") introduced kind_flag which can be used to distinguish whether a forward type is a struct or union. Also, "type_id=0" does not carry any meaningful information for fwd type as btf_type.type = 0 is simply enforced during btf verification and is not used anywhere else. This commit changed the log to [2] FWD A struct if kind_flag = 0, or [2] FWD A union if kind_flag = 1. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 00:47:56 +01:00
Daniel Borkmann	dd4bfda9cf	Merge branch 'bpf-sk-msg-size-member' John Fastabend says: ==================== This adds a size field to the sk_msg_md data structure used by SK_MSG programs. Without this in the zerocopy case and in the copy case where multiple iovs are in use its difficult to know how much data can be pulled in. The normal method of reading data and data_end only give the current contiguous buffer. BPF programs can attempt to pull in extra data but have to guess if it exists. This can result in multiple "guesses" its much better if we know upfront the size of the sk_msg. ==================== Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 00:30:05 +01:00
John Fastabend	945a47d87c	bpf: sk_msg, add tests for size field This adds tests to read the size field to test_verifier. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 00:27:23 +01:00
John Fastabend	584e46813e	bpf: add tools lib/include support sk_msg_md size field Add the size field to sk_msg_md for tools. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 00:27:23 +01:00
John Fastabend	3bdbd0228e	bpf: sockmap, metadata support for reporting size of msg This adds metadata to sk_msg_md for BPF programs to read the sk_msg size. When the SK_MSG program is running under an application that is using sendfile the data is not copied into sk_msg buffers by default. Rather the BPF program uses sk_msg_pull_data to read the bytes in. This avoids doing the costly memcopy instructions when they are not in fact needed. However, if we don't know the size of the sk_msg we have to guess if needed bytes are available by doing a pull request which may fail. By including the size of the sk_msg BPF programs can check the size before issuing sk_msg_pull_data requests. Additionally, the same applies for sendmsg calls when the application provides multiple iovs. Here the BPF program needs to pull in data to update data pointers but its not clear where the data ends without a size parameter. In many cases "guessing" is not easy to do and results in multiple calls to pull and without bounded loops everything gets fairly tricky. Clean this up by including a u32 size field. Note, all writes into sk_msg_md are rejected already from sk_msg_is_valid_access so nothing additional is needed there. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-19 00:27:23 +01:00
Jiong Wang	0bae2d4d62	bpf: correct slot_type marking logic to allow more stack slot sharing Verifier is supposed to support sharing stack slot allocated to ptr with SCALAR_VALUE for privileged program. However this doesn't happen for some cases. The reason is verifier is not clearing slot_type STACK_SPILL for all bytes, it only clears part of them, while verifier is using: slot_type[0] == STACK_SPILL as a convention to check one slot is ptr type. So, the consequence of partial clearing slot_type is verifier could treat a partially overridden ptr slot, which should now be a SCALAR_VALUE slot, still as ptr slot, and rejects some valid programs. Before this patch, test_xdp_noinline.o under bpf selftests, bpf_lxc.o and bpf_netdev.o under Cilium bpf repo, when built with -mattr=+alu32 are rejected due to this issue. After this patch, they all accepted. There is no processed insn number change before and after this patch on Cilium bpf programs. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-18 14:45:01 -08:00
Matt Mullins	a38d1107f9	bpf: support raw tracepoints in modules Distributions build drivers as modules, including network and filesystem drivers which export numerous tracepoints. This enables bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints. Signed-off-by: Matt Mullins <mmullins@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-18 14:08:12 -08:00
Daniel Borkmann	a137401d85	Merge branch 'bpf-bpftool-mount-tracefs' Quentin Monnet says: ==================== This series focus on mounting (or not mounting) tracefs with bpftool. First patch makes bpftool attempt to mount tracefs if tracefs is not found when running "bpftool prog tracelog". Second patch adds an option to bpftool to prevent it from attempting to mount any file system (tracefs or bpffs), in case this behaviour is undesirable for some users. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 14:47:19 +01:00
Quentin Monnet	33221307c3	tools: bpftool: add an option to prevent auto-mount of bpffs, tracefs In order to make life easier for users, bpftool automatically attempts to mount the BPF virtual file system, if it is not mounted already, before trying to pin objects in it. Similarly, it attempts to mount tracefs if necessary before trying to dump the trace pipe to the console. While mounting file systems on-the-fly can improve user experience, some administrators might prefer to avoid that. Let's add an option to block these mount attempts. Note that it does not prevent automatic mounting of tracefs by debugfs for the "bpftool prog tracelog" command. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 14:47:17 +01:00
Quentin Monnet	be3245e22d	tools: bpftool: attempt to mount tracefs if required for tracelog cmd As a follow-up to commit `30da46b5dc` ("tools: bpftool: add a command to dump the trace pipe"), attempt to mount the tracefs virtual file system if it is not detected on the system before trying to dump content of the tracing pipe on an invocation of "bpftool prog tracelog". Usually, tracefs in automatically mounted by debugfs when the user tries to access it (e.g. "ls /sys/kernel/debug/tracing" mounts the tracefs). So if we failed to find it, it is probably that debugfs is not here either. Therefore, we just attempt a single mount, at a location that does not involve debugfs: /sys/kernel/tracing. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 14:47:17 +01:00
Yonghong Song	0d7410ea6e	tools/bpf: check precise {func, line, jited_line}_info_rec_size in test_btf Current btf func_info, line_info and jited_line are designed to be extensible. The record sizes for {func,line}_info are passed to kernel, and the record sizes for {func,line,jited_line}_info are returned to userspace during bpf_prog_info query. In bpf selftests test_btf.c, when testing whether kernel returns a legitimate {func,line, jited_line)_info rec_size, the test only compares to the minimum allowed size. If the returned rec_size is smaller than the minimum allowed size, it is considered incorrect. The minimum allowed size for these three info sizes are equal to current value of sizeof(struct bpf_func_info), sizeof(struct bpf_line_info) and sizeof(__u64). The original thinking was that in the future when rec_size is increased in kernel, the same test should run correctly. But this sacrificed the precision of testing under the very kernel the test is shipped with, and bpf selftest is typically run with the same repo kernel. So this patch changed the testing of rec_size such that the kernel returned value should be equal to the size defined by tools uapi header bpf.h which syncs with kernel uapi header. Martin discovered a bug in one of rec_size comparisons. Instead of comparing to minimum func_info rec_size 8, it compares to 4. This patch fixed that issue as well. Fixes: `999d82cbc0` ("tools/bpf: enhance test_btf file testing to test func info") Fixes: `05687352c6` ("bpf: Refactor and bug fix in test_func_type in test_btf.c") Fixes: `4d6304c763` ("bpf: Add unit tests for bpf_line_info") Suggested-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 14:46:32 +01:00
Prashant Bhole	07a09d1b73	bpf: libbpf: fix memleak by freeing line_info This patch fixes a memory leak in libbpf by freeing up line_info member of struct bpf_program while unloading a program. Fixes: `3d65014146` ("bpf: libbpf: Add btf_line_info support to libbpf") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:16:23 +01:00
Daniel Borkmann	37c7b1caea	Merge branch 'bpf-btf-type-fixes' Yonghong Song says: ==================== Commit `69b693f0ae` ("bpf: btf: Introduce BPF Type Format (BTF)") introduced BTF, a debug info format for BTF. The original design has a couple of issues though. First, the bitfield size is only encoded in int type. If the struct member bitfield type is enum, pahole ([1]) or llvm is forced to replace enum with int type. As a result, the original type information gets lost. Second, the original BTF design does not envision the possibility of BTF=>header_file conversion ([2]), hence does not encode "struct" or "union" info for a forward type. Such information is necessary to convert BTF to a header file. This patch set fixed the issue by introducing kind_flag, using one bit in type->info. When kind_flag, the struct/union btf_member->offset will encode both bitfield_size and bit_offset, covering both int and enum base types. The kind_flag is also used to indicate whether the forward type is a union (when set) or a struct. Patch #1 refactors function btf_int_bits_seq_show() so Patch #2 can reuse part of the function. Patch #2 implemented kind_flag support for struct/union/fwd types. Patch #3 added kind_flag support for cgroup local storage map pretty print. Patch #4 syncs kernel uapi btf.h to tools directory. Patch #5 added unit tests for kind_flag. Patch #6 added tests for kernel bpffs based pretty print with kind_flag. Patch #7 refactors function btf_dumper_int_bits() so Patch #8 can reuse part of the function. Patch #8 added bpftool support of pretty print with kind_flag set. [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=b18354f64cc215368c3bc0df4a7e5341c55c378c [2] https://lwn.net/SubscriberLink/773198/fe3074838f5c3f26/ Change logs: v2 -> v3: . Relocated comments about bitfield_size/bit_offset interpretation of the "offset" field right before the "offset" struct member. . Added missing byte alignment checking for non-bitfield enum member of a struct with kind_flag set. . Added two test cases in unit tests for struct type, kind_flag set, non-bitfield int/enum member, not-byte aligned bit offsets. . Added comments to help understand there is no overflow for total_bits_offset in bpftool function btf_dumper_int_bits(). . Added explanation of typedef type dumping fix in Patch #8 commit message. v1 -> v2: . If kind_flag is set for a structure, ensure an int member, whether it is a bitfield or not, is a regular int type. . Added support so cgroup local storage map pretty print works with kind_flag. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:12:01 +01:00
Yonghong Song	8772c8bc09	tools: bpftool: support pretty print with kind_flag set The following example shows map pretty print with structures which include bitfield members. enum A { A1, A2, A3, A4, A5 }; typedef enum A ___A; struct tmp_t { char a1:4; int a2:4; int :4; __u32 a3:4; int b; ___A b1:4; enum A b2:4; }; struct bpf_map_def SEC("maps") tmpmap = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct tmp_t), .max_entries = 1, }; BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t); and the following map update in the bpf program: key = 0; struct tmp_t t = {}; t.a1 = 2; t.a2 = 4; t.a3 = 6; t.b = 7; t.b1 = 8; t.b2 = 10; bpf_map_update_elem(&tmpmap, &key, &t, 0); With this patch, I am able to print out the map values correctly with this patch: bpftool map dump id 187 [{ "key": 0, "value": { "a1": 0x2, "a2": 0x4, "a3": 0x6, "b": 7, "b1": 0x8, "b2": 0xa } } ] Previously, if a function prototype argument has a typedef type, the prototype is not printed since function __btf_dumper_type_only() bailed out with error if the type is a typedef. This commit corrected this behavior by printing out typedef properly. The following example shows forward type and typedef type can be properly printed in function prototype with modified test_btf_haskv.c. struct t; union u; __attribute__((noinline)) static int test_long_fname_1(struct dummy_tracepoint_args arg, struct t p1, union u p2, __u32 unused) ... int _dummy_tracepoint(struct dummy_tracepoint_args arg) { return test_long_fname_1(arg, 0, 0, 0); } $ bpftool p d xlated id 24 ... int test_long_fname_1(struct dummy_tracepoint_args * arg, struct t * p1, union u * p2, __u32 unused) ... Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:12:00 +01:00
Yonghong Song	9f95e37e31	tools: bpftool: refactor btf_dumper_int_bits() The core dump funcitonality in btf_dumper_int_bits() is refactored into a separate function btf_dumper_bitfield() which will be used by the next patch. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:12:00 +01:00
Yonghong Song	d0ebce687e	tools/bpf: test kernel bpffs map pretty print with struct kind_flag The new tests are added to test bpffs map pretty print in kernel with kind_flag for structure type. $ test_btf -p ...... BTF pretty print array(#1)......OK BTF pretty print array(#2)......OK PASS:8 SKIP:0 FAIL:0 Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:12:00 +01:00
Yonghong Song	cd9de5d3d6	tools/bpf: add test_btf unit tests for kind_flag This patch added unit tests for different types handling type->info.kind_flag. The following new tests are added: $ test_btf ... BTF raw test[82] (invalid int kind_flag): OK BTF raw test[83] (invalid ptr kind_flag): OK BTF raw test[84] (invalid array kind_flag): OK BTF raw test[85] (invalid enum kind_flag): OK BTF raw test[86] (valid fwd kind_flag): OK BTF raw test[87] (invalid typedef kind_flag): OK BTF raw test[88] (invalid volatile kind_flag): OK BTF raw test[89] (invalid const kind_flag): OK BTF raw test[90] (invalid restrict kind_flag): OK BTF raw test[91] (invalid func kind_flag): OK BTF raw test[92] (invalid func_proto kind_flag): OK BTF raw test[93] (valid struct kind_flag, bitfield_size = 0): OK BTF raw test[94] (valid struct kind_flag, int member, bitfield_size != 0): OK BTF raw test[95] (valid union kind_flag, int member, bitfield_size != 0): OK BTF raw test[96] (valid struct kind_flag, enum member, bitfield_size != 0): OK BTF raw test[97] (valid union kind_flag, enum member, bitfield_size != 0): OK BTF raw test[98] (valid struct kind_flag, typedef member, bitfield_size != 0): OK BTF raw test[99] (valid union kind_flag, typedef member, bitfield_size != 0): OK BTF raw test[100] (invalid struct type, bitfield_size greater than struct size): OK BTF raw test[101] (invalid struct type, kind_flag bitfield base_type int not regular): OK BTF raw test[102] (invalid struct type, kind_flag base_type int not regular): OK BTF raw test[103] (invalid union type, bitfield_size greater than struct size): OK ... PASS:122 SKIP:0 FAIL:0 The second parameter name of macro BTF_INFO_ENC(kind, root, vlen) in selftests test_btf.c is also renamed from "root" to "kind_flag". Note that before this patch "root" is not used and always 0. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:11:59 +01:00
Yonghong Song	128b343dbe	tools/bpf: sync btf.h header from kernel to tools Sync include/uapi/linux/btf.h to tools/include/uapi/linux/btf.h. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:11:59 +01:00
Yonghong Song	ffa0c1cf59	bpf: enable cgroup local storage map pretty print with kind_flag Commit 970289fc0a83 ("bpf: add bpffs pretty print for cgroup local storage maps") added bpffs pretty print for cgroup local storage maps. The commit worked for struct without kind_flag set. This patch refactored and made pretty print also work with kind_flag set for the struct. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:11:59 +01:00
Yonghong Song	9d5f9f701b	bpf: btf: fix struct/union/fwd types with kind_flag This patch fixed two issues with BTF. One is related to struct/union bitfield encoding and the other is related to forward type. Issue #1 and solution: ====================== Current btf encoding of bitfield follows what pahole generates. For each bitfield, pahole will duplicate the type chain and put the bitfield size at the final int or enum type. Since the BTF enum type cannot encode bit size, pahole workarounds the issue by generating an int type whenever the enum bit size is not 32. For example, -bash-4.4$ cat t.c typedef int ___int; enum A { A1, A2, A3 }; struct t { int a[5]; ___int b:4; volatile enum A c:4; } g; -bash-4.4$ gcc -c -O2 -g t.c The current kernel supports the following BTF encoding: $ pahole -JV t.o [1] TYPEDEF ___int type_id=2 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED [3] ENUM A size=4 vlen=3 A1 val=0 A2 val=1 A3 val=2 [4] STRUCT t size=24 vlen=3 a type_id=5 bits_offset=0 b type_id=9 bits_offset=160 c type_id=11 bits_offset=164 [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5 [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none) [7] VOLATILE (anon) type_id=3 [8] INT int size=1 bit_offset=0 nr_bits=4 encoding=(none) [9] TYPEDEF ___int type_id=8 [10] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED [11] VOLATILE (anon) type_id=10 Two issues are in the above: . by changing enum type to int, we lost the original type information and this will not be ideal later when we try to convert BTF to a header file. . the type duplication for bitfields will cause BTF bloat. Duplicated types cannot be deduplicated later if the bitfield size is different. To fix this issue, this patch implemented a compatible change for BTF struct type encoding: . the bit 31 of struct_type->info, previously reserved, now is used to indicate whether bitfield_size is encoded in btf_member or not. . if bit 31 of struct_type->info is set, btf_member->offset will encode like: bit 0 - 23: bit offset bit 24 - 31: bitfield size if bit 31 is not set, the old behavior is preserved: bit 0 - 31: bit offset So if the struct contains a bit field, the maximum bit offset will be reduced to (2^24 - 1) instead of MAX_UINT. The maximum bitfield size will be 256 which is enough for today as maximum bitfield in compiler can be 128 where int128 type is supported. This kernel patch intends to support the new BTF encoding: $ pahole -JV t.o [1] TYPEDEF ___int type_id=2 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED [3] ENUM A size=4 vlen=3 A1 val=0 A2 val=1 A3 val=2 [4] STRUCT t kind_flag=1 size=24 vlen=3 a type_id=5 bitfield_size=0 bits_offset=0 b type_id=1 bitfield_size=4 bits_offset=160 c type_id=7 bitfield_size=4 bits_offset=164 [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5 [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none) [7] VOLATILE (anon) type_id=3 Issue #2 and solution: ====================== Current forward type in BTF does not specify whether the original type is struct or union. This will not work for type pretty print and BTF-to-header-file conversion as struct/union must be specified. $ cat tt.c struct t; union u; int foo(struct t t, union u u) { return 0; } $ gcc -c -g -O2 tt.c $ pahole -JV tt.o [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED [2] FWD t type_id=0 [3] PTR (anon) type_id=2 [4] FWD u type_id=0 [5] PTR (anon) type_id=4 To fix this issue, similar to issue #1, type->info bit 31 is used. If the bit is set, it is union type. Otherwise, it is a struct type. $ pahole -JV tt.o [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED [2] FWD t kind_flag=0 type_id=0 [3] PTR (anon) kind_flag=0 type_id=2 [4] FWD u kind_flag=1 type_id=0 [5] PTR (anon) kind_flag=0 type_id=4 Pahole/LLVM change: =================== The new kind_flag functionality has been implemented in pahole and llvm: https://github.com/yonghong-song/pahole/tree/bitfield https://github.com/yonghong-song/llvm/tree/bitfield Note that pahole hasn't implemented func/func_proto kind and .BTF.ext. So to print function signature with bpftool, the llvm compiler should be used. Fixes: `69b693f0ae` ("bpf: btf: Introduce BPF Type Format (BTF)") Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:11:59 +01:00
Yonghong Song	f97be3ab04	bpf: btf: refactor btf_int_bits_seq_show() Refactor function btf_int_bits_seq_show() by creating function btf_bitfield_seq_show() which has no dependence on btf and btf_type. The function btf_bitfield_seq_show() will be in later patch to directly dump bitfield member values. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-18 01:11:59 +01:00
Daniel Borkmann	6c4fc209fc	bpf: remove useless version check for prog load Existing libraries and tracing frameworks work around this kernel version check by automatically deriving the kernel version from uname(3) or similar such that the user does not need to do it manually; these workarounds also make the version check useless at the same time. Moreover, most other BPF tracing types enabling bpf_probe_read()-like functionality have /not/ adapted this check, and in general these days it is well understood anyway that all the tracing programs are not stable with regards to future kernels as kernel internal data structures are subject to change from release to release. Back at last netconf we discussed [0] and agreed to remove this check from bpf_prog_load() and instead document it here in the uapi header that there is no such guarantee for stable API for these programs. [0] http://vger.kernel.org/netconf2018_files/DanielBorkmann_netconf2018.pdf Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-17 13:41:35 -08:00
Daniel Borkmann	034565da0f	Merge branch 'bpf-bpftool-cleanups' Quentin Monnet says: ==================== This series contains several minor fixes for bpftool source and documentation. The first patches focus on documentation: addition of an option in the page for "bpftool prog", clean up and update of the same page, and addition of an example of prog array map manipulation in "bpftool map" page. The last two fix warnings susceptible to appear when libbfd is not present (patch 4), or with additional warning flags passed to the compiler (last patch). ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:50 +01:00
Quentin Monnet	c101189bc9	tools: bpftool: fix -Wmissing declaration warnings Help compiler check arguments for several utility functions used to print items to the console by adding the "printf" attribute when declaring those functions. Also, declare as "static" two functions that are only used in prog.c. All of them discovered by compiling bpftool with -Wmissing-format-attribute -Wmissing-declarations. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:49 +01:00
Quentin Monnet	8c03ecf712	tools: bpftool: fix warning on struct bpf_prog_linfo definition The following warning appears when compiling bpftool without BFD support: main.h:198:23: warning: 'struct bpf_prog_linfo' declared inside parameter list will not be visible outside of this definition or declaration const struct bpf_prog_linfo *prog_linfo, Fix it by declaring struct bpf_prog_linfo even in the case BFD is not supported. Fixes: `b053b439b7` ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:49 +01:00
Quentin Monnet	bd0fb9d007	tools: bpftool: add a prog array map update example to documentation Add an example in map documentation to show how to use bpftool in order to update the references to programs hold by prog array maps. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:49 +01:00
Quentin Monnet	32870ba407	tools: bpftool: fix examples in documentation for bpftool prog Bring various fixes to the manual page for "bpftool prog" set of commands: - Fix typos ("dum" -> "dump") - Harmonise indentation and format for command output - Update date format for program load time - Add instruction numbers on program dumps - Fix JSON format for the example program listing Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:48 +01:00
Quentin Monnet	bc6cd66460	tools: bpftool: add doc for -m option to bpftool-prog.rst The --mapcompat\|-m option has been documented on the main bpftool.rst page, and on the interactive help. As this option is useful for loading programs with maps with the "bpftool prog load" command, it should also appear in the related bpftool-prog.rst documentation page. Let's add it. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-15 01:31:48 +01:00

1 2 3 4 5 ...

799654 Коммитов Все ветки Поиск

799654 Коммитов

Все ветки