From 04b38d012556199ba4c31195940160e0c44c64f0 Mon Sep 17 00:00:00 2001 From: Paul Cercueil Date: Mon, 11 Jan 2021 17:28:39 +0000 Subject: [PATCH 1/2] seccomp: Add missing return in non-void function We don't actually care about the value, since the kernel will panic before that; but a value should nonetheless be returned, otherwise the compiler will complain. Fixes: 8112c4f140fa ("seccomp: remove 2-phase API") Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Paul Cercueil Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20210111172839.640914-1-paul@crapouillou.net --- kernel/seccomp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 952dc1c90229..63b40d12896b 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -1284,6 +1284,8 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, const bool recheck_after_trace) { BUG(); + + return -1; } #endif From a381b70a1cf88e4a2d54f24d59abdcad0ff2dfe6 Mon Sep 17 00:00:00 2001 From: wanghongzhe Date: Fri, 5 Feb 2021 11:34:09 +0800 Subject: [PATCH 2/2] seccomp: Improve performace by optimizing rmb() According to Kees's suggest, we started with the patch that just replaces rmb() with smp_rmb() and did a performance test with UnixBench. The results showed the overhead about 2.53% in rmb() test compared to the smp_rmb() one, in a x86-64 kernel with CONFIG_SMP enabled running inside a qemu-kvm vm. The test is a "syscall" testcase in UnixBench, which executes 5 syscalls in a loop during a certain timeout (100 second in our test) and counts the total number of executions of this 5-syscall sequence. We set a seccomp filter with all allow rule for all used syscalls in this test (which will go bitmap path) to make sure the rmb() will be executed. The details for the test: with rmb(): /txm # ./syscall_allow_min 100 COUNT|35861159|1|lps /txm # ./syscall_allow_min 100 COUNT|35545501|1|lps /txm # ./syscall_allow_min 100 COUNT|35664495|1|lps with smp_rmb(): /txm # ./syscall_allow_min 100 COUNT|36552771|1|lps /txm # ./syscall_allow_min 100 COUNT|36491247|1|lps /txm # ./syscall_allow_min 100 COUNT|36504746|1|lps For a x86-64 kernel with CONFIG_SMP enabled, the smp_rmb() is just a compiler barrier() which have no impact in runtime, while rmb() is a lfence which will prevent all memory access operations (not just load according the recently claim by Intel) behind itself. We can also figure it out in disassembly: with rmb(): 0000000000001430 <__seccomp_filter>: 1430: 41 57 push %r15 1432: 41 56 push %r14 1434: 41 55 push %r13 1436: 41 54 push %r12 1438: 55 push %rbp 1439: 53 push %rbx 143a: 48 81 ec 90 00 00 00 sub $0x90,%rsp 1441: 89 7c 24 10 mov %edi,0x10(%rsp) 1445: 89 54 24 14 mov %edx,0x14(%rsp) 1449: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax 1450: 00 00 1452: 48 89 84 24 88 00 00 mov %rax,0x88(%rsp) 1459: 00 145a: 31 c0 xor %eax,%eax * 145c: 0f ae e8 lfence 145f: 48 85 f6 test %rsi,%rsi 1462: 49 89 f4 mov %rsi,%r12 1465: 0f 84 42 03 00 00 je 17ad <__seccomp_filter+0x37d> 146b: 65 48 8b 04 25 00 00 mov %gs:0x0,%rax 1472: 00 00 1474: 48 8b 98 80 07 00 00 mov 0x780(%rax),%rbx 147b: 48 85 db test %rbx,%rbx with smp_rmb(); 0000000000001430 <__seccomp_filter>: 1430: 41 57 push %r15 1432: 41 56 push %r14 1434: 41 55 push %r13 1436: 41 54 push %r12 1438: 55 push %rbp 1439: 53 push %rbx 143a: 48 81 ec 90 00 00 00 sub $0x90,%rsp 1441: 89 7c 24 10 mov %edi,0x10(%rsp) 1445: 89 54 24 14 mov %edx,0x14(%rsp) 1449: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax 1450: 00 00 1452: 48 89 84 24 88 00 00 mov %rax,0x88(%rsp) 1459: 00 145a: 31 c0 xor %eax,%eax 145c: 48 85 f6 test %rsi,%rsi 145f: 49 89 f4 mov %rsi,%r12 1462: 0f 84 42 03 00 00 je 17aa <__seccomp_filter+0x37a> 1468: 65 48 8b 04 25 00 00 mov %gs:0x0,%rax 146f: 00 00 1471: 48 8b 98 80 07 00 00 mov 0x780(%rax),%rbx 1478: 48 85 db test %rbx,%rbx Signed-off-by: wanghongzhe Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/1612496049-32507-1-git-send-email-wanghongzhe@huawei.com --- kernel/seccomp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 63b40d12896b..1d60fc2c9987 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -1164,7 +1164,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, * Make sure that any changes to mode from another thread have * been seen after SYSCALL_WORK_SECCOMP was seen. */ - rmb(); + smp_rmb(); if (!sd) { populate_seccomp_data(&sd_local);