From 5f34b1eb2f8d4bba7d6352e767ef84bee9096d97 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Mon, 12 Jul 2021 10:00:43 +0100 Subject: [PATCH 1/8] arm64: fix strlen() with CONFIG_KASAN_HW_TAGS When the kernel is built with CONFIG_KASAN_HW_TAGS and the CPU supports MTE, memory accesses are checked at 16-byte granularity, and out-of-bounds accesses can result in tag check faults. Our current implementation of strlen() makes unaligned 16-byte accesses (within a naturally aligned 4096-byte window), and can trigger tag check faults. This can be seen at boot time, e.g. | BUG: KASAN: invalid-access in __pi_strlen+0x14/0x150 | Read at addr f4ff0000c0028300 by task swapper/0/0 | Pointer tag: [f4], memory tag: [fe] | | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-09550-g03c2813535a2-dirty #20 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x1b0 | show_stack+0x1c/0x30 | dump_stack_lvl+0x68/0x84 | print_address_description+0x7c/0x2b4 | kasan_report+0x138/0x38c | __do_kernel_fault+0x190/0x1c4 | do_tag_check_fault+0x78/0x90 | do_mem_abort+0x44/0xb4 | el1_abort+0x40/0x60 | el1h_64_sync_handler+0xb0/0xd0 | el1h_64_sync+0x78/0x7c | __pi_strlen+0x14/0x150 | __register_sysctl_table+0x7c4/0x890 | register_leaf_sysctl_tables+0x1a4/0x210 | register_leaf_sysctl_tables+0xc8/0x210 | __register_sysctl_paths+0x22c/0x290 | register_sysctl_table+0x2c/0x40 | sysctl_init+0x20/0x30 | proc_sys_init+0x3c/0x48 | proc_root_init+0x80/0x9c | start_kernel+0x640/0x69c | __primary_switched+0xc0/0xc8 To fix this, we can reduce the (strlen-internal) MIN_PAGE_SIZE to 16 bytes when CONFIG_KASAN_HW_TAGS is selected. This will cause strlen() to align the base pointer downwards to a 16-byte boundary, and to discard the additional prefix bytes without counting them. All subsequent accesses will be 16-byte aligned 16-byte LDPs. While the comments say the body of the loop will access 32 bytes, this is performed as two 16-byte acceses, with the second made only if the first did not encounter a NUL byte, so the body of the loop will not over-read across a 16-byte boundary. No other string routines are affected. The other str*() routines will not make any access which straddles a 16-byte boundary, and the mem*() routines will only make acceses which straddle a 16-byte boundary when which is entirely within the bounds of the relevant base and size arguments. Fixes: 325a1de81287 ("arm64: Import updated version of Cortex Strings' strlen") Signed-off-by: Mark Rutland Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Catalin Marinas Cc: Dmitry Vyukov Cc: Marco Elver Cc: Robin Murphy Cc: Will Deacon Reviewed-by: Catalin Marinas Reviewed-by: Robin Murphy Link: https://lore.kernel.org/r/20210712090043.20847-1-mark.rutland@arm.com Signed-off-by: Will Deacon --- arch/arm64/lib/strlen.S | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/arm64/lib/strlen.S b/arch/arm64/lib/strlen.S index 35fbdb7d6e1a..1648790e91b3 100644 --- a/arch/arm64/lib/strlen.S +++ b/arch/arm64/lib/strlen.S @@ -8,6 +8,7 @@ #include #include +#include /* Assumptions: * @@ -42,7 +43,16 @@ #define REP8_7f 0x7f7f7f7f7f7f7f7f #define REP8_80 0x8080808080808080 +/* + * When KASAN_HW_TAGS is in use, memory is checked at MTE_GRANULE_SIZE + * (16-byte) granularity, and we must ensure that no access straddles this + * alignment boundary. + */ +#ifdef CONFIG_KASAN_HW_TAGS +#define MIN_PAGE_SIZE MTE_GRANULE_SIZE +#else #define MIN_PAGE_SIZE 4096 +#endif /* Since strings are short on average, we check the first 16 bytes of the string for a NUL character. In order to do an unaligned ldp From e62e074814862cffd8e60a1bdf52d6b592a03675 Mon Sep 17 00:00:00 2001 From: Carlos Bilbao Date: Thu, 8 Jul 2021 07:15:42 -0400 Subject: [PATCH 2/8] arm64: Add missing header in two files Add missing header on include/asm/smp_plat.h, as it calls function cpu_logical_map(). Also include it on kernel/cpufeature.c since it has calls to functions cpu_panic_kernel() and cpu_die_early(). Both files call functions defined on this header, make the header dependencies less fragile. Signed-off-by: Carlos Bilbao Acked-by: Mark Rutland Link: https://lore.kernel.org/r/4325940.LvFx2qVVIh@iron-maiden Signed-off-by: Will Deacon --- arch/arm64/include/asm/smp_plat.h | 1 + arch/arm64/kernel/cpufeature.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 99ad77df8f52..97ddc6c203b7 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -10,6 +10,7 @@ #include +#include #include struct mpidr_hash { diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 125d5c9471ac..0ead8bfedf20 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -81,6 +81,7 @@ #include #include #include +#include #include #include #include From c1132702c71f4b95db9435bac5fdc912881563e0 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 12 Jul 2021 13:10:00 +0100 Subject: [PATCH 3/8] Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)" This reverts commit 65688d2a05deb9f0671a7e2301eadbfe7e27c9e9. Unfortunately, the original Qualcomm Kryo cores integrated into the MSM8996 SoC feature an L2 cache with 128-byte lines which sits above the Point of Coherency. Consequently, we must restore ARCH_DMA_MINALIGN to its former ugly self so that non-coherent DMA can be performed safely on devices built using this SoC. Thanks to Jeffrey Hugo for confirming this with a hardware designer. Link: https://lore.kernel.org/r/CAOCk7NqdpUZFMSXfGjw0_1NaSK5gyTLgpS9kSdZn1jmBy-QkfA@mail.gmail.com/ Reported-by: Yassine Oudjana Link: https://lore.kernel.org/r/uHgsRacR8hJ7nW-I-pIcehzg-lNIn7NJvaL7bP9tfAftFsBjsgaY2qTjG9zyBgxHkjNL1WPNrD7YVv2JVD2_Wy-a5VTbcq-1xEi8ZnwrXBo=@protonmail.com Signed-off-by: Will Deacon --- arch/arm64/include/asm/cache.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index a9c0716e7440..a074459f8f2f 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -47,7 +47,7 @@ * cache before the transfer is done, causing old data to be seen by * the CPU. */ -#define ARCH_DMA_MINALIGN L1_CACHE_BYTES +#define ARCH_DMA_MINALIGN (128) #ifdef CONFIG_KASAN_SW_TAGS #define ARCH_SLAB_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) From 8cdd23c23c3d481a43b4aa03dcb5738812831115 Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Mon, 12 Jul 2021 14:46:37 -0700 Subject: [PATCH 4/8] arm64: Restrict ARM64_BTI_KERNEL to clang 12.0.0 and newer Commit 97fed779f2a6 ("arm64: bti: Provide Kconfig for kernel mode BTI") disabled CONFIG_ARM64_BTI_KERNEL when CONFIG_GCOV_KERNEL was enabled and compiling with clang because of warnings that were seen with allmodconfig because LLVM was not emitting PAC/BTI instructions for compiler generated functions: | warning: some functions compiled with BTI and some compiled without BTI | warning: not setting BTI in feature flags This dependency was fine for avoiding the warnings with allmodconfig until commit 51c2ee6d121c ("Kconfig: Introduce ARCH_WANTS_NO_INSTR and CC_HAS_NO_PROFILE_FN_ATTR"), which prevents CONFIG_GCOV_KERNEL from being enabled with clang 12.0.0 or older because those versions do not support the no_profile_instrument_function attribute. As a result, CONFIG_ARM64_BTI_KERNEL gets enabled with allmodconfig and there are more warnings like the ones above due to CONFIG_KASAN, which suffers from the same problem as CONFIG_GCOV_KERNEL. This was most likely not noticed at the time because allmodconfig + CONFIG_GCOV_KERNEL=n was not tested. defconfig + CONFIG_KASAN=y is enough to reproduce the same warnings as above. The root cause of the warnings was resolved in LLVM during the 12.0.0 release so rather than play whack-a-mole with the dependencies, just update CONFIG_ARM64_BTI_KERNEL to require clang 12.0.0, which will have all of the issues ironed out. Link: https://github.com/ClangBuiltLinux/linux/issues/1428 Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/3010034706?check_suite_focus=true Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/3010035725?check_suite_focus=true Link: https://github.com/llvm/llvm-project/commit/a88c722e687e6780dcd6a58718350dc76fcc4cc9 Signed-off-by: Nathan Chancellor Reviewed-by: Nick Desaulniers Link: https://lore.kernel.org/r/20210712214636.3134425-1-nathan@kernel.org Signed-off-by: Will Deacon --- arch/arm64/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e07e7de9ac49..b5b13a932561 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1605,7 +1605,8 @@ config ARM64_BTI_KERNEL depends on CC_HAS_BRANCH_PROT_PAC_RET_BTI # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94697 depends on !CC_IS_GCC || GCC_VERSION >= 100100 - depends on !(CC_IS_CLANG && GCOV_KERNEL) + # https://github.com/llvm/llvm-project/commit/a88c722e687e6780dcd6a58718350dc76fcc4cc9 + depends on !CC_IS_CLANG || CLANG_VERSION >= 120000 depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS) help Build the kernel with Branch Target Identification annotations From 295cf156231ca3f9e3a66bde7fab5e09c41835e0 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Mon, 12 Jul 2021 15:27:46 +0100 Subject: [PATCH 5/8] arm64: Avoid premature usercopy failure Al reminds us that the usercopy API must only return complete failure if absolutely nothing could be copied. Currently, if userspace does something silly like giving us an unaligned pointer to Device memory, or a size which overruns MTE tag bounds, we may fail to honour that requirement when faulting on a multi-byte access even though a smaller access could have succeeded. Add a mitigation to the fixup routines to fall back to a single-byte copy if we faulted on a larger access before anything has been written to the destination, to guarantee making *some* forward progress. We needn't be too concerned about the overall performance since this should only occur when callers are doing something a bit dodgy in the first place. Particularly broken userspace might still be able to trick generic_perform_write() into an infinite loop by targeting write() at an mmap() of some read-only device register where the fault-in load succeeds but any store synchronously aborts such that copy_to_user() is genuinely unable to make progress, but, well, don't do that... CC: stable@vger.kernel.org Reported-by: Chen Huang Suggested-by: Al Viro Reviewed-by: Catalin Marinas Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/dc03d5c675731a1f24a62417dba5429ad744234e.1626098433.git.robin.murphy@arm.com Signed-off-by: Will Deacon --- arch/arm64/lib/copy_from_user.S | 13 ++++++++++--- arch/arm64/lib/copy_in_user.S | 21 ++++++++++++++------- arch/arm64/lib/copy_to_user.S | 14 +++++++++++--- 3 files changed, 35 insertions(+), 13 deletions(-) diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S index 95cd62d67371..2cf999e41d30 100644 --- a/arch/arm64/lib/copy_from_user.S +++ b/arch/arm64/lib/copy_from_user.S @@ -29,7 +29,7 @@ .endm .macro ldrh1 reg, ptr, val - user_ldst 9998f, ldtrh, \reg, \ptr, \val + user_ldst 9997f, ldtrh, \reg, \ptr, \val .endm .macro strh1 reg, ptr, val @@ -37,7 +37,7 @@ .endm .macro ldr1 reg, ptr, val - user_ldst 9998f, ldtr, \reg, \ptr, \val + user_ldst 9997f, ldtr, \reg, \ptr, \val .endm .macro str1 reg, ptr, val @@ -45,7 +45,7 @@ .endm .macro ldp1 reg1, reg2, ptr, val - user_ldp 9998f, \reg1, \reg2, \ptr, \val + user_ldp 9997f, \reg1, \reg2, \ptr, \val .endm .macro stp1 reg1, reg2, ptr, val @@ -53,8 +53,10 @@ .endm end .req x5 +srcin .req x15 SYM_FUNC_START(__arch_copy_from_user) add end, x0, x2 + mov srcin, x1 #include "copy_template.S" mov x0, #0 // Nothing to copy ret @@ -63,6 +65,11 @@ EXPORT_SYMBOL(__arch_copy_from_user) .section .fixup,"ax" .align 2 +9997: cmp dst, dstin + b.ne 9998f + // Before being absolutely sure we couldn't copy anything, try harder +USER(9998f, ldtrb tmp1w, [srcin]) + strb tmp1w, [dst], #1 9998: sub x0, end, dst // bytes not copied ret .previous diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S index 1f61cd0df062..dbea3799c3ef 100644 --- a/arch/arm64/lib/copy_in_user.S +++ b/arch/arm64/lib/copy_in_user.S @@ -30,33 +30,34 @@ .endm .macro ldrh1 reg, ptr, val - user_ldst 9998f, ldtrh, \reg, \ptr, \val + user_ldst 9997f, ldtrh, \reg, \ptr, \val .endm .macro strh1 reg, ptr, val - user_ldst 9998f, sttrh, \reg, \ptr, \val + user_ldst 9997f, sttrh, \reg, \ptr, \val .endm .macro ldr1 reg, ptr, val - user_ldst 9998f, ldtr, \reg, \ptr, \val + user_ldst 9997f, ldtr, \reg, \ptr, \val .endm .macro str1 reg, ptr, val - user_ldst 9998f, sttr, \reg, \ptr, \val + user_ldst 9997f, sttr, \reg, \ptr, \val .endm .macro ldp1 reg1, reg2, ptr, val - user_ldp 9998f, \reg1, \reg2, \ptr, \val + user_ldp 9997f, \reg1, \reg2, \ptr, \val .endm .macro stp1 reg1, reg2, ptr, val - user_stp 9998f, \reg1, \reg2, \ptr, \val + user_stp 9997f, \reg1, \reg2, \ptr, \val .endm end .req x5 - +srcin .req x15 SYM_FUNC_START(__arch_copy_in_user) add end, x0, x2 + mov srcin, x1 #include "copy_template.S" mov x0, #0 ret @@ -65,6 +66,12 @@ EXPORT_SYMBOL(__arch_copy_in_user) .section .fixup,"ax" .align 2 +9997: cmp dst, dstin + b.ne 9998f + // Before being absolutely sure we couldn't copy anything, try harder +USER(9998f, ldtrb tmp1w, [srcin]) +USER(9998f, sttrb tmp1w, [dst]) + add dst, dst, #1 9998: sub x0, end, dst // bytes not copied ret .previous diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S index 043da90f5dd7..9f380eecf653 100644 --- a/arch/arm64/lib/copy_to_user.S +++ b/arch/arm64/lib/copy_to_user.S @@ -32,7 +32,7 @@ .endm .macro strh1 reg, ptr, val - user_ldst 9998f, sttrh, \reg, \ptr, \val + user_ldst 9997f, sttrh, \reg, \ptr, \val .endm .macro ldr1 reg, ptr, val @@ -40,7 +40,7 @@ .endm .macro str1 reg, ptr, val - user_ldst 9998f, sttr, \reg, \ptr, \val + user_ldst 9997f, sttr, \reg, \ptr, \val .endm .macro ldp1 reg1, reg2, ptr, val @@ -48,12 +48,14 @@ .endm .macro stp1 reg1, reg2, ptr, val - user_stp 9998f, \reg1, \reg2, \ptr, \val + user_stp 9997f, \reg1, \reg2, \ptr, \val .endm end .req x5 +srcin .req x15 SYM_FUNC_START(__arch_copy_to_user) add end, x0, x2 + mov srcin, x1 #include "copy_template.S" mov x0, #0 ret @@ -62,6 +64,12 @@ EXPORT_SYMBOL(__arch_copy_to_user) .section .fixup,"ax" .align 2 +9997: cmp dst, dstin + b.ne 9998f + // Before being absolutely sure we couldn't copy anything, try harder + ldrb tmp1w, [srcin] +USER(9998f, sttrb tmp1w, [dst]) + add dst, dst, #1 9998: sub x0, end, dst // bytes not copied ret .previous From 59f44069e0527523f27948da7b77599a73dab157 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Wed, 14 Jul 2021 15:38:41 +0100 Subject: [PATCH 6/8] arm64: mte: fix restoration of GCR_EL1 from suspend Since commit: bad1e1c663e0a72f ("arm64: mte: switch GCR_EL1 in kernel entry and exit") we saved/restored the user GCR_EL1 value at exception boundaries, and update_gcr_el1_excl() is no longer used for this. However it is used to restore the kernel's GCR_EL1 value when returning from a suspend state. Thus, the comment is misleading (and an ISB is necessary). When restoring the kernel's GCR value, we need an ISB to ensure this is used by subsequent instructions. We don't necessarily get an ISB by other means (e.g. if the kernel is built without support for pointer authentication). As __cpu_setup() initialised GCR_EL1.Exclude to 0xffff, until a context synchronization event, allocation tag 0 may be used rather than the desired set of tags. This patch drops the misleading comment, adds the missing ISB, and for clarity folds update_gcr_el1_excl() into its only user. Fixes: bad1e1c663e0 ("arm64: mte: switch GCR_EL1 in kernel entry and exit") Signed-off-by: Mark Rutland Cc: Andrey Konovalov Cc: Catalin Marinas Cc: Vincenzo Frascino Cc: Will Deacon Link: https://lore.kernel.org/r/20210714143843.56537-2-mark.rutland@arm.com Signed-off-by: Will Deacon --- arch/arm64/kernel/mte.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 69b3fde8759e..36f51b0e438a 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -193,18 +193,6 @@ void mte_check_tfsr_el1(void) } #endif -static void update_gcr_el1_excl(u64 excl) -{ - - /* - * Note that the mask controlled by the user via prctl() is an - * include while GCR_EL1 accepts an exclude mask. - * No need for ISB since this only affects EL0 currently, implicit - * with ERET. - */ - sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl); -} - static void set_gcr_el1_excl(u64 excl) { current->thread.gcr_user_excl = excl; @@ -265,7 +253,8 @@ void mte_suspend_exit(void) if (!system_supports_mte()) return; - update_gcr_el1_excl(gcr_kernel_excl); + sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, gcr_kernel_excl); + isb(); } long set_mte_ctrl(struct task_struct *task, unsigned long arg) From 31a7f0f6c8f392f002c937f34f372943cf8be5a9 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Wed, 14 Jul 2021 18:28:01 +0100 Subject: [PATCH 7/8] arm64: entry: add missing noinstr We intend that all the early exception handling code is marked as `noinstr`, but we forgot this for __el0_error_handler_common(), which is called before we have completed entry from user mode. If it were instrumented, we could run into problems with RCU, lockdep, etc. Mark it as `noinstr` to prevent this. The few other functions in entry-common.c which do not have `noinstr` are called once we've completed entry, and are safe to instrument. Fixes: bb8e93a287a5 ("arm64: entry: convert SError handlers to C") Signed-off-by: Mark Rutland Cc: Catalin Marinas Cc: Marc Zyngier Cc: Joey Gouly Cc: James Morse Cc: Will Deacon Link: https://lore.kernel.org/r/20210714172801.16475-1-mark.rutland@arm.com Signed-off-by: Will Deacon --- arch/arm64/kernel/entry-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 12ce14a98b7c..db8b2e2d02c2 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -604,7 +604,7 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs) __el0_fiq_handler_common(regs); } -static void __el0_error_handler_common(struct pt_regs *regs) +static void noinstr __el0_error_handler_common(struct pt_regs *regs) { unsigned long esr = read_sysreg(esr_el1); From e6f85cbeb23bd74b8966cf1f15bf7d01399ff625 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Thu, 15 Jul 2021 13:30:49 +0100 Subject: [PATCH 8/8] arm64: entry: fix KCOV suppression We suppress KCOV for entry.o rather than entry-common.o. As entry.o is built from entry.S, this is pointless, and permits instrumentation of entry-common.o, which is built from entry-common.c. Fix the Makefile to suppress KCOV for entry-common.o, as we had intended to begin with. I've verified with objdump that this is working as expected. Fixes: bf6fa2c0dda7 ("arm64: entry: don't instrument entry code with KCOV") Signed-off-by: Mark Rutland Cc: Catalin Marinas Cc: James Morse Cc: Marc Zyngier Cc: Will Deacon Link: https://lore.kernel.org/r/20210715123049.9990-1-mark.rutland@arm.com Signed-off-by: Will Deacon --- arch/arm64/kernel/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index cce308586fcc..3f1490bfb938 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -17,7 +17,7 @@ CFLAGS_syscall.o += -fno-stack-protector # It's not safe to invoke KCOV when portions of the kernel environment aren't # available or are out-of-sync with HW state. Since `noinstr` doesn't always # inhibit KCOV instrumentation, disable it for the entire compilation unit. -KCOV_INSTRUMENT_entry.o := n +KCOV_INSTRUMENT_entry-common.o := n KCOV_INSTRUMENT_idle.o := n # Object file lists.