WSL2-Linux-Kernel/mm
Carlos Llamas 43bed0a13a mm/mmap: undo ->mmap() when arch_validate_flags() fails
commit deb0f65628 upstream.

Commit c462ac288f ("mm: Introduce arch_validate_flags()") added a late
check in mmap_region() to let architectures validate vm_flags.  The check
needs to happen after calling ->mmap() as the flags can potentially be
modified during this callback.

If arch_validate_flags() check fails we unmap and free the vma.  However,
the error path fails to undo the ->mmap() call that previously succeeded
and depending on the specific ->mmap() implementation this translates to
reference increments, memory allocations and other operations what will
not be cleaned up.

There are several places (mainly device drivers) where this is an issue.
However, one specific example is bpf_map_mmap() which keeps count of the
mappings in map->writecnt.  The count is incremented on ->mmap() and then
decremented on vm_ops->close().  When arch_validate_flags() fails this
count is off since bpf_map_mmap_close() is never called.

One can reproduce this issue in arm64 devices with MTE support.  Here the
vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
previously.  From userspace then is enough to pass the PROT_MTE flag to
mmap() syscall to trigger the arch_validate_flags() failure.

The following program reproduces this issue:

  #include <stdio.h>
  #include <unistd.h>
  #include <linux/unistd.h>
  #include <linux/bpf.h>
  #include <sys/mman.h>

  int main(void)
  {
	union bpf_attr attr = {
		.map_type = BPF_MAP_TYPE_ARRAY,
		.key_size = sizeof(int),
		.value_size = sizeof(long long),
		.max_entries = 256,
		.map_flags = BPF_F_MMAPABLE,
	};
	int fd;

	fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
	mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);

	return 0;
  }

By manually adding some log statements to the vm_ops callbacks we can
confirm that when passing PROT_MTE to mmap() the map->writecnt is off upon
->release():

With PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  111.263874] bpf_map_write_active_inc: map=9 writecnt=1
  [  111.288763] bpf_map_release: map=9 writecnt=1

Without PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  157.816912] bpf_map_write_active_inc: map=10 writecnt=1
  [  157.830442] bpf_map_write_active_dec: map=10 writecnt=0
  [  157.832396] bpf_map_release: map=10 writecnt=0

This patch fixes the above issue by calling vm_ops->close() when the
arch_validate_flags() check fails, after this we can proceed to unmap and
free the vma on the error path.

Link: https://lkml.kernel.org/r/20220930003844.1210987-1-cmllamas@google.com
Fixes: c462ac288f ("mm: Introduce arch_validate_flags()")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>	[5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26 12:34:24 +02:00
..
damon mm/damon: validate if the pmd entry is present before accessing 2022-10-26 12:34:24 +02:00
kasan kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time 2022-05-09 09:14:41 +02:00
kfence mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool 2022-05-25 09:57:23 +02:00
Kconfig kmap_local: don't assume kmap PTEs are linear arrays in memory 2021-11-25 09:48:43 +01:00
Kconfig.debug
Makefile mm: introduce Data Access MONitor (DAMON) 2021-09-08 11:50:24 -07:00
backing-dev.c writeback: avoid use-after-free after removing device 2022-08-31 17:16:47 +02:00
balloon_compaction.c
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-31 17:16:48 +02:00
cleancache.c
cma.c Revert "mm/cma.c: remove redundant cma_mutex lock" 2022-06-09 10:23:27 +02:00
cma.h
cma_debug.c
cma_sysfs.c
compaction.c mm, compaction: fast_find_migrateblock() should return pfn in the target zone 2022-06-09 10:23:21 +02:00
debug.c mm/debug: sync up latest migrate_reason to migrate_reason_names 2021-09-24 16:13:35 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: remove pte entry from the page table 2022-02-08 18:34:05 +01:00
dmapool.c
early_ioremap.c mm/early_ioremap.c: remove redundant early_ioremap_shutdown() 2021-09-08 11:50:24 -07:00
fadvise.c
failslab.c
filemap.c mm/filemap: fix UAF in find_lock_entries 2022-07-12 16:34:47 +02:00
frontswap.c
gup.c mm: gup: fix the fast GUP race against THP collapse 2022-10-12 09:53:26 +02:00
gup_test.c
gup_test.h
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-13 20:59:21 +02:00
hmm.c mm/hmm: fault non-owner device private entries 2022-08-03 12:03:54 +02:00
huge_memory.c mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() 2022-10-12 09:53:28 +02:00
hugetlb.c mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte 2022-09-05 10:30:04 +02:00
hugetlb_cgroup.c
hugetlb_vmemmap.c
hugetlb_vmemmap.h
hwpoison-inject.c mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler 2022-07-12 16:35:05 +02:00
init-mm.c
internal.h mm/numa: automatically generate node migration order 2021-09-03 09:58:16 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
khugepaged.c mm: gup: fix the fast GUP race against THP collapse 2022-10-12 09:53:26 +02:00
kmemleak.c Revert "mm: kmemleak: take a full lowmem check in kmemleak_*_phys()" 2022-09-15 11:30:00 +02:00
ksm.c mm/ksm: remove old GCC 4.9+ check 2021-09-13 10:18:28 -07:00
list_lru.c
maccess.c ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault 2021-08-20 11:39:25 +01:00
madvise.c mm: fix madivse_pageout mishandling on non-LRU page 2022-10-05 10:39:39 +02:00
mapping_dirty_helpers.c
memblock.c memblock: use kfree() to release kmalloced memblock regions 2022-03-02 11:48:10 +01:00
memcontrol.c memcg: sync flush only if periodic flush is delayed 2022-04-27 14:38:57 +02:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-08 19:12:48 +01:00
memory-failure.c mm,hwpoison: check mm when killing accessing process 2022-10-05 10:39:39 +02:00
memory.c mm: fix page leak with multiple threads mapping the same page 2022-08-03 12:03:42 +02:00
memory_hotplug.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
mempolicy.c mm/mempolicy: fix get_nodes out of bound access 2022-08-17 14:23:47 +02:00
mempool.c
memremap.c mm/memremap: fix memunmap_pages() race with get_dev_pagemap() 2022-08-17 14:23:44 +02:00
memtest.c
migrate.c mm/migrate_device.c: flush TLB while holding PTL 2022-10-05 10:39:39 +02:00
mincore.c
mlock.c mm/mlock: fix potential imbalanced rlimit ucounts adjustment 2022-05-15 20:18:53 +02:00
mm_init.c
mmap.c mm/mmap: undo ->mmap() when arch_validate_flags() fails 2022-10-26 12:34:24 +02:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmu_gather.c
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-27 14:38:58 +02:00
mmzone.c
mprotect.c mm: don't try to NUMA-migrate COW pages that have other uses 2022-02-23 12:03:03 +01:00
mremap.c mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0) 2022-04-13 20:59:22 +02:00
msync.c
nommu.c Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux 2021-09-04 11:35:47 -07:00
oom_kill.c oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-27 14:38:58 +02:00
page-writeback.c writeback: avoid use-after-free after removing device 2022-08-31 17:16:47 +02:00
page_alloc.c mm: prevent page_frag_alloc() from corrupting the memory 2022-10-05 10:39:39 +02:00
page_counter.c
page_ext.c mm/migrate: add CPU hotplug to demotion #ifdef 2021-10-18 20:22:02 -10:00
page_idle.c mm/idle_page_tracking: make PG_idle reusable 2021-09-08 11:50:24 -07:00
page_io.c mm: fix unexpected zeroed page mapping with zram swap 2022-04-20 09:34:18 +02:00
page_isolation.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
page_owner.c mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE 2021-09-08 11:50:22 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_vma_mapped.c
pagewalk.c mm: pagewalk: Fix race between unmap and page walker 2022-09-08 12:28:05 +02:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
pgalloc-track.h
pgtable-generic.c
process_vm_access.c
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-08 12:28:05 +02:00
readahead.c mm: Protect operations adding pages to page cache with invalidate_lock 2021-07-13 13:14:27 +02:00
rmap.c mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse 2022-09-05 10:30:07 +02:00
rodata_test.c
secretmem.c mm: fix dereferencing possible ERR_PTR 2022-10-05 10:39:39 +02:00
shmem.c mm: shmem: fix missing cache flush in shmem_mfill_atomic_pte() 2022-05-15 20:18:53 +02:00
shuffle.c
shuffle.h
slab.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-27 14:38:51 +02:00
slab.h mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-27 14:38:51 +02:00
slab_common.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-27 14:38:51 +02:00
slob.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-27 14:38:51 +02:00
slub.c mm: slub: fix flush_cpu_slab()/__free_slab() invocations in task context. 2022-09-28 11:11:44 +02:00
sparse-vmemmap.c
sparse.c mm: introduce memmap_alloc() to unify memory map allocation 2021-09-03 09:58:15 -07:00
swap.c mm: fs: invalidate bh_lrus for only cold path 2021-09-24 16:13:35 -07:00
swap_cgroup.c
swap_slots.c mm: Replace deprecated CPU-hotplug functions. 2021-08-28 01:46:17 +02:00
swap_state.c mm: swap: get rid of livelock in swapin readahead 2022-03-23 09:16:41 +01:00
swapfile.c mm, memcg: inline swap-related functions to improve disabled memcg config 2021-09-03 09:58:12 -07:00
truncate.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
usercopy.c mm/usercopy: return 1 from hardened_usercopy __setup() handler 2022-04-08 14:24:14 +02:00
userfaultfd.c mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages 2022-07-21 21:24:11 +02:00
util.c mm: vmalloc: introduce array allocation functions 2022-07-12 16:35:01 +02:00
vmacache.c
vmalloc.c mm: defer kmemleak object creation of module_alloc() 2022-03-08 19:12:38 +01:00
vmpressure.c mm/vmpressure: replace vmpressure_to_css() with vmpressure_to_memcg() 2021-09-03 09:58:17 -07:00
vmscan.c mm,vmscan: fix divide by zero in get_scan_count 2021-09-08 18:45:53 -07:00
vmstat.c mm/vmstat: protect per cpu variables with preempt disable on RT 2021-09-08 15:32:34 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-27 14:38:57 +02:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c zsmalloc: fix races between asynchronous zspage free and page migration 2022-06-06 08:43:39 +02:00
zswap.c