WSL2-Linux-Kernel/mm
Vlastimil Babka 1d26c11295 mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
commit 7810e6781e upstream.

In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies.  The zonelist is obtained
from current CPU's node.  This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g.  because the
allocating thread has been migrated to a different CPU.

This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended.  If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.

Current SLAB implementation seems to be immune by luck thanks to commit
511e3a0588 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.

We can fix it by simply removing the zonelist reset completely.  There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place.  This was different
when commit 183f6371aa ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.

We might consider this for 4.17 although I don't know if there's
anything currently broken.

SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a0588 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
ones I'll have to check.

So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes.  BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.

Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aa ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26 08:06:33 +08:00
..
kasan kasan: fix memory hotplug during boot 2018-05-30 07:51:50 +02:00
Kconfig mm: don't allow deferred pages with NEED_PER_CPU_KM 2018-05-22 18:53:58 +02:00
Kconfig.debug kmemcheck: rip it out 2018-02-22 15:42:24 +01:00
Makefile kmemcheck: rip it out 2018-02-22 15:42:24 +01:00
backing-dev.c bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue 2018-06-26 08:06:32 +08:00
balloon_compaction.c mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY 2017-09-08 18:26:46 -07:00
bootmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cleancache.c fs: switch ->s_uuid to uuid_t 2017-06-05 16:59:12 +02:00
cma.c mm/cma.c: take __GFP_NOWARN into account in cma_alloc() 2017-10-13 16:18:32 -07:00
cma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cma_debug.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compaction.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dmapool.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2018-02-25 11:08:03 +01:00
fadvise.c mm/fadvise: discard partial page if endbyte is also EOF 2018-04-26 11:02:15 +02:00
failslab.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
filemap.c mm/filemap.c: fix NULL pointer in page_cache_tree_insert() 2018-04-24 09:36:39 +02:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2018-03-03 10:24:21 +01:00
frontswap.c
gup.c proc: do not access cmdline nor environ from file-backed areas 2018-05-19 10:20:27 +02:00
highmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
hmm.c mm/hmm: hmm_pfns_bad() was accessing wrong struct 2018-04-24 09:36:22 +02:00
huge_memory.c mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() 2018-06-05 11:41:59 +02:00
hugetlb.c hugetlbfs: check for pgoff value overflow 2018-03-28 18:24:38 +02:00
hugetlb_cgroup.c
hwpoison-inject.c mm: hwpoison: call shake_page() unconditionally 2017-05-03 15:52:12 -07:00
init-mm.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
internal.h mm, oom: do not rely on TIF_MEMDIE for memory reserves access 2017-09-06 17:27:30 -07:00
interval_tree.c lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
khugepaged.c mm, thp: do not cause memcg oom for thp 2018-05-30 07:52:19 +02:00
kmemleak-test.c
kmemleak.c mm/kmemleak.c: wait for scan completion before disabling free 2018-05-30 07:52:21 +02:00
ksm.c mm/ksm: fix interaction with THP 2018-05-30 07:52:24 +02:00
list_lru.c mm: memcontrol: use vmalloc fallback for large kmem memcg arrays 2017-10-03 17:54:25 -07:00
maccess.c
madvise.c mm/madvise.c: fix madvise() infinite loop under special circumstances 2017-12-05 11:26:29 +01:00
memblock.c Revert "mm: page_alloc: skip over regions of invalid pfns where possible" 2018-03-28 18:24:39 +02:00
memcontrol.c mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create() 2018-06-21 04:02:46 +09:00
memory-failure.c x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages 2018-02-22 15:42:31 +01:00
memory.c mm: hide a #warning for COMPILE_TEST 2018-02-22 15:42:27 +01:00
memory_hotplug.c mm/memory_hotplug: define find_{smallest|biggest}_section_pfn as unsigned long 2017-10-03 17:54:26 -07:00
mempolicy.c mm/mempolicy.c: avoid use uninitialized preferred_node 2018-05-30 07:52:19 +02:00
mempool.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mincore.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mlock.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mm_init.c
mmap.c mmap: relax file size limit for regular files 2018-06-11 22:49:18 +02:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_notifier.c mm/mmu_notifier: kill invalidate_page 2017-08-31 16:13:00 -07:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c mm/mprotect: add a cond_resched() inside change_pmd_range() 2018-01-10 09:31:17 +01:00
mremap.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
msync.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nobootmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nommu.c Merge branch 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 18:13:32 -07:00
oom_kill.c mm, oom: fix concurrent munlock and oom reaper unmap, v3 2018-05-16 10:10:27 +02:00
page-writeback.c writeback: safer lock nesting 2018-04-24 09:36:39 +02:00
page_alloc.c mm, page_alloc: do not break __GFP_THISNODE by zonelist reset 2018-06-26 08:06:33 +08:00
page_counter.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_ext.c mm/page_ext.c: check if page_ext is not prepared 2017-11-24 08:37:05 +01:00
page_idle.c mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one() 2018-05-30 07:52:24 +02:00
page_io.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_isolation.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_owner.c mm/page_owner: fix recursion bug after changing skip entries 2018-05-30 07:52:21 +02:00
page_poison.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_vma_mapped.c mm, page_vma_mapped: Drop faulty pointer arithmetics in check_pte() 2018-01-23 19:58:21 +01:00
pagewalk.c mm/pagewalk.c: report holes in hugetlb ranges 2017-11-24 08:37:04 +01:00
percpu-internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-km.c percpu: add __GFP_NORETRY semantics to the percpu balancing path 2018-04-08 14:26:29 +02:00
percpu-stats.c percpu: fix starting offset for chunk statistics traversal 2017-09-27 14:45:57 -07:00
percpu-vm.c percpu: add __GFP_NORETRY semantics to the percpu balancing path 2018-04-08 14:26:29 +02:00
percpu.c percpu: include linux/sched.h for cond_resched() 2018-05-09 09:51:48 +02:00
pgtable-generic.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
process_vm_access.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
quicklist.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
rodata_test.c mm: fix RODATA_TEST failure "rodata_test: test data was not read only" 2017-10-03 17:54:24 -07:00
shmem.c mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() 2018-03-28 18:24:39 +02:00
slab.c mm, slab: memcg_link the SLAB's kmem_cache 2018-05-30 07:52:21 +02:00
slab.h kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2018-02-22 15:42:23 +01:00
slab_common.c kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2018-02-22 15:42:23 +01:00
slob.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
slub.c kmemcheck: rip it out 2018-02-22 15:42:24 +01:00
sparse-vmemmap.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sparse.c mm: sections are not offlined during memory hotremove 2018-05-16 10:10:27 +02:00
swap.c mm: avoid marking swap cached page as lazyfree 2017-10-03 17:54:24 -07:00
swap_cgroup.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_slots.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_state.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swapfile.c swap: divide-by-zero when zero length swap file on ssd 2018-05-30 07:52:23 +02:00
truncate.c mm/truncate.c: fix THP handling in invalidate_mapping_pages() 2017-07-10 16:32:32 -07:00
usercopy.c mm/usercopy: Drop extra is_vmalloc_or_module() check 2017-04-05 12:30:18 -07:00
userfaultfd.c userfaultfd: shmem: wire up shmem_mfill_zeropage_pte 2017-09-06 17:27:28 -07:00
util.c mm: rename global_page_state to global_zone_page_state 2017-09-06 17:27:29 -07:00
vmacache.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
vmalloc.c vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems 2018-02-25 11:08:04 +01:00
vmpressure.c mm, vmpressure: pass-through notification support 2017-07-10 16:32:31 -07:00
vmscan.c mm: fix the NULL mapping case in __isolate_lru_page() 2018-06-05 11:41:54 +02:00
vmstat.c mm/vmstat.c: fix vmstat_update() preemption BUG 2018-05-30 07:52:21 +02:00
workingset.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
z3fold.c z3fold: fix memory leak 2018-05-30 07:52:23 +02:00
zbud.c
zpool.c
zsmalloc.c zsmalloc: calling zs_map_object() from irq is a bug 2017-12-14 09:53:10 +01:00
zswap.c mm, swap, frontswap: fix THP swap if frontswap enabled 2018-02-28 10:19:41 +01:00