The source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2)
Перейти к файлу
Vlastimil Babka 7810e6781e mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies.  The zonelist is obtained
from current CPU's node.  This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g.  because the
allocating thread has been migrated to a different CPU.

This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended.  If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.

Current SLAB implementation seems to be immune by luck thanks to commit
511e3a0588 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.

We can fix it by simply removing the zonelist reset completely.  There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place.  This was different
when commit 183f6371aa ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.

We might consider this for 4.17 although I don't know if there's
anything currently broken.

SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a0588 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
ones I'll have to check.

So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes.  BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.

Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aa ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-07 17:34:38 -07:00
Documentation mm: memcg: allow lowering memory.swap.max below the current usage 2018-06-07 17:34:37 -07:00
LICENSES LICENSES: Add Linux-OpenIB license text 2018-04-27 16:41:53 -06:00
arch mm: add pt_mm to struct page 2018-06-07 17:34:37 -07:00
block Changes for 4.18: 2018-06-05 13:24:20 -07:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto - Introduce arithmetic overflow test helper functions (Rasmus) 2018-06-06 17:27:14 -07:00
drivers zram: introduce zram memory tracking 2018-06-07 17:34:34 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs userfaultfd: prevent non-cooperative events vs mcopy_atomic races 2018-06-07 17:34:38 -07:00
include userfaultfd: prevent non-cooperative events vs mcopy_atomic races 2018-06-07 17:34:38 -07:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
ipc Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-04 21:02:18 -07:00
kernel mm: split page_type out from _mapcount 2018-06-07 17:34:37 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
mm mm, page_alloc: do not break __GFP_THISNODE by zonelist reset 2018-06-07 17:34:38 -07:00
net net/9p/trans_xen.c: don't inclide rwlock.h directly 2018-06-07 17:34:34 -07:00
samples samples/bpf: xdpsock: use skb Tx path for XDP_SKB 2018-06-05 15:48:57 +02:00
scripts mm: split page_type out from _mapcount 2018-06-07 17:34:37 -07:00
security Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
sound media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
tools mm: mark pages in use for page tables 2018-06-07 17:34:37 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-06-04 15:23:48 -07:00
.clang-format clang-format: add configuration file 2018-04-11 10:28:35 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap Merge branch 'asoc-4.17' into asoc-4.18 for compress dependencies 2018-04-26 12:24:28 +01:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: add basic helper macros to scripts/Kconfig.include 2018-05-29 03:31:19 +09:00
MAINTAINERS media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.