WSL2-Linux-Kernel/mm
Roman Gushchin 3d8b38eb81 mm, oom: introduce memory.oom.group
For some workloads an intervention from the OOM killer can be painful.
Killing a random task can bring the workload into an inconsistent state.

Historically, there are two common solutions for this
problem:
1) enabling panic_on_oom,
2) using a userspace daemon to monitor OOMs and kill
   all outstanding processes.

Both approaches have their downsides: rebooting on each OOM is an obvious
waste of capacity, and handling all in userspace is tricky and requires a
userspace agent, which will monitor all cgroups for OOMs.

In most cases an in-kernel after-OOM cleaning-up mechanism can eliminate
the necessity of enabling panic_on_oom.  Also, it can simplify the cgroup
management for userspace applications.

This commit introduces a new knob for cgroup v2 memory controller:
memory.oom.group.  The knob determines whether the cgroup should be
treated as an indivisible workload by the OOM killer.  If set, all tasks
belonging to the cgroup or to its descendants (if the memory cgroup is not
a leaf cgroup) are killed together or not at all.

To determine which cgroup has to be killed, we do traverse the cgroup
hierarchy from the victim task's cgroup up to the OOMing cgroup (or root)
and looking for the highest-level cgroup with memory.oom.group set.

Tasks with the OOM protection (oom_score_adj set to -1000) are treated as
an exception and are never killed.

This patch doesn't change the OOM victim selection algorithm.

Link: http://lkml.kernel.org/r/20180802003201.817-4-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:45 -07:00
..
kasan kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN 2018-08-17 16:20:30 -07:00
Kconfig mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP 2018-08-17 16:20:32 -07:00
Kconfig.debug mm: clarify CONFIG_PAGE_POISONING and usage 2018-08-22 10:52:44 -07:00
Makefile
backing-dev.c
balloon_compaction.c
bootmem.c
cleancache.c
cma.c mm/cma: remove unsupported gfp_mask parameter from cma_alloc() 2018-08-17 16:20:32 -07:00
cma.h
cma_debug.c mm/cma: remove unsupported gfp_mask parameter from cma_alloc() 2018-08-17 16:20:32 -07:00
compaction.c
debug.c
debug_page_ref.c
dmapool.c
early_ioremap.c
fadvise.c mm/fadvise.c: fix signed overflow UBSAN complaint 2018-08-17 16:20:30 -07:00
failslab.c
filemap.c
frame_vector.c
frontswap.c
gup.c
gup_benchmark.c
highmem.c
hmm.c mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
huge_memory.c mm, huge page: copy target sub-page last when copy huge page 2018-08-17 16:20:29 -07:00
hugetlb.c mm/hugetlb.c: don't zero 1GiB bootmem pages 2018-08-17 16:20:32 -07:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c
internal.h mm: remove __paginginit 2018-08-22 10:52:45 -07:00
interval_tree.c
khugepaged.c mm: thp: pass correct vm_flags to hugepage_vma_check() 2018-08-17 16:20:30 -07:00
kmemleak-test.c
kmemleak.c
ksm.c mm: fix page_freeze_refs and page_unfreeze_refs in comments 2018-08-22 10:52:44 -07:00
list_lru.c mm/list_lru: introduce list_lru_shrink_walk_irq() 2018-08-17 16:20:32 -07:00
maccess.c
madvise.c
memblock.c mm/memblock.c: replace u64 with phys_addr_t where appropriate 2018-08-17 16:20:30 -07:00
memcontrol.c mm, oom: introduce memory.oom.group 2018-08-22 10:52:45 -07:00
memfd.c
memory-failure.c mm: fix page_freeze_refs and page_unfreeze_refs in comments 2018-08-22 10:52:44 -07:00
memory.c Revert "mm: always flush VMA ranges affected by zap_page_range" 2018-08-17 16:20:31 -07:00
memory_hotplug.c mm/page_alloc: Introduce free_area_init_core_hotplug 2018-08-22 10:52:45 -07:00
mempolicy.c mm: access zone->node via zone_to_nid() and zone_set_nid() 2018-08-22 10:52:45 -07:00
mempool.c mm/mempool.c: add missing parameter description 2018-08-22 10:52:44 -07:00
memtest.c
migrate.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mincore.c
mlock.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mm_init.c mm: access zone->node via zone_to_nid() and zone_set_nid() 2018-08-22 10:52:45 -07:00
mmap.c mm, oom: remove oom_lock from oom_reaper 2018-08-22 10:52:44 -07:00
mmu_context.c
mmu_notifier.c mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
mmzone.c
mprotect.c
mremap.c
msync.c
nobootmem.c
nommu.c mm: provide a fallback for PAGE_KERNEL_EXEC for architectures 2018-08-17 16:20:29 -07:00
oom_kill.c mm, oom: introduce memory.oom.group 2018-08-22 10:52:45 -07:00
page-writeback.c mm/page-writeback.c: update stale account_page_redirty() comment 2018-08-17 16:20:30 -07:00
page_alloc.c mm/page_alloc: Introduce free_area_init_core_hotplug 2018-08-22 10:52:45 -07:00
page_counter.c
page_ext.c mm/page_ext.c: constify lookup_page_ext() argument 2018-08-17 16:20:28 -07:00
page_idle.c
page_io.c
page_isolation.c
page_owner.c
page_poison.c
page_vma_mapped.c
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c
rodata_test.c
shmem.c mm: zero out the vma in vma_init() 2018-08-22 10:52:44 -07:00
slab.c
slab.h mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB 2018-08-17 16:20:30 -07:00
slab_common.c mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB 2018-08-17 16:20:30 -07:00
slob.c
slub.c mm, slub: restore the original intention of prefetch_freepointer() 2018-08-17 16:20:28 -07:00
sparse-vmemmap.c mm/sparse: delete old sparse_init and enable new one 2018-08-17 16:20:32 -07:00
sparse.c mm/sparse: delete old sparse_init and enable new one 2018-08-17 16:20:32 -07:00
swap.c
swap_cgroup.c
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c
swapfile.c mm/swapfile.c: put_swap_page: share more between huge/normal code path 2018-08-22 10:52:44 -07:00
truncate.c
usercopy.c
userfaultfd.c
util.c
vmacache.c mm, vmacache: hash addresses based on pmd 2018-08-17 16:20:32 -07:00
vmalloc.c mm: provide a fallback for PAGE_KERNEL_EXEC for architectures 2018-08-17 16:20:29 -07:00
vmpressure.c
vmscan.c mm: fix page_freeze_refs and page_unfreeze_refs in comments 2018-08-22 10:52:44 -07:00
vmstat.c
workingset.c mm/list_lru: introduce list_lru_shrink_walk_irq() 2018-08-17 16:20:32 -07:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c mm/zsmalloc.c: make several functions and a struct static 2018-08-17 16:20:30 -07:00
zswap.c