WSL2-Linux-Kernel/mm
Tim Chen 917d9290af mm: tune vm_committed_as percpu_counter batching size
Currently the per cpu counter's batch size for memory accounting is
configured as twice the number of cpus in the system.  However, for
system with very large memory, it is more appropriate to make it
proportional to the memory size per cpu in the system.

For example, for a x86_64 system with 64 cpus and 128 GB of memory, the
batch size is only 2*64 pages (0.5 MB).  So any memory accounting
changes of more than 0.5MB will overflow the per cpu counter into the
global counter.  Instead, for the new scheme, the batch size is
configured to be 0.4% of the memory/cpu = 8MB (128 GB/64 /256), which is
more inline with the memory size.

I've done a repeated brk test of 800KB (from will-it-scale test suite)
with 80 concurrent processes on a 4 socket Westmere machine with a total
of 40 cores.  Without the patch, about 80% of cpu is spent on spin-lock
contention within the vm_committed_as counter.  With the patch, there's
a 73x speedup on the benchmark and the lock contention drops off almost
entirely.

[akpm@linux-foundation.org: fix section mismatch]
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
..
Kconfig mm: soft-dirty bits for user memory changes tracking 2013-07-03 16:07:26 -07:00
Kconfig.debug
Makefile memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
backing-dev.c writeback: expose the bdi_wq workqueue 2013-04-01 19:08:06 -07:00
balloon_compaction.c mm: introduce a common interface for balloon pages mobility 2012-12-11 17:22:26 -08:00
bootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
bounce.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
cleancache.c mm: cleancache: clean up cleancache_enabled 2013-04-30 17:04:01 -07:00
compaction.c mm: add & use zone_end_pfn() and zone_spans_pfn() 2013-02-23 17:50:20 -08:00
debug-pagealloc.c
dmapool.c dmapool: make DMAPOOL_DEBUG detect corruption of free marker 2012-12-11 17:22:24 -08:00
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
filemap_xip.c lift sb_start_write() out of ->write() 2013-04-09 14:12:56 -04:00
fremap.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
frontswap.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm: soft-dirty bits for user memory changes tracking 2013-07-03 16:07:26 -07:00
hugetlb.c mm/hugetlb: use already existing interface huge_page_shift 2013-07-03 16:07:32 -07:00
hugetlb_cgroup.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hwpoison-inject.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
init-mm.c
internal.h mm: accelerate munlock() treatment of THP pages 2013-02-27 19:10:09 -08:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
kmemcheck.c
kmemleak-test.c
kmemleak.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksm.c ksm: fix m68k build: only NUMA needs pfn_to_nid 2013-03-08 15:05:34 -08:00
maccess.c
madvise.c mm: madvise: complete input validation before taking lock 2013-04-29 15:54:37 -07:00
memblock.c memblock: fix missing comment of memblock_insert_region() 2013-04-29 15:54:38 -07:00
memcontrol.c mm, memcg: don't take task_lock in task_in_mem_cgroup 2013-07-03 16:07:26 -07:00
memory-failure.c mm/memory-failure.c: fix memory leak in successful soft offlining 2013-07-03 16:07:31 -07:00
memory.c mm: use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT 2013-07-03 16:07:26 -07:00
memory_hotplug.c mm/memory_hotplug.c: change normal message to use pr_debug 2013-07-03 16:07:31 -07:00
mempolicy.c mm/mempolicy.c: fix sp_node_init() argument ordering 2013-03-08 15:05:34 -08:00
mempool.c mempool: add @gfp_mask to mempool_create_node() 2012-06-25 11:53:47 +02:00
migrate.c mm: migration: add migrate_entry_wait_huge() 2013-06-12 16:29:46 -07:00
mincore.c swap: make each swap partition have one address_space 2013-02-23 17:50:17 -08:00
mlock.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
mm_init.c mm: tune vm_committed_as percpu_counter batching size 2013-07-03 16:07:32 -07:00
mmap.c mm: use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT 2013-07-03 16:07:26 -07:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c mm: mmu_notifier: re-fix freed page still mapped in secondary MMU 2013-05-24 16:22:51 -07:00
mmzone.c mm: rename page struct field helpers 2013-02-23 17:50:18 -08:00
mprotect.c mm/mprotect.c: coding-style cleanups 2012-12-18 15:02:15 -08:00
mremap.c mm: soft-dirty bits for user memory changes tracking 2013-07-03 16:07:26 -07:00
msync.c
nobootmem.c mm, nobootmem: do memset() after memblock_reserve() 2013-04-29 15:54:39 -07:00
nommu.c mm/nommu.c: add additional check for vread() just like vwrite() has done 2013-07-03 16:07:31 -07:00
oom_kill.c memcg, oom: provide more precise dump info while memcg oom happening 2013-02-23 17:50:08 -08:00
page-writeback.c mm: make snapshotting pages for stable writes a per-bio operation 2013-04-29 15:54:33 -07:00
page_alloc.c mm/memory-hotplug: fix lowmem count overflow when offline pages 2013-07-03 16:07:32 -07:00
page_cgroup.c memcontrol: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:32 -08:00
page_io.c mm: remove compressed copy from zram in-memory 2013-07-03 16:07:26 -07:00
page_isolation.c mm: fix zone_watermark_ok_safe() accounting of isolated pages 2013-01-04 16:11:46 -08:00
pagewalk.c mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas 2013-05-24 16:22:53 -07:00
percpu-km.c
percpu-vm.c mm: fix kernel-doc warnings 2012-06-20 14:39:36 -07:00
percpu.c mm, percpu: Make sure percpu_alloc early parameter has an argument 2012-12-02 06:23:04 -08:00
pgtable-generic.c mm: Only flush the TLB when clearing an accessible pte 2012-12-11 14:28:34 +00:00
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c
readahead.c mm: change invalidatepage prototype to accept length 2013-05-21 23:17:23 -04:00
rmap.c mm: remove lru parameter from __lru_cache_add and lru_cache_add_lru 2013-07-03 16:07:31 -07:00
shmem.c vfs: export lseek_execute() to modules 2013-07-03 16:23:27 +04:00
slab.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-05-07 08:42:20 -07:00
slab.h slab: Common definition for kmem_cache_node 2013-02-01 12:32:09 +02:00
slab_common.c slab: prevent warnings when allocating with __GFP_NOWARN 2013-06-13 10:01:58 +03:00
slob.c mm: rename page struct field helpers 2013-02-23 17:50:18 -08:00
slub.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-05-07 08:42:20 -07:00
sparse-vmemmap.c sparse-vmemmap: specify vmemmap population range in bytes 2013-04-29 15:54:35 -07:00
sparse.c mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-29 15:54:37 -07:00
swap.c mm: remove lru parameter from __lru_cache_add and lru_cache_add_lru 2013-07-03 16:07:31 -07:00
swap_state.c swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion 2013-06-12 16:29:45 -07:00
swapfile.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
truncate.c mm: teach truncate_inode_pages_range() to handle non page aligned ranges 2013-05-27 23:32:35 -04:00
util.c swap: make each swap partition have one address_space 2013-02-23 17:50:17 -08:00
vmalloc.c vmalloc: introduce remap_vmalloc_range_partial 2013-07-03 16:07:30 -07:00
vmpressure.c memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
vmscan.c mm: remove lru parameter from __lru_cache_add and lru_cache_add_lru 2013-07-03 16:07:31 -07:00
vmstat.c mm/vmstat: add note on safety of drain_zonestat 2013-04-29 15:54:38 -07:00