MADV_FREE pages have been moved into the LRU_INACTIVE_FILE list by commit
f7ad2a6cb9 ("mm: move MADV_FREE pages into LRU_INACTIVE_FILE list").
Link: https://lkml.kernel.org/r/20221111034639.3593380-1-wenjian1@xiaomi.com
Signed-off-by: Jian Wen <wenjian1@xiaomi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Since 2d1c498072 ("mm: memcontrol: make swap tracking an integral part
of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config
option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP.
Update the sites accordingly and drop the symbol.
[ While touching the docs, remove two references to CONFIG_MEMCG_KMEM,
which hasn't been a user-visible symbol for over half a decade. ]
Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
During the integration of PREEMPT_RT support, the code flow around
memcg_check_events() resulted in `twisted code'. Moving the code around
and avoiding then would then lead to an additional local-irq-save
section within memcg_check_events(). While looking better, it adds a
local-irq-save section to code flow which is usually within an
local-irq-off block on non-PREEMPT_RT configurations.
The threshold event handler is a deprecated memcg v1 feature. Instead
of trying to get it to work under PREEMPT_RT just disable it. There
should be no users on PREEMPT_RT. From that perspective it makes even
less sense to get it to work under PREEMPT_RT while having zero users.
Make memory.soft_limit_in_bytes and cgroup.event_control return
-EOPNOTSUPP on PREEMPT_RT. Make an empty memcg_check_events() and
memcg_write_event_control() which return only -EOPNOTSUPP on PREEMPT_RT.
Document that the two knobs are disabled on PREEMPT_RT.
Link: https://lkml.kernel.org/r/20220226204144.1008339-3-bigeasy@linutronix.de
Suggested-by: Michal Hocko <mhocko@kernel.org>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <oliver.sang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The deprecation process of kmem.limit_in_bytes started with the commit
0158115f70 ("memcg, kmem: deprecate kmem.limit_in_bytes") which also
explains in detail the motivation behind the deprecation. To summarize,
it is the unexpected behavior on hitting the kmem limit. This patch
moves the deprecation process to the next stage by disallowing to set
the kmem limit. In future we might just remove the kmem.limit_in_bytes
file completely.
[akpm@linux-foundation.org: s/ENOTSUPP/EOPNOTSUPP/]
[arnd@arndb.de: mark cancel_charge() inline]
Link: https://lkml.kernel.org/r/20211022070542.679839-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20211019153408.2916808-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When debugging an oom issue, I found the oom_kill counter of memcg is
confusing. At the first glance without checking document, I thought it
just counts for memcg oom, but it turns out it counts both global and
memcg oom.
The cgroup v2 documents it, but the description is missed for cgroup v1.
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Chris Down <chris@chrisdown.name>
Link: https://lore.kernel.org/r/20210226021254.3980-1-shy828301@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Replace the lkml.org links with lore to better use a single source
that's more likely to stay available long-term.
Done by bash script:
cvt_lkml_to_lore ()
{
tmpfile=$(mktemp ./.cvt_links.XXXXXXX)
header=$(echo $1 | sed 's@/lkml/@/lkml/headers/@')
wget -qO - $header > $tmpfile
if [[ $? == 0 ]] ; then
link=$(grep -i '^Message-Id:' $tmpfile | head -1 | \
sed -r -e 's/^\s*Message-Id:\s*<\s*//' -e 's/\s*>\s*$//' -e 's@^@https://lore.kernel.org/r/@')
# echo "testlink: $link"
if [ -n "$link" ] ; then
wget -qO - $link > /dev/null
if [[ $? == 0 ]] ; then
echo $link
fi
fi
fi
rm -f $tmpfile
}
git grep -P -o "\bhttps?://(?:www.)?lkml.org/lkml[\/\w]+" $@ |
while read line ; do
echo $line
file=$(echo $line | cut -f1 -d':')
link=$(echo $line | cut -f2- -d':')
newlink=$(cvt_lkml_to_lore $link)
if [[ -n "$newlink" ]] ; then
sed -i -e "s#\b$link\b#$newlink#" $file
fi
done
Link: https://lore.kernel.org/patchwork/patch/1265849/#1462688
Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/77cdb7f32cfb087955bfc3600b86c40bed5d4104.camel@perches.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to fix
the incorrect comments in code. Also fixed some zone->lru_lock comment
error from ancient time. etc.
I struggled to understand the comment above move_pages_to_lru() (surely
it never calls page_referenced()), and eventually realized that most of
it had got separated from shrink_active_list(): move that comment back.
Link: https://lkml.kernel.org/r/1604566549-62481-20-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Jann Horn <jannh@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Chen, Rong A" <rong.a.chen@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update cgroup v1 docs after the deprecation of the non-hierarchical mode
of the memory controller.
Link: https://lkml.kernel.org/r/20201110220800.929549-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cgroup v1 memcg controller has exposed a dedicated kmem limit to users
which turned out to be really a bad idea because there are paths which
cannot shrink the kernel memory usage enough to get below the limit (e.g.
because the accounted memory is not reclaimable). There are cases when
the failure is even not allowed (e.g. __GFP_NOFAIL). This means that the
kmem limit is in excess to the hard limit without any way to shrink and
thus completely useless. OOM killer cannot be invoked to handle the
situation because that would lead to a premature oom killing.
As a result many places might see ENOMEM returning from kmalloc and result
in unexpected errors. E.g. a global OOM killer when there is a lot of
free memory because ENOMEM is translated into VM_FAULT_OOM in #PF path and
therefore pagefault_out_of_memory would result in OOM killer.
Please note that the kernel memory is still accounted to the overall limit
along with the user memory so removing the kmem specific limit should
still allow to contain kernel memory consumption. Unlike the kmem one,
though, it invokes memory reclaim and targeted memcg oom killing if
necessary.
Start the deprecation process by crying to the kernel log. Let's see
whether there are relevant usecases and simply return to EINVAL in the
second stage if nobody complains in few releases.
[akpm@linux-foundation.org: tweak documentation text]
Link: http://lkml.kernel.org/r/20190911151612.GI4023@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Thomas Lindroth <thomas.lindroth@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>