The source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2)
Перейти к файлу
Jonathan Toppins 75dddef325 mm: ratelimit PFNs busy info message
The RDMA subsystem can generate several thousand of these messages per
second eventually leading to a kernel crash.  Ratelimit these messages
to prevent this crash.

Doug said:
 "I've been carrying a version of this for several kernel versions. I
  don't remember when they started, but we have one (and only one) class
  of machines: Dell PE R730xd, that generate these errors. When it
  happens, without a rate limit, we get rcu timeouts and kernel oopses.
  With the rate limit, we just get a lot of annoying kernel messages but
  the machine continues on, recovers, and eventually the memory
  operations all succeed"

And:
 "> Well... why are all these EBUSY's occurring? It sounds inefficient
  > (at least) but if it is expected, normal and unavoidable then
  > perhaps we should just remove that message altogether?

  I don't have an answer to that question. To be honest, I haven't
  looked real hard. We never had this at all, then it started out of the
  blue, but only on our Dell 730xd machines (and it hits all of them),
  but no other classes or brands of machines. And we have our 730xd
  machines loaded up with different brands and models of cards (for
  instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
  ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
  meant it wasn't tied to any particular brand/model of RDMA hardware.
  To me, it always smelled of a hardware oddity specific to maybe the
  CPUs or mainboard chipsets in these machines, so given that I'm not an
  mm expert anyway, I never chased it down.

  A few other relevant details: it showed up somewhere around 4.8/4.9 or
  thereabouts. It never happened before, but the prinkt has been there
  since the 3.18 days, so possibly the test to trigger this message was
  changed, or something else in the allocator changed such that the
  situation started happening on these machines?

  And, like I said, it is specific to our 730xd machines (but they are
  all identical, so that could mean it's something like their specific
  ram configuration is causing the allocator to hit this on these
  machine but not on other machines in the cluster, I don't want to say
  it's necessarily the model of chipset or CPU, there are other bits of
  identicalness between these machines)"

Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.com
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-10 15:54:06 -07:00
Documentation Pin control fixes for the v4.13 cycle: 2017-08-09 14:30:34 -07:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2017-08-10 09:36:06 -07:00
block blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed 2017-08-02 08:23:57 -06:00
certs
crypto
drivers Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-10 10:30:29 -07:00
firmware
fs mm: fix global NR_SLAB_.*CLAIMABLE counter reads 2017-08-10 15:54:06 -07:00
include Pin control fixes for the v4.13 cycle: 2017-08-09 14:30:34 -07:00
init
ipc ipc: add missing container_of()s for randstruct 2017-08-02 17:16:12 -07:00
kernel mm: fix global NR_SLAB_.*CLAIMABLE counter reads 2017-08-10 15:54:06 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-07-31 22:36:42 -07:00
mm mm: ratelimit PFNs busy info message 2017-08-10 15:54:06 -07:00
net packet: fix tp_reserve race in packet_set_ring 2017-08-10 09:52:12 -07:00
samples
scripts parse-maintainers: Move matching sections from MAINTAINERS 2017-08-08 11:16:14 -07:00
security
sound ASoC: Fixes for v4.13 2017-08-02 17:11:45 +02:00
tools bpf: fix selftest/bpf/test_pkt_md_access on s390x 2017-08-07 10:06:27 -07:00
usr
virt KVM/ARM Fixes for v4.13-rc4 2017-08-03 17:59:58 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS Pin control fixes for the v4.13 cycle: 2017-08-09 14:30:34 -07:00
Makefile Linux 4.13-rc4 2017-08-06 18:44:49 -07:00
README

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.