Граф коммитов

916762 Коммитов

Автор SHA1 Сообщение Дата
Jiaxun Yang 818e915fba irqchip: Add Loongson HyperTransport Vector support
This controller appears on Loongson-3 chips for receiving interrupt
vectors from PCH's PIC and PCH's PCIe MSI interrupts.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200528152757.1028711-2-jiaxun.yang@flygoat.com
2020-05-29 09:42:18 +01:00
Anup Patel 0e375f5101 irqchip/sifive-plic: Improve boot prints for multiple PLIC instances
We improve PLIC banner to help distinguish multiple PLIC instances
in boot time prints.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Link: https://lore.kernel.org/r/20200518091441.94843-4-anup.patel@wdc.com
2020-05-25 10:38:25 +01:00
Anup Patel 2234ae846c irqchip/sifive-plic: Setup cpuhp once after boot CPU handler is present
For multiple PLIC instances, the plic_init() is called once for each
PLIC instance. Due to this we have two issues:
1. cpuhp_setup_state() is called multiple times
2. plic_starting_cpu() can crash for boot CPU if cpuhp_setup_state()
   is called before boot CPU PLIC handler is available.

Address both issues by only initializing the HP notifiers when
the boot CPU setup is complete.

Fixes: f1ad1133b1 ("irqchip/sifive-plic: Add support for multiple PLICs")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200518091441.94843-3-anup.patel@wdc.com
2020-05-25 10:36:53 +01:00
Anup Patel 2458ed31e9 irqchip/sifive-plic: Set default irq affinity in plic_irqdomain_map()
For multiple PLIC instances, each PLIC can only target a subset of
CPUs which is represented by "lmask" in the "struct plic_priv".

Currently, the default irq affinity for each PLIC interrupt is all
online CPUs which is illegal value for default irq affinity when we
have multiple PLIC instances. To fix this, we now set "lmask" as the
default irq affinity in for each interrupt in plic_irqdomain_map().

Fixes: f1ad1133b1 ("irqchip/sifive-plic: Add support for multiple PLICs")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200518091441.94843-2-anup.patel@wdc.com
2020-05-25 10:36:09 +01:00
Valentin Schneider cc86432aa8 irqchip/gic-v2, v3: Drop extra IRQ_NOAUTOEN setting for (E)PPIs
(E)PPIs are per-CPU interrupts, so we want each CPU to go and enable them
via enable_percpu_irq(); this also means we want IRQ_NOAUTOEN for them as
the autoenable would lead to calling irq_enable() instead of the more
appropriate irq_percpu_enable().

Calling irq_set_percpu_devid() is enough to get just that since it trickles
down to irq_set_percpu_devid_flags(), which gives us IRQ_NOAUTOEN (and a
few others). Setting IRQ_NOAUTOEN *again* right after this call is just
redundant, so don't do it.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200521223500.834-1-valentin.schneider@arm.com
2020-05-25 10:32:51 +01:00
Andy Shevchenko 9ed78b05f9 irqdomain: Allow software nodes for IRQ domain creation
In some cases we need to have an IRQ domain created out of software node.

One of such cases is DesignWare GPIO driver when it's instantiated from
half-baked ACPI table (alas, we can't fix it for devices which are few years
on market) and thus using software nodes to quirk this. But the driver
is using IRQ domains based on per GPIO port firmware nodes, which are in
the above case software ones. This brings a warning message to be printed

  [   73.957183] irq: Invalid fwnode type for irqdomain

and creates an anonymous IRQ domain without a debugfs entry.

Allowing software nodes to be valid for IRQ domains rids us of the warning
and debugs gets correctly populated.

  % ls -1 /sys/kernel/debug/irq/domains/
  ...
  intel-quark-dw-apb-gpio:portA

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[maz: refactored commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-3-andriy.shevchenko@linux.intel.com
2020-05-21 10:53:17 +01:00
Andy Shevchenko 87526603c8 irqdomain: Get rid of special treatment for ACPI in __irq_domain_add()
Now that __irq_domain_add() is able to better deals with generic
fwnodes, there is no need to special-case ACPI anymore.

Get rid of the special treatment for ACPI.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-2-andriy.shevchenko@linux.intel.com
2020-05-21 10:51:50 +01:00
Andy Shevchenko 181e9d4efa irqdomain: Make __irq_domain_add() less OF-dependent
__irq_domain_add() relies in some places on the fact that the fwnode
can be only of type OF. This prevents refactoring of the code to support
other types of fwnode. Make it less OF-dependent by switching it
to use the fwnode directly where it makes sense.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-1-andriy.shevchenko@linux.intel.com
2020-05-21 10:50:30 +01:00
Dan Carpenter 128516e49d iio: dummy_evgen: Fix use after free on error in iio_dummy_evgen_create()
We need to preserve the "iio_evgen->irq_sim_domain" error code before
we free "iio_evgen" otherwise it leads to a use after free.

Fixes: 337cbeb2c1 ("genirq/irq_sim: Simplify the API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-20 13:11:41 +01:00
Marc Zyngier c5d6082d35 irqchip/gic-v3-its: Balance initial LPI affinity across CPUs
When mapping a LPI, the ITS driver picks the first possible
affinity, which is in most cases CPU0, assuming that if
that's not suitable, someone will come and set the affinity
to something more interesting.

It apparently isn't the case, and people complain of poor
performance when many interrupts are glued to the same CPU.
So let's place the interrupts by finding the "least loaded"
CPU (that is, the one that has the fewer LPIs mapped to it).
So called 'managed' interrupts are an interesting case where
the affinity is actually dictated by the kernel itself, and
we should honor this.

Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1575642904-58295-1-git-send-email-john.garry@huawei.com
Link: https://lore.kernel.org/r/20200515165752.121296-3-maz@kernel.org
2020-05-20 11:00:00 +01:00
Marc Zyngier 2f13ff1d1d irqchip/gic-v3-its: Track LPI distribution on a per CPU basis
In order to improve the distribution of LPIs among CPUs, let start by
tracking the number of LPIs assigned to CPUs, both for managed and
non-managed interrupts (as separate counters).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20200515165752.121296-2-maz@kernel.org
2020-05-18 10:30:42 +01:00
Bartosz Golaszewski 337cbeb2c1 genirq/irq_sim: Simplify the API
The interrupt simulator API exposes a lot of custom data structures and
functions and doesn't reuse the interfaces already exposed by the irq
subsystem. This patch tries to address it.

We hide all the simulator-related data structures from users and instead
rely on the well-known irq domain. When creating the interrupt simulator
the user receives a pointer to a newly created irq_domain and can use it
to create mappings for simulated interrupts.

It is also possible to pass a handle to fwnode when creating the simulator
domain and retrieve it using irq_find_matching_fwnode().

The irq_sim_fire() function is dropped as well. Instead we implement the
irq_get/set_irqchip_state interface.

We modify the two modules that use the simulator at the same time as
adding these changes in order to reduce the intermediate bloat that would
result when trying to migrate the drivers in separate patches.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> #for IIO
Link: https://lore.kernel.org/r/20200514083901.23445-3-brgl@bgdev.pl
2020-05-18 10:30:21 +01:00
Bartosz Golaszewski 5c8f77a278 irqdomain: Make irq_domain_reset_irq_data() available to non-hierarchical users
irq_domain_reset_irq_data() doesn't modify the parent data, so it can be
made available even if irq domain hierarchy is not being built. We'll
subsequently use it in irq_sim code.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20200514083901.23445-2-brgl@bgdev.pl
2020-05-18 10:29:26 +01:00
Wesley W. Terpstra 82f2202ddc irqchip/sifive-plic: Remove incorrect requirement about number of irq contexts
A PLIC may not be connected to all the cores. In that case, nr_contexts
may be less than num_possible_cpus. This requirement is only valid a single
PLIC is the only interrupt controller for the whole system.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: "Wesley W. Terpstra" <wesley@sifive.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Link: https://lore.kernel.org/r/20200512172636.96299-1-atish.patra@wdc.com

[Atish: Modified the commit text]
2020-05-18 10:28:30 +01:00
Ingo Rohloff 8a94c1ab34 irqchip/gic-v3: Fix missing "__init" for gic_smp_init()
With an SMP configuration, gic_smp_init() calls set_smp_cross_call().
set_smp_cross_call() is marked with "__init".
So gic_smp_init() should also be marked with "__init".
gic_smp_init() is only called from gic_init_bases().
gic_init_bases() is also marked with "__init";
So marking gic_smp_init() with "__init" is fine.

Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200422112857.4300-1-ingo.rohloff@lauterbach.com
2020-05-18 10:28:30 +01:00
Shaokun Zhang ae0bb9fda4 platform-msi: Fix typos in comment
Fix up one typos @nev -> @nr_irqs.

Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1589770859-19340-1-git-send-email-zhangshaokun@hisilicon.com
2020-05-18 10:28:30 +01:00
Linus Torvalds 2ef96a5bb1 Linux 5.7-rc5 2020-05-10 15:16:58 -07:00
Linus Torvalds c14cab2688 A set of fixes for x86:
- Ensure that direct mapping alias is always flushed when changing page
    attributes. The optimization for small ranges failed to do so when
    the virtual address was in the vmalloc or module space.
 
  - Unbreak the trace event registration for syscalls without arguments
    caused by the refactoring of the SYSCALL_DEFINE0() macro.
 
  - Move the printk in the TSC deadline timer code to a place where it is
    guaranteed to only be called once during boot and cannot be rearmed by
    clearing warn_once after boot. If it's invoked post boot then lockdep
    rightfully complains about a potential deadlock as the calling context
    is different.
 
  - A series of fixes for objtool and the ORC unwinder addressing variety
    of small issues:
 
      Stack offset tracking for indirect CFAs in objtool ignored subsequent
      pushs and pops
 
      Repair the unwind hints in the register clearing entry ASM code
 
      Make the unwinding in the low level exit to usermode code stop after
      switching to the trampoline stack. The unwind hint is not longer valid
      and the ORC unwinder emits a warning as it can't find the registers
      anymore.
 
      Fix the unwind hints in switch_to_asm() and rewind_stack_do_exit()
      which caused objtool to generate bogus ORC data.
 
      Prevent unwinder warnings when dumping the stack of a non-current
      task as there is no way to be sure about the validity because the
      dumped stack can be a moving target.
 
      Make the ORC unwinder behave the same way as the frame pointer
      unwinder when dumping an inactive tasks stack and do not skip the
      first frame.
 
      Prevent ORC unwinding before ORC data has been initialized
 
      Immediately terminate unwinding when a unknown ORC entry type is
      found.
 
      Prevent premature stop of the unwinder caused by IRET frames.
 
      Fix another infinite loop in objtool caused by a negative offset which
      was not catched.
 
      Address a few build warnings in the ORC unwinder and add missing
      static/ro_after_init annotations
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6363QTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRJHD/4hWjzJLsUZ9xq2NrzhevoeJtxj+wVM
 66x9NM3mlFQ30BN4Aye4EnNEhR0iIvNPWWdfEmaJYfPHPwnUjjcOa426HYxP/WXA
 DWd5F20wGaaPOJ65LJpy/+pfcxAeQynt4I2cDEWHAplswfOWV/Hv8mSeKAKuq400
 lCWaTMkWcO/toexSNn8PVyWi9rHlm+76E1bHkVwuoekGBGt1VloKGlK6OPyElzL2
 w9VtrjSLlYQ0MdfCJKQeg44XQPMbf4hZRfc88x9SwDWB01q7aSvb0pWNl9AJKNXA
 7fFu5T4F4PABPgRM7eJ5yNk0De9jM1y+6eCp66f9UXoNOeSr7Boz9Xc4xWqAraIi
 9Dtx3WliO9CAxwUiD+Cj2iJO5o83AdRK/xhCth2VRnYMS6imfSidEqTC+LhEtkzw
 Yplu7sbrWQDa5JTh8vk60clDvbkU+pfdxJisY+KClRguWfQfR6MJNuQnE0NHr7cH
 H4VXFFHEE6tDdJneQ9RxA4iF20RTgSlJGK0YlsH6QsxPsRgoHVkGUao8fQhrNvRc
 MIdpm9YasWStjJ7ZXbDeStmnLFN3DCj1RC8wmvJ4i/R1sPnBvPvRUt4Lm988a951
 Vyr23VIcVrE7zykiqQZVH7bvIv6ULORqTJbIOF1rO/aIut4W8z0ojoVXC0Z7CiwF
 S5SGj+hlWciIew==
 =0rCi
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Ensure that direct mapping alias is always flushed when changing
     page attributes. The optimization for small ranges failed to do so
     when the virtual address was in the vmalloc or module space.

   - Unbreak the trace event registration for syscalls without arguments
     caused by the refactoring of the SYSCALL_DEFINE0() macro.

   - Move the printk in the TSC deadline timer code to a place where it
     is guaranteed to only be called once during boot and cannot be
     rearmed by clearing warn_once after boot. If it's invoked post boot
     then lockdep rightfully complains about a potential deadlock as the
     calling context is different.

   - A series of fixes for objtool and the ORC unwinder addressing
     variety of small issues:

       - Stack offset tracking for indirect CFAs in objtool ignored
         subsequent pushs and pops

       - Repair the unwind hints in the register clearing entry ASM code

       - Make the unwinding in the low level exit to usermode code stop
         after switching to the trampoline stack. The unwind hint is no
         longer valid and the ORC unwinder emits a warning as it can't
         find the registers anymore.

       - Fix unwind hints in switch_to_asm() and rewind_stack_do_exit()
         which caused objtool to generate bogus ORC data.

       - Prevent unwinder warnings when dumping the stack of a
         non-current task as there is no way to be sure about the
         validity because the dumped stack can be a moving target.

       - Make the ORC unwinder behave the same way as the frame pointer
         unwinder when dumping an inactive tasks stack and do not skip
         the first frame.

       - Prevent ORC unwinding before ORC data has been initialized

       - Immediately terminate unwinding when a unknown ORC entry type
         is found.

       - Prevent premature stop of the unwinder caused by IRET frames.

       - Fix another infinite loop in objtool caused by a negative
         offset which was not catched.

       - Address a few build warnings in the ORC unwinder and add
         missing static/ro_after_init annotations"

* tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES
  x86/apic: Move TSC deadline timer debug printk
  ftrace/x86: Fix trace event registration for syscalls without arguments
  x86/mm/cpa: Flush direct map alias during cpa
  objtool: Fix infinite loop in for_offset_range()
  x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
  x86/unwind/orc: Fix error path for bad ORC entry type
  x86/unwind/orc: Prevent unwinding before ORC initialization
  x86/unwind/orc: Don't skip the first frame for inactive tasks
  x86/unwind: Prevent false warnings for non-current tasks
  x86/unwind/orc: Convert global variables to static
  x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
  x86/entry/64: Fix unwind hints in __switch_to_asm()
  x86/entry/64: Fix unwind hints in kernel exit path
  x86/entry/64: Fix unwind hints in register clearing code
  objtool: Fix stack offset tracking for indirect CFAs
2020-05-10 11:59:53 -07:00
Linus Torvalds 8b00083219 A single fix for objtool to prevent an infinite loop in the jump table
search which can be triggered when building the kernel with
 -ffunction-sections.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl635X8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYod4AEACumK3x6NTRw7o889GU+Kzj/LfpJZOr
 QoKjZgo4AzeONa/5sKoBeNhGiHkuAOhhwG/yq9QrTMTw3b6kFPnQ8Vel590JM/uj
 j/qjgbpR5v2v0ULMbONEHBN/3dRETyFomAe88hg6WM7ZuNCXPd+rbya9XxD7LdQo
 K04y+vdPACJIf1hyb91sOWfyAWDHzwencNQ0qq0CMf72JpKFcRXCqor7vvH5zNH0
 2PmTbKasOlyrgb9HQNLi6mNNoM47bc9lg8D76eDBl/Wl3yqYVwAawk4vqgwjxc0P
 MetoybQsWegi2dzmhjk61MIF8h6vw/NM6xuyiXSW7dHCN1GXzWGojuSVfzv0AEj2
 0/xbRToSRWuPAvURGqiZ8GhBgG2ybHz+sDQyh/Cg4Jj2NulOLxy0lBngh5o/Dczh
 mLpqcJUd6I76bUSk+c0vehUqOEj0yAZm5Olo7gTkyoHlFSG0b2nShC1Km4TBM04A
 h5RTp/lBp++u1h4+w2EwyP8F6+chcpsyUQG1yEpz+GYoHsjYkA1gfx1rtD7uNKUI
 NwzVsuUEUt3UL4PPnjMMPYbOZb4R3vUjGFhgG5Se3D+2mNHkrvh+HA3WvUb5CUuW
 MSQ0e0K7Su3+M0/H+PQDKDPS/a0h3HtpZwaZt4gPkxnXNc5P7+c4Vyo28nttwF2O
 Ad08s6mVY8sDtA==
 =0+2v
 -----END PGP SIGNATURE-----

Merge tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Thomas Gleixner:
 "A single fix for objtool to prevent an infinite loop in the
  jump table search which can be triggered when building the
  kernel with '-ffunction-sections'"

* tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix infinite loop in find_jump_table()
2020-05-10 11:42:14 -07:00
Linus Torvalds bd2049f871 A single fix for the fallout of the recent futex uacess rework.
With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser()
 correctly and emits a 'maybe unitialized' warning. While we usually ignore
 compiler stupidity the conditional store is pointless anyway because the
 correct case has to store. For the fault case the extra store does no harm.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl635IgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTQkEAChNFCVLyCNihKNDar4h0LhuChYSVow
 CRpnLFKTrxWpUemHYgOQM8FBFRjvVK3o3yhmp7qyWc1LnM4iYuGleP+FhfL5F1mk
 t0ANUMFAZOomy4348XXeVR/bq7RFpKrD68tsl0u3nC+NzykN4kCt3n8qN0CendbH
 +j9ILi2eNEbSIarC4gH228UuN0YIY5nC9ftW9oHJ+c/Z23X9RXstXhiH1TB9w99E
 97G96WOdWjA+z7KzMF1REi/goJGxeZh0GQdz4iuR6vBNd4iR2V9hT3DqklUnSZPp
 +XGvaWaUH7yVa0etUdCtlBwmZ7Xq3h/N381khq9m6NfXdS8aZ7OavWyf+3urx7xz
 6GtCIlo0QnIyqx5oe1/06zxQNgNAf0JAKIi5IDLFsr8SwfoWoG1Z6RrAYugyZurm
 9RganJhVGrTXApi/9NUafhqHv7y9OE5UodRLpnKdnjei+/sE51xaIgx7Tr59Ao8n
 G3sMZkI/8GV9cQnKrg7qcN7kiJfyofoslnOigwm3hJaTMAn0fK9+Bx5YvJgVlyf2
 SmE3saw3408/hhqkVWCW5GL8J+JEh/WDi6FCZ3Fu+L1UHalzqDGKAlhfmVxxDNmt
 tDbP4AUHbucmcWl98Ms0iKtfSwz1H0kTfkaHS0cvphIfH593S4FDJEiywiKsab7v
 8nPUV2Bi6vZHxw==
 =Va5K
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Thomas Gleixner:
 "A single fix for the fallout of the recent futex uacess rework.

  With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser()
  correctly and emits a 'maybe unitialized' warning. While we usually
  ignore compiler stupidity the conditional store is pointless anyway
  because the correct case has to store. For the fault case the extra
  store does no harm"

* tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM: futex: Address build warning
2020-05-10 11:39:31 -07:00
Linus Torvalds 27d2dcb1b9 IOMMU Fixes for Linux v5.7-rc4
Including:
 
 	- The race condition fixes for the AMD IOMMU driver. This are 5
 	  patches fixing two race conditions around
 	  increase_address_space(). The first race condition was around
 	  the non-atomic update of the domain page-table root pointer
 	  and the variable containing the page-table depth (called
 	  mode). This is fixed now be merging page-table root and mode
 	  into one 64-bit field which is read/written atomically.
 
 	  The second race condition was around updating the page-table
 	  root pointer and making it public before the hardware caches
 	  were flushed. This could cause addresses to be mapped and
 	  returned to drivers which are not reachable by IOMMU hardware
 	  yet, causing IO page-faults. This is fixed too by adding the
 	  necessary flushes before a new page-table root is published.
 
 	  Related to the race condition fixes these patches also add a
 	  missing domain_flush_complete() barrier to update_domain() and
 	  a fix to bail out of the loop which tries to increase the
 	  address space when the call to increase_address_space() fails.
 
 	  Qian was able to trigger the race conditions under high load
 	  and memory pressure within a few days of testing. He confirmed
 	  that he has seen no issues anymore with the fixes included
 	  here.
 
 	- Fix for a list-handling bug in the VirtIO IOMMU driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl638MYACgkQK/BELZcB
 GuOQxQ/5AYorgKuGkqVbob69YWZuSAEG08dlzDw4C8CDnPKXEPd0L4gJGLP7BpEh
 bPJo9QJtXW7zG6Hhk8sWk9/iONsThngoudaQrodJwaQRdCDGaDZlvBaezG2Vx4xb
 A2OrcM9lvQSODdgyf3x0O1cX7vkQ4J6nJR1Z8Fw4EufjH6TS9DR0tf8ZWHtIpHa6
 Josu3M+qhUXPsn7KK5o7GtNib7sI4whLldYaASGsuaFGzod3CgA0cgmL2HfD+DWP
 k1EIEZTCaOq0BamtpyXbSA6o0AxwKERr/KONi1pL0xN4r0yCjsxEQ6+Rw4caqvgA
 zrfv3kk4a+wFAxOe0hUEtKk8Oy587LPJvIX4FnjG8hRnBrEaQC9vy4eMj05utPid
 PpsNQ35P+SyrxTlIp7ybIVhUvKbxih8SSpRsjx16vX+r/h4SRvWHzjpHVq/4+gIT
 TeZGw1g7xCIyjzn5HqLs/nMG/Ly9QHQaWia8slJJgbzI/deUXAVTy6PmMrqHB+zv
 e0PelKsq5lEQBrFX+r/Sg5hBViKaMykXKbXXg3KIolzlutJc2Rrzh4EEKpP/ug2/
 upTXf+NvMobNxb3QLqn3IJApIirEGYQqI7lwjiUwTC5xb3EfYLUuRa5i4fbOAZIv
 krsVM4sNX1S32TblTMzDDOEEggPG1wPhVF5B+1emOolYHek3ShI=
 =gqwr
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Race condition fixes for the AMD IOMMU driver.

   These are five patches fixing two race conditions around
   increase_address_space(). The first race condition was around the
   non-atomic update of the domain page-table root pointer and the
   variable containing the page-table depth (called mode). This is fixed
   now be merging page-table root and mode into one 64-bit field which
   is read/written atomically.

   The second race condition was around updating the page-table root
   pointer and making it public before the hardware caches were flushed.
   This could cause addresses to be mapped and returned to drivers which
   are not reachable by IOMMU hardware yet, causing IO page-faults. This
   is fixed too by adding the necessary flushes before a new page-table
   root is published.

   Related to the race condition fixes these patches also add a missing
   domain_flush_complete() barrier to update_domain() and a fix to bail
   out of the loop which tries to increase the address space when the
   call to increase_address_space() fails.

   Qian was able to trigger the race conditions under high load and
   memory pressure within a few days of testing. He confirmed that he
   has seen no issues anymore with the fixes included here.

 - Fix for a list-handling bug in the VirtIO IOMMU driver.

* tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/virtio: Reverse arguments to list_add
  iommu/amd: Do not flush Device Table in iommu_map_page()
  iommu/amd: Update Device Table in increase_address_space()
  iommu/amd: Call domain_flush_complete() in update_domain()
  iommu/amd: Do not loop forever when trying to increase address space
  iommu/amd: Fix race in increase_address_space()/fetch_pte()
2020-05-10 11:26:23 -07:00
Linus Torvalds 0a85ed6e7f block-5.7-2020-05-09
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl63WVAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkXWD/9qJgqQpPkigCCwwPHZ+phthw6gHeAgBxPH
 Cw6P9QB4QCdacZjQA6QH3zdxaDsCCitQRioWPgxngs1326TKYNzBi7U3eTEwiK12
 cnRybLnkzei4yzYVUSJk637oOoQh3CiJLvYcJBppGFi7crpbvlQv68M2hu05vhwL
 R/91H62X/5UaUlc1cJV63OBk8euWzF6XNbCQQrR4ayDvz+BsV5Fs72vYa1gx7qIt
 as/67oTT6y4U4pd74nT4OGkxDIXbXfn2eTbh5sMNc4ilBkqMyNbf8aOHdWqXZIBd
 18RKpNl6h/fiDMJ0jsGliReONLjfRBcJla68Kn1AFONMcyxcXidjptOwLOt2fYWf
 YMguCVMhfgxVBslzLWoQ9AWSiNVh36ycORWlCOrnRaOaQCb9OaLZ2fwibfZ0JsMd
 0259Z5vA7MIUoobCc5akXOYHbpByA9FSYkKudgTYLpdjkn05kxQyA12GgJjW3sVw
 ZRjoUuDuZDDUct6JcLWdrlONT8st05g+qf6PCoD+Jac8HtbpqHfKJJUtYecUat75
 4hGKhuvTzpuVY0wNHo3sgqKfsejQODTN6UhejNI11Zs/nx6O0ze/qoDuWZHncnKl
 158le+K5rNS8SUNbDBTMWp3OX4SJm/Gsf30fOWkkt6z1iaEfKc5sCxBHvSOeBEvH
 M9pzy56Vtw==
 =73nU
 -----END PGP SIGNATURE-----

Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - a small series fixing a use-after-free of bdi name (Christoph,Yufen)

 - NVMe fix for a regression with the smaller CQ update (Alexey)

 - NVMe fix for a hang at namespace scanning error recovery (Sagi)

 - fix race with blk-iocost iocg->abs_vdebt updates (Tejun)

* tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block:
  nvme: fix possible hang when ns scanning fails during error recovery
  nvme-pci: fix "slimmer CQ head update"
  bdi: add a ->dev_name field to struct backing_dev_info
  bdi: use bdi_dev_name() to get device name
  bdi: move bdi_dev_name out of line
  vboxsf: don't use the source name in the bdi name
  iocost: protect iocg->abs_vdebt with iocg->waitq.lock
2020-05-10 11:16:07 -07:00
Linus Torvalds e99332e7b4 gcc-10: mark more functions __init to avoid section mismatch warnings
It seems that for whatever reason, gcc-10 ends up not inlining a couple
of functions that used to be inlined before.  Even if they only have one
single callsite - it looks like gcc may have decided that the code was
unlikely, and not worth inlining.

The code generation difference is harmless, but caused a few new section
mismatch errors, since the (now no longer inlined) function wasn't in
the __init section, but called other init functions:

   Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap()

So add the appropriate __init annotation to make modpost not complain.
In both cases there were trivially just a single callsite from another
__init function.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 17:50:03 -07:00
Linus Torvalds 2e28f3b13a RISC-V Fixes for 5.7-rc5
This contains a smattering of fixes and cleanups that I'd like to target for
 5.7:
 
 * Dead code removal.
 * Exporting riscv_cpuid_to_hartid_mask for modules.
 * Per-CPU tracking of ISA features.
 * Setting max_pfn correctly when probing memory.
 * Adding a note to the VDSO so glibc can check the kernel's version without a
   uname().
 * A fix to force the bootloader to initialize the boot spin tables, which still
   get used as a fallback when SBI-0.1 is enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl61prsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYia2UD/44ILoaQySVnLZ+ZzXaMXn3WwGHe8bS
 NVPQJB21ejkfbM8cDR5A8+w45FBrHquIRwhHnVkl5JU2AtvcdWh3tztmFx6Ejsu9
 FFBzcbHcXnYthkm1xLVPQASY0Pl6VOPdx47Mip9gvoLK79VetjQWNzUpFk4CBJdw
 nObgYgxE9twCQ7JOcK0VnPL9IpJ6E/lCcIyCi11NL9xRWtUyWk4hcmAFj/+tUegm
 DroT7QzKKxFS24eLaRkJgQGwAJ1jb0/b0ztl04U8NTOqVjgFXkGTC1Kuzd06Ch2U
 U34CYRL+A2sXwWnnNsIyjD7Epdalc/xx+JMEuD8dhnr0YK8WilvvG53gGwCwFgVc
 wpFhvsIuINYTw253Rv0q1oeRcDmMCKmV7bhOKSX4x0V1iGM1ognl/6zkCY4J0dQC
 7BCoeAGlpBTNbidatZ6jl5e32jes50ZRjhf3LxXe3mgrBd+diKXyOyLT01SVwqv/
 A1Sur/KquwoqT4RSx2Cel8JswPhfErhB0otL3CYoao8V7rxYGTKWKXg5SFAgwDHZ
 rib1UpYmyh2tjmoXb99ctlBpRHsYcVzXOZS9tG7B2ue7YhEwiZdV3249uwitAQgm
 NmGCH7tDe/nu5DLBoFyTjBJ64pZyn3YmE58M/uCmbXyMRVSGp2TXK83u3mfiw+gh
 kKNSRHJDAAl7Fg==
 =bGU8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A smattering of fixes and cleanups:

   - Dead code removal.

   - Exporting riscv_cpuid_to_hartid_mask for modules.

   - Per-CPU tracking of ISA features.

   - Setting max_pfn correctly when probing memory.

   - Adding a note to the VDSO so glibc can check the kernel's version
     without a uname().

   - A fix to force the bootloader to initialize the boot spin tables,
     which still get used as a fallback when SBI-0.1 is enabled"

* tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Remove unused code from STRICT_KERNEL_RWX
  riscv: force __cpu_up_ variables to put in data section
  riscv: add Linux note to vdso
  riscv: set max_pfn to the PFN of the last page
  RISC-V: Remove N-extension related defines
  RISC-V: Add bitmap reprensenting ISA features common across CPUs
  RISC-V: Export riscv_cpuid_to_hartid_mask() API
2020-05-09 16:24:16 -07:00
Linus Torvalds 1a263ae60b gcc-10: avoid shadowing standard library 'free()' in crypto
gcc-10 has started warning about conflicting types for a few new
built-in functions, particularly 'free()'.

This results in warnings like:

   crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch]

because the crypto layer had its local freeing functions called
'free()'.

Gcc-10 is in the wrong here, since that function is marked 'static', and
thus there is no chance of confusion with any standard library function
namespace.

But the simplest thing to do is to just use a different name here, and
avoid this gcc mis-feature.

[ Side note: gcc knowing about 'free()' is in itself not the
  mis-feature: the semantics of 'free()' are special enough that a
  compiler can validly do special things when seeing it.

  So the mis-feature here is that gcc thinks that 'free()' is some
  restricted name, and you can't shadow it as a local static function.

  Making the special 'free()' semantics be a function attribute rather
  than tied to the name would be the much better model ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:58:04 -07:00
Linus Torvalds adc7192096 gcc-10: disable 'restrict' warning for now
gcc-10 now warns about passing aliasing pointers to functions that take
restricted pointers.

That's actually a great warning, and if we ever start using 'restrict'
in the kernel, it might be quite useful.  But right now we don't, and it
turns out that the only thing this warns about is an idiom where we have
declared a few functions to be "printf-like" (which seems to make gcc
pick up the restricted pointer thing), and then we print to the same
buffer that we also use as an input.

And people do that as an odd concatenation pattern, with code like this:

    #define sysfs_show_gen_prop(buffer, fmt, ...) \
        snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)

where we have 'buffer' as both the destination of the final result, and
as the initial argument.

Yes, it's a bit questionable.  And outside of the kernel, people do have
standard declarations like

    int snprintf( char *restrict buffer, size_t bufsz,
                  const char *restrict format, ... );

where that output buffer is marked as a restrict pointer that cannot
alias with any other arguments.

But in the context of the kernel, that 'use snprintf() to concatenate to
the end result' does work, and the pattern shows up in multiple places.
And we have not marked our own version of snprintf() as taking restrict
pointers, so the warning is incorrect for now, and gcc picks it up on
its own.

If we do start using 'restrict' in the kernel (and it might be a good
idea if people find places where it matters), we'll need to figure out
how to avoid this issue for snprintf and friends.  But in the meantime,
this warning is not useful.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:45:21 -07:00
Linus Torvalds 5a76021c2e gcc-10: disable 'stringop-overflow' warning for now
This is the final array bounds warning removal for gcc-10 for now.

Again, the warning is good, and we should re-enable all these warnings
when we have converted all the legacy array declaration cases to
flexible arrays. But in the meantime, it's just noise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:40:52 -07:00
Sagi Grimberg 59c7c3caaa nvme: fix possible hang when ns scanning fails during error recovery
When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da24343).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Alexey Dobriyan a8de663916 nvme-pci: fix "slimmer CQ head update"
Pre-incrementing ->cq_head can't be done in memory because OOB value
can be observed by another context.

This devalues space savings compared to original code :-\

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-32 (-32)
	Function                                     old     new   delta
	nvme_poll_irqdisable                         464     456      -8
	nvme_poll                                    455     447      -8
	nvme_irq                                     388     380      -8
	nvme_dev_disable                             955     947      -8

But the code is minimal now: one read for head, one read for q_depth,
one increment, one comparison, single instruction phase bit update and
one write for new head.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Fixes: e2a366a4b0 ("nvme-pci: slimmer CQ head update")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Christoph Hellwig 6bd87eec23 bdi: add a ->dev_name field to struct backing_dev_info
Cache a copy of the name for the life time of the backing_dev_info
structure so that we can reference it even after unregistering.

Fixes: 68f23b8906 ("memcg: fix a crash in wb_workfn when a device disappears")
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:57 -06:00
Yufen Yu d51cfc53ad bdi: use bdi_dev_name() to get device name
Use the common interface bdi_dev_name() to get device name.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Add missing <linux/backing-dev.h> include BFQ

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:39 -06:00
Linus Torvalds 44720996e2 gcc-10: disable 'array-bounds' warning for now
This is another fine warning, related to the 'zero-length-bounds' one,
but hitting the same historical code in the kernel.

Because C didn't historically support flexible array members, we have
code that instead uses a one-sized array, the same way we have cases of
zero-sized arrays.

The one-sized arrays come from either not wanting to use the gcc
zero-sized array extension, or from a slight convenience-feature, where
particularly for strings, the size of the structure now includes the
allocation for the final NUL character.

So with a "char name[1];" at the end of a structure, you can do things
like

       v = my_malloc(sizeof(struct vendor) + strlen(name));

and avoid the "+1" for the terminator.

Yes, the modern way to do that is with a flexible array, and using
'offsetof()' instead of 'sizeof()', and adding the "+1" by hand.  That
also technically gets the size "more correct" in that it avoids any
alignment (and thus padding) issues, but this is another long-term
cleanup thing that will not happen for 5.7.

So disable the warning for now, even though it's potentially quite
useful.  Having a slew of warnings that then hide more urgent new issues
is not an improvement.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 14:52:44 -07:00
Linus Torvalds 5c45de21a2 gcc-10: disable 'zero-length-bounds' warning for now
This is a fine warning, but we still have a number of zero-length arrays
in the kernel that come from the traditional gcc extension.  Yes, they
are getting converted to flexible arrays, but in the meantime the gcc-10
warning about zero-length bounds is very verbose, and is hiding other
issues.

I missed one actual build failure because it was hidden among hundreds
of lines of warning.  Thankfully I caught it on the second go before
pushing things out, but it convinced me that I really need to disable
the new warnings for now.

We'll hopefully be all done with our conversion to flexible arrays in
the not too distant future, and we can then re-enable this warning.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 14:30:29 -07:00
Linus Torvalds 78a5255ffb Stop the ad-hoc games with -Wno-maybe-initialized
We have some rather random rules about when we accept the
"maybe-initialized" warnings, and when we don't.

For example, we consider it unreliable for gcc versions < 4.9, but also
if -O3 is enabled, or if optimizing for size.  And then various kernel
config options disabled it, because they know that they trigger that
warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).

And now gcc-10 seems to be introducing a lot of those warnings too, so
it falls under the same heading as 4.9 did.

At the same time, we have a very straightforward way to _enable_ that
warning when wanted: use "W=2" to enable more warnings.

So stop playing these ad-hoc games, and just disable that warning by
default, with the known and straight-forward "if you want to work on the
extra compiler warnings, use W=123".

Would it be great to have code that is always so obvious that it never
confuses the compiler whether a variable is used initialized or not?
Yes, it would.  In a perfect world, the compilers would be smarter, and
our source code would be simpler.

That's currently not the world we live in, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 13:57:10 -07:00
Linus Torvalds 1d3962ae3b io_uring-5.7-2020-05-08
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl62HvYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptEAEACbuLfgFok0Vw8j7KNW0WNNKlS2o6nXQlW5
 cl95JsqYdSL+toiDPQnJFtdoaxMhzL90kbWZzvPTBj+yTpLzRX0YnwFqXwFfmrga
 gd/7SOM5C97F1LCPL+luhbgp5HUq+ZVH882KjMiOVLvjjAb4SeKSexQGoxeKvtcV
 Pg3xm+zsbKKvclRDEqhnZB1X93WFAIrufuKBuV5xMZar7lkeRS9zwBUHySXa00xF
 i7lbvDqtNn3itgNQd7VGSNCF5u4JxCUm73SumY3nDMFXBfvSNk0nUpFBpTYLjb7G
 0XY71tfWrBlbk1sssqr1Dbs+pRuxJRj9FgtfNAMid7gcK0L9k6n7v08cFxkIz4Sv
 XPHisD6QCOz7pZ5JwfdAp9Ea5g9z+QsN0G1Owr18fSgWwlgvhJ9rdd4H0Of7rWVj
 mGyF5f+ZqoLD2UhaEmLgjQoSvzPlb6rsAUL9SxgpZkg/mk5l0j5tk32JS5bJL8h5
 RTj0oeyqoVGKqnRy8heV/0z6TqcEtuNn/nOsht8adCgIUVpk95bkjTGBM900IK/X
 HhdJMqPlTEDXQic+ZxVYNHDTZFhq4UOVJkoDfEwIN971LZfUaiz8XZ6uG5m4rFqj
 iRmLN5XJNVNK52hNT1dLQyeQ4j3a5OnVGsvjZ33QLy2P6rCZd7yU6jKfsoL8JDEU
 uAzkaWqLjA==
 =YeXV
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix finish_wait() balancing in file cancelation (Xiaoguang)

 - Ensure early cleanup of resources in ring map failure (Xiaoguang)

 - Ensure IORING_OP_SLICE does the right file mode checks (Pavel)

 - Remove file opening from openat/openat2/statx, it's not needed and
   messes with O_PATH

* tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block:
  io_uring: don't use 'fd' for openat/openat2/statx
  splice: move f_mode checks to do_{splice,tee}()
  io_uring: handle -EFAULT properly in io_uring_setup()
  io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()
2020-05-09 12:02:09 -07:00
Linus Torvalds d5eeab8d7e SCSI fixes on 20200508
Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXrWUpiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaQbAQCus5pe
 D+e8jb8VzwbT+tr6HgPvaUOoSgrBJpXHy1oVFAEAwWuT9h4yHU8rsis5UR1MMh8F
 aTW9+wWDpdnTqoZN3iY=
 =fl/G
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ibmvscsi: Fix WARN_ON during event pool release
  scsi: ibmvfc: Don't send implicit logouts prior to NPIV login
  scsi: qla2xxx: Delete all sessions before unregister local nvme port
  scsi: qla2xxx: Fix hang when issuing nvme disconnect-all in NPIV
2020-05-08 10:36:56 -07:00
Linus Torvalds eb24fdd8e6 Fixes for an endianness handling bug that prevented mounts on
big-endian arches, a spammy log message and a couple error paths.
 Also included a MAINTAINERS update.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl61ktUTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi3yKB/9s0kZ7fLYtGzqtuoIjualsaM0lsBBS
 rWAN4BkIVsxp3eOd5Hdb+ngIY5ykLLcUd+4gKqUNHkB7/1upDq9ZURKlyTwel5Wy
 889YEYESCVQQxPVY9KNvafaPeuR++2r9Thlp9hWyczrtvXtz80sFIrtO9TwDrj1P
 ZXPN3lxppGlxQiVNQfKIw2Cs78OxaNu9BthXZ7jN2OGaMQ0NU6sZ4LRXz8rbY+od
 AbfLEfwz4dPHQ/44k3rQg2IWNuOxRK+CNayxhuN0KWzock3MzGVYoYkPx0wNLiDx
 rntMscBqh3kppILZPEIeIA5Nv0yDAf4tf2hcUDf7GoJT/L/f9v7Q2SHa
 =75Ca
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Fixes for an endianness handling bug that prevented mounts on
  big-endian arches, a spammy log message and a couple error paths.

  Also included a MAINTAINERS update"

* tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client:
  ceph: demote quotarealm lookup warning to a debug message
  MAINTAINERS: remove myself as ceph co-maintainer
  ceph: fix double unlock in handle_cap_export()
  ceph: fix special error code in ceph_try_get_caps()
  ceph: fix endianness bug when handling MDS session feature bits
2020-05-08 10:27:00 -07:00
Luis Henriques 12ae44a40a ceph: demote quotarealm lookup warning to a debug message
A misconfigured cephx can easily result in having the kernel client
flooding the logs with:

  ceph: Can't lookup inode 1 (err: -13)

Change this message to debug level.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/44546
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-08 18:44:40 +02:00
Linus Torvalds 4334f30ebf Char/Misc driver fixes for 5.7-rc5
Here are some small driver fixes for 5.7-rc5 that resolve a number of
 minor reported issues:
 	- mhi bus driver fixes found as people actually use the code
 	- phy driver fixes and compat string additions
 	- most driver fix due to link order changing when the core moved
 	  out of staging
 	- mei driver fix
 	- interconnect build warning fix
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVocw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynQ3wCaA6rSYBcPTAHNrGo0iuzan0sbAfsAnAkfjZPk
 s369btDipPKIBv2hoVwt
 =KzSN
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 5.7-rc5 that resolve a number of
  minor reported issues:

   - mhi bus driver fixes found as people actually use the code

   - phy driver fixes and compat string additions

   - most driver fix due to link order changing when the core moved out
     of staging

   - mei driver fix

   - interconnect build warning fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  bus: mhi: core: Fix channel device name conflict
  bus: mhi: core: Fix typo in comment
  bus: mhi: core: Offload register accesses to the controller
  bus: mhi: core: Remove link_status() callback
  bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails
  bus: mhi: Fix parsing of mhi_flags
  mei: me: disable mei interface on LBG servers.
  phy: qualcomm: usb-hs-28nm: Prepare clocks in init
  MAINTAINERS: Add Vinod Koul as Generic PHY co-maintainer
  interconnect: qcom: Move the static keyword to the front of declaration
  most: core: use function subsys_initcall()
  bus: mhi: core: Fix a NULL vs IS_ERR check in mhi_create_devices()
  phy: qcom-qusb2: Re add "qcom,sdm845-qusb2-phy" compat string
  phy: tegra: Select USB_COMMON for usb_get_maximum_speed()
2020-05-08 09:11:53 -07:00
Linus Torvalds c61529f6f5 Driver core fixes for 5.7-rc5
Here are a number of small driver core fixes for 5.7-rc5 to resolve a
 bunch of reported issues with the current tree.
 
 Biggest here are the reverts and patches from John Stultz to resolve a
 bunch of deferred probe regressions we have been seeing in 5.7-rc right
 now.
 
 Along with those are some other smaller fixes:
 	- coredump crash fix
 	- devlink fix for when permissive mode was enabled
 	- amba and platform device dma_parms fixes
 	- component error silenced for when deferred probe happens
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVnyg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylWBgCfbwjUbsDsHsrsVgWfOakIaoPUQ8IAmwetMKvS
 ny1Kq7Cia+2y2e+7fDyo
 =UKEM
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are a number of small driver core fixes for 5.7-rc5 to resolve a
  bunch of reported issues with the current tree.

  Biggest here are the reverts and patches from John Stultz to resolve a
  bunch of deferred probe regressions we have been seeing in 5.7-rc
  right now.

  Along with those are some other smaller fixes:

   - coredump crash fix

   - devlink fix for when permissive mode was enabled

   - amba and platform device dma_parms fixes

   - component error silenced for when deferred probe happens

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  regulator: Revert "Use driver_deferred_probe_timeout for regulator_init_complete_work"
  driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires
  driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings
  driver core: Revert default driver_deferred_probe_timeout value to 0
  component: Silence bind error on -EPROBE_DEFER
  driver core: Fix handling of fw_devlink=permissive
  coredump: fix crash when umh is disabled
  amba: Initialize dma_parms for amba devices
  driver core: platform: Initialize dma_parms for platform devices
2020-05-08 09:06:34 -07:00
Linus Torvalds e7a1c733fe Staging driver fixes for 5.7-rc5
Here are 3 small driver fixes for 5.7-rc5.
 
 Two of these are documentation fixes:
 	- MAINTAINERS update due to removed driver
 	- removing Wolfram from the ks7010 driver TODO file
 The other patch is a real fix:
 	- fix gasket driver to proper check the return value of a call
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVm5w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykXOACfXDClXrIByeSfMmVMbU/Kvi4Rh1YAoMV2tNh1
 VOTuOopJJvjlp2tINkgL
 =KSsk
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are three small driver fixes for 5.7-rc5.

  Two of these are documentation fixes:

   - MAINTAINERS update due to removed driver

   - removing Wolfram from the ks7010 driver TODO file

  The other patch is a real fix:

   - fix gasket driver to proper check the return value of a call

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: gasket: Check the return value of gasket_get_bar_index()
  staging: ks7010: remove me from CC list
  MAINTAINERS: remove entry after hp100 driver removal
2020-05-08 09:03:49 -07:00
Linus Torvalds cbd0e48213 TTY/Serial fixes for 5.7-rc5
Here are 3 small TTY/Serial/VT fixes for 5.7-rc5:
 	- revert for the bcm63xx driver "fix" that was incorrect
 	- vt unicode console bugfix
 	- xilinx_uartps console driver fix
 
 All of these have been in linux next with no reported issues
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVmRg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymY+ACfelBeBAxlYjuvZ8QpDYSkR9fl8EIAoKeuJocX
 TaXtUFCvCSax68siL81w
 =L0Rp
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are three small TTY/Serial/VT fixes for 5.7-rc5:

   - revert for the bcm63xx driver "fix" that was incorrect

   - vt unicode console bugfix

   - xilinx_uartps console driver fix

  All of these have been in linux next with no reported issues"

* tag 'tty-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: xilinx_uartps: Fix missing id assignment to the console
  vt: fix unicode console freeing with a common interface
  Revert "tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart"
2020-05-08 08:56:16 -07:00
Linus Torvalds 0a0b96b2e2 USB fixes for 5.7-rc5
Here are some small USB fixes for 5.7-rc5 to resolve some reported
 issues:
 	- syzbot found problems fixed
 	- usbfs dma mapping fix
 	- typec bugfixs
 	- chipidea bugfix
 	- usb4/thunderbolt fix
 	- new device ids/quirks
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVlxQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynyPwCgtPF0qX7DbP3RwhPGoy3YCPNlsXMAoJT2T+CH
 6MuNazbkAv6GcqAW/50i
 =Viyi
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 5.7-rc5 to resolve some reported
  issues:

   - syzbot found problems fixed

   - usbfs dma mapping fix

   - typec bugfixs

   - chipidea bugfix

   - usb4/thunderbolt fix

   - new device ids/quirks

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: chipidea: msm: Ensure proper controller reset using role switch API
  usb: typec: mux: intel: Handle alt mode HPD_HIGH
  usb: usbfs: correct kernel->user page attribute mismatch
  usb: typec: intel_pmc_mux: Fix the property names
  USB: core: Fix misleading driver bug report
  USB: serial: qcserial: Add DW5816e support
  USB: uas: add quirk for LaCie 2Big Quadra
  thunderbolt: Check return value of tb_sw_read() in usb4_switch_op()
  USB: serial: garmin_gps: add sanity checking for data length
2020-05-08 08:54:00 -07:00
Linus Torvalds 775a8e0316 drm fixes for 5.7-rc5
hdcp:
 - fix HDCP regression
 
 amdgpu:
 - Runtime PM fixes
 - DC fix for PPC
 - Misc DC fixes
 
 virtio:
 - fix context ordering issue
 
 sun4i:
 - old gcc warning fix
 
 ingenic-drm:
 - missing module support
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJetOwdAAoJEAx081l5xIa+7ewQAJTkOQL4mT0mmPBGKgMd8vDL
 vf0rdVIFcdU4cktP8aM+CDawTqazpnliQF7+JwFy/JAqlyGRxhTRO7OGAYKaqG4z
 bGG5PpZl17MvzfHSJjpPnherkBF8Afx55hcP0SH1yv6mShEtVNG3PWZijP/rMtam
 lSJNSHwPR9OgB4ikX2Ra6aOkdL7BOJjnO57yrLhsiKsOCpCzVflN2lG5LrlGrckJ
 AxbKOAKff/CBphzVpjNUPg2/6++KvixlZGKX+vVCrzyWdOgszuwJGApRh0VwlTU+
 rlr+wEjIcEEt0TqnLyTUdUJ7ddsJulARkojwkh4eD6fYeJ6n9h60PDV1mQZgR0as
 XnsmOMkfr/dX2RTGGMdEjlZ9gep5YB02VZ2JRSCMha1oN+kcceomNxo0KXMjgR2o
 erZhuIcitwC8N30pmpZPcrnGpEkhWsmxeJxigEDmTFegi6ZOKq3g45xCnR4GHash
 Hz78dZoauIbmMeKxp5VeG238GKyErnFXjsXEwaLhw3GOBDkkHtn3TFVHbiLGdRnR
 G4VQ3nU3SpH3dEGKCNrApBet/CYIk8tmR1rznqQuCnezz5YttekKpIaK481JbX7C
 dtSMy5St9SHAyW8uaWPGZoz80+mn4Best0x+2bY2rVLV092pE09ltojU2XAwTzmW
 00b8KizTlFigmV44Y6K1
 =0Axh
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Another pretty normal week. I didn't get any i915 fixes yet, so next
  week I'd expect double the usual i915, but otherwise a bunch of amdgpu
  and some scattered other fixes.

  hdcp:
   - fix HDCP regression

  amdgpu:
   - Runtime PM fixes
   - DC fix for PPC
   - Misc DC fixes

  virtio:
   - fix context ordering issue

  sun4i:
   - old gcc warning fix

  ingenic-drm:
   - missing module support"

* tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Prevent dpcd reads with passive dongles
  drm/amd/display: fix counter in wait_for_no_pipes_pending
  drm/amd/display: Update DCN2.1 DV Code Revision
  drm: Fix HDCP failures when SRM fw is missing
  sun6i: dsi: fix gcc-4.8
  drm: ingenic-drm: add MODULE_DEVICE_TABLE
  drm/virtio: create context before RESOURCE_CREATE_2D in 3D mode
  drm/amd/display: work around fp code being emitted outside of DC_FP_START/END
  drm/amdgpu/dc: Use WARN_ON_ONCE for ASSERT
  drm/amdgpu: drop redundant cg/pg ungate on runpm enter
  drm/amdgpu: move kfd suspend after ip_suspend_phase1
2020-05-08 08:49:34 -07:00
Linus Torvalds af38553c66 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 fixes and one selftest to verify the ipc fixes herein"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: limit boost_watermark on small zones
  ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST
  mm/vmscan: remove unnecessary argument description of isolate_lru_pages()
  epoll: atomically remove wait entry on wake up
  kselftests: introduce new epoll60 testcase for catching lost wakeups
  percpu: make pcpu_alloc() aware of current gfp context
  mm/slub: fix incorrect interpretation of s->offset
  scripts/gdb: repair rb_first() and rb_last()
  eventpoll: fix missing wakeup for ovflist in ep_poll_callback
  arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()
  scripts/decodecode: fix trapping instruction formatting
  kernel/kcov.c: fix typos in kcov_remote_start documentation
  mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
  mm, memcg: fix error return value of mem_cgroup_css_alloc()
  ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
2020-05-08 08:41:09 -07:00
Julia Lawall fb3637a113 iommu/virtio: Reverse arguments to list_add
Elsewhere in the file, there is a list_for_each_entry with
&vdev->resv_regions as the second argument, suggesting that
&vdev->resv_regions is the list head.  So exchange the
arguments on the list_add call to put the list head in the
second argument.

Fixes: 2a5a314874 ("iommu/virtio: Add probe request")
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/1588704467-13431-1-git-send-email-Julia.Lawall@inria.fr
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-08 17:31:18 +02:00
Dave Airlie a9fe6f18cd A few minor fixes for an ordering issue in virtio, an (old) gcc warning
in sun4i, a probe issue in ingenic-drm and a regression in the HDCP
 support.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXrQvBQAKCRDj7w1vZxhR
 xQuSAQCvccp3LESycSTuQU0GFlh+flhb8lBZJkfjr2RC6SUggAD/ZmHsHdYIsMNq
 PT7BmulDo9oRn1aHGzNY43K9U9W4Rgw=
 =uSqQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2020-05-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

A few minor fixes for an ordering issue in virtio, an (old) gcc warning
in sun4i, a probe issue in ingenic-drm and a regression in the HDCP
support.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507160130.id64niqgf5wsha4u@gilmour.lan
2020-05-08 15:04:25 +10:00
Dave Airlie c61b0b97ef Merge tag 'amd-drm-fixes-5.7-2020-05-06' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.7-2020-05-06:

amdgpu:
- Runtime PM fixes
- DC fix for PPC
- Misc DC fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506212257.3893-1-alexander.deucher@amd.com
2020-05-08 13:31:39 +10:00
Linus Torvalds 79dede78c0 Merge branch 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem fix from James Morris:
 "Fix the default value of fs_context_parse_param hook"

* 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix the default value of fs_context_parse_param hook
2020-05-07 19:43:13 -07:00
Henry Willard 14f69140ff mm: limit boost_watermark on small zones
Commit 1c30844d2d ("mm: reclaim small amounts of memory when an
external fragmentation event occurs") adds a boost_watermark() function
which increases the min watermark in a zone by at least
pageblock_nr_pages or the number of pages in a page block.

On Arm64, with 64K pages and 512M huge pages, this is 8192 pages or
512M.  It does this regardless of the number of managed pages managed in
the zone or the likelihood of success.

This can put the zone immediately under water in terms of allocating
pages from the zone, and can cause a small machine to fail immediately
due to OoM.  Unlike set_recommended_min_free_kbytes(), which
substantially increases min_free_kbytes and is tied to THP,
boost_watermark() can be called even if THP is not active.

The problem is most likely to appear on architectures such as Arm64
where pageblock_nr_pages is very large.

It is desirable to run the kdump capture kernel in as small a space as
possible to avoid wasting memory.  In some architectures, such as Arm64,
there are restrictions on where the capture kernel can run, and
therefore, the space available.  A capture kernel running in 768M can
fail due to OoM immediately after boost_watermark() sets the min in zone
DMA32, where most of the memory is, to 512M.  It fails even though there
is over 500M of free memory.  With boost_watermark() suppressed, the
capture kernel can run successfully in 448M.

This patch limits boost_watermark() to boosting a zone's min watermark
only when there are enough pages that the boost will produce positive
results.  In this case that is estimated to be four times as many pages
as pageblock_nr_pages.

Mel said:

: There is no harm in marking it stable.  Clearly it does not happen very
: often but it's not impossible.  32-bit x86 is a lot less common now
: which would previously have been vulnerable to triggering this easily.
: ppc64 has a larger base page size but typically only has one zone.
: arm64 is likely the most vulnerable, particularly when CMA is
: configured with a small movable zone.

Fixes: 1c30844d2d ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-07 19:27:21 -07:00