Commit e5a03bfd87 ("phy: Add an mdio_device structure")
introduced a spurious trailing semicolon. Remove it.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5c0338c687 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered") automatically enabled ordered attribute for unbound
workqueues w/ max_active == 1. Because ordered workqueues reject
max_active and some attribute changes, this implicit ordered mode
broke cases where the user creates an unbound workqueue w/ max_active
== 1 and later explicitly changes the related attributes.
This patch distinguishes explicit and implicit ordered setting and
overrides from attribute changes if implict.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 5c0338c687 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
Paul Moore reported a SELinux/IP_PASSSEC regression
caused by missing skb->sp at recvmsg() time. We need to
preserve the skb head state to process the IP_CMSG_PASSSEC
cmsg.
With this commit we avoid releasing the skb head state in the
BH even if a secpath is attached to the current skb, and stores
the skb status (with/without head states) in the scratch area,
so that we can access it at skb deallocation time, without
incurring in cache-miss penalties.
This also avoids misusing the skb CB for ipv6 packets,
as introduced by the commit 0ddf3fb2c4 ("udp: preserve
skb->dst if required for IP options processing").
Clean a bit the scratch area helpers implementation, to
reduce the code differences between 32 and 64 bits build.
Reported-by: Paul Moore <paul@paul-moore.com>
Fixes: 0a463c78d2 ("udp: avoid a cache miss on dequeue")
Fixes: 0ddf3fb2c4 ("udp: preserve skb->dst if required for IP options processing")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FC-NVME spec hasn't locked down on the format string for TRADDR.
Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
where the wwn's are hex values but not prefixed by 0x.
Most implementations so far expect a string format of
"nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
uses the match_u64 parser which requires a leading 0x prefix to set
the base properly. If it's not there, a match will either fail or return
a base 10 value.
The resolution in T11 is pushing out. Therefore, to fix things now and
to cover any eventuality and any implementations already in the field,
this patch adds support for both formats.
The change consists of replacing the token matching routine with a
routine that validates the fixed string format, and then builds
a local copy of the hex name with a 0x prefix before calling
the system parser.
Note: the same parser routine exists in both the initiator and target
transports. Given this is about the only "shared" item, we chose to
replicate rather than create an interdendency on some shared code.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fabrics commands with opcode 0x7F use the fctype field to indicate data
direction.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Sagi Grimberg <sai@grmberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
kvm_pmu_overflow_set() is called from perf's interrupt handler,
making the call of kvm_vgic_inject_irq() from it introduced with
"KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad
idea, as it's quite easy to try and retake a lock that the
interrupted context is already holding. The fix is to use a vcpu
kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(),
like it was doing before the refactoring. We don't just revert,
though, because before the kick was request-less, leaving the vcpu
exposed to the request-less vcpu kick race, and also because the
kick was used unnecessarily from register access handlers.
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
A couple of kerneldoc comments in <linux/wait.h> had incorrect names for
macro parameters, with this unsightly result:
./include/linux/wait.h:555: warning: No description found for parameter 'wq'
./include/linux/wait.h:555: warning: Excess function parameter 'wq_head' description in 'wait_event_interruptible_hrtimeout'
./include/linux/wait.h:759: warning: No description found for parameter 'wq_head'
./include/linux/wait.h:759: warning: Excess function parameter 'wq' description in 'wait_event_killable'
Correct the comments and kill the warnings.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/20170724135800.769c4042@lwn.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The last (4th) argument of tcp_rcv_established() is redundant as it
always equals to skb->len and the skb itself is always passed as 2th
agrument. There is no reason to have it.
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIVAwUAWXJGrPSw1s6N8H32AQK/5A/5AcdMLnejfV4r4qQvqKmv/M3w6Lt9P1qY
sQJyWUhUPATptttj0rdtkHh9n5exOkBpPE/pBrwxoSXe0rm8oa1xO96UWsdDQWn1
DwILipqyTQ9HHNESoY9XaBpPy7bTNRVUXOcVTXLqVuSozkiZgINic4uq/q8pVonB
NRUULZPdcxmETUhZyBzloV2afY1pv287Rz5vRm8PUnRZmVK26lHylFi75Eywblju
nw3N+McPe846Tc5qIFyj3b9VdMtzFA/py7GkrWPeHRmVHdZOviH9rQ++KkiBCUAz
hQl/YaSKCGbTL9KU/B4E2dz3VnL48p3AVxQusCA5BExOO+HIDiCrenti3JMpEXLN
gt29rD4AEyxBYbocJHpXNRxARxzDmBAmaw4tRC1Aw57MXomV5uMm/jKH/f646sYe
S7ohnngaeWRwMa4JfxgNdf+NEenUwm/06tTSYrwYynWpjJDanI0xQDLgBYKR8SYp
YoYLAv1tduMXcX7JjSWq2lPn6WvDnSZzRWOpJPHeFJcaGEcaYer5Qw8Yr/ZgOVxm
0xz3wgZtckIfi2d6NcSybEvIPv5jI2BLhxqgpvxxfW95NdDXcsyQCycugX69Jg3o
Zar4qJSFCBtC86KDOkL008X7fv/I27yb+nm5EcerC8stO6GymqfZVo6f9wEIhb6W
P+rtLI3zYco=
=8ZR+
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-rewrite-20170721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Rearrange headers
Here's a pair of patches that rearrange some of the AF_RXRPC header files
that are outside of the net/rxrpc/ directory:
(1) The bits userspace need are moved to uapi/linux/rxrpc.h. [Should this
be af_rxrpc.h instead, I wonder - but there doesn't seem to be
precedent for that in the other net UAPI headers.]
(2) For the most part, the contents of rxrpc/packet.h are no longer used
outside of the AF_RXRPC module, so move them to net/rxrpc/protocol.h
with the exception of the standard abort codes which are exposed to
userspace when an abort occurs and the security index values which are
needed when constructing keys.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_abort_chunk_t, and
replace with struct sctp_abort_chunk in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_heartbeat_chunk_t, and
replace with struct sctp_heartbeat_chunk in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_heartbeathdr_t, and
replace with struct sctp_heartbeathdr in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_sack_chunk_t, and
replace with struct sctp_sack_chunk in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_sackhdr_t, and replace
with struct sctp_sackhdr in the places where it's using this
typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_sack_variable_t, and
replace with union sctp_sack_variable in the places where it's
using this typedef.
It is also to fix some indents in sctp_acked().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_dup_tsn_t, and replace
with __be32 in the places where it's using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_gap_ack_block_t, and
replace with struct sctp_gap_ack_block in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_unrecognized_param_t, and
replace with struct sctp_unrecognized_param in the places where it's
using this typedef.
It is also to fix some indents in sctp_sf_do_unexpected_init() and
sctp_sf_do_5_1B_init().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_cookie_param_t, and
replace with struct sctp_cookie_param in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_initack_chunk_t, and
replace with struct sctp_initack_chunk in the places where it's
using this typedef.
It is also to use sizeof(variable) instead of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new NETDEV_UDP_TUNNEL_DROP_INFO event, similar to
NETDEV_UDP_TUNNEL_PUSH_INFO, to signal to un-offload ports.
This also adds udp_tunnel_drop_rx_port(), which calls
ndo_udp_tunnel_del.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new netdevice feature, so that the offloading of RX port for
UDP tunnels can be disabled by the administrator on some netdevices,
using the "rx-udp_tunnel-port-offload" feature in ethtool.
This feature is set for all devices that provide ndo_udp_tunnel_add.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now if a file includes acpi_numa.h and they don't happen to include
linux/numa.h before it, they get the following warning:
./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef]
#if MAX_NUMNODES > 256
^~~~~~~~~~~~
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When setting up the Xenstore watch for the memory target size the new
watch will fire at once. Don't try to reach the configured target size
by onlining new memory in this case, as the current memory size will
be smaller in almost all cases due to e.g. BIOS reserved pages.
Onlining new memory will lead to more problems e.g. undesired conflicts
with NVMe devices meant to be operated as block devices.
Instead remember the difference between target size and current size
when the watch fires for the first time and apply it to any further
size changes, too.
In order to avoid races between balloon.c and xen-balloon.c init calls
do the xen-balloon.c initialization from balloon.c.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Here are some small tty and serial driver fixes for 4.13-rc2. Nothing
huge at all, a revert of a patch that turned out to break things, a fix
up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition
correct, and a few minor serial driver fixes for reported issues.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWXMebA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn9MgCdGsIEQlTvTeG4QikxUPB7dkEJQacAoLWOKJzQ
A65mBAeOfiJztczi8uo4
=KEXp
-----END PGP SIGNATURE-----
Merge tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty and serial driver fixes for 4.13-rc2. Nothing
huge at all, a revert of a patch that turned out to break things, a
fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi
definition correct, and a few minor serial driver fixes for reported
issues.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: Fix TIOCGPTPEER ioctl definition
tty: hide unused pty_get_peer function
tty: serial: lpuart: Fix the logic for detecting the 32-bit type UART
serial: imx: Prevent TX buffer PIO write when a DMA has been started
Revert "serial: imx-serial - move DMA buffer configuration to DT"
serial: sh-sci: Uninitialized variables in sysfs files
serial: st-asc: Potential error pointer dereference
Here are some small USB fixes for 4.13-rc2.
The usual batch, gadget fixes for reported issues, as well as xhci
fixes, and a small random collection of other fixes for reported issues.
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWXMcIA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynwPwCguVnh/HQMcPyJahg0OQsWSA9lROYAoI+45qMv
bCrf83L1IWkvAcSd8BR2
=YVGJ
-----END PGP SIGNATURE-----
Merge tag 'usb-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 4.13-rc2.
The usual batch, gadget fixes for reported issues, as well as xhci
fixes, and a small random collection of other fixes for reported
issues.
All have been in linux-next with no reported issues"
* tag 'usb-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
xhci: fix memleak in xhci_run()
usb: xhci: fix spinlock recursion for USB2 test mode
xhci: fix 20000ms port resume timeout
usb: xhci: Issue stop EP command only when the EP state is running
xhci: Bad Ethernet performance plugged in ASM1042A host
xhci: Fix NULL pointer dereference when cleaning up streams for removed host
usb: renesas_usbhs: gadget: disable all eps when the driver stops
usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL
usb: gadget: udc: renesas_usb3: protect usb3_ep->started in usb3_start_pipen()
usb: gadget: udc: renesas_usb3: fix zlp transfer by the dmac
usb: gadget: udc: renesas_usb3: fix free size in renesas_usb3_dma_free_prd()
usb: gadget: f_uac2: endianness fixes.
usb: gadget: f_uac1: endianness fixes.
include: usb: audio: specify exact endiannes of descriptors
usb: gadget: udc: start_udc() can be static
usb: dwc2: gadget: On USB RESET reset device address to zero
usb: storage: return on error to avoid a null pointer dereference
usb: typec: include linux/device.h in ucsi.h
USB: cdc-acm: add device-id for quirky printer
usb: dwc3: gadget: only unmap requests from DMA if mapped
...
Pull block fixes from Jens Axboe:
"A small set of fixes for -rc2 - two fixes for BFQ, documentation and
code, and a removal of an unused variable in nbd. Outside of that, a
small collection of fixes from the usual crew on the nvme side"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvmet: don't report 0-bytes in serial number
nvmet: preserve controller serial number between reboots
nvmet: Move serial number from controller to subsystem
nvmet: prefix version configfs file with attr
nvme-pci: Fix an error handling path in 'nvme_probe()'
nvme-pci: Remove nvme_setup_prps BUG_ON
nvme-pci: add another device ID with stripe quirk
nvmet-fc: fix byte swapping in nvmet_fc_ls_create_association
nvme: fix byte swapping in the streams code
nbd: kill unused ret in recv_work
bfq: dispatch request to prevent queue stalling after the request completion
bfq: fix typos in comments about B-WF2Q+ algorithm
- i40iw fixes
- bnxt_re fixes
- Dan Carpenter bugfixes across stack
- 10 more random fixes, no more than 2 from any 1 person
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZcPDWAAoJELgmozMOVy/df2AQAJJj6rJGqzBWtqpAZyZh1hAp
hF5txSFj52JDNWfYxqqohMIjFElVxIbmC2BExiahxMS8KZ+u6DsMZbRCkilTdfMb
v9We+KC+XsKdPM8SUQ5kmOXmP+iA6gJrnXMkfdECIZk2Q2BQdrbgo8Ah4SK1phlO
9HpzQlc9a/46bs6Hao58n7kOsENpN7R9bHEHpwFbmsUVqiZ4Y3SRA7B8dLqC5YGk
375IbJd3C7UaZOEJu4o9wdWdMGErziTP8pI5cikdQp3DS7MFfnwpPvsbTrpXqvCk
Kw+33j2AwLKoq55msTfodM7eosgWOoRjRGB6Wa4udmSsC9ZFE2RwttvGeqbLPhLd
Tv4cxk+vh++5Xhaz6Y1u6OqQfAPJLSChwphKAY3telhlOid99jDrqE24hDi2XSXF
m0pFZI7T6dfI66sS16kGuSCe1nz+QqxwjC93sIPOqDGqJgB+vSBXmcl+GWFkfZ8d
UvuqqpcBwLS6i5nH+TI7MQ9rqe0c+tD2jkRFdTKffpfJ3u8RyOVXO2HYgN203kjk
dKxJ5FX+cZUHQrs9tbJ3ARtBRqEibjWF0UasEEvQv2ORUUGk7tAhL46HJZpI5O8z
uKYQGSCfBsZ6fN1+gEbM9RZLLJ6aDbJmVIOyM+ll8hM5yemZ49D5dsByO4Rhx2ay
ZkfR57LbyoVAn+R/jylR
=3zwC
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma fixes from Doug Ledford:
"As per my previous pull request, there were two drivers that each had
a rather large number of legitimate fixes still to be sent.
As it turned out, I also missed a reasonably large set of fixes from
one person across the stack that are all important fixes. All in all,
the bnxt_re, i40iw, and Dan Carpenter are 3/4 to 2/3rds of this pull
request.
There were some other random fixes that I didn't send in the last pull
request that I added to this one. This catches the rdma stack up to
the fixes from up to about the beginning of this week. Any more fixes
I'll wait and batch up later in the -rc cycle. This will give us a
good base to start with for basing a for-next branch on -rc2.
Summary:
- i40iw fixes
- bnxt_re fixes
- Dan Carpenter bugfixes across stack
- ten more random fixes, no more than two from any one person"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
RDMA/core: Initialize port_num in qp_attr
RDMA/uverbs: Fix the check for port number
IB/cma: Fix reference count leak when no ipv4 addresses are set
RDMA/iser: don't send an rkey if all data is written as immadiate-data
rxe: fix broken receive queue draining
RDMA/qedr: Prevent memory overrun in verbs' user responses
iw_cxgb4: don't use WR keys/addrs for 0 byte reads
IB/mlx4: Fix CM REQ retries in paravirt mode
IB/rdmavt: Setting of QP timeout can overflow jiffies computation
IB/core: Fix sparse warnings
RDMA/bnxt_re: Fix the value reported for local ack delay
RDMA/bnxt_re: Report MISSED_EVENTS in req_notify_cq
RDMA/bnxt_re: Fix return value of poll routine
RDMA/bnxt_re: Enable atomics only if host bios supports
RDMA/bnxt_re: Specify RDMA component when allocating stats context
RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP
RDMA/bnxt_re: Report supported value to IB stack in query_device
RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error
RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZcYMKAAoJEAx081l5xIa+J98QAIAQycvxK1NSRv2tpeME4qdG
7BfY8VYFnrgSicC16YouooMk6gEXP8I+gVWmUDv0btl3fmi6bSZcU2zVjfdDr9Rv
CjlIR3AOGwRnxOzlRt7t8Fn5OlZtFsisoh9Yz6JpPqmNRnwf/+0NaEu8BrlFy1Uy
MO/As2Tx3llIs6MEr+XUSFnJAmuqnibGg30++shSmHQSFIZVPHovYqpiHSzGnovE
6GS7lSUXzu/mKN34AF2M8JXYHvdEQWeasTGKdKOPzhEliPsnPfyf0cf4zdoxPOnM
EmLYWl0F/tvVH1c8yHxgsJ7C6kOBdltokAYLR9Sgc3NxpgGiOxdLBFIyF1odpkp3
Qb1wB6ZdHXTzoSGAkhXA2w/cKvLGNKUMHS6CwgGKc5+zIroTbrgEE23H32NeZ0zT
guS9EgCzX3wk2P5IHeiEV5EK9POs2txa/EJLbc4L4d97B08fmOdtbGoM0XS4aLQ4
7TZ34yVmn8hG5auLHv79w1cRSgjlIhNQCiY+dTLds3u5ixLqPgzxbM/p8m4DjrAU
CTh3LuzC3o8F9AIaXhJIuTwgb5VUSJAnGnv7iU3pgdYbBaLse4PgqARhDpziC3ew
PBB3RzjiEY+gv7ErIWbSHwZq1xiToPan0u45rUmJlF0p7zI2s7xOr2UOoeBlcsPF
+sN5uEq3ohsMcBmpibQH
=L28s
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.13-rc2' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A bunch of fixes for rc2: two imx regressions, vc4 fix, dma-buf fix,
some displayport mst fixes, and an amdkfd fix.
Nothing too crazy, I assume we just haven't see much rc1 testing yet"
* tag 'drm-fixes-for-v4.13-rc2' of git://people.freedesktop.org/~airlied/linux:
drm/mst: Avoid processing partially received up/down message transactions
drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req()
drm/mst: Fix error handling during MST sideband message reception
drm/imx: parallel-display: Accept drm_of_find_panel_or_bridge failure
drm/imx: fix typo in ipu_plane_formats[]
drm/vc4: Fix VBLANK handling in crtc->enable() path
dma-buf/fence: Avoid use of uninitialised timestamp
drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec
drm/radeon: Remove initialization of shared_resources.num_mec
drm/amdkfd: Remove unused references to shared_resources.num_mec
drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
- Use of the new GFP_RETRY_MAYFAIL to be more aggressive in allocating
memory for the ring buffer without causing OOMs
- Fix a memory leak in adding and removing instances
- Add __rcu annotation to be able to debug RCU usage of function
tracing a bit better.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJZcf52FBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
Vg4H/0DxsgqsGehOhbIu/W6JLJo+q+jNUbfFfvpIDvraZ8z7bC+6SORdgMEV7uXt
EMISWnzy9Wv9E361ZLgUaODwbimnqdUeFYzE4f4ggE1+eFhZKAY5Lo0UDcctwNoq
/kcOPr51aW8+Tzdu6UtymVsnXykuJo3mIPGFzsKQju8ykcl/dXIdiFAMvVmiNxsG
/Rv9yGhYDYm61pj3JyP9pgICYTI/7jtatKhoVZBxI/ji0hWNAnZfF89k0VeU9vpY
xsK/d9n84o+kPsuh8hIMVKUUPRoeamDuxpMa+Rf37Vm6aQyzNIXDtNdo3mdfocpg
uXLxNxYcmDmRXawR5EkF2cCGIl0=
=FjNl
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Three minor updates
- Use the new GFP_RETRY_MAYFAIL to be more aggressive in allocating
memory for the ring buffer without causing OOMs
- Fix a memory leak in adding and removing instances
- Add __rcu annotation to be able to debug RCU usage of function
tracing a bit better"
* tag 'trace-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
trace: fix the errors caused by incompatible type of RCU variables
tracing: Fix kmemleak in instance_rmdir
tracing/ring_buffer: Try harder to allocate
Move the protocol description header file into net/rxrpc/ and rename it to
protocol.h. It's no longer necessary to expose it as packets are no longer
exposed to kernel services (such as AFS) that use the facility.
The abort codes are transferred to the UAPI header instead as we pass these
back to userspace and also to kernel services.
Signed-off-by: David Howells <dhowells@redhat.com>
Move UAPI definitions from the internal header and place them in a UAPI
header file so that userspace can make use of them.
Signed-off-by: David Howells <dhowells@redhat.com>
Core Changes:
- fence: Introduce new fence flag to signify timestamp is populated (Chris)
- mst: Avoid processing incomplete data + fix NULL dereference (Imre)
Driver Changes:
- vc4: Avoid WARN from grabbing a ref from vblank that's not on (Boris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Imre Deak <imre.deak@intel.com>
* tag 'drm-misc-fixes-2017-07-20' of git://anongit.freedesktop.org/git/drm-misc:
drm/mst: Avoid processing partially received up/down message transactions
drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req()
drm/mst: Fix error handling during MST sideband message reception
drm/vc4: Fix VBLANK handling in crtc->enable() path
dma-buf/fence: Avoid use of uninitialised timestamp
Pull networking fixes from David Miller:
1) BPF verifier signed/unsigned value tracking fix, from Daniel
Borkmann, Edward Cree, and Josef Bacik.
2) Fix memory allocation length when setting up calls to
->ndo_set_mac_address, from Cong Wang.
3) Add a new cxgb4 device ID, from Ganesh Goudar.
4) Fix FIB refcount handling, we have to set it's initial value before
the configure callback (which can bump it). From David Ahern.
5) Fix double-free in qcom/emac driver, from Timur Tabi.
6) A bunch of gcc-7 string format overflow warning fixes from Arnd
Bergmann.
7) Fix link level headroom tests in ip_do_fragment(), from Vasily
Averin.
8) Fix chunk walking in SCTP when iterating over error and parameter
headers. From Alexander Potapenko.
9) TCP BBR congestion control fixes from Neal Cardwell.
10) Fix SKB fragment handling in bcmgenet driver, from Doug Berger.
11) BPF_CGROUP_RUN_PROG_SOCK_OPS needs to check for null __sk, from Cong
Wang.
12) xmit_recursion in ppp driver needs to be per-device not per-cpu,
from Gao Feng.
13) Cannot release skb->dst in UDP if IP options processing needs it.
From Paolo Abeni.
14) Some netdev ioctl ifr_name[] NULL termination fixes. From Alexander
Levin and myself.
15) Revert some rtnetlink notification changes that are causing
regressions, from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
net: bonding: Fix transmit load balancing in balance-alb mode
rds: Make sure updates to cp_send_gen can be observed
net: ethernet: ti: cpsw: Push the request_irq function to the end of probe
ipv4: initialize fib_trie prior to register_netdev_notifier call.
rtnetlink: allocate more memory for dev_set_mac_address()
net: dsa: b53: Add missing ARL entries for BCM53125
bpf: more tests for mixed signed and unsigned bounds checks
bpf: add test for mixed signed and unsigned bounds checks
bpf: fix up test cases with mixed signed/unsigned bounds
bpf: allow to specify log level and reduce it for test_verifier
bpf: fix mixed signed/unsigned derived min/max value bounds
ipv6: avoid overflow of offset in ip6_find_1stfragopt
net: tehuti: don't process data if it has not been copied from userspace
Revert "rtnetlink: Do not generate notifications for CHANGEADDR event"
net: dsa: mv88e6xxx: Enable CMODE config support for 6390X
dt-binding: ptp: Add SoC compatibility strings for dte ptp clock
NET: dwmac: Make dwmac reset unconditional
net: Zero terminate ifr_name in dev_ifname().
wireless: wext: terminate ifr name coming from userspace
netfilter: fix netfilter_net_init() return
...
Edward reported that there's an issue in min/max value bounds
tracking when signed and unsigned compares both provide hints
on limits when having unknown variables. E.g. a program such
as the following should have been rejected:
0: (7a) *(u64 *)(r10 -8) = 0
1: (bf) r2 = r10
2: (07) r2 += -8
3: (18) r1 = 0xffff8a94cda93400
5: (85) call bpf_map_lookup_elem#1
6: (15) if r0 == 0x0 goto pc+7
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
7: (7a) *(u64 *)(r10 -16) = -8
8: (79) r1 = *(u64 *)(r10 -16)
9: (b7) r2 = -1
10: (2d) if r1 > r2 goto pc+3
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0
R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
11: (65) if r1 s> 0x1 goto pc+2
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0,max_value=1
R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
12: (0f) r0 += r1
13: (72) *(u8 *)(r0 +0) = 0
R0=map_value_adj(ks=8,vs=8,id=0),min_value=0,max_value=1 R1=inv,min_value=0,max_value=1
R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
14: (b7) r0 = 0
15: (95) exit
What happens is that in the first part ...
8: (79) r1 = *(u64 *)(r10 -16)
9: (b7) r2 = -1
10: (2d) if r1 > r2 goto pc+3
... r1 carries an unsigned value, and is compared as unsigned
against a register carrying an immediate. Verifier deduces in
reg_set_min_max() that since the compare is unsigned and operation
is greater than (>), that in the fall-through/false case, r1's
minimum bound must be 0 and maximum bound must be r2. Latter is
larger than the bound and thus max value is reset back to being
'invalid' aka BPF_REGISTER_MAX_RANGE. Thus, r1 state is now
'R1=inv,min_value=0'. The subsequent test ...
11: (65) if r1 s> 0x1 goto pc+2
... is a signed compare of r1 with immediate value 1. Here,
verifier deduces in reg_set_min_max() that since the compare
is signed this time and operation is greater than (>), that
in the fall-through/false case, we can deduce that r1's maximum
bound must be 1, meaning with prior test, we result in r1 having
the following state: R1=inv,min_value=0,max_value=1. Given that
the actual value this holds is -8, the bounds are wrongly deduced.
When this is being added to r0 which holds the map_value(_adj)
type, then subsequent store access in above case will go through
check_mem_access() which invokes check_map_access_adj(), that
will then probe whether the map memory is in bounds based
on the min_value and max_value as well as access size since
the actual unknown value is min_value <= x <= max_value; commit
fce366a9dd ("bpf, verifier: fix alu ops against map_value{,
_adj} register types") provides some more explanation on the
semantics.
It's worth to note in this context that in the current code,
min_value and max_value tracking are used for two things, i)
dynamic map value access via check_map_access_adj() and since
commit 06c1c04972 ("bpf: allow helpers access to variable memory")
ii) also enforced at check_helper_mem_access() when passing a
memory address (pointer to packet, map value, stack) and length
pair to a helper and the length in this case is an unknown value
defining an access range through min_value/max_value in that
case. The min_value/max_value tracking is /not/ used in the
direct packet access case to track ranges. However, the issue
also affects case ii), for example, the following crafted program
based on the same principle must be rejected as well:
0: (b7) r2 = 0
1: (bf) r3 = r10
2: (07) r3 += -512
3: (7a) *(u64 *)(r10 -16) = -8
4: (79) r4 = *(u64 *)(r10 -16)
5: (b7) r6 = -1
6: (2d) if r4 > r6 goto pc+5
R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512
R4=inv,min_value=0 R6=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
7: (65) if r4 s> 0x1 goto pc+4
R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512
R4=inv,min_value=0,max_value=1 R6=imm-1,max_value=18446744073709551615,min_align=1
R10=fp
8: (07) r4 += 1
9: (b7) r5 = 0
10: (6a) *(u16 *)(r10 -512) = 0
11: (85) call bpf_skb_load_bytes#26
12: (b7) r0 = 0
13: (95) exit
Meaning, while we initialize the max_value stack slot that the
verifier thinks we access in the [1,2] range, in reality we
pass -7 as length which is interpreted as u32 in the helper.
Thus, this issue is relevant also for the case of helper ranges.
Resetting both bounds in check_reg_overflow() in case only one
of them exceeds limits is also not enough as similar test can be
created that uses values which are within range, thus also here
learned min value in r1 is incorrect when mixed with later signed
test to create a range:
0: (7a) *(u64 *)(r10 -8) = 0
1: (bf) r2 = r10
2: (07) r2 += -8
3: (18) r1 = 0xffff880ad081fa00
5: (85) call bpf_map_lookup_elem#1
6: (15) if r0 == 0x0 goto pc+7
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
7: (7a) *(u64 *)(r10 -16) = -8
8: (79) r1 = *(u64 *)(r10 -16)
9: (b7) r2 = 2
10: (3d) if r2 >= r1 goto pc+3
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
11: (65) if r1 s> 0x4 goto pc+2
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0
R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
12: (0f) r0 += r1
13: (72) *(u8 *)(r0 +0) = 0
R0=map_value_adj(ks=8,vs=8,id=0),min_value=3,max_value=4
R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
14: (b7) r0 = 0
15: (95) exit
This leaves us with two options for fixing this: i) to invalidate
all prior learned information once we switch signed context, ii)
to track min/max signed and unsigned boundaries separately as
done in [0]. (Given latter introduces major changes throughout
the whole verifier, it's rather net-next material, thus this
patch follows option i), meaning we can derive bounds either
from only signed tests or only unsigned tests.) There is still the
case of adjust_reg_min_max_vals(), where we adjust bounds on ALU
operations, meaning programs like the following where boundaries
on the reg get mixed in context later on when bounds are merged
on the dst reg must get rejected, too:
0: (7a) *(u64 *)(r10 -8) = 0
1: (bf) r2 = r10
2: (07) r2 += -8
3: (18) r1 = 0xffff89b2bf87ce00
5: (85) call bpf_map_lookup_elem#1
6: (15) if r0 == 0x0 goto pc+6
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
7: (7a) *(u64 *)(r10 -16) = -8
8: (79) r1 = *(u64 *)(r10 -16)
9: (b7) r2 = 2
10: (3d) if r2 >= r1 goto pc+2
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
11: (b7) r7 = 1
12: (65) if r7 s> 0x0 goto pc+2
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,max_value=0 R10=fp
13: (b7) r0 = 0
14: (95) exit
from 12 to 15: R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0
R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,min_value=1 R10=fp
15: (0f) r7 += r1
16: (65) if r7 s> 0x4 goto pc+2
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp
17: (0f) r0 += r7
18: (72) *(u8 *)(r0 +0) = 0
R0=map_value_adj(ks=8,vs=8,id=0),min_value=4,max_value=4 R1=inv,min_value=3
R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp
19: (b7) r0 = 0
20: (95) exit
Meaning, in adjust_reg_min_max_vals() we must also reset range
values on the dst when src/dst registers have mixed signed/
unsigned derived min/max value bounds with one unbounded value
as otherwise they can be added together deducing false boundaries.
Once both boundaries are established from either ALU ops or
compare operations w/o mixing signed/unsigned insns, then they
can safely be added to other regs also having both boundaries
established. Adding regs with one unbounded side to a map value
where the bounded side has been learned w/o mixing ops is
possible, but the resulting map value won't recover from that,
meaning such op is considered invalid on the time of actual
access. Invalid bounds are set on the dst reg in case i) src reg,
or ii) in case dst reg already had them. The only way to recover
would be to perform i) ALU ops but only 'add' is allowed on map
value types or ii) comparisons, but these are disallowed on
pointers in case they span a range. This is fine as only BPF_JEQ
and BPF_JNE may be performed on PTR_TO_MAP_VALUE_OR_NULL registers
which potentially turn them into PTR_TO_MAP_VALUE type depending
on the branch, so only here min/max value cannot be invalidated
for them.
In terms of state pruning, value_from_signed is considered
as well in states_equal() when dealing with adjusted map values.
With regards to breaking existing programs, there is a small
risk, but use-cases are rather quite narrow where this could
occur and mixing compares probably unlikely.
Joint work with Josef and Edward.
[0] https://lists.iovisor.org/pipermail/iovisor-dev/2017-June/000822.html
Fixes: 484611357c ("bpf: allow access into map value arrays")
Reported-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once in_dev_get is called to receive in_device pointer, the
in_device reference counter is increased, but if there are
no ipv4 addresses configured on the net-device the ifa_list
will be null, resulting in a flow that doesn't call in_dev_put
to decrease the ref_cnt.
This was exposed when running RoCE over ipv6 without any ipv4
addresses configured
Fixes: commit 8e3867310c90 ("IB/cma: Fix a race condition in iboe_addr_get_sgid()")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Current computation of qp->timeout_jiffies in rvt_modify_qp() will cause
overflow due to the fact that the input to the function usecs_to_jiffies
is only 32-bit ( unsigned int). Overflow will occur when attr->timeout is
equal to or greater than 30. The consequence is unnecessarily excessive
retry and thus degradation of the system performance.
This patch fixes the problem by limiting the input to 5-bit and calling
usecs_to_jiffies() before multiplying the scaling factor.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Christoph noticed [1] that default DMA pool in current form overload
the DMA coherent infrastructure. In reply, Robin suggested [2] to
split the per-device vs. global pool interfaces, so allocation/release
from default DMA pool is driven by dma ops implementation.
This patch implements Robin's idea and provide interface to
allocate/release/mmap the default (aka global) DMA pool.
To make it clear that existing *_from_coherent routines work on
per-device pool rename them to *_from_dev_coherent.
[1] https://lkml.org/lkml/2017/7/7/370
[2] https://lkml.org/lkml/2017/7/7/431
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The variables which are processed by RCU functions should be annotated
as RCU, otherwise sparse will report the errors like below:
"error: incompatible types in comparison expression (different
address spaces)"
Link: http://lkml.kernel.org/r/1496823171-7758-1-git-send-email-zhang.chunyan@linaro.org
Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
[ Updated to not be 100% 80 column strict ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
There is no useful return value from dev_close. All paths return 0.
Change dev_close and helper functions to void.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dsa_is_port_initialized helper is only used by dsa_switch_resume and
dsa_switch_suspend, if CONFIG_PM_SLEEP is enabled. Make it static to
dsa.c.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adjusts the timeout formula to schedule the TCP loss probe
(TLP). The previous formula uses 2*SRTT or 1.5*RTT + DelayACKMax if
only one packet is in flight. It keeps a lower bound of 10 msec which
is too large for short RTT connections (e.g. within a data-center).
The new formula = 2*RTT + (inflight == 1 ? 200ms : 2ticks) which
performs better for short and fast connections.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently llist_for_each_entry() and llist_for_each_entry_safe() iterate
until &pos->member != NULL. But when building the kernel with Clang,
the compiler assumes &pos->member cannot be NULL if the member's offset
is greater than 0 (which would be equivalent to the object being
non-contiguous in memory). Therefore the loop condition is always true,
and the loops become infinite.
To work around this, introduce the member_address_is_nonnull() macro,
which casts object pointer to uintptr_t, thus letting the member pointer
to be NULL.
Signed-off-by: Alexander Potapenko <glider@google.com>
Tested-by: Sodagudi Prasad <psodagud@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
randstruct plugin, including the task_struct.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJZbRgGAAoJEIly9N/cbcAmk2AQAIL60aQ+9RIcFAXriFhnd7Z2
x9Jqi9JNc8NgPFXx8GhE4J4eTZ5PwcjgXBpNRWY/laBkRyoBHn24ku09YxrJjmHz
ZSUsP+/iO9lVeEfbmU9Tnk50afkfwx6bHXBwkiVGQWHtybNVUqA19JbqkHeg8ubx
myKLGeUv5PPCodRIcBDD0+HaAANcsqtgbDpgmWU8s+IXWwvWCE2p7PuBw7v3HHgH
qzlPDHYQCRDw+LWsSqPaHj+9mbRO18P/ydMoZHGH4Hl3YYNtty8ZbxnraI3A7zBL
6mLUVcZ+/l88DqHc5I05T8MmLU1yl2VRxi8/jpMAkg9wkvZ5iNAtlEKIWU6eqsvk
vaImNOkViLKlWKF+oUD1YdG16d8Segrc6m4MGdI021tb+LoGuUbkY7Tl4ee+3dl/
9FM+jPv95HjJnyfRNGidh2TKTa9KJkh6DYM9aUnktMFy3ca1h/LuszOiN0LTDiHt
k5xoFURk98XslJJyXM8FPwXCXiRivrXMZbg5ixNoS4aYSBLv7Cn1M6cPnSOs7UPh
FqdNPXLRZ+vabSxvEg5+41Ioe0SHqACQIfaSsV5BfF2rrRRdaAxK4h7DBcI6owV2
7ziBN1nBBq2onYGbARN6ApyCqLcchsKtQfiZ0iFsvW7ZawnkVOOObDTCgPl3tdkr
403YXzphQVzJtpT5eRV6
=ngAW
-----END PGP SIGNATURE-----
Merge tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull structure randomization updates from Kees Cook:
"Now that IPC and other changes have landed, enable manual markings for
randstruct plugin, including the task_struct.
This is the rest of what was staged in -next for the gcc-plugins, and
comes in three patches, largest first:
- mark "easy" structs with __randomize_layout
- mark task_struct with an optional anonymous struct to isolate the
__randomize_layout section
- mark structs to opt _out_ of automated marking (which will come
later)
And, FWIW, this continues to pass allmodconfig (normal and patched to
enable gcc-plugins) builds of x86_64, i386, arm64, arm, powerpc, and
s390 for me"
* tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
randstruct: opt-out externally exposed function pointer structs
task_struct: Allow randomized layout
randstruct: Mark various structs for randomization
fix, marked for stable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJZb3AkAAoJEEp/3jgCEfOLMAUH/RRRxbY4KL/PUhDXVPf+a+Pf
groC365undvuCmHCkT1ufrlrh56KE0XUvEKgXJp+r84WS4SC6lxaebD6QvzVtyMM
KPVnbpCNfKw5KtLB1upMteYY6MGfTk4VTPCav69aNGPrvUxJQB8obvWenPi0rWk/
knALvlJZbSiZeUDK3Id9cjntTGkClYuUHYJQ1JaZeieB/Xwnr+ZvV4on8ul7gkGX
B6zdqaM43ZomSl/rJrV/G/MOMNV5uVjBNJmVpfH7KkZQGipW7O+8aDwFaMFAAN7r
4TQcLf+d3SDjcjVspikCMYr0r0VnbL8hLPGkd7Cus/3jei9GWQHGaQqbZZmcKl8=
=TPyV
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.13-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A number of small fixes for -rc1 Luminous changes plus a readdir race
fix, marked for stable"
* tag 'ceph-for-4.13-rc2' of git://github.com/ceph/ceph-client:
libceph: potential NULL dereference in ceph_msg_data_create()
ceph: fix race in concurrent readdir
libceph: don't call encode_request_finish() on MOSDBackoff messages
libceph: use alloc_pg_mapping() in __decode_pg_upmap_items()
libceph: set -EINVAL in one place in crush_decode()
libceph: NULL deref on osdmap_apply_incremental() error path
libceph: fix old style declaration warnings
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Missing netlink message sanity check in nfnetlink, patch from
Mateusz Jurczyk.
2) We now have netfilter per-netns hooks, so let's kill global hook
infrastructure, this infrastructure is known to be racy with netns.
We don't care about out of tree modules. Patch from Florian Westphal.
3) find_appropriate_src() is buggy when colissions happens after the
conversion of the nat bysource to rhashtable. Also from Florian.
4) Remove forward chain in nf_tables arp family, it's useless and it is
causing quite a bit of confusion, from Florian Westphal.
5) nf_ct_remove_expect() is called with the wrong parameter, causing
kernel oops, patch from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
retain last used xfrm_dst in a pcpu cache.
On next request, reuse this dst if the policies are the same.
The cache will not help with strict RR workloads as there is no hit.
The cache packet-path part is reasonably small, the notifier part is
needed so we do not add long hangs when a device is dismantled but some
pcpu xdst still holds a reference, there are also calls to the flush
operation when userspace deletes SAs so modules can be removed
(there is no hit.
We need to run the dst_release on the correct cpu to avoid races with
packet path. This is done by adding a work_struct for each cpu and then
doing the actual test/release on each affected cpu via schedule_work_on().
Test results using 4 network namespaces and null encryption:
ns1 ns2 -> ns3 -> ns4
netperf -> xfrm/null enc -> xfrm/null dec -> netserver
what TCP_STREAM UDP_STREAM UDP_RR
Flow cache: 14644.61 294.35 327231.64
No flow cache: 14349.81 242.64 202301.72
Pcpu cache: 14629.70 292.21 205595.22
UDP tests used 64byte packets, tests ran for one minute each,
value is average over ten iterations.
'Flow cache' is 'net-next', 'No flow cache' is net-next plus this
series but without this patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
After rcu conversions performance degradation in forward tests isn't that
noticeable anymore.
See next patch for some numbers.
A followup patcg could then also remove genid from the policies
as we do not cache bundles anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
USB spec says that multiple byte fields are stored in
little-endian order (see chapter 8.1 of USB2.0 spec and
chapter 7.1 of USB3.0 spec), thus mark such fields as LE
for UAC1 and UAC2 headers
Signed-off-by: Ruslan Bilovol <ruslan.bilovol@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Initial patches missed case with CONFIG_BPF_SYSCALL not set.
Fixes: 11393cc9b9 ("xdp: Add batching support to redirect map")
Fixes: 97f91a7cf0 ("bpf: add bpf_redirect_map helper routine")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no users for IB_QP_CREATE_USE_GFP_NOIO flag,
so let's remove it.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.
The patch removes the gfp flags argument and updates all driver paths.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.
The patch removes the gfp flags argument and updates all driver paths.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds new function ib_modify_qp_with_udata so that
uverbs layer can avoid handling L2 mac address at verbs layer
and depend on the core layer to resolve the mac address consistently
for all required QPs.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When req->rsk_listener is NULL, sk_to_full_sk() returns
NULL too, so we have to check its return value against
NULL here.
Fixes: 40304b2a15 ("bpf: BPF support for sock_ops")
Reported-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was added for netlink mmap tx, there are no callers in the tree.
The commit also added a check for skb->head != NULL in kfree_skb path,
remove that too -- all skbs ought to have skb->head set.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BPF map devmap holds a refcnt on the net_device structure when
it is in the map. We need to do this to ensure on driver unload we
don't lose a dev reference.
However, its not very convenient to have to manually unload the map
when destroying a net device so add notifier handlers to do the cleanup
automatically. But this creates a race between update/destroy BPF
syscall and programs and the unregister netdev hook.
Unfortunately, the best I could come up with is either to live with
requiring manual removal of net devices from the map before removing
the net device OR to add a mutex in devmap to ensure the map is not
modified while we are removing a device. The fallout also requires
that BPF programs no longer update/delete the map from the BPF program
side because the mutex may sleep and this can not be done from inside
an rcu critical section. This is not a real problem though because I
have not come up with any use cases where this is actually useful in
practice. If/when we come up with a compelling user for this we may
need to revisit this.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For performance reasons we want to avoid updating the tail pointer in
the driver tx ring as much as possible. To accomplish this we add
batching support to the redirect path in XDP.
This adds another ndo op "xdp_flush" that is used to inform the driver
that it should bump the tail pointer on the TX ring.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
BPF programs can use the devmap with a bpf_redirect_map() helper
routine to forward packets to netdevice in map.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device map (devmap) is a BPF map, primarily useful for networking
applications, that uses a key to lookup a reference to a netdevice.
The map provides a clean way for BPF programs to build virtual port
to physical port maps. Additionally, it provides a scoping function
for the redirect action itself allowing multiple optimizations. Future
patches will leverage the map to provide batching at the XDP layer.
Another optimization/feature, that is not yet implemented, would be
to support multiple netdevices per key to support efficient multicast
and broadcast support.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a trace event for xdp redirect which may help when debugging
XDP programs that use redirect bpf commands.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for redirect to xdp generic creating a fall back for
devices that do not yet have support and allowing test infrastructure
using veth pairs to be built.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for a bpf_redirect helper function to the XDP
infrastructure. For now this only supports redirecting to the egress
path of a port.
In order to support drivers handling a xdp_buff natively this patches
uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff
to the specified device.
If the program specifies either (a) an unknown device or (b) a device
that does not support the operation a BPF warning is thrown and the
XDP_ABORTED error code is returned.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC 7 added a new -Wimplicit-fallthrough warning. It's only enabled
with W=1, but since linux/jhash.h is included in over hundred places
(including other global headers) it seems worthwhile fixing this
warning.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As discussed in Faro during Netfilter Workshop 2017, RB trees can be
used with RCU, using a seqlock.
Note that net/rxrpc/conn_service.c is already using this.
This patch converts inetpeer from AVL tree to RB tree, since it allows
to remove private AVL implementation in favor of shared RB code.
$ size net/ipv4/inetpeer.before net/ipv4/inetpeer.after
text data bss dec hex filename
3195 40 128 3363 d23 net/ipv4/inetpeer.before
1562 24 0 1586 632 net/ipv4/inetpeer.after
The same technique can be used to speed up
net/netfilter/nft_set_rbtree.c (removing rwlock contention in fast path)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All unix sockets now account inflight FDs to the respective sender.
This was introduced in:
commit 712f4aad40
Author: willy tarreau <w@1wt.eu>
Date: Sun Jan 10 07:54:56 2016 +0100
unix: properly account for FDs passed over unix sockets
and further refined in:
commit 415e3d3e90
Author: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed Feb 3 02:11:03 2016 +0100
unix: correctly track in-flight fds in sending process user_struct
Hence, regardless of the stacking depth of FDs, the total number of
inflight FDs is limited, and accounted. There is no known way for a
local user to exceed those limits or exploit the accounting.
Furthermore, the GC logic is independent of the recursion/stacking depth
as well. It solely depends on the total number of inflight FDs,
regardless of their layout.
Lastly, the current `recursion_level' suffers a TOCTOU race, since it
checks and inherits depths only at queue time. If we consider `A<-B' to
mean `queue-B-on-A', the following sequence circumvents the recursion
level easily:
A<-B
B<-C
C<-D
...
Y<-Z
resulting in:
A<-B<-C<-...<-Z
With all of this in mind, lets drop the recursion limit. It has no
additional security value, anymore. On the contrary, it randomly
confuses message brokers that try to forward file-descriptors, since
any sendmsg(2) call can fail spuriously with ETOOMANYREFS if a client
maliciously modifies the FD while inflight.
Cc: Alban Crequy <alban.crequy@collabora.co.uk>
Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pick up
1ed134e652 drm/vc4: Fix VBLANK handling in crtc->enable() path
From drm-misc-next-fixes, it was applied after the last pull request
was sent from that branch. We'll send it through drm-fixes instead.
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
because it doesn't copy anything from/to userspace to access the
argument.
Fixes: 54ebbfb160 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Aleksa Sarai <asarai@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
no more users in the tree, remove this.
The old api is racy wrt. module removal, all users have been converted
to the netns-aware api.
The old api pretended we still have global hooks but that has not been
true for a long time.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This reverts commit 99b04f4c40 ("ASoC: add Component level
pcm_new/pcm_free"), which started calling the pcm_new callback for every
component in a *card* when creating a new pcm, something which does not
seem to make any sense.
This specifically led to memory leaks in systems with more than one
platform component and where DMA memory is allocated in the
platform-driver callback. For example, when both mcasp devices are being
used on an am335x board, DMA memory would be allocated twice for every
DAI link during probe.
When CONFIG_SND_VERBOSE_PROCFS was set this fortunately also led to
warnings such as:
WARNING: CPU: 0 PID: 565 at ../fs/proc/generic.c:346 proc_register+0x110/0x154
proc_dir_entry 'sub0/prealloc' already registered
Since there seems to be no users of the new component callbacks, and the
current implementation introduced a regression, let's revert the
offending commit for now.
Fixes: 99b04f4c40 ("ASoC: add Component level pcm_new/pcm_free")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable <stable@vger.kernel.org> # 4.10
Remove unused callbacks in the omap_hsmmc_platform_data structure
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The new macros don't follow the usual style for declarations,
which we get a warning for with 'make W=1':
In file included from fs/ceph/mds_client.c:16:0:
include/linux/ceph/ceph_features.h:74:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration]
This moves the 'static' keyword to the front of the
declaration.
Fixes: f179d3ba8c ("libceph: new features macros")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This patch is to remove the typedef sctp_hmac_algo_param_t, and
replace with struct sctp_hmac_algo_param in the places where it's
using this typedef.
It is also to use sizeof(variable) instead of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_chunks_param_t, and
replace with struct sctp_chunks_param in the places where it's
using this typedef.
It is also to use sizeof(variable) instead of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_random_param_t, and
replace with struct sctp_random_param in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_supported_ext_param_t, and
replace with struct sctp_supported_ext_param in the places where it's
using this typedef.
It is also to use sizeof(variable) instead of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_adaptation_ind_param_t, and
replace with struct sctp_adaptation_ind_param in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_supported_addrs_param_t, and
replace with struct sctp_supported_addrs_param in the places where it's
using this typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove this typedef, there is even no places using it.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_cookie_preserve_param_t, and
replace with struct sctp_cookie_preserve_param in the places where it's
using this typedef.
It is also to fix some indents in sctp_sf_do_5_2_6_stale().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_ipv6addr_param_t, and replace
with struct sctp_ipv6addr_param in the places where it's using this
typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to remove the typedef sctp_ipv4addr_param_t, and replace
with struct sctp_ipv4addr_param in the places where it's using this
typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The definition of an "anycast destination address" has been tweaked as a
side-effect of commit 2647a9b070 ("ipv6: Remove external dependency on
rt6i_gateway and RTF_ANYCAST"). The first address of a point-to-point
/127 subnet is now considered as an anycast address. This prevents
ICMPv6 errors to be returned to a sender of such a subnet and breaks
PMTU discovery.
This can be reproduced with:
ip link add name out6 type veth peer name in6
ip link add name out7 type veth peer name in7
ip link set mtu 1400 dev out7
ip link set mtu 1400 dev in7
ip netns add next-hop
ip netns add next-next-hop
ip link set netns next-hop dev in6
ip link set netns next-hop dev out7
ip link set netns next-next-hop dev in7
ip link set up dev out6
ip addr add 2001:db8:1::12/127 dev out6
ip netns exec next-hop ip link set up dev in6
ip netns exec next-hop ip link set up dev out7
ip netns exec next-hop ip addr add 2001:db8:1::13/127 dev in6
ip netns exec next-hop ip addr add 2001:db8:1::14/127 dev out7
ip netns exec next-hop ip route add default via 2001:db8:1::15
ip netns exec next-hop sysctl -qw net.ipv6.conf.all.forwarding=1
ip netns exec next-next-hop ip link set up dev in7
ip netns exec next-next-hop ip addr add 2001:db8:1::15/127 dev in7
ip netns exec next-next-hop ip addr add 2001:db8:1::50/128 dev in7
ip netns exec next-next-hop ip route add default via 2001:db8:1::14
ip netns exec next-next-hop sysctl -qw net.ipv6.conf.all.forwarding=1
ip route add 2001:db8:1::48/123 via 2001:db8:1::13
sleep 4
ping -M do -s 1452 -c 3 2001:db8:1::50 || true
ip route get 2001:db8:1::50
Before the patch, we get:
2001:db8:1::50 from :: via 2001:db8:1::13 dev out6 src 2001:db8:1::12 metric 1024 pref medium
After the patch, we get:
2001:db8:1::50 via 2001:db8:1::13 dev out6 src 2001:db8:1::12 metric 0
cache expires 578sec mtu 1400 pref medium
Fixes: 2647a9b070 ("ipv6: Remove external dependency on rt6i_gateway and RTF_ANYCAST")
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllqXNUACgkQ8vlZVpUN
gaPtAgf/aUbXZuWYsDQzslHsbzEWi+qz4QgL885/w4L00pEImTTp91Q06SDxWhtB
KPvGnZHS3IofxBh2DC+6AwN6dPMoWDCfYhhO6po3FSz0DiPRIQCTuvOb8fhKY1X7
rTdDq2xtDxPGxJ25bMJtlrgzH2XlXPpVyPUeoc9uh87zUK5aesXpUn9kBniRexoz
ume+M/cDzPKkwNQpbLq8vzhNjoWMVv0FeW2akVvrjkkWko8nZLZ0R/kIyKQlRPdG
LZDXcz0oTHpDS6+ufEo292ZuWm2IGer2YtwHsKyCAsyEWsUqBz2yurtkSj3mAVyC
hHafyS+5WNaGdgBmg0zJxxwn5qxxLg==
=ua7p
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random updates from Ted Ts'o:
"Add wait_for_random_bytes() and get_random_*_wait() functions so that
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: reorder READ_ONCE() in get_random_uXX
random: suppress spammy warnings about unseeded randomness
random: warn when kernel uses unseeded randomness
net/route: use get_random_int for random counter
net/neighbor: use get_random_u32 for 32-bit hash random
rhashtable: use get_random_u32 for hash_rnd
ceph: ensure RNG is seeded before using
iscsi: ensure RNG is seeded before use
cifs: use get_random_u32 for 32-bit lock random
random: add get_random_{bytes,u32,u64,int,long,once}_wait family
random: add wait_for_random_bytes() API
Pull ->s_options removal from Al Viro:
"Preparations for fsmount/fsopen stuff (coming next cycle). Everything
gets moved to explicit ->show_options(), killing ->s_options off +
some cosmetic bits around fs/namespace.c and friends. Basically, the
stuff needed to work with fsmount series with minimum of conflicts
with other work.
It's not strictly required for this merge window, but it would reduce
the PITA during the coming cycle, so it would be nice to have those
bits and pieces out of the way"
* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
isofs: Fix isofs_show_options()
VFS: Kill off s_options and helpers
orangefs: Implement show_options
9p: Implement show_options
isofs: Implement show_options
afs: Implement show_options
affs: Implement show_options
befs: Implement show_options
spufs: Implement show_options
bpf: Implement show_options
ramfs: Implement show_options
pstore: Implement show_options
omfs: Implement show_options
hugetlbfs: Implement show_options
VFS: Don't use save/replace_mount_options if not using generic_show_options
VFS: Provide empty name qstr
VFS: Make get_filesystem() return the affected filesystem
VFS: Clean up whitespace in fs/namespace.c and fs/super.c
Provide a function to create a NUL-terminated string from unterminated data
Pull uacess-unaligned removal from Al Viro:
"That stuff had just one user, and an exotic one, at that - binfmt_flat
on arm and m68k"
* 'work.uaccess-unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kill {__,}{get,put}_user_unaligned()
binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail
Pull MIPS updates from Ralf Baechle:
"Boston platform support:
- Document DT bindings
- Add CLK driver for board clocks
CM:
- Avoid per-core locking with CM3 & higher
- WARN on attempt to lock invalid VP, not BUG
CPS:
- Select CONFIG_SYS_SUPPORTS_SCHED_SMT for MIPSr6
- Prevent multi-core with dcache aliasing
- Handle cores not powering down more gracefully
- Handle spurious VP starts more gracefully
DSP:
- Add lwx & lhx missaligned access support
eBPF:
- Add MIPS support along with many supporting change to add the
required infrastructure
Generic arch code:
- Misc sysmips MIPS_ATOMIC_SET fixes
- Drop duplicate HAVE_SYSCALL_TRACEPOINTS
- Negate error syscall return in trace
- Correct forced syscall errors
- Traced negative syscalls should return -ENOSYS
- Allow samples/bpf/tracex5 to access syscall arguments for sane
traces
- Cleanup from old Kconfig options in defconfigs
- Fix PREF instruction usage by memcpy for MIPS R6
- Fix various special cases in the FPU eulation
- Fix some special cases in MIPS16e2 support
- Fix MIPS I ISA /proc/cpuinfo reporting
- Sort MIPS Kconfig alphabetically
- Fix minimum alignment requirement of IRQ stack as required by
ABI / GCC
- Fix special cases in the module loader
- Perform post-DMA cache flushes on systems with MAARs
- Probe the I6500 CPU
- Cleanup cmpxchg and add support for 1 and 2 byte operations
- Use queued read/write locks (qrwlock)
- Use queued spinlocks (qspinlock)
- Add CPU shared FTLB feature detection
- Handle tlbex-tlbp race condition
- Allow storing pgd in C0_CONTEXT for MIPSr6
- Use current_cpu_type() in m4kc_tlbp_war()
- Support Boston in the generic kernel
Generic platform:
- yamon-dt: Pull YAMON DT shim code out of SEAD-3 board
- yamon-dt: Support > 256MB of RAM
- yamon-dt: Use serial* rather than uart* aliases
- Abstract FDT fixup application
- Set RTC_ALWAYS_BCD to 0
- Add a MAINTAINERS entry
core kernel:
- qspinlock.c: include linux/prefetch.h
Loongson 3:
- Add support
Perf:
- Add I6500 support
SEAD-3:
- Remove GIC timer from DT
- Set interrupt-parent per-device, not at root node
- Fix GIC interrupt specifiers
SMP:
- Skip IPI setup if we only have a single CPU
VDSO:
- Make comment match reality
- Improvements to time code in VDSO"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (86 commits)
locking/qspinlock: Include linux/prefetch.h
MIPS: Fix MIPS I ISA /proc/cpuinfo reporting
MIPS: Fix minimum alignment requirement of IRQ stack
MIPS: generic: Support MIPS Boston development boards
MIPS: DTS: img: Don't attempt to build-in all .dtb files
clk: boston: Add a driver for MIPS Boston board clocks
dt-bindings: Document img,boston-clock binding
MIPS: Traced negative syscalls should return -ENOSYS
MIPS: Correct forced syscall errors
MIPS: Negate error syscall return in trace
MIPS: Drop duplicate HAVE_SYSCALL_TRACEPOINTS select
MIPS16e2: Provide feature overrides for non-MIPS16 systems
MIPS: MIPS16e2: Report ASE presence in /proc/cpuinfo
MIPS: MIPS16e2: Subdecode extended LWSP/SWSP instructions
MIPS: MIPS16e2: Identify ASE presence
MIPS: VDSO: Fix a mismatch between comment and preprocessor constant
MIPS: VDSO: Add implementation of gettimeofday() fallback
MIPS: VDSO: Add implementation of clock_gettime() fallback
MIPS: VDSO: Fix conversions in do_monotonic()/do_monotonic_coarse()
MIPS: Use current_cpu_type() in m4kc_tlbp_war()
...
Common:
- add uevents for VM creation/destruction
- annotate and properly access RCU-protected objects
s390:
- rename IOCTL added in the first v4.13 merge
x86:
- emulate VMLOAD VMSAVE feature in SVM
- support paravirtual asynchronous page fault while nested
- add Hyper-V userspace interfaces for better migration
- improve master clock corner cases
- extend internal error reporting after EPT misconfig
- correct single-stepping of emulated instructions in SVM
- handle MCE during VM entry
- fix nVMX VM entry checks and nVMX VMCS shadowing
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJZaOm6AAoJEED/6hsPKofoqO8H/3breVIyVv9mwg7A5+o+6LTq
GzV/YXHSC8NtfxZn8ViS/TCziYiBSFv7XiPSodkXbOgYSz8Yya5x9D+dbEH+xgG7
l+LsZEqdSFbHCkvKrMiwSsoXtsT5WygA56+KZiBmu8cvlwqSyXWHFn3ZJ1wKzGq/
zivlkfCoh2m6bGdNmrG9pHUSgxvDh94pXesaVBKy4hgeovY1qjzby3Lo+HuIUzai
exuEU1EKRlUIfLK1B2Anp5IIv5Q1lFnMSvD6YSiWYywZb95dN/adsX1bv+MKeOdt
TIAgotsWjaAuT9JolAJjfVPHG0+uMBMsWg4Zh9Ra/gPPaSh3KEC2h1++zEYKjvw=
=1zII
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Radim Krčmář:
"Second batch of KVM updates for v4.13
Common:
- add uevents for VM creation/destruction
- annotate and properly access RCU-protected objects
s390:
- rename IOCTL added in the first v4.13 merge
x86:
- emulate VMLOAD VMSAVE feature in SVM
- support paravirtual asynchronous page fault while nested
- add Hyper-V userspace interfaces for better migration
- improve master clock corner cases
- extend internal error reporting after EPT misconfig
- correct single-stepping of emulated instructions in SVM
- handle MCE during VM entry
- fix nVMX VM entry checks and nVMX VMCS shadowing"
* tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
kvm: x86: hyperv: make VP_INDEX managed by userspace
KVM: async_pf: Let guest support delivery of async_pf from guest mode
KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf
KVM: async_pf: Add L1 guest async_pf #PF vmexit handler
KVM: x86: Simplify kvm_x86_ops->queue_exception parameter list
kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2
KVM: x86: make backwards_tsc_observed a per-VM variable
KVM: trigger uevents when creating or destroying a VM
KVM: SVM: Enable Virtual VMLOAD VMSAVE feature
KVM: SVM: Add Virtual VMLOAD VMSAVE feature definition
KVM: SVM: Rename lbr_ctl field in the vmcb control area
KVM: SVM: Prepare for new bit definition in lbr_ctl
KVM: SVM: handle singlestep exception when skipping emulated instructions
KVM: x86: take slots_lock in kvm_free_pit
KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definition
kvm: vmx: Properly handle machine check during VM-entry
KVM: x86: update master clock before computing kvmclock_offset
kvm: nVMX: Shadow "high" parts of shadowed 64-bit VMCS fields
kvm: nVMX: Fix nested_vmx_check_msr_bitmap_controls
kvm: nVMX: Validate the I/O bitmaps on nested VM-entry
...
This fixes a problem with bool properties that could be seen as
"true" when the property was not present at all by adding a special
helper for bool properties with checks for all of the requisute
conditions (Sakari Ailus).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZaLjVAAoJEILEb/54YlRxh/oP/j0LBZ1+vVMp6onKcQZfHkEi
xA3QyRBz+G48u9dmPmKzpMcXzXmelwuw/6bJix098xFkADVlI2qQkmUYCb5sb06t
blogDZ3wQokZrVIkFT+kFFUQxcfvX4FgMv6yTerQt+k0xbLXWGighS+46wWvTuPd
hD3+M+tBw85NMI4pU4rwWIK8UUDjry5lFYsn96cqTe5VC+wUAEy33aFCY8mw55J8
FGN6DyOmlHbzCdytzi3o4ejfXQkAwjUVAbH/JHVWfEDyxhebo/N6AVAwajtgN8bU
yHY6nHMz8xL1XtNvc7q6M0i9UySZdk2YrbLh5JtRvUrUWw5OBpE/A1iaUgaGVclC
5bmeWZww1jejHLnK4U7xUWA2SuQtPE3Kz/rUt3uroJzaCfYuTAxl+7IYftadRW9y
s+j3mjO3IKjLmCmPQwgPpMmNIore14bFKMnIfStSqyIxtbw59Gt3CnEYpd11UxIh
noCnnacTGHqSyHQSelf2gdYcWiJo+IUvgFAB4ze33yRPtcfkky4med3MG4I7yXP2
Tzi+xVGnAmgb9betNINH6sPUw9JImgY19hrQhVyxs8I8sYXV18Wdrkc6JOAZ9bna
REUXseHkF3ofy7iP1L/yrjOQ31kKwYxJ5kEasg/gwkRV6kzWNJ0CLK/qD7Xp0A3O
sPm7B8FVfM4xoPOaMngw
=KEbK
-----END PGP SIGNATURE-----
Merge tag 'devprop-fix-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework fix from Rafael Wysocki:
"This fixes a problem with bool properties that could be seen as "true"
when the property was not present at all by adding a special helper
for bool properties with checks for all of the requisute conditions
(Sakari Ailus)"
* tag 'devprop-fix-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Introduce fwnode_call_bool_op() for ops that return bool
Merge even more updates from Andrew Morton:
- a few leftovers
- fault-injector rework
- add a module loader test driver
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kmod: throttle kmod thread limit
kmod: add test driver to stress test the module loader
MAINTAINERS: give kmod some maintainer love
xtensa: use generic fb.h
fault-inject: add /proc/<pid>/fail-nth
fault-inject: simplify access check for fail-nth
fault-inject: make fail-nth read/write interface symmetric
fault-inject: parse as natural 1-based value for fail-nth write interface
fault-inject: automatically detect the number base for fail-nth write interface
kernel/watchdog.c: use better pr_fmt prefix
MAINTAINERS: move the befs tree to kernel.org
lib/atomic64_test.c: add a test that atomic64_inc_not_zero() returns an int
mm: fix overflow check in expand_upwards()
Using strscpy was wrong because FORTIFY_SOURCE is passing the maximum
possible size of the outermost object, but strscpy defines the count
parameter as the exact buffer size, so this could copy past the end of
the source. This would still be wrong with the planned usage of
__builtin_object_size(p, 1) for intra-object overflow checks since it's
the maximum possible size of the specified object with no guarantee of
it being that large.
Reuse of the fortified functions like this currently makes the runtime
error reporting less precise but that can be improved later on.
Noticed by Dave Jones and KASAN.
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The value written to fail-nth file is parsed as 0-based. Parsing as
one-based is more natural to understand and it enables to cancel the
previous setup by simply writing '0'.
This change also converts task->fail_nth from signed to unsigned int.
Link: http://lkml.kernel.org/r/1491490561-10485-3-git-send-email-akinobu.mita@gmail.com
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
this different kind of NTB HW, some style fixes (per Greg KH
recommendation), and some ntb_test tweaks.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZZowhAAoJEG5mS6x6i9IjNx8P/0DTYNqBRXg4PSET7+6/Z6yq
Gz09DRw42dwHXN70ytEwtdRP7jcp3agrvBpRty3EgBygpLlMfOA0iQ6crVdBaza0
XzRuSl1iEd8oSX3CmJN8dIIWueNiPiqvG6as+j1kjZRZst21oqiMLTO3uO37oBh+
7/ltCkQtQdco74fiaqnN1iDeKK2lj3qHPJxb2AbeGIxj3lOr6NttPifd0zovimtb
keBCbaYyQaTxDoNzRwQmZtQNB2bwC/2V1sAM5L1T/xkVIBWXH8AUqJO1iVC4ptjI
VURvxitDMo28JDc4hgO63AVHsvz8zLbgKsCNK0ia+rtalSLJkDBBN8Qaf7e10hbh
6YUg2eXzIGusdvFowKnOdXhHYAqrQXMBcWgM2ZVe1YBj8QqZKQ0LL4h9ooVuP9HV
2hZc3GfMT+iUGmQKdk+rtDVLxYNnsng1NT/5ZxCHBQptrb+/Rqey/wDYdPV2TpnD
JfRpIFny29NOu74sa4W5GEPWI3AL7WAPAA1WH7RHy5rVDPfX20DcUQdvKvpo25y8
IdH7caSiBrjpc2GQPvkaziHlLLdZoRNwqJnmHDyUatukdmrElV/oTIRhaYQmLMK0
/xTFS7YXsmiOGqlfILW9VXdZIlLXSKuNSPX1HUcJ91l2Ik7m0PiK382jKOJVXqcZ
Frt1CgnCclCDYHBD4yfm
=SG8F
-----END PGP SIGNATURE-----
Merge tag 'ntb-4.13' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"The major change in the series is a rework of the NTB infrastructure
to all for IDT hardware to be supported (and resulting fallout from
that). There are also a few clean-ups, etc.
New IDT NTB driver and changes to the NTB infrastructure to allow for
this different kind of NTB HW, some style fixes (per Greg KH
recommendation), and some ntb_test tweaks"
* tag 'ntb-4.13' of git://github.com/jonmason/ntb:
ntb_netdev: set the net_device's parent
ntb: Add error path/handling to Debug FS entry creation
ntb: Add more debugfs support for ntb_perf testing options
ntb: Remove debug-fs variables from the context structure
ntb: Add a module option to control affinity of DMA channels
NTB: Add IDT 89HPESxNTx PCIe-switches support
ntb_hw_intel: Style fixes: open code macros that just obfuscate code
ntb_hw_amd: Style fixes: open code macros that just obfuscate code
NTB: Add ntb.h comments
NTB: Add PCIe Gen4 link speed
NTB: Add new Memory Windows API documentation
NTB: Add Messaging NTB API
NTB: Alter Scratchpads API to support multi-ports devices
NTB: Alter MW API to support multi-ports devices
NTB: Alter link-state API to support multi-port devices
NTB: Add indexed ports NTB API
NTB: Make link-state API being declared first
NTB: ntb_test: add parameter for doorbell bitmask
NTB: ntb_test: modprobe on remote host
Pull thermal management updates from Zhang Rui:
- Improve thermal cpu_cooling interaction with cpufreq core.
The cpu_cooling driver is designed to use CPU frequency scaling to
avoid high thermal states for a platform. But it wasn't glued really
well with cpufreq core.
For example clipped-cpus is copied from the policy structure and its
much better to use the policy->cpus (or related_cpus) fields directly
as they may have got updated. Not that things were broken before this
series, but they can be optimized a bit more.
This series tries to improve interactions between cpufreq core and
cpu_cooling driver and does some fixes/cleanups to the cpu_cooling
driver. (Viresh Kumar)
- A couple of fixes and cleanups in thermal core and imx, hisilicon,
bcm_2835, int340x thermal drivers. (Arvind Yadav, Dan Carpenter,
Sumeet Pawnikar, Srinivas Pandruvada, Willy WOLFF)
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (24 commits)
thermal: bcm2835: fix an error code in probe()
thermal: hisilicon: Handle return value of clk_prepare_enable
thermal: imx: Handle return value of clk_prepare_enable
thermal: int340x: check for sensor when PTYP is missing
Thermal/int340x: Fix few typos and kernel-doc style
thermal: fix source code documentation for parameters
thermal: cpu_cooling: Replace kmalloc with kmalloc_array
thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device
thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()
thermal: cpu_cooling: don't store cpu_dev in cpufreq_cdev
thermal: cpu_cooling: get_level() can't fail
thermal: cpu_cooling: create structure for idle time stats
thermal: cpu_cooling: merge frequency and power tables
thermal: cpu_cooling: get rid of 'allowed_cpus'
thermal: cpu_cooling: OPPs are registered for all CPUs
thermal: cpu_cooling: store cpufreq policy
cpufreq: create cpufreq_table_count_valid_entries()
thermal: cpu_cooling: use cpufreq_policy to register cooling device
thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()
thermal: cpu_cooling: remove cpufreq_cooling_get_level()
...
clk_bulk_disable_unprepare() API to the newly introduced
clk bulk APIs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJZZ+vFAAoJEK0CiJfG5JUl6hQP/A38eFGbN/OiQV1dgxsb7A5i
YgCouZKEy1vczQmagEQ/1OOxcNljKV5OwKK3lxfRummhvCuaV8nxgfKTCVjX6WZX
HvgF3pEtGOK54q4F6ugHgDudfGJ2YGtYl+4Yml782lcKf5BBCvOSicMipzc+o5mY
qVTIXbmuNvHRE7OZvSTAQNp9PEGGkEZMuayPSqRekt5PJ0D7tZ7yJ+XrQFmPG5+O
13BzWcm/1XUIXoVnfwhUsu/LHih1g0ahbnswSKDOtUtJhhQUKmXoK0XrObpqqK+2
YH2s0nVeICxAGF7pr4NWxf2lerEYkfaTQLMieNvCqnhORJteyWnM0Ih+NDQkL7cY
LM6k87gzQvgZ0/GTrRpN/5EroeirEifCPn6NnSEDmlekn9JrmH/noXxZLsxJCYHf
AihvysOv2YiZlj2wg540p7n39z7xKv472ppZ6dK7EFQdQblDjGA1o39yy8qlwrXL
70n0/S04xoKHclzZb7zgjuLbYmG9E6Lwz0iWSrmt/PXnxkJndf1Sy1zOISfFn6qw
Sw5zNYKDpSJHX8OQ/2klIyMYuNtB2bZS25GB9o0JhblDznjqM9af9TliCUj6yDUY
eIq1msnr0mPNaUPALTa7v6o3s23ljHxkqa4CUr8dXxDdOLM0xZ+2ro275akJysbP
27abema8QBYKwKFWO8B2
=H+KT
-----END PGP SIGNATURE-----
Merge tag 'clk-bulk-get-prep-enable' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk api update from Stephen Boyd:
"Small patch to add the clk_bulk_prepare_enable() and
clk_bulk_disable_unprepare() API to the newly introduced
clk bulk APIs.
It would be good to get this into the -rc1 so that other
driver trees can use it for code targeted for the next
merge window"
* tag 'clk-bulk-get-prep-enable' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Provide bulk prepare_enable disable_unprepare variants
[ 236.821534] WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (ffff8802538683d0)
[ 236.828642] 420000001e7f0000000000000000000000080000000000000000000000000000
[ 236.839543] i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
[ 236.850420] ^
[ 236.854123] RIP: 0010:[<ffffffff81396f07>] [<ffffffff81396f07>] fence_signal+0x17/0xd0
[ 236.861313] RSP: 0018:ffff88024acd7ba0 EFLAGS: 00010282
[ 236.865027] RAX: ffffffff812f6a90 RBX: ffff8802527ca800 RCX: ffff880252cb30e0
[ 236.868801] RDX: ffff88024ac5d918 RSI: ffff880252f780e0 RDI: ffff880253868380
[ 236.872579] RBP: ffff88024acd7bc0 R08: ffff88024acd7be0 R09: 0000000000000000
[ 236.876407] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880253868380
[ 236.880185] R13: ffff8802538684d0 R14: ffff880253868380 R15: ffff88024cd48e00
[ 236.883983] FS: 00007f1646d1a740(0000) GS:ffff88025d000000(0000) knlGS:0000000000000000
[ 236.890959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 236.894702] CR2: ffff880251360318 CR3: 000000024ad21000 CR4: 00000000001406f0
[ 236.898481] [<ffffffff8130d1ad>] i915_gem_request_retire+0x1cd/0x230
[ 236.902439] [<ffffffff8130e2b3>] i915_gem_request_alloc+0xa3/0x2f0
[ 236.906435] [<ffffffff812fb1bd>] i915_gem_do_execbuffer.isra.41+0xb6d/0x18b0
[ 236.910434] [<ffffffff812fc265>] i915_gem_execbuffer2+0x95/0x1e0
[ 236.914390] [<ffffffff812ad625>] drm_ioctl+0x1e5/0x460
[ 236.918275] [<ffffffff8110d4cf>] do_vfs_ioctl+0x8f/0x5c0
[ 236.922168] [<ffffffff8110da3c>] SyS_ioctl+0x3c/0x70
[ 236.926090] [<ffffffff814b7a5f>] entry_SYSCALL_64_fastpath+0x17/0x93
[ 236.930045] [<ffffffffffffffff>] 0xffffffffffffffff
We only set the timestamp before we mark the fence as signaled. It is
done before to avoid observers having a window in which they may see the
fence as complete but no timestamp. Having it does incur a potential for
the timestamp to be written twice, and even for it to be corrupted if
the u64 write is not atomic. Instead use a new bit to record the
presence of the timestamp, and teach the readers to wait until it is set
if the fence is complete. There still remains a race where the timestamp
for the signaled fence may be shown before the fence is reported as
signaled, but that's a pre-existing error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214124001.1930-1-chris@chris-wilson.co.uk
Some firmwares in Huawei E3372H devices have been observed to switch back
to NTB 32-bit format after altsetting switch.
This patch implements a driver flag to check for the device settings and
set NTB format to 16-bit again if needed.
The flag has been activated for devices controlled by the huawei_cdc_ncm.c
driver.
V1->V2:
- fixed broken error checks
- some corrections to the commit message
V2->V3:
- variable name changes, to clarify what's happening
- check (and possibly set) the NTB format later in the common bind code path
Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com>
Reported-and-tested-by: Christian Panton <christian@panton.org>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
CC: Bjørn Mork <bjorn@mork.no>
CC: Christian Panton <christian@panton.org>
CC: linux-usb@vger.kernel.org
CC: netdev@vger.kernel.org
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hyper-V identifies vCPUs by Virtual Processor Index, which can be
queried via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a
sequential number which can't exceed the maximum number of vCPUs per VM.
APIC ids can be sparse and thus aren't a valid replacement for VP
indices.
Current KVM uses its internal vcpu index as VP_INDEX. However, to make
it predictable and persistent across VM migrations, the userspace has to
control the value of VP_INDEX.
This patch achieves that, by storing vp_index explicitly on vcpu, and
allowing HV_X64_MSR_VP_INDEX to be set from the host side. For
compatibility it's initialized to KVM vcpu index. Also a few variables
are renamed to make clear distinction betweed this Hyper-V vp_index and
KVM vcpu_id (== APIC id). Besides, a new capability,
KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip
attempting msr writes where unsupported, to avoid spamming error logs.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Stable bugfixes:
- Fix -EACCESS on commit to DS handling
- Fix initialization of nfs_page_array->npages
- Only invalidate dentries that are actually invalid
Features:
- Enable NFSoRDMA transparent state migration
- Add support for lookup-by-filehandle
- Add support for nfs re-exporting
Other bugfixes and cleanups:
- Christoph cleaned up the way we declare NFS operations
- Clean up various internal structures
- Various cleanups to commits
- Various improvements to error handling
- Set the dt_type of . and .. entries in NFS v4
- Make slot allocation more reliable
- Fix fscache stat printing
- Fix uninitialized variable warnings
- Fix potential list overrun in nfs_atomic_open()
- Fix a race in NFSoRDMA RPC reply handler
- Fix return size for nfs42_proc_copy()
- Fix against MAC forgery timing attacks
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlln4jEACgkQ18tUv7Cl
QOv2ZxAAwbQN9Dtx4rOZmPe0Xszua23sNN0ja891PodkCjIiZrRelZhLIBAf1rfP
uSR+jTD8EsBHGt3bzTXg2DHz+o8cGDZuH+uuZX+wRWJPQcKA2pC7zElqnse8nmn5
4Z1UUdzf42vE4NZ/G1ucqpEiAmOqGJ3s7pCRLLXPvOSSQXqOhiomNDAcGxX05FIv
Ly4Kr6RIfg/O4oNOZBuuL/tZHodeyOj1vbyjt/4bDQ5MEXlUQfcjJZEsz/2EcNh6
rAgbquxr1pGCD072pPBwYNH2vLGbgNN41KDDMGI0clp+8p6EhV6BOlgcEoGtZM86
c0yro2oBOB2vPCv9nGr6JgTOHPKG6ksJ7vWVXrtQEjBGP82AbFfAawLgqZ6Ae8dP
Sqpx55j4xdm4nyNglCuhq5PlPAogARq/eibR+RbY973Lhzr5bZb3XqlairCkNNEv
4RbTlxbWjhgrKJ56jVf+KpUDJAVG5viKMD7YDx/bOfLtvPwALbozD7ONrunz5v43
PgQEvWvVtnQAKp27pqHemTsLFhU6M6eGUEctRnAfB/0ogWZh1X8QXgulpDlqG3kb
g12kr5hfA0pSfcB0aGXVzJNnHKfW3IY3WBWtxq4xaMY22YkHtuB+78+9/yk3jCAi
dvimjT2Ko9fE9MnltJ/hC5BU+T+xUxg+1vfwWnKMvMH8SIqjyu4=
=OpLj
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"Stable bugfixes:
- Fix -EACCESS on commit to DS handling
- Fix initialization of nfs_page_array->npages
- Only invalidate dentries that are actually invalid
Features:
- Enable NFSoRDMA transparent state migration
- Add support for lookup-by-filehandle
- Add support for nfs re-exporting
Other bugfixes and cleanups:
- Christoph cleaned up the way we declare NFS operations
- Clean up various internal structures
- Various cleanups to commits
- Various improvements to error handling
- Set the dt_type of . and .. entries in NFS v4
- Make slot allocation more reliable
- Fix fscache stat printing
- Fix uninitialized variable warnings
- Fix potential list overrun in nfs_atomic_open()
- Fix a race in NFSoRDMA RPC reply handler
- Fix return size for nfs42_proc_copy()
- Fix against MAC forgery timing attacks"
* tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (68 commits)
NFS: Don't run wake_up_bit() when nobody is waiting...
nfs: add export operations
nfs4: add NFSv4 LOOKUPP handlers
nfs: add a nfs_ilookup helper
nfs: replace d_add with d_splice_alias in atomic_open
sunrpc: use constant time memory comparison for mac
NFSv4.2 fix size storage for nfs42_proc_copy
xprtrdma: Fix documenting comments in frwr_ops.c
xprtrdma: Replace PAGE_MASK with offset_in_page()
xprtrdma: FMR does not need list_del_init()
xprtrdma: Demote "connect" log messages
NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
xprtrdma: Don't defer MR recovery if ro_map fails
xprtrdma: Fix FRWR invalidation error recovery
xprtrdma: Fix client lock-up after application signal fires
xprtrdma: Rename rpcrdma_req::rl_free
xprtrdma: Pass only the list of registered MRs to ro_unmap_sync
xprtrdma: Pre-mark remotely invalidated MRs
xprtrdma: On invalidation failure, remove MWs from rl_registered
...
Pull SCSI target updates from Nicholas Bellinger:
"It's been usually busy for summer, with most of the efforts centered
around TCMU developments and various target-core + fabric driver bug
fixing activities. Not particularly large in terms of LoC, but lots of
smaller patches from many different folks.
The highlights include:
- ibmvscsis logical partition manager support (Michael Cyr + Bryant
Ly)
- Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch +
nab)
- Add support for TMR percpu LUN reference counting (nab)
- Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown
(Bart)
- Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi)
- Fix TMCU module removal (Xiubo Li)
- Fix iser-target OOPs during login failure (Andrea Righi + Sagi)
- Breakup target-core free_device backend driver callback (mnc)
- Perform TCMU add/delete/reconfig synchronously (mnc)
- Fix TCMU multiple UIO open/close sequences (mnc)
- Fix TCMU CHECK_CONDITION sense handling (mnc)
- Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab)
- Introduce TYPE_ZBC support in PSCSI (Damien Le Moal)
- Fix possible TCMU memory leak + OOPs when recalculating cmd base
size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc)
- Add login_keys_workaround attribute for non RFC initiators (Robert
LeBlanc + Arun Easi + nab)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits)
iscsi-target: Add login_keys_workaround attribute for non RFC initiators
Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT"
tcmu: clean up the code and with one small fix
tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size
target: export lio pgr/alua support as device attr
target: Fix return sense reason in target_scsi3_emulate_pr_out
target: Fix cmd size for PR-OUT in passthrough_parse_cdb
tcmu: Fix dev_config_store
target: pscsi: Introduce TYPE_ZBC support
target: Use macro for WRITE_VERIFY_32 operation codes
target: fix SAM_STAT_BUSY/TASK_SET_FULL handling
target: remove transport_complete
pscsi: finish cmd processing from pscsi_req_done
tcmu: fix sense handling during completion
target: add helper to copy sense to se_cmd buffer
target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE
target: make device_mutex and device_list static
tcmu: Fix flushing cmd entry dcache page
tcmu: fix multiple uio open/close sequences
tcmu: drop configured check in destroy
...
"perf lock" shows fairly heavy contention for the bit waitqueue locks
when doing an I/O heavy workload.
Use a bit to tell whether or not there has been contention for a lock
so that we can optimise away the bit waitqueue options in those cases.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
"perf lock" shows fairly heavy contention for the bit waitqueue locks
when doing an I/O heavy workload.
Use a bit to tell whether or not there has been contention for a lock
so that we can optimise away the bit waitqueue options in those cases.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
RPC-over-RDMA transport to use the new rdma_rw API.
Christoph cleaned the way nfs operations are declared, removing a bunch
of function-pointer casts and declaring the operation vectors as const.
Christoph's changes touch both client and server, and both client and
server pulls this time around should be based on the same commits from
Christoph.
(Note: Anna and I initially didn't coordinate this well and we realized
our pull requests were going to leave you with Christoph's 33 patches
duplicated between our two trees. We decided a last-minute rebase was
the lesser of two evils, so her pull request will show that last-minute
rebase. Yell if that was the wrong choice, and we'll know better for
next time....)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZZ80JAAoJECebzXlCjuG+PiMP/jmw4IbzY4qt/X8aldVTMPZ8
TkEXuZSrc7FbmroqAR0XN/qJjzENKUcrnlYm7HKVe6iItTZUvJuVThtHQVGzZUZD
wP2VRzgkky59aDs9cphfTPGKPKL1MtoC3qQdFmKd/8ZhBDHIq89A2pQJwl7PI4rA
IHzvLmZtTKL+xWoypqZQxepONhEY2ZPrffGWL+5OVF/dPmWfJ6m/M6jRTb7zV/YD
PZyRqWQ8UY/HwZTwRrxZDCCxUsmRUPZz195iFjM8wvBl7auWNetC22gyyITlvfzf
1m0zJqw3qn09+v2xnAWs/ZVxypg6rsEiIcL2mf0JC/tQh+iIzabc4e/TwDEWqSq+
ocQrvXJuZCjsrMqg4oaIuDFogaZCsGR5wxDAEyfYDS/8fMdiKq8xJzT7v31/2U37
Bsr1hvgAmD4eZWaTrJg11V5RnTzDgns+EtNfISR8t4/k+wehDfyzav8A+j72sqvR
JT+7iUEd0QcBwo+MCC7AOnLLsIX45QUjZKKrvZNAC1fmr8RyAF1zo5HHO+NNjLuP
J2PUG2GbNxsQkm/JAFKDvyklLpEXZc6uyYAcEefirxYbh1x0GfuetzqtH58DtrQL
/1e80MRG9Qgq5S8PvYyvp1bIQPDRaQ188chEvzZy+3QeNXydq2LzDh0bjlM+4A9I
DZhP2pNGLh0ImaPtX0q+
=mR/a
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.13' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Chuck's RDMA update overhauls the "call receive" side of the
RPC-over-RDMA transport to use the new rdma_rw API.
Christoph cleaned the way nfs operations are declared, removing a
bunch of function-pointer casts and declaring the operation vectors as
const.
Christoph's changes touch both client and server, and both client and
server pulls this time around should be based on the same commits from
Christoph"
* tag 'nfsd-4.13' of git://linux-nfs.org/~bfields/linux: (53 commits)
svcrdma: fix an incorrect check on -E2BIG and -EINVAL
nfsd4: factor ctime into change attribute
svcrdma: Remove svc_rdma_chunk_ctxt::cc_dir field
svcrdma: use offset_in_page() macro
svcrdma: Clean up after converting svc_rdma_recvfrom to rdma_rw API
svcrdma: Clean-up svc_rdma_unmap_dma
svcrdma: Remove frmr cache
svcrdma: Remove unused Read completion handlers
svcrdma: Properly compute .len and .buflen for received RPC Calls
svcrdma: Use generic RDMA R/W API in RPC Call path
svcrdma: Add recvfrom helpers to svc_rdma_rw.c
sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqst
svcrdma: Don't account for Receive queue "starvation"
svcrdma: Improve Reply chunk sanity checking
svcrdma: Improve Write chunk sanity checking
svcrdma: Improve Read chunk sanity checking
svcrdma: Remove svc_rdma_marshal.c
svcrdma: Avoid Send Queue overflow
svcrdma: Squelch disconnection messages
sunrpc: Disable splice for krb5i
...
This will be needed in order to implement the get_parent export op
for nfsd.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
This helper will allow to find an existing NFS inode by the file handle
and fattr.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Transparent State Migration copies a client's lease state from the
server where a filesystem used to reside to the server where it now
resides. When an NFSv4.1 client first contacts that destination
server, it uses EXCHANGE_ID to detect trunking relationships.
The lease that was copied there is returned to that client, but the
destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
the client. This is because the lease was confirmed on the source
server (before it was copied).
Normally, when CONFIRMED_R is set, a client purges the lease and
creates a new one. However, that throws away the entire benefit of
Transparent State Migration.
Therefore, the client must not purge that lease when it is possible
that Transparent State Migration has occurred.
Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If the page cache is being flushed, then we want to ensure that we
do start a commit once the pages are done being flushed.
If we just wait until all I/O is done to that file, we can end up
livelocking until the balance_dirty_pages() mechanism puts its
foot down and forces I/O to stop.
So instead we do more or less the same thing that O_DIRECT does,
and set up a counter to tell us when the flush is done,
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Remove the 'layout_private' fields that were only used by the pNFS OSD
layout driver.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
An interrupted rename will leave the old dentry behind if the rename
succeeds. Fix this by forcing a lookup the next time through
->d_revalidate.
A previous attempt at solving this problem took the approach to complete
the work of the rename asynchronously, however that approach was wrong
since it would allow the d_move() to occur after the directory's i_mutex
had been dropped by the original process.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NFS uses some int, and unsigned int :1, and bool as flags in structs and
args. Assert the preference for uniformly replacing these with the bool
type.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.
Signed-off-by: Christoph Hellwig <hch@lst.de>
pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.
This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Drop the resp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Drop the argp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Drop the p and resp arguments as they are always NULL or can trivially
be derived from the rqstp argument. With that all functions now have the
same prototype, and we can remove the unsafe casting to kxdrproc_t.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument. With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.
Signed-off-by: Christoph Hellwig <hch@lst.de>
struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.
This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pass struct rpc_request as the first argument instead of an untyped blob.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pass struct rpc_request as the first argument instead of an untyped blob,
and mark the data object as const.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Merge yet more updates from Andrew Morton:
- various misc things
- kexec updates
- sysctl core updates
- scripts/gdb udpates
- checkpoint-restart updates
- ipc updates
- kernel/watchdog updates
- Kees's "rough equivalent to the glibc _FORTIFY_SOURCE=1 feature"
- "stackprotector: ascii armor the stack canary"
- more MM bits
- checkpatch updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits)
writeback: rework wb_[dec|inc]_stat family of functions
ARM: samsung: usb-ohci: move inline before return type
video: fbdev: omap: move inline before return type
video: fbdev: intelfb: move inline before return type
USB: serial: safe_serial: move __inline__ before return type
drivers: tty: serial: move inline before return type
drivers: s390: move static and inline before return type
x86/efi: move asmlinkage before return type
sh: move inline before return type
MIPS: SMP: move asmlinkage before return type
m68k: coldfire: move inline before return type
ia64: sn: pci: move inline before type
ia64: move inline before return type
FRV: tlbflush: move asmlinkage before return type
CRIS: gpio: move inline before return type
ARM: HP Jornada 7XX: move inline before return type
ARM: KVM: move asmlinkage before type
checkpatch: improve the STORAGE_CLASS test
mm, migration: do not trigger OOM killer when migrating memory
drm/i915: use __GFP_RETRY_MAYFAIL
...
- Include Intel XXV710 in INTx workaround (Alex Williamson)
- Make use of ERR_CAST() for error return (Dan Carpenter)
- Fix vfio_group release deadlock from iommu notifier (Alex Williamson)
- Unset KVM-VFIO attributes only on group match (Alex Williamson)
- Fix release path group/file matching with KVM-VFIO (Alex Williamson)
- Remove unnecessary lock uses triggering lockdep splat (Alex Williamson)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJZZ71TAAoJECObm247sIsi1mUP/1/5kubhHRE/Y11SnX4Bpie0
X6YVbV3WLGuV9jaFMI7EgYJLZ6pBvCRX0CLUzEdrbS4LAlTLQu2GSY5tqhcLJumE
0mV1Wj1jOO9BAur13/sJ+IohbZeK10dtbTkWv+YhrVSpRaLP+ituaKsajEV12WRH
6dxho2bqfvNcTC8yjE+vqe9mU3jXYPGuCx8oYGaDXEsCjzrdbOFnAG0/s/2cWpb+
D1ADHd3020VAHZRnHRBLFfMczza1jqllhSAUfdMw1gRGCQDq3k1XenzVNLLTbtYy
VmEWHa+R/OCfbKVxaDqPsgOTK7x7DKn+Pzb3lWCdQ8v5X+2ubHpVIYjxTDSSTbt3
YJ7a+hNk8AHkFgwS7x8BdOT8mmNGb1NZldjS4dv2VWkfcTnMQnubnYSCzGztto9h
P2THKBil6djPb9S3pCvtKUiHSIZedYZlKofUldrOdGDAZmzLLlf8lTzijGjDYKFM
pQeZC+xeEhZXURipgkH+a+paYgDtKEfwSlABODghjCcJf7S/GbyVPLOKLXzoVb2y
Ml8eGlo4O/cNniQK5faH447ilM7hzS1aG83uGnHTfe8VgKjI7Z5ZSxKOtoEq5bDz
bb91E6GVLKHqT0LVS1YZfrnqK0hX/QAd/sK1REM5nN8JNmPLyLpjv8FaJEhpk1vC
z4At5+pfKM8DYrW3EGmc
=3A4K
-----END PGP SIGNATURE-----
Merge tag 'vfio-v4.13-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Include Intel XXV710 in INTx workaround (Alex Williamson)
- Make use of ERR_CAST() for error return (Dan Carpenter)
- Fix vfio_group release deadlock from iommu notifier (Alex Williamson)
- Unset KVM-VFIO attributes only on group match (Alex Williamson)
- Fix release path group/file matching with KVM-VFIO (Alex Williamson)
- Remove unnecessary lock uses triggering lockdep splat (Alex Williamson)
* tag 'vfio-v4.13-rc1' of git://github.com/awilliam/linux-vfio:
vfio: Remove unnecessary uses of vfio_container.group_lock
vfio: New external user group/file match
kvm-vfio: Decouple only when we match a group
vfio: Fix group release deadlock
vfio: Use ERR_CAST() instead of open coding it
vfio/pci: Add Intel XXV710 to hidden INTx devices
Subsystem:
- expose non volatile RAM using nvmem instead of open coding in many
drivers. Unfortunately, this option has to be enabled by default to not
break existing users.
- rtctest can now test for cutoff dates, showing when an RTC will start
failing to properly save time and date.
- new RTC registration functions to remove race conditions in drivers
Newly supported RTCs:
- Broadcom STB wake-timer
- Epson RX8130CE
- Maxim IC DS1308
- STMicroelectronics STM32H7
Drivers:
- ds1307: use regmap, use nvmem, more cleanups
- ds3232: temperature reading support
- gemini: renamed to ftrtc010
- m41t80: use CCF to expose the clock
- rv8803: use nvmem
- s3c: many cleanups
- st-lpc: fix y2106 bug
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEXx9Viay1+e7J/aM4AyWl4gNJNJIFAllnMUIACgkQAyWl4gNJ
NJJGNxAAqMTrggkF6KTvFCVAoMHdkeAxuoyigwCH8BCm2gOm5Qj8ZodZxndcl3Gb
dWG+c1pHf4KXrz59h6ZGI4qFgIpCyJpjGpyJs0Pvt6gY7YIqHrEa1nvcrPO7DaWw
fPPcszyiymDOsb6d+wJzriA2ISJUHy7Kf6FUb0fjQLoYNl7ezgzdV6+dvePOPcW1
kaAfRX8XqrkECrDFFHlX1Szb78qGhcUB1TmWFW+hadICTguBLX/fro0DKWRw2POQ
y3cHKqMzFhTD1+jkp26o535x/D9CWDXzLmLvRF5tBQ0X7V2UIGchj4aNEHT0Ruwx
YlGzB3WDwfj/Jl+VALmY27mplf71z5ppJRhaFn84OWrJmvjS/2EF9TCCBc4XvzzX
dH/5nvPyNrUYnayTTCXiPhN3p4ivywHXqA9gkHcWb3BagNIpuvwNVnJT/Sxz3Y5R
Gt2zGl07NKQ1EtEThQEIBOMXy9nJ2PVJdQFmLehj1PfxX+Gbs42tWBILzl4n1rgT
yUFLMGw1Y0/h39jw7t+uKM7v0aXPHOXLrwaDKIj+c4ffVXD8IALhgG7BL4dOQPSF
rRPKi5QNYJMnuBeKHJrFlq7xWqHRVUfTFh16eyYvwGLGWiUuGe9akhlabl6bE8jG
fm3TlHPNieGMObXijwEVePkY6z7E0CLE+d1iQsDK6ZgO/z3pdOo=
=QDxE
-----END PGP SIGNATURE-----
Merge tag 'rtc-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Here is the pull-request for the RTC subsystem for 4.13.
Subsystem:
- expose non volatile RAM using nvmem instead of open coding in many
drivers. Unfortunately, this option has to be enabled by default to
not break existing users.
- rtctest can now test for cutoff dates, showing when an RTC will
start failing to properly save time and date.
- new RTC registration functions to remove race conditions in drivers
Newly supported RTCs:
- Broadcom STB wake-timer
- Epson RX8130CE
- Maxim IC DS1308
- STMicroelectronics STM32H7
Drivers:
- ds1307: use regmap, use nvmem, more cleanups
- ds3232: temperature reading support
- gemini: renamed to ftrtc010
- m41t80: use CCF to expose the clock
- rv8803: use nvmem
- s3c: many cleanups
- st-lpc: fix y2106 bug"
* tag 'rtc-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (51 commits)
rtc: Remove wrong deprecation comment
nvmem: include linux/err.h from header
rtc: st-lpc: make it robust against y2038/2106 bug
rtc: rtctest: add check for problematic dates
tools: timer: add rtctest_setdate
rtc: ds1307: remove ds1307_remove
rtc: ds1307: use generic nvmem
rtc: ds1307: switch to rtc_register_device
rtc: rv8803: remove rv8803_remove
rtc: rv8803: use generic nvmem support
rtc: rv8803: switch to rtc_register_device
rtc: add generic nvmem support
rtc: at91rm9200: remove race condition
rtc: introduce new registration method
rtc: class separate id allocation from registration
rtc: class separate device allocation from registration
rtc: stm32: add STM32H7 RTC support
dt-bindings: rtc: stm32: add support for STM32H7
rtc: ds1307: add ds1308 variant
rtc: ds3232: add temperature support
...
General updates
* Cleanups and additional flash support for "dataflash" driver
* new driver for mchp23k256 SPI SRAM device
* improve handling of MTDs without eraseblocks (i.e., MTD_NO_ERASE)
* refactor and improve "sub-partition" handling with TRX partition
parser; partitions can now be created as sub-partitions of another
partition
SPI NOR updates, from Cyrille Pitchen and Marek Vasut:
* introduce support to the SPI 1-2-2 and 1-4-4 protocols.
* introduce support to the Double Data Rate (DDR) mode.
* introduce support to the Octo SPI protocols.
* add support to new memory parts for Spansion, Macronix and Winbond.
* add fixes for the Aspeed, STM32 and Cadence QSPI controler drivers.
* clean up the st_spi_fsm driver.
NAND updates, from Boris Brezillon:
* addition of on-die ECC support to Micron driver
* addition of helpers to help drivers choose most appropriate ECC
settings
* deletion of dead-code (cached programming and ->errstat() hook)
* make sure drivers that do not support the SET/GET FEATURES command
return ENOTSUPP use a dummy ->set/get_features implementation
returning -ENOTSUPP (required for Micron on-die ECC)
* change the semantic of ecc->write_page() for drivers setting the
NAND_ECC_CUSTOM_PAGE_ACCESS flag
* support exiting 'GET STATUS' command in default ->cmdfunc()
implementations
* change the prototype of ->setup_data_interface()
A bunch of driver related changes:
* various cleanup, fixes and improvements of the MTK driver
* OMAP DT bindings fixes
* support for ->setup_data_interface() in the fsmc driver
* support for imx7 in the gpmi driver
* finalization of the denali driver rework (thanks to Masahiro for the
work he's done on this driver)
* fix "bitflips in erased pages" handling in the ifc driver
* addition of PM ops and dynamic timing configuration to the atmel
driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZZ7bdAAoJEFySrpd9RFgtZvIP+wfo25Lkv2gFRIFhnoDnxDfu
1pLVL8HrgTYBcD3dmr9ghONq+bxh2SSz3gU20i/eWmOmKy1OwaGegSj88hYpGOpS
2bwWWlczMqkX+upHw0une3ZrTb6pjoyHKHr5I5GYoJPgG2Dw2D3joehRkvMDispD
9cEik9HkyliHXy/1mqFsToe5RwdqauLbKR/a2XZQo89gt8n8Rnlt91Q5QOZytC6r
GLkuQzRAf4qVi4sgDb7zvFZW7KeyGTXTLDxKZGG9JETNjzcEJZMykAWxR9SwBCHa
tL7HjyaU5d2rXo4ukZ4IplKn9Y+BneDeGomy44DcGP6RAyNDqVC/R5eFW+MtlbwY
rm6SDxs9vCeUBrgIaJlVqDJxca/OR3ruHKILGbEfvIy/MmRQ4keBf357Dew8o4x/
wQw2dgznn3/vs5aqSz/E+erY22gdnaHtDApaefB/D0Kqi9fs2yVaAh3gGcXmloO9
yfRfzPugMRwI29gztMkgRWKWTCfHe2JN4hLDMVwO7Rt3ucQIbz642N/4JVMEpDcX
gJcaSgXn/u6xRJnEX/2u+B6ERNqVvLZ8fbnfD0fkPkjLOISvfg38xti1qgoxs8z8
tm5lMI7VR9/MKIxCXT/6Z+actDV21j/oo0QInV3YMxHDPl5KBj+migsRtDzpGhna
dmztYIMYqF9I29skWgXR
=ReBr
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20170713' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"General updates:
- Cleanups and additional flash support for "dataflash" driver
- new driver for mchp23k256 SPI SRAM device
- improve handling of MTDs without eraseblocks (i.e., MTD_NO_ERASE)
- refactor and improve "sub-partition" handling with TRX partition
parser; partitions can now be created as sub-partitions of another
partition
SPINOR updates, from Cyrille Pitchen and Marek Vasut:
- introduce support to the SPI 1-2-2 and 1-4-4 protocols.
- introduce support to the Double Data Rate (DDR) mode.
- introduce support to the Octo SPI protocols.
- add support to new memory parts for Spansion, Macronix and Winbond.
- add fixes for the Aspeed, STM32 and Cadence QSPI controler drivers.
- clean up the st_spi_fsm driver.
NAND updates, from Boris Brezillon:
- addition of on-die ECC support to Micron driver
- addition of helpers to help drivers choose most appropriate ECC
settings
- deletion of dead-code (cached programming and ->errstat() hook)
- make sure drivers that do not support the SET/GET FEATURES command
return ENOTSUPP use a dummy ->set/get_features implementation
returning -ENOTSUPP (required for Micron on-die ECC)
- change the semantic of ecc->write_page() for drivers setting the
NAND_ECC_CUSTOM_PAGE_ACCESS flag
- support exiting 'GET STATUS' command in default ->cmdfunc()
implementations
- change the prototype of ->setup_data_interface()
A bunch of driver related changes:
- various cleanup, fixes and improvements of the MTK driver
- OMAP DT bindings fixes
- support for ->setup_data_interface() in the fsmc driver
- support for imx7 in the gpmi driver
- finalization of the denali driver rework (thanks to Masahiro for
the work he's done on this driver)
- fix "bitflips in erased pages" handling in the ifc driver
- addition of PM ops and dynamic timing configuration to the atmel
driver"
* tag 'for-linus-20170713' of git://git.infradead.org/linux-mtd: (118 commits)
Documentation: ABI: mtd: describe "offset" more precisely
mtd: Fix check in mtd_unpoint()
mtd: nand: mtk: release lock on error path
mtd: st_spi_fsm: remove SPINOR_OP_RDSR2 and use SPINOR_OP_RDCR instead
mtd: spi-nor: cqspi: remove duplicate const
mtd: spi-nor: Add support for Spansion S25FL064L
mtd: spi-nor: Add support for mx66u51235f
mtd: nand: mtk: add ->setup_data_interface() hook
mtd: nand: mtk: remove unneeded mtk_ecc_hw_init from mtk_ecc_resume
mtd: nand: mtk: remove unneeded mtk_nfc_hw_init from mtk_nfc_resume
mtd: nand: mtk: disable ecc irq when writing page with hwecc
mtd: nand: mtk: fix incorrect register setting order about ecc irq
mtd: partitions: fixup some allocate_partition() whitespace
mtd: parsers: trx: fix pr_err format for printing offset
MAINTAINERS: Update SPI NOR subsystem git repositories
mtd: extract TRX parser out of bcm47xxpart into a separated module
mtd: partitions: add support for partition parsers
mtd: partitions: add support for subpartitions
mtd: partitions: rename "master" to the "parent" where appropriate
mtd: partitions: remove sysfs files when deleting all master's partitions
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZZvUdAAoJEAx081l5xIa+ZHQP/iFMSmu1H78hZ3amygQnCkyr
NSeeym+z2Aj4MADXjMM6mEmLxSAR3xzMeqD4yJjZTYcOiLoK2FI2t1yo216LVNm0
U4Ws5u1tLTBgr/tYSwCbAOsr1s4e7YvS4FdpK3echcTKp10TBSu3wd4utTidQAit
Q7h2+mnoj92WbBdbIcydIHpg931xbmEyRQIuOJuIAuYdG3iGlflQ1EsUAuozV8Hn
z9S93IwtTeDZMYEjQcCJFpGrnUzHP/oi2WatW6c8GXwykuI7e9px9KmxqO/kWibV
JfrVuDZFJKVyBga7cc3xHf8zmvnzsv0QhZBPb9hOGuH093RkUlF2Q6fswnvy9MPU
Ym1t/sb8kgF0tgPBxTt3F6m5mFSi/Rz9WTof8y/y4bN3z2LCCsv65VWaqpZUvnGD
x6my5wgCgXRNHxK031lMNOLItv2rg4rbkJcL+VE1No2TjbnjZxcFgMdihwCyrkYk
B1oeFUZk+oDS4ho0dL0LZr9OJkvF/X4yF0Ubx/CjYf5lTRr1Gtaha+qyaSqTbcEO
qm+aprNTm8QPKx9ZAVOZObEj7bt3R0r4HzxUTTy5VyYJqNx6NySINIoMgEi/e/wc
ScJVzI2EKmQh8xVtEle5wGz1gMGZMHj5Q/v/r1fkjXf07R931im+w3lxif4I5K1e
TOvgMJ2rBgYXQ55lssN/
=TGAC
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.13-rc1' of git://people.freedesktop.org/~airlied/linux
Pull more drm updates from Dave Airlie:
"i915, amd and some core fixes + mediatek color support.
Some fixes tree came in since the main pull request for rc1, primarily
i915 and drm-misc and one amd fix. The drm core vblank regression fix
is probably the most important thing.
I've also added the mediatek feature pull, it wasn't that big and
didn't look like it would have any impact outside of mediatek, in fact
it looks to just be a single feature, and some cleanups"
* tag 'drm-fixes-for-v4.13-rc1' of git://people.freedesktop.org/~airlied/linux: (31 commits)
drm/i915: Make DP-MST connector info work
drm/i915/gvt: Use fence error from GVT request for workload status
drm/i915/gvt: remove scheduler_mutex in per-engine workload_thread
drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue"
drm/i915/gvt: Audit the command buffer address
drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt()
drm/rockchip: fix NULL check on devm_kzalloc() return value
drm/i915/fbdev: Check for existence of ifbdev->vma before operations
drm/radeon: Fix eDP for single-display iMac10,1 (v2)
drm/i915: Hold RPM wakelock while initializing OA buffer
drm/i915/cnl: Fix the CURSOR_COEFF_MASK used in DDI Vswing Programming
drm/i915/cfl: Fix Workarounds.
drm/i915: Avoid undefined behaviour of "u32 >> 32"
drm/i915: reintroduce VLV/CHV PFI programming power domain workaround
drm/i915: Fix an error checking test
drm/i915: Disable MSI for all pre-gen5
drm/atomic: Add missing drm_atomic_state_clear to atomic_remove_fb
drm: vblank: Fix vblank timestamp update
drm/i915/gvt: Make function dpy_reg_mmio_readx safe
drm/mediatek: separate color module to fixup error memory reallocation
...
There is a flaw in the Hyper-V SynIC implementation in KVM: when message
page or event flags page is enabled by setting the corresponding msr,
KVM zeroes it out. This is problematic because on migration the
corresponding MSRs are loaded on the destination, so the content of
those pages is lost.
This went unnoticed so far because the only user of those pages was
in-KVM hyperv synic timers, which could continue working despite that
zeroing.
Newer QEMU uses those pages for Hyper-V VMBus implementation, and
zeroing them breaks the migration.
Besides, in newer QEMU the content of those pages is fully managed by
QEMU, so zeroing them is undesirable even when writing the MSRs from the
guest side.
To support this new scheme, introduce a new capability,
KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
pages aren't zeroed out in KVM.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This extends the existing set of bulk helpers with prepare_enable and
disable_unprepare variants.
Cc: Russell King <linux@armlinux.org.uk>,
Cc: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Pull sysctl fix from Eric Biederman:
"A rather embarassing and hard to hit bug was merged into 4.11-rc1.
Andrei Vagin tracked this bug now and after some staring at the code
I came up with a fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Fix proc_sys_prune_dcache to hold a sb reference
Pull networking fixes from David Miller:
1) Fix 64-bit division in mlx5 IPSEC offload support, from Ilan Tayari
and Arnd Bergmann.
2) Fix race in statistics gathering in bnxt_en driver, from Michael
Chan.
3) Can't use a mutex in RCU reader protected section on tap driver, from
Cong WANG.
4) Fix mdb leak in bridging code, from Eduardo Valentin.
5) Fix free of wrong pointer variable in nfp driver, from Dan Carpenter.
6) Buffer overflow in brcmfmac driver, from Arend van SPriel.
7) ioremap_nocache() return value needs to be checked in smsc911x
driver, from Alexey Khoroshilov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
net: stmmac: revert "support future possible different internal phy mode"
sfc: don't read beyond unicast address list
datagram: fix kernel-doc comments
socket: add documentation for missing elements
smsc911x: Add check for ioremap_nocache() return code
brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
net: hns: Bugfix for Tx timeout handling in hns driver
net: ipmr: ipmr_get_table() returns NULL
nfp: freeing the wrong variable
mlxsw: spectrum_switchdev: Check status of memory allocation
mlxsw: spectrum_switchdev: Remove unused variable
mlxsw: spectrum_router: Fix use-after-free in route replace
mlxsw: spectrum_router: Add missing rollback
samples/bpf: fix a build issue
bridge: mdb: fix leak on complete_info ptr on fail path
tap: convert a mutex to a spinlock
cxgb4: fix BUG() on interrupt deallocating path of ULD
qed: Fix printk option passed when printing ipv6 addresses
net: Fix minor code bug in timestamping.txt
net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer
...
Currently the writeback statistics code uses a percpu counters to hold
various statistics. Furthermore we have 2 families of functions - those
which disable local irq and those which doesn't and whose names begin
with double underscore. However, they both end up calling
__add_wb_stats which in turn calls percpu_counter_add_batch which is
already irq-safe.
Exploiting this fact allows to eliminated the __wb_* functions since
they don't add any further protection than we already have.
Furthermore, refactor the wb_* function to call __add_wb_stat directly
without the irq-disabling dance. This will likely result in better
runtime of code which deals with modifying the stat counters.
While at it also document why percpu_counter_add_batch is in fact
preempt and irq-safe since at least 3 people got confused.
Link: http://lkml.kernel.org/r/1498029937-27293-1-git-send-email-nborisov@suse.com
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Page migration (for memory hotplug, soft_offline_page or mbind) needs to
allocate a new memory. This can trigger an oom killer if the target
memory is depleated. Although quite unlikely, still possible,
especially for the memory hotplug (offlining of memoery).
Up to now we didn't really have reasonable means to back off.
__GFP_NORETRY can fail just too easily and __GFP_THISNODE sticks to a
single node and that is not suitable for all callers.
But now that we have __GFP_RETRY_MAYFAIL we should use it. It is
preferable to fail the migration than disrupt the system by killing some
processes.
Link: http://lkml.kernel.org/r/20170623085345.11304-7-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
the page allocator. This has been true but only for allocations
requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always
ignored for smaller sizes. This is a bit unfortunate because there is
no way to express the same semantic for those requests and they are
considered too important to fail so they might end up looping in the
page allocator for ever, similarly to GFP_NOFAIL requests.
Now that the whole tree has been cleaned up and accidental or misled
usage of __GFP_REPEAT flag has been removed for !costly requests we can
give the original flag a better name and more importantly a more useful
semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
that the allocator would try really hard but there is no promise of a
success. This will work independent of the order and overrides the
default allocator behavior. Page allocator users have several levels of
guarantee vs. cost options (take GFP_KERNEL as an example)
- GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
attempt to free memory at all. The most light weight mode which even
doesn't kick the background reclaim. Should be used carefully because
it might deplete the memory and the next user might hit the more
aggressive reclaim
- GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
allocation without any attempt to free memory from the current
context but can wake kswapd to reclaim memory if the zone is below
the low watermark. Can be used from either atomic contexts or when
the request is a performance optimization and there is another
fallback for a slow path.
- (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
non sleeping allocation with an expensive fallback so it can access
some portion of memory reserves. Usually used from interrupt/bh
context with an expensive slow path fallback.
- GFP_KERNEL - both background and direct reclaim are allowed and the
_default_ page allocator behavior is used. That means that !costly
allocation requests are basically nofail but there is no guarantee of
that behavior so failures have to be checked properly by callers
(e.g. OOM killer victim is allowed to fail currently).
- GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
and all allocation requests fail early rather than cause disruptive
reclaim (one round of reclaim in this implementation). The OOM killer
is not invoked.
- GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
behavior and all allocation requests try really hard. The request
will fail if the reclaim cannot make any progress. The OOM killer
won't be triggered.
- GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
and all allocation requests will loop endlessly until they succeed.
This might be really dangerous especially for larger orders.
Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
because they already had their semantic. No new users are added.
__alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
there is no progress and we have already passed the OOM point.
This means that all the reclaim opportunities have been exhausted except
the most disruptive one (the OOM killer) and a user defined fallback
behavior is more sensible than keep retrying in the page allocator.
[akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
[mhocko@suse.com: semantic fix]
Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
[mhocko@kernel.org: address other thing spotted by Vlastimil]
Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "stackprotector: ascii armor the stack canary", v2.
Zero out the first byte of the stack canary value on 64 bit systems, in
order to mitigate unterminated C string overflows.
The null byte both prevents C string functions from reading the canary,
and from writing it if the canary value were guessed or obtained through
some other means.
Reducing the entropy by 8 bits is acceptable on 64-bit systems, which
will still have 56 bits of entropy left, but not on 32 bit systems, so
the "ascii armor" canary is only implemented on 64-bit systems.
Inspired by the "ascii armor" code in execshield and Daniel Micay's
linux-hardened tree.
Also see https://github.com/thestinger/linux-hardened/
This patch (of 5):
Introduce get_random_canary(), which provides a random unsigned long
canary value with the first byte zeroed out on 64 bit architectures, in
order to mitigate non-terminated C string overflows.
The null byte both prevents C string functions from reading the canary,
and from writing it if the canary value were guessed or obtained through
some other means.
Reducing the entropy by 8 bits is acceptable on 64-bit systems, which
will still have 56 bits of entropy left, but not on 32 bit systems, so
the "ascii armor" canary is only implemented on 64-bit systems.
Inspired by the "ascii armor" code in the old execshield patches, and
Daniel Micay's linux-hardened tree.
Link: http://lkml.kernel.org/r/20170524155751.424-2-riel@redhat.com
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.
GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and
allow full use of the regular string intrinsics while avoiding the need
for a bunch of _chk functions and per-arch assembly to avoid wrapper
overhead.
This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.
Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:
* Some of the fortified string functions (strncpy, strcat), don't yet
place a limit on reads from the source based on __builtin_object_size of
the source buffer.
* Extending coverage to more string functions like strlcat.
* It should be possible to optionally use __builtin_object_size(x, 1) for
some functions (C strings) to detect intra-object overflows (like
glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
approach to avoid likely compatibility issues.
* The compile-time checks should be made available via a separate config
option which can be enabled by default (or always enabled) once enough
time has passed to get the issues it catches fixed.
Kees said:
"This is great to have. While it was out-of-tree code, it would have
blocked at least CVE-2016-3858 from being exploitable (improper size
argument to strlcpy()). I've sent a number of fixes for
out-of-bounds-reads that this detected upstream already"
[arnd@arndb.de: x86: fix fortified memcpy]
Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de
[keescook@chromium.org: avoid panic() in favor of BUG()]
Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast
[keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help]
Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com
Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.
LOCKUP_DETECTOR implies the general boot, sysctl, and programming
interfaces for the lockup detectors.
An architecture that wants to use a hard lockup detector must define
HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH.
Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the
minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and
does not implement the LOCKUP_DETECTOR interfaces.
sparc is unusual in that it has started to implement some of the
interfaces, but not fully yet. It should probably be converted to a full
HAVE_HARDLOCKUP_DETECTOR_ARCH.
[npiggin@gmail.com: fix]
Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com
Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Babu Moger <babu.moger@oracle.com> [sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For architectures that define HAVE_NMI_WATCHDOG, instead of having them
provide the complete touch_nmi_watchdog() function, just have them
provide arch_touch_nmi_watchdog().
This gives the generic code more flexibility in implementing this
function, and arch implementations don't miss out on touching the
softlockup watchdog or other generic details.
Link: http://lkml.kernel.org/r/20170616065715.18390-3-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Babu Moger <babu.moger@oracle.com> [sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Improve watchdog config for arch watchdogs", v4.
A series to make the hardlockup watchdog more easily replaceable by arch
code. The last patch provides some justification for why we want to do
this (existing sparc watchdog is another that could benefit).
This patch (of 5):
Remove unused declaration.
Link: http://lkml.kernel.org/r/20170616065715.18390-2-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Babu Moger <babu.moger@oracle.com> [sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sem_ctime is initialized to the semget() time and then updated at every
semctl() that changes the array.
Thus it does not represent the time of the last change.
Especially, semop() calls are only stored in sem_otime, not in
sem_ctime.
This is already described in ipc/sem.c, I just overlooked that there is
a comment in include/linux/sem.h and man semctl(2) as well.
So: Correct wrong comments.
Link: http://lkml.kernel.org/r/20170515171912.6298-4-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: <1vier1@web.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ipc has two management structures that exist for every id:
- struct kern_ipc_perm, it contains e.g. the permissions.
- struct ipc_rcu, it contains the rcu head for rcu handling and the
refcount.
The patch merges both structures.
As a bonus, we may save one cacheline, because both structures are
cacheline aligned. In addition, it reduces the number of casts, instead
most codepaths can use container_of.
To simplify code, the ipc_rcu_alloc initializes the allocation to 0.
[manfred@colorfullife.com: really include the memset() into ipc_alloc_rcu()]
Link: http://lkml.kernel.org/r/564f8612-0601-b267-514f-a9f650ec9b32@colorfullife.com
Link: http://lkml.kernel.org/r/20170525185107.12869-3-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sma->sem_base is initialized with
sma->sem_base = (struct sem *) &sma[1];
The current code has four problems:
- There is an unnecessary pointer dereference - sem_base is not needed.
- Alignment for struct sem only works by chance.
- The current code causes false positive for static code analysis.
- This is a cast between different non-void types, which the future
randstruct GCC plugin warns on.
And, as bonus, the code size gets smaller:
Before:
0 .text 00003770
After:
0 .text 0000374e
[manfred@colorfullife.com: s/[0]/[]/, per hch]
Link: http://lkml.kernel.org/r/20170525185107.12869-2-manfred@colorfullife.com
Link: http://lkml.kernel.org/r/20170515171912.6298-2-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: <1vier1@web.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add /proc/self/task/<current-tid>/fail-nth file that allows failing
0-th, 1-st, 2-nd and so on calls systematically.
Excerpt from the added documentation:
"Write to this file of integer N makes N-th call in the current task
fail (N is 0-based). Read from this file returns a single char 'Y' or
'N' that says if the fault setup with a previous write to this file
was injected or not, and disables the fault if it wasn't yet injected.
Note that this file enables all types of faults (slab, futex, etc).
This setting takes precedence over all other generic settings like
probability, interval, times, etc. But per-capability settings (e.g.
fail_futex/ignore-private) take precedence over it. This feature is
intended for systematic testing of faults in a single system call. See
an example below"
Why add a new setting:
1. Existing settings are global rather than per-task.
So parallel testing is not possible.
2. attr->interval is close but it depends on attr->count
which is non reset to 0, so interval does not work as expected.
3. Trying to model this with existing settings requires manipulations
of all of probability, interval, times, space, task-filter and
unexposed count and per-task make-it-fail files.
4. Existing settings are per-failure-type, and the set of failure
types is potentially expanding.
5. make-it-fail can't be changed by unprivileged user and aggressive
stress testing better be done from an unprivileged user.
Similarly, this would require opening the debugfs files to the
unprivileged user, as he would need to reopen at least times file
(not possible to pre-open before dropping privs).
The proposed interface solves all of the above (see the example).
We want to integrate this into syzkaller fuzzer. A prototype has found
10 bugs in kernel in first day of usage:
https://groups.google.com/forum/#!searchin/syzkaller/%22FAULT_INJECTION%22%7Csort:relevance
I've made the current interface work with all types of our sandboxes.
For setuid the secret sauce was prctl(PR_SET_DUMPABLE, 1, 0, 0, 0) to
make /proc entries non-root owned. So I am fine with the current
version of the code.
[akpm@linux-foundation.org: fix build]
Link: http://lkml.kernel.org/r/20170328130128.101773-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kcmp syscall is build iif CONFIG_CHECKPOINT_RESTORE is selected, so wrap
appropriate helpers in epoll code with the config to build it
conditionally.
Link: http://lkml.kernel.org/r/20170513083456.GG1881@uranus.lan
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: Andrew Morton <akpm@linuxfoundation.org>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With current epoll architecture target files are addressed with
file_struct and file descriptor number, where the last is not unique.
Moreover files can be transferred from another process via unix socket,
added into queue and closed then so we won't find this descriptor in the
task fdinfo list.
Thus to checkpoint and restore such processes CRIU needs to find out
where exactly the target file is present to add it into epoll queue.
For this sake one can use kcmp call where some particular target file
from the queue is compared with arbitrary file passed as an argument.
Because epoll target files can have same file descriptor number but
different file_struct a caller should explicitly specify the offset
within.
To test if some particular file is matching entry inside epoll one have
to
- fill kcmp_epoll_slot structure with epoll file descriptor,
target file number and target file offset (in case if only
one target is present then it should be 0)
- call kcmp as kcmp(pid1, pid2, KCMP_EPOLL_TFD, fd, &kcmp_epoll_slot)
- the kernel fetch file pointer matching file descriptor @fd of pid1
- lookups for file struct in epoll queue of pid2 and returns traditional
0,1,2 result for sorting purpose
Link: http://lkml.kernel.org/r/20170424154423.511592110@gmail.com
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To keep parity with regular int interfaces provide the an unsigned int
proc_douintvec_minmax() which allows you to specify a range of allowed
valid numbers.
Adding proc_douintvec_minmax_sysadmin() is easy but we can wait for an
actual user for that.
Link: http://lkml.kernel.org/r/20170519033554.18592-6-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
has the risk of being modified by some wrong code during system is
running.
As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
when using "crash", "makedumpfile", etc utility to parse this vmcore, we
probably will get "Segmentation fault" or other unexpected errors.
E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
system; 3) trigger kdump, then we obviously will fail to recognize the
crash context correctly due to the corrupted vmcoreinfo.
Now except for vmcoreinfo, all the crash data is well
protected(including the cpu note which is fully updated in the crash
path, thus its correctness is guaranteed). Given that vmcoreinfo data
is a large chunk prepared for kdump, we better protect it as well.
To solve this, we relocate and copy vmcoreinfo_data to the crash memory
when kdump is loading via kexec syscalls. Because the whole crash
memory will be protected by existing arch_kexec_protect_crashkres()
mechanism, we naturally protect vmcoreinfo_data from write(even read)
access under kernel direct mapping after kdump is loaded.
Since kdump is usually loaded at the very early stage after boot, we can
trust the correctness of the vmcoreinfo data copied.
On the other hand, we still need to operate the vmcoreinfo safe copy
when crash happens to generate vmcoreinfo_note again, we rely on vmap()
to map out a new kernel virtual address and update to use this new one
instead in the following crash_save_vmcoreinfo().
BTW, we do not touch vmcoreinfo_note, because it will be fully updated
using the protected vmcoreinfo_data after crash which is surely correct
just like the cpu crash note.
Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vmcoreinfo_max_size stands for the vmcoreinfo_data, the correct one we
should use is vmcoreinfo_note whose total size is VMCOREINFO_NOTE_SIZE.
Like explained in commit 77019967f0 ("kdump: fix exported size of
vmcoreinfo note"), it should not affect the actual function, but we
better fix it, also this change should be safe and backward compatible.
After this, we can get rid of variable vmcoreinfo_max_size, let's use
the corresponding macros directly, fewer variables means more safety for
vmcoreinfo operation.
[xlpang@redhat.com: fix build warning]
Link: http://lkml.kernel.org/r/1494830606-27736-1-git-send-email-xlpang@redhat.com
Link: http://lkml.kernel.org/r/1493281021-20737-2-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As Eric said,
"what we need to do is move the variable vmcoreinfo_note out of the
kernel's .bss section. And modify the code to regenerate and keep this
information in something like the control page.
Definitely something like this needs a page all to itself, and ideally
far away from any other kernel data structures. I clearly was not
watching closely the data someone decided to keep this silly thing in
the kernel's .bss section."
This patch allocates extra pages for these vmcoreinfo_XXX variables, one
advantage is that it enhances some safety of vmcoreinfo, because
vmcoreinfo now is kept far away from other kernel data structures.
Link: http://lkml.kernel.org/r/1493281021-20737-1-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the first parameter of container_of() is a pointer to a
non-const-qualified array type (and the third parameter names a
non-const-qualified array member), the local variable __mptr will be
defined with a const-qualified array type. In ISO C, these types are
incompatible. They work as expected in GNU C, but some versions will
issue warnings. For example, GCC 4.9 produces the warning
"initialization from incompatible pointer type".
Here is an example of where the problem occurs:
-------------------------------------------------------
#include <linux/kernel.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
struct st {
int a;
char b[16];
};
static int __init example_init(void) {
struct st t = { .a = 101, .b = "hello" };
char (*p)[16] = &t.b;
struct st *x = container_of(p, struct st, b);
printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x);
return 0;
}
static void __exit example_exit(void) {
}
module_init(example_init);
module_exit(example_exit);
-------------------------------------------------------
Building the module with gcc-4.9 results in these warnings (where '{m}'
is the module source and '{k}' is the kernel source):
-------------------------------------------------------
In file included from {m}/example.c:1:0:
{m}/example.c: In function `example_init':
{k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
^
{m}/example.c:14:17: note: in expansion of macro `container_of'
struct st *x = container_of(p, struct st, b);
^
{k}/include/linux/kernel.h:854:48: warning: (near initialization for `x')
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
^
{m}/example.c:14:17: note: in expansion of macro `container_of'
struct st *x = container_of(p, struct st, b);
^
-------------------------------------------------------
Replace the type checking performed by the macro to avoid these
warnings. Make sure `*(ptr)` either has type compatible with the
member, or has type compatible with `void`, ignoring qualifiers. Raise
compiler errors if this is not true. This is stronger than the previous
behaviour, which only resulted in compiler warnings for a type mismatch.
[arnd@arndb.de: fix new warnings for container_of()]
Link: http://lkml.kernel.org/r/20170620200940.90557-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20170525120316.24473-7-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
"kernel.h: handle pointers to arrays better in container_of()" triggers:
In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/linux/syscalls.h:71,
from fs/dcache.c:17:
fs/dcache.c: In function 'release_dentry_name_snapshot':
include/linux/compiler.h:542:38: error: call to '__compiletime_assert_305' declared with attribute error: pointer type mismatch in container_of()
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^
include/linux/compiler.h:525:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
include/linux/compiler.h:542:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^
include/linux/build_bug.h:46:37: note: in expansion of macro 'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^
include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
^
fs/dcache.c:305:7: note: in expansion of macro 'container_of'
p = container_of(name->name, struct external_name, name[0]);
Switch name_snapshot to use unsigned chars, matching struct qstr and
struct external_name.
Link: http://lkml.kernel.org/r/20170710152134.0f78c1e6@canb.auug.org.au
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fill in missing kernel-doc for missing elements in struct sock.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtc_time_to_tm and rtc_tm_to_time are not deprecated and make perfect sense
for RTCs that are simple 32bit counters.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
In case of KVM_S390_GET_CMMA_BITS, the kernel does not only read struct
kvm_s390_cmma_log passed from userspace (which constitutes _IOC_WRITE),
it also writes back a return value (which constitutes _IOC_READ) making
this an _IOWR ioctl instead of _IOW.
Fixes: 4036e387 ("KVM: s390: ioctls to get and set guest storage attributes")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Clean up: Registration mode details are now handled by the rdma_rw
API, and thus can be removed from svcrdma.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: Now that the svc_rdma_recvfrom path uses the rdma_rw API,
the details of Read sink buffer registration are dealt with by the
kernel's RDMA core. This cache is no longer used, and can be
removed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up:
The generic RDMA R/W API conversion of svc_rdma_recvfrom replaced
the Register, Read, and Invalidate completion handlers. Remove the
old ones, which are no longer used.
These handlers shared some helper code with svc_rdma_wc_send. Fold
the wc_common helper back into the one remaining completion handler.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The current svcrdma recvfrom code path has a lot of detail about
registration mode and the type of port (iWARP, IB, etc).
Instead, use the RDMA core's generic R/W API. This shares code with
other RDMA-enabled ULPs that manages the gory details of buffer
registration and the posting of RDMA Read Work Requests.
Since the Read list marshaling code is being replaced, I took the
opportunity to replace C structure-based XDR encoding code with more
portable code that uses pointer arithmetic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
svc_rdma_rw.c already contains helpers for the sendto path.
Introduce helpers for the recvfrom path.
The plan is to replace the local NFSD bespoke code that constructs
and posts RDMA Read Work Requests with calls to the rdma_rw API.
This shares code with other RDMA-enabled ULPs that manages the gory
details of buffer registration and posting Work Requests.
This new code also puts all RDMA_NOMSG-specific logic in one place.
Lastly, the use of rqstp->rq_arg.pages is deprecated in favor of
using rqstp->rq_pages directly, for clarity.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
svcrdma needs 259 pages allocated to receive 1MB NFSv4.0 WRITE requests:
- 1 page for the transport header and head iovec
- 256 pages for the data payload
- 1 page for the trailing GETATTR request (since NFSD XDR decoding
does not look for a tail iovec, the GETATTR is stuck at the end
of the rqstp->rq_arg.pages list)
- 1 page for building the reply xdr_buf
But RPCSVC_MAXPAGES is already 259 (on x86_64). The problem is that
svc_alloc_arg never allocates that many pages. To address this:
1. The final element of rq_pages always points to NULL. To
accommodate up to 259 pages in rq_pages, add an extra element
to rq_pages for the array termination sentinel.
2. Adjust the calculation of "pages" to match how RPCSVC_MAXPAGES
is calculated, so it can go up to 259. Bruce noted that the
calculation assumes sv_max_mesg is a multiple of PAGE_SIZE,
which might not always be true. I didn't change this assumption.
3. Change the loop boundaries to allow 259 pages to be allocated.
Additional clean-up: WARN_ON_ONCE adds an extra conditional branch,
which is basically never taken. And there's no need to dump the
stack here because svc_alloc_arg has only one caller.
Keeping that NULL "array termination sentinel"; there doesn't appear to
be any code that depends on it, only code in nfsd_splice_actor() which
needs the 259th element to be initialized to *something*. So it's
possible we could just keep the array at 259 elements and drop that
final NULL, but we're being conservative for now.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull i2c updates from Wolfram Sang:
"This pull request contains:
- i2c core reorganization. One source file became too monolithic. It
is now split up, yet we still have the same named object as the
final output. This should ease maintenance.
- new drivers: ZTE ZX2967 family, ASPEED 24XX/25XX
- designware driver gained slave mode support
- xgene-slimpro driver gained ACPI support
- bigger overhaul for pca-platform driver
- the algo-bit module now supports messages with enforced STOP
- slightly bigger than usual set of driver updates and improvements
and with much appreciated quality assurance from Andy Shevchenko"
* 'i2c/for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (51 commits)
i2c: Provide a stub for i2c_detect_slave_mode()
i2c: designware: Let slave adapter support be optional
i2c: designware: Make HW init functions static
i2c: designware: fix spelling mistakes
i2c: pca-platform: propagate error from i2c_pca_add_numbered_bus
i2c: pca-platform: correctly set algo_data.reset_chip
i2c: acpi: Do not create i2c-clients for LNXVIDEO ACPI devices
i2c: designware: enable SLAVE in platform module
i2c: designware: add SLAVE mode functions
i2c: zx2967: drop COMPILE_TEST dependency
i2c: zx2967: always use the same device when printing errors
i2c: pca-platform: use dev_warn/dev_info instead of printk
i2c: pca-platform: use device managed allocations
i2c: pca-platform: add devicetree awareness
i2c: pca-platform: switch to struct gpio_desc
dt-bindings: add bindings for i2c-pca-platform
i2c: cadance: fix ctrl/addr reg write order
i2c: zx2967: add i2c controller driver for ZTE's zx2967 family
dt: bindings: add documentation for zx2967 family i2c controller
i2c: algo-bit: add support for I2C_M_STOP
...
This update comes with:
* Support for lockless operation in the ARM io-pgtable code.
This is an important step to solve the scalability problems in
the common dma-iommu code for ARM
* Some Errata workarounds for ARM SMMU implemenations
* Rewrite of the deferred IO/TLB flush code in the AMD IOMMU
driver. The code suffered from very high flush rates, with the
new implementation the flush rate is down to ~1% of what it
was before
* Support for amd_iommu=off when booting with kexec. Problem
here was that the IOMMU driver bailed out early without
disabling the iommu hardware, if it was enabled in the old
kernel
* The Rockchip IOMMU driver is now available on ARM64
* Align the return value of the iommu_ops->device_group
call-backs to not miss error values
* Preempt-disable optimizations in the Intel VT-d and common
IOVA code to help Linux-RT
* Various other small cleanups and fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ
Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+
buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ
mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf
eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m
S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO
4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn
TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz
4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD
Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL
quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN
Ukehbl4OEKzZpD3oLPPk
=pmBf
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This update comes with:
- Support for lockless operation in the ARM io-pgtable code.
This is an important step to solve the scalability problems in the
common dma-iommu code for ARM
- Some Errata workarounds for ARM SMMU implemenations
- Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver.
The code suffered from very high flush rates, with the new
implementation the flush rate is down to ~1% of what it was before
- Support for amd_iommu=off when booting with kexec.
The problem here was that the IOMMU driver bailed out early without
disabling the iommu hardware, if it was enabled in the old kernel
- The Rockchip IOMMU driver is now available on ARM64
- Align the return value of the iommu_ops->device_group call-backs to
not miss error values
- Preempt-disable optimizations in the Intel VT-d and common IOVA
code to help Linux-RT
- Various other small cleanups and fixes"
* tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits)
iommu/vt-d: Constify intel_dma_ops
iommu: Warn once when device_group callback returns NULL
iommu/omap: Return ERR_PTR in device_group call-back
iommu: Return ERR_PTR() values from device_group call-backs
iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device()
iommu/vt-d: Don't disable preemption while accessing deferred_flush()
iommu/iova: Don't disable preempt around this_cpu_ptr()
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701)
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74
ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model
iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions
iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table
iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE
iommu/arm-smmu-v3: Remove io-pgtable spinlock
iommu/arm-smmu: Remove io-pgtable spinlock
iommu/io-pgtable-arm-v7s: Support lockless operation
iommu/io-pgtable-arm: Support lockless operation
iommu/io-pgtable: Introduce explicit coherency
iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
...
Pull overlayfs updates from Miklos Szeredi:
"This work from Amir introduces the inodes index feature, which
provides:
- hardlinks are not broken on copy up
- infrastructure for overlayfs NFS export
This also fixes constant st_ino for samefs case for lower hardlinks"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (33 commits)
ovl: mark parent impure and restore timestamp on ovl_link_up()
ovl: document copying layers restrictions with inodes index
ovl: cleanup orphan index entries
ovl: persistent overlay inode nlink for indexed inodes
ovl: implement index dir copy up
ovl: move copy up lock out
ovl: rearrange copy up
ovl: add flag for upper in ovl_entry
ovl: use struct copy_up_ctx as function argument
ovl: base tmpfile in workdir too
ovl: factor out ovl_copy_up_inode() helper
ovl: extract helper to get temp file in copy up
ovl: defer upper dir lock to tempfile link
ovl: hash overlay non-dir inodes by copy up origin
ovl: cleanup bad and stale index entries on mount
ovl: lookup index entry for copy up origin
ovl: verify index dir matches upper dir
ovl: verify upper root dir matches lower root dir
ovl: introduce the inodes index dir feature
ovl: generalize ovl_create_workdir()
...
fwnode_call_int_op() isn't suitable for calling ops that return bool
since it effectively causes the result returned to the user to be
true when an op hasn't been defined or the fwnode is NULL.
Address this by introducing fwnode_call_bool_op() for calling ops
that return bool.
Fixes: 3708184afc "device property: Move FW type specific functionality to FW specific files"
Fixes: 2294b3af05 "device property: Introduce fwnode_device_is_available()"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull more block updates from Jens Axboe:
"This is a followup for block changes, that didn't make the initial
pull request. It's a bit of a mixed bag, this contains:
- A followup pull request from Sagi for NVMe. Outside of fixups for
NVMe, it also includes a series for ensuring that we properly
quiesce hardware queues when browsing live tags.
- Set of integrity fixes from Dmitry (mostly), fixing various issues
for folks using DIF/DIX.
- Fix for a bug introduced in cciss, with the req init changes. From
Christoph.
- Fix for a bug in BFQ, from Paolo.
- Two followup fixes for lightnvm/pblk from Javier.
- Depth fix from Ming for blk-mq-sched.
- Also from Ming, performance fix for mtip32xx that was introduced
with the dynamic initialization of commands"
* 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
block: call bio_uninit in bio_endio
nvmet: avoid unneeded assignment of submit_bio return value
nvme-pci: add module parameter for io queue depth
nvme-pci: compile warnings in nvme_alloc_host_mem()
nvmet_fc: Accept variable pad lengths on Create Association LS
nvme_fc/nvmet_fc: revise Create Association descriptor length
lightnvm: pblk: remove unnecessary checks
lightnvm: pblk: control I/O flow also on tear down
cciss: initialize struct scsi_req
null_blk: fix error flow for shared tags during module_init
block: Fix __blkdev_issue_zeroout loop
nvme-rdma: unconditionally recycle the request mr
nvme: split nvme_uninit_ctrl into stop and uninit
virtio_blk: quiesce/unquiesce live IO when entering PM states
mtip32xx: quiesce request queues to make sure no submissions are inflight
nbd: quiesce request queues to make sure no submissions are inflight
nvme: kick requeue list when requeueing a request instead of when starting the queues
nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
...
RESEND_ON_SPLIT, RADOS_BACKOFF, OSDMAP_PG_UPMAP and CRUSH_CHOOSE_ARGS
feature bits, and various other changes in the RADOS client protocol.
On top of that we have a new fsc mount option to allow supplying
fscache uniquifier (similar to NFS) and the usual pile of filesystem
fixes from Zheng.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJZZQT+AAoJEEp/3jgCEfOLSsMH/i8ZdSzp7ocX00oLMlIxzFEk
5BUXZ086mEPAE4fjJFPO7+qYk6y26MzAhJL+bj8r5E0GvBEpQkoAoSQZ19Mj5ApC
nZnllzQ2C8kYvM4hp4Z2pLrF/OYACj/WJJgbTxubBET1zRq1iPj4EgbzBEraPvma
K76W9ILKNUjIoSDlNR5qvykXXfvi2dxRpi/8nvfMCOcjlw/7orjXVLa05fKmmOoX
OvpOjicWOrc8NlacGK+j1j1aaKlmLvZb9Ff+45hfC/L5PPQblM0dypFCVfq3MFFq
nUxKgTCAQDPrndzCdURCtdovjFKbskRGKmhnd0EZkdDCcnUmg6nLxqta6g2Dbs0=
=ioKM
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The main item here is support for v12.y.z ("Luminous") clusters:
RESEND_ON_SPLIT, RADOS_BACKOFF, OSDMAP_PG_UPMAP and CRUSH_CHOOSE_ARGS
feature bits, and various other changes in the RADOS client protocol.
On top of that we have a new fsc mount option to allow supplying
fscache uniquifier (similar to NFS) and the usual pile of filesystem
fixes from Zheng"
* tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client: (44 commits)
libceph: advertise support for NEW_OSDOP_ENCODING and SERVER_LUMINOUS
libceph: osd_state is 32 bits wide in luminous
crush: remove an obsolete comment
crush: crush_init_workspace starts with struct crush_work
libceph, crush: per-pool crush_choose_arg_map for crush_do_rule()
crush: implement weight and id overrides for straw2
libceph: apply_upmap()
libceph: compute actual pgid in ceph_pg_to_up_acting_osds()
libceph: pg_upmap[_items] infrastructure
libceph: ceph_decode_skip_* helpers
libceph: kill __{insert,lookup,remove}_pg_mapping()
libceph: introduce and switch to decode_pg_mapping()
libceph: don't pass pgid by value
libceph: respect RADOS_BACKOFF backoffs
libceph: make DEFINE_RB_* helpers more general
libceph: avoid unnecessary pi lookups in calc_target()
libceph: use target pi for calc_target() calculations
libceph: always populate t->target_{oid,oloc} in calc_target()
libceph: make sure need_resend targets reflect latest map
libceph: delete from need_resend_linger before check_linger_pool_dne()
...
This patch re-introduces part of a long standing login workaround that
was recently dropped by:
commit 1c99de981f
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Sun Apr 2 13:36:44 2017 -0700
iscsi-target: Drop work-around for legacy GlobalSAN initiator
Namely, the workaround for FirstBurstLength ended up being required by
Mellanox Flexboot PXE boot ROMs as reported by Robert.
So this patch re-adds the work-around for FirstBurstLength within
iscsi_check_proposer_for_optional_reply(), and makes the key optional
to respond when the initiator does not propose, nor respond to it.
Also as requested by Arun, this patch introduces a new TPG attribute
named 'login_keys_workaround' that controls the use of both the
FirstBurstLength workaround, as well as the two other existing
workarounds for gPXE iSCSI boot client.
By default, the workaround is enabled with login_keys_workaround=1,
since Mellanox FlexBoot requires it, and Arun has verified the Qlogic
MSFT initiator already proposes FirstBurstLength, so it's uneffected
by this re-adding this part of the original work-around.
Reported-by: Robert LeBlanc <robert@leblancnet.us>
Cc: Robert LeBlanc <robert@leblancnet.us>
Reviewed-by: Arun Easi <arun.easi@cavium.com>
Cc: <stable@vger.kernel.org> # 3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Changes in this pull request are around catching up
cros_ec with the internal chromeos-kernel versions of
cros_ec, cros_ec_lpc, and cros_ec_lightbar.
Also, switching maintainership from olof to bleung.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJZZBB8AAoJEB8J9XsKL+ZYcf4P/iRXb23r6pJgaqE3jO1mLJjQ
aJH8sMVk3q0tIA/Wo3blVZmUD87RkDPqQNRhUx4AKuTtkq+zi+YIdltBk9nyK2tZ
oRKtAFe1RL1a7Bxvh2im51mFE91q05nItPee+zylAKHL2PudKsAtvsjqEP/qmIBm
h3XkkOMzSB3cqAjzaLm6bE531pFoRx6yKWUMGr0aTbOjXewC2uhP/U9rJYqtiaYl
1oRfg1759cUxH1QXmsKIA5Ua2gKDZ+32aszxxgxSWmZ5671SB0psuyLW4Aar7XS0
MNKGIYgKWBAUHX8iBTLwz/Z4VBB8X9DS2BfDvCZwDJtjCjYcJPzLKjqyGeJ3wr0G
jW/kfjJL0G1FPxmS7WnsiUcDJemn+p/ia2/9HipLMM61fy7clezmBaxV8I4aWMh0
zxW8Bk7+qOOv9D72ErKKHJ1oaZ3EWXgWWfiUEmr+99n6GOfFu0vF5+gcdV4HVLKB
g2Gmt89OE+oMBAlWtDhX/RdhY2Xxf4POsCriBrqrealYXe9NIxjrleKRr6ysEj37
71/X6TFaqGTYoyyDAVjFmIu6upGVoCLLdx9b/BodV1hyq97AIKHOdzOXpCKk2nvx
IuA+JOWeoSGBD28CBhuvitJFDwTJv973Z+N9VrvZj91MKI89zI3Y0+sPAm69fbQ4
mqkTtiLPIfCsvZE/7lWN
=QtSr
-----END PGP SIGNATURE-----
Merge tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
Pull chrome platform updates from Benson Leung:
"Changes in this pull request are around catching up cros_ec with the
internal chromeos-kernel versions of cros_ec, cros_ec_lpc, and
cros_ec_lightbar.
Also, switching maintainership from olof to bleung"
* tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
platform/chrome : Add myself as Maintainer
platform/chrome: cros_ec_lightbar - hide unused PM functions
cros_ec: Don't signal wake event for non-wake host events
cros_ec: Fix deadlock when EC is not responsive at probe
cros_ec: Don't return error when checking command version
platform/chrome: cros_ec_lightbar - Avoid I2C xfer to EC during suspend
platform/chrome: cros_ec_lightbar - Add userspace lightbar control bit to EC
platform/chrome: cros_ec_lightbar - Control of suspend/resume lightbar sequence
platform/chrome: cros_ec_lightbar - Add lightbar program feature to sysfs
platform/chrome: cros_ec_lpc: Add MKBP events support over ACPI
platform/chrome: cros_ec_lpc: Add power management ops
platform/chrome: cros_ec_lpc: Add support for GOOG004 ACPI device
platform/chrome: cros_ec_lpc: Add support for mec1322 EC
platform/chrome: cros_ec_lpc: Add R/W helpers to LPC protocol variants
mfd: cros_ec: Add support for dumping panic information
cros_ec_debugfs: Pass proper struct sizes to cros_ec_cmd_xfer()
mfd: cros_ec: add debugfs, console log file
mfd: cros_ec: Add EC console read structures definitions
mfd: cros_ec: Add helper for event notifier.
Andrei Vagin writes:
FYI: This bug has been reproduced on 4.11.7
> BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo} still in use (1) [unmount of proc proc]
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80
> CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1
> Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015
> Workqueue: events proc_cleanup_work
> Call Trace:
> dump_stack+0x63/0x86
> __warn+0xcb/0xf0
> warn_slowpath_null+0x1d/0x20
> umount_check+0x6e/0x80
> d_walk+0xc6/0x270
> ? dentry_free+0x80/0x80
> do_one_tree+0x26/0x40
> shrink_dcache_for_umount+0x2d/0x90
> generic_shutdown_super+0x1f/0xf0
> kill_anon_super+0x12/0x20
> proc_kill_sb+0x40/0x50
> deactivate_locked_super+0x43/0x70
> deactivate_super+0x5a/0x60
> cleanup_mnt+0x3f/0x90
> mntput_no_expire+0x13b/0x190
> kern_unmount+0x3e/0x50
> pid_ns_release_proc+0x15/0x20
> proc_cleanup_work+0x15/0x20
> process_one_work+0x197/0x450
> worker_thread+0x4e/0x4a0
> kthread+0x109/0x140
> ? process_one_work+0x450/0x450
> ? kthread_park+0x90/0x90
> ret_from_fork+0x2c/0x40
> ---[ end trace e1c109611e5d0b41 ]---
> VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds. Have a nice day...
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: _raw_spin_lock+0xc/0x30
> PGD 0
Fix this by taking a reference to the super block in proc_sys_prune_dcache.
The superblock reference is the core of the fix however the sysctl_inodes
list is converted to a hlist so that hlist_del_init_rcu may be used. This
allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while
not causing problems for proc_sys_evict_inode when if it later choses to
remove the inode from the sysctl_inodes list. Removing inodes from the
sysctl_inodes list allows proc_sys_prune_dcache to have a progress
guarantee, while still being able to drop all locks. The fact that
head->unregistering is set in start_unregistering ensures that no more
inodes will be added to the the sysctl_inodes list.
Previously the code did a dance where it delayed calling iput until the
next entry in the list was being considered to ensure the inode remained on
the sysctl_inodes list until the next entry was walked to. The structure
of the loop in this patch does not need that so is much easier to
understand and maintain.
Cc: stable@vger.kernel.org
Reported-by: Andrei Vagin <avagin@gmail.com>
Tested-by: Andrei Vagin <avagin@openvz.org>
Fixes: ace0c791e6 ("proc/sysctl: Don't grab i_lock under sysctl_lock.")
Fixes: d6cffbbe9a ("proc/sysctl: prune stale dentries during unregistering")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add device tree binding documentation for the clocks provided by the
MIPS Boston development board from Imagination Technologies, and a
header file describing the available clocks for use by device trees &
driver.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16482/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Kill off s_options, save/replace_mount_options() and generic_show_options()
as all filesystems now implement ->show_options() for themselves. This
should make it easier to implement a context-based mount where the mount
options can be passed individually over a file descriptor.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Implement the show_options superblock op for 9p as part of a bid to get
rid of s_options and generic_show_options() to make it easier to implement
a context-based mount where the mount options can be passed individually
over a file descriptor.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Ron Minnich <rminnich@sandia.gov>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: v9fs-developer@lists.sourceforge.net
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge more updates from Andrew Morton:
- most of the rest of MM
- KASAN updates
- lib/ updates
- checkpatch updates
- some binfmt_elf changes
- various misc bits
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (115 commits)
kernel/exit.c: avoid undefined behaviour when calling wait4()
kernel/signal.c: avoid undefined behaviour in kill_something_info
binfmt_elf: safely increment argv pointers
s390: reduce ELF_ET_DYN_BASE
powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
arm: move ELF_ET_DYN_BASE to 4MB
binfmt_elf: use ELF_ET_DYN_BASE only for PIE
fs, epoll: short circuit fetching events if thread has been killed
checkpatch: improve multi-line alignment test
checkpatch: improve macro reuse test
checkpatch: change format of --color argument to --color[=WHEN]
checkpatch: silence perl 5.26.0 unescaped left brace warnings
checkpatch: improve tests for multiple line function definitions
checkpatch: remove false warning for commit reference
checkpatch: fix stepping through statements with $stat and ctx_statement_block
checkpatch: [HLP]LIST_HEAD is also declaration
checkpatch: warn when a MAINTAINERS entry isn't [A-Z]:\t
checkpatch: improve the unnecessary OOM message test
lib/bsearch.c: micro-optimize pivot position calculation
...
Commit 7dd968163f ("bitmap: bitmap_equal memcmp optimization") was
rather more restrictive than necessary; we can use memcmp() to implement
bitmap_equal() as long as the number of bits can be proved to be a
multiple of 8. And architectures other than s390 may be able to make
good use of this optimisation.
[arnd@arndb.de: fix build: add a memcmp() declaration]
Link: http://lkml.kernel.org/r/20170630153908.3439707-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20170628153221.11322-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several callers have constant 'start' and an 'nbits' that is a multiple
of 8, so we can turn them into calls to memset. We don't need the
entirety of 'start' and 'nbits' to be constant, we just need to know
whether they're divisible by 8.
Link: http://lkml.kernel.org/r/20170628153221.11322-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have eight users calling bitmap_clear for a single bit and seventeen
calling bitmap_set for a single bit. Rather than fix all of them to
call __clear_bit or __set_bit, turn bitmap_clear and bitmap_set into
inline functions and make this special case efficient.
Link: http://lkml.kernel.org/r/20170628153221.11322-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The global variable 'rd_size' is declared as 'int' in source file
arch/arm/kernel/atags_parse.c and as 'unsigned long' in
drivers/block/brd.c. Fix this inconsistency.
Additionally, remove the declarations of rd_image_start, rd_prompt and
rd_doload from parse_tag_ramdisk() since these duplicate existing
declarations in <linux/initrd.h>.
Link: http://lkml.kernel.org/r/20170627065024.12347-1-bart.vanassche@wdc.com
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: Zhaohongjiang <zhaohongjiang@huawei.com>
Cc: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Including <linux/bug.h> pulls in a lot of bloat from <asm/bug.h> and
<asm-generic/bug.h> that is not needed to call the BUILD_BUG() family of
macros. Split them out into their own header, <linux/build_bug.h>.
Also correct some checkpatch.pl errors for the BUILD_BUG_ON_ZERO() and
BUILD_BUG_ON_NULL() macros by adding parentheses around the bitfield
widths that begin with a minus sign.
Link: http://lkml.kernel.org/r/20170525120316.24473-6-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correct these checkpatch.pl errors:
|ERROR: space required before that '-' (ctx:OxO)
|#37: FILE: include/linux/bug.h:37:
|+#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
|ERROR: space required before that '-' (ctx:OxO)
|#38: FILE: include/linux/bug.h:38:
|+#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
I decided to wrap the bitfield expressions that begin with minus signs
in parentheses rather than insert spaces before the minus signs.
Link: http://lkml.kernel.org/r/20170525120316.24473-5-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correct these checkpatch.pl warnings:
|WARNING: Block comments use * on subsequent lines
|#34: FILE: include/linux/bug.h:34:
|+/* Force a compilation error if condition is true, but also produce a
|+ result (of value 0 and type size_t), so the expression can be used
|WARNING: Block comments use a trailing */ on a separate line
|#36: FILE: include/linux/bug.h:36:
|+ aren't permitted). */
Link: http://lkml.kernel.org/r/20170525120316.24473-3-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series of patches splits BUILD_BUG related macros out of
"include/linux/bug.h" into new file "include/linux/build_bug.h" (patch
5), and changes the pointer type checking in the `container_of()` macro
to deal with pointers of array type better (patch 6). Patches 1 to 4
are prerequisites.
Patches 2, 3, 4, and 5 have been inserted since the previous version of
this patch series. Patch 6 here corresponds to v3 and v4's patch 2.
Patch 1 was a prerequisite in v3 of this series to avoid a lot of
warnings when <linux/bug.h> was included by <linux/kernel.h>. That is
no longer relevant for v5 of the series, but I left it in because it was
acked by a Arnd Bergmann and Michal Nazarewicz.
Patches 2, 3, and 4 are some checkpatch clean-ups on
"include/linux/bug.h" before splitting out the BUILD_BUG stuff in patch
5.
Patch 5 splits the BUILD_BUG related macros out of "include/linux/bug.h"
into new file "include/linux/build_bug.h" because including
<linux/bug.h> in "include/linux/kernel.h" would result in build failures
due to circular dependencies.
Patch 6 changes the pointer type checking by `container_of()` to avoid
some incompatible pointer warnings when the dereferenced pointer has
array type.
1) asm-generic/bug.h: declare struct pt_regs; before function prototype
2) linux/bug.h: correct formatting of block comment
3) linux/bug.h: correct "(foo*)" should be "(foo *)"
4) linux/bug.h: correct "space required before that '-'"
5) bug: split BUILD_BUG stuff out into <linux/build_bug.h>
6) kernel.h: handle pointers to arrays better in container_of()
This patch (of 6):
The declaration of `__warn()` has `struct pt_regs *regs` as one of its
parameters. This can result in compiler warnings if `struct regs` is not
already declared. Add an empty declaration of `struct pt_regs` to avoid
the warnings.
Link: http://lkml.kernel.org/r/20170525120316.24473-2-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
early_pfn_to_nid will return node 0 if both HAVE_ARCH_EARLY_PFN_TO_NID
and HAVE_MEMBLOCK_NODE_MAP are disabled. It seems we are safe now
because all architectures which support NUMA define one of them (with an
exception of alpha which however has CONFIG_NUMA marked as broken) so
this works as expected. It can get silently and subtly broken too
easily, though. Make sure we fail the compilation if NUMA is enabled
and there is no proper implementation for this function. If that ever
happens we know that either the specific configuration is invalid and
the fix should either disable NUMA or enable one of the above configs.
Link: http://lkml.kernel.org/r/20170704075803.15979-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The rework of the cpu hotplug locking unearthed potential deadlocks with
the memory hotplug locking code.
The solution for these is to rework the memory hotplug locking code as
well and take the cpu hotplug lock before the memory hotplug lock in
mem_hotplug_begin(), but this will cause a recursive locking of the cpu
hotplug lock when the memory hotplug code calls lru_add_drain_all().
Split out the inner workings of lru_add_drain_all() into
lru_add_drain_all_cpuslocked() so this function can be invoked from the
memory hotplug code with the cpu hotplug lock held.
Link: http://lkml.kernel.org/r/20170704093421.419329357@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
list_lru_count_node() iterates over all memcgs to get the total number of
entries on the node but it can race with memcg_drain_all_list_lrus(),
which migrates the entries from a dead cgroup to another. This can return
incorrect number of entries from list_lru_count_node().
Fix this by keeping track of entries per node and simply return it in
list_lru_count_node().
Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Alexander Polakov <apolyakov@beget.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wb_stat_sum() disables interrupts and calls __wb_stat_sum() which
eventually calls __percpu_counter_sum(). However, the percpu routine is
already irq-safe. Simplify the code a bit by making wb_stat_sum()
directly call percpu_counter_sum_positive() and not disable interrupts.
Also remove the now-uneeded __wb_stat_sum() which was just a wrapper
over percpu_counter_sum_positive().
Link: http://lkml.kernel.org/r/1498230681-29103-1-git-send-email-nborisov@suse.com
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently pg_data_t is just a struct which describes a NUMA node memory
layout. Let's keep the comment simple and remove ambiguity.
Link: http://lkml.kernel.org/r/1498220534-22717-1-git-send-email-nborisov@suse.com
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>