The prefix suggests the number should be printed in hex, so use
the %x specifier to do that.
Found by using regex suggested by Joe Perches.
Signed-off-by: Hans Wennborg <hans@hanshq.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
See commit c6e9d6f388
random: introduce getrandom(2) system call
for all the details (and even a manual page!)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pull sparc updates from David Miller:
1) Add sparc RAM output to /proc/iomem, from Bob Picco.
2) Allow seeks on /dev/mdesc, from Khalid Aziz.
3) Cleanup sparc64 I/O accessors, from Sam Ravnborg.
4) If update_mmu_cache{,_pmd}() is called with an not-valid mapping, do
not insert it into the TLB miss hash tables otherwise we'll
livelock. Based upon work by Christopher Alexander Tobias Schulze.
5) Fix BREAK detection in sunsab driver when no actual characters are
pending, from Christopher Alexander Tobias Schulze.
6) Because we have modules --> openfirmware --> vmalloc ordering of
virtual memory, the lazy VMAP TLB flusher can cons up an invocation
of flush_tlb_kernel_range() that covers the openfirmware address
range. Unfortunately this will flush out the firmware's locked TLB
mapping which causes all kinds of trouble. Just split up the flush
request if this happens, but in the long term the lazy VMAP flusher
should probably be made a little bit smarter.
Based upon work by Christopher Alexander Tobias Schulze.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
sparc64: Fix up merge thinko.
sparc: Add "install" target
arch/sparc/math-emu/math_32.c: drop stray break operator
sparc64: ldc_connect() should not return EINVAL when handshake is in progress.
sparc64: Guard against flushing openfirmware mappings.
sunsab: Fix detection of BREAK on sunsab serial console
bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
sparc64: Do not insert non-valid PTEs into the TSB hash table.
sparc64: avoid code duplication in io_64.h
sparc64: reorder functions in io_64.h
sparc64: drop unused SLOW_DOWN_IO definitions
sparc64: remove macro indirection in io_64.h
sparc64: update IO access functions in PeeCeeI
sparcspkr: use sbus_*() primitives for IO
sparc: Add support for seek and shorter read to /dev/mdesc
sparc: use %s for unaligned panic
drivers/sbus/char: Micro-optimization in display7seg.c
display7seg: Introduce the use of the managed version of kzalloc
sparc64 - add mem to iomem resource
Pull networking updates from David Miller:
"Highlights:
1) Steady transitioning of the BPF instructure to a generic spot so
all kernel subsystems can make use of it, from Alexei Starovoitov.
2) SFC driver supports busy polling, from Alexandre Rames.
3) Take advantage of hash table in UDP multicast delivery, from David
Held.
4) Lighten locking, in particular by getting rid of the LRU lists, in
inet frag handling. From Florian Westphal.
5) Add support for various RFC6458 control messages in SCTP, from
Geir Ola Vaagland.
6) Allow to filter bridge forwarding database dumps by device, from
Jamal Hadi Salim.
7) virtio-net also now supports busy polling, from Jason Wang.
8) Some low level optimization tweaks in pktgen from Jesper Dangaard
Brouer.
9) Add support for ipv6 address generation modes, so that userland
can have some input into the process. From Jiri Pirko.
10) Consolidate common TCP connection request code in ipv4 and ipv6,
from Octavian Purdila.
11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.
12) Generic resizable RCU hash table, with intial users in netlink and
nftables. From Thomas Graf.
13) Maintain a name assignment type so that userspace can see where a
network device name came from (enumerated by kernel, assigned
explicitly by userspace, etc.) From Tom Gundersen.
14) Automatic flow label generation on transmit in ipv6, from Tom
Herbert.
15) New packet timestamping facilities from Willem de Bruijn, meant to
assist in measuring latencies going into/out-of the packet
scheduler, latency from TCP data transmission to ACK, etc"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
cxgb4 : Disable recursive mailbox commands when enabling vi
net: reduce USB network driver config options.
tg3: Modify tg3_tso_bug() to handle multiple TX rings
amd-xgbe: Perform phy connect/disconnect at dev open/stop
amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
net: sun4i-emac: fix memory leak on bad packet
sctp: fix possible seqlock seadlock in sctp_packet_transmit()
Revert "net: phy: Set the driver when registering an MDIO bus device"
cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
team: Simplify return path of team_newlink
bridge: Update outdated comment on promiscuous mode
net-timestamp: ACK timestamp for bytestreams
net-timestamp: TCP timestamping
net-timestamp: SCHED timestamp on entering packet scheduler
net-timestamp: add key to disambiguate concurrent datagrams
net-timestamp: move timestamp flags out of sk_flags
net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
cxgb4i : Move stray CPL definitions to cxgb4 driver
tcp: reduce spurious retransmits due to transient SACK reneging
qlcnic: Initialize dcbnl_ops before register_netdev
...
call, which is a superset of OpenBSD's getentropy(2) call, for use
with userspace crypto libraries such as LibreSSL. Also add the
ability to have a kernel thread to pull entropy from hardware rng
devices into /dev/random.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJT4VkhAAoJENNvdpvBGATwGMwP/0DvcJnk8Xg2pE67GrBlkL4V
ltDYZBUNI3Z9YqPFMbN02kt8jBJ4o8NVrD9XXSAmk0NbNV6pc4SdGUU7BBcms4BF
DX4CasmQS1EMKOxsszlvEbj9Q25u9ODJhUKsr1ZQKe3wfjx1gKRQ1QHHcrqgbGc0
tjkBU/TW+8daza6dGYrUrO34BPeN5Y4xbBG5WmVOLGgbDH7J3ZKGzkG21R5zHraI
tPJzZ3KGj+Cf1TtamBOpyF+SLqM7qi43JY/1l8LfDzJgJhB3NxOR1ig/Pk6z1qLi
2xYm1hb+EQqJGaToMXEl5fLLcYfnJmLYD/dWNq/pOVXFqC5cGxYIH1h+Nwzywvy3
hVqh4yDU5HXgu8mOMPPc23azicJflZwCNq0vTTDE+orYnb8n9Sbg0l+rUQ45BZua
tVfGKT1LZuYtM0axYQ4fIfqS9bxsyRJcF6HNNaEMQJsm0V0prwlz0hXkaod1uOJd
CwOn9+CpZUGCgj5paRS+zTOtcl39+X1tIhcWTHEDMpMzIqnk8KpkLGqCDisBZNBF
UbjEaTA8w6tBxRX5FZ9qdmRFvsxCJH7nOxmmsaIOZ/7QXQHQNrxI2+v6yd4HWJAw
yZnaVR5o6sojKc8zp9nOXQ219G1zvt4l6XyTqIP+gKWJGDKGCsMXXzEg1OchO+rI
Oo8s5+ytZB9qei7QwLAf
=wLqJ
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull randomness updates from Ted Ts'o:
"Cleanups and bug fixes to /dev/random, add a new getrandom(2) system
call, which is a superset of OpenBSD's getentropy(2) call, for use
with userspace crypto libraries such as LibreSSL.
Also add the ability to have a kernel thread to pull entropy from
hardware rng devices into /dev/random"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
hwrng: Pass entropy to add_hwgenerator_randomness() in bits, not bytes
random: limit the contribution of the hw rng to at most half
random: introduce getrandom(2) system call
hw_random: fix sparse warning (NULL vs 0 for pointer)
random: use registers from interrupted code for CPU's w/o a cycle counter
hwrng: add per-device entropy derating
hwrng: create filler thread
random: add_hwgenerator_randomness() for feeding entropy from devices
random: use an improved fast_mix() function
random: clean up interrupt entropy accounting for archs w/o cycle counters
random: only update the last_pulled time if we actually transferred entropy
random: remove unneeded hash of a portion of the entropy pool
random: always update the entropy pool under the spinlock
Pull security subsystem updates from James Morris:
"In this release:
- PKCS#7 parser for the key management subsystem from David Howells
- appoint Kees Cook as seccomp maintainer
- bugfixes and general maintenance across the subsystem"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits)
X.509: Need to export x509_request_asymmetric_key()
netlabel: shorter names for the NetLabel catmap funcs/structs
netlabel: fix the catmap walking functions
netlabel: fix the horribly broken catmap functions
netlabel: fix a problem when setting bits below the previously lowest bit
PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1
tpm: simplify code by using %*phN specifier
tpm: Provide a generic means to override the chip returned timeouts
tpm: missing tpm_chip_put in tpm_get_random()
tpm: Properly clean sysfs entries in error path
tpm: Add missing tpm_do_selftest to ST33 I2C driver
PKCS#7: Use x509_request_asymmetric_key()
Revert "selinux: fix the default socket labeling in sock_graft()"
X.509: x509_request_asymmetric_keys() doesn't need string length arguments
PKCS#7: fix sparse non static symbol warning
KEYS: revert encrypted key change
ima: add support for measuring and appraising firmware
firmware_class: perform new LSM checks
security: introduce kernel_fw_from_file hook
PKCS#7: Missing inclusion of linux/err.h
...
Use sigsp() instead of the open coded variant.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Acked-by: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
This inverts also the return codes of setup_*frame() to follow the
kernel convention.
Signed-off-by: Richard Weinberger <richard@nod.at>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Helge Deller <deller@gmx.de>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
This inverts also the return codes of force_sigsegv_info()
and setup_frame() to follow the kernel convention.
Signed-off-by: Richard Weinberger <richard@nod.at>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Tested-by: Mark Salter <msalter@redhat.com>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Steven Miao <realmz6@gmail.com>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
commit 4badad352a (locking/mutex: Disable
optimistic spinning on some architectures) fenced spinning for
architectures without proper cmpxchg.
There is no need to disable mutex spinning on s390, though:
The instructions CS,CSG and friends provide the proper guarantees.
(We dont implement cmpxchg with locks).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull timer and time updates from Thomas Gleixner:
"A rather large update of timers, timekeeping & co
- Core timekeeping code is year-2038 safe now for 32bit machines.
Now we just need to fix all in kernel users and the gazillion of
user space interfaces which rely on timespec/timeval :)
- Better cache layout for the timekeeping internal data structures.
- Proper nanosecond based interfaces for in kernel users.
- Tree wide cleanup of code which wants nanoseconds but does hoops
and loops to convert back and forth from timespecs. Some of it
definitely belongs into the ugly code museum.
- Consolidation of the timekeeping interface zoo.
- A fast NMI safe accessor to clock monotonic for tracing. This is a
long standing request to support correlated user/kernel space
traces. With proper NTP frequency correction it's also suitable
for correlation of traces accross separate machines.
- Checkpoint/restart support for timerfd.
- A few NOHZ[_FULL] improvements in the [hr]timer code.
- Code move from kernel to kernel/time of all time* related code.
- New clocksource/event drivers from the ARM universe. I'm really
impressed that despite an architected timer in the newer chips SoC
manufacturers insist on inventing new and differently broken SoC
specific timers.
[ Ed. "Impressed"? I don't think that word means what you think it means ]
- Another round of code move from arch to drivers. Looks like most
of the legacy mess in ARM regarding timers is sorted out except for
a few obnoxious strongholds.
- The usual updates and fixlets all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
timekeeping: Fixup typo in update_vsyscall_old definition
clocksource: document some basic timekeeping concepts
timekeeping: Use cached ntp_tick_length when accumulating error
timekeeping: Rework frequency adjustments to work better w/ nohz
timekeeping: Minor fixup for timespec64->timespec assignment
ftrace: Provide trace clocks monotonic
timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
seqcount: Add raw_write_seqcount_latch()
seqcount: Provide raw_read_seqcount()
timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
timekeeping: Create struct tk_read_base and use it in struct timekeeper
timekeeping: Restructure the timekeeper some more
clocksource: Get rid of cycle_last
clocksource: Move cycle_last validation to core code
clocksource: Make delta calculation a function
wireless: ath9k: Get rid of timespec conversions
drm: vmwgfx: Use nsec based interfaces
drm: i915: Use nsec based interfaces
timekeeping: Provide ktime_get_raw()
hangcheck-timer: Use ktime_get_ns()
...
Pull irq updates from Thomas Gleixner:
"Nothing spectacular from the irq department this time:
- overhaul of the crossbar chip driver
- overhaul of the spear shirq chip driver
- support for the atmel-aic chip
- code move from arch to drivers
- the usual tiny fixlets
- two reverts worth to mention which undo the too simple attempt of
supporting wakeup interrupts on shared interrupt lines"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
Revert "irq: Warn when shared interrupts do not match on NO_SUSPEND"
Revert "PM / sleep / irq: Do not suspend wakeup interrupts"
irq: Warn when shared interrupts do not match on NO_SUSPEND
irqchip: atmel-aic: Define irq fixups for atmel SoCs
irqchip: atmel-aic: Implement RTC irq fixup
irqchip: atmel-aic: Add irq fixup infrastructure
irqchip: atmel-aic: Add atmel AIC/AIC5 drivers
irqchip: atmel-aic: Move binding doc to interrupt-controller directory
genirq: generic chip: Export irq_map_generic_chip function
PM / sleep / irq: Do not suspend wakeup interrupts
irqchip: or1k-pic: Migrate from arch/openrisc/
irqchip: crossbar: Allow for quirky hardware with direct hardwiring of GIC
documentation: dt: omap: crossbar: Add description for interrupt consumer
irqchip: crossbar: Introduce centralized check for crossbar write
irqchip: crossbar: Introduce ti, max-crossbar-sources to identify valid crossbar mapping
irqchip: crossbar: Add kerneldoc for crossbar_domain_unmap callback
irqchip: crossbar: Set cb pointer to null in case of error
irqchip: crossbar: Change the goto naming
irqchip: crossbar: Return proper error value
irqchip: crossbar: Fix kerneldoc warning
...
Commit ea431643d6 ("x86/mce: Fix CMCI preemption bugs") breaks RT by
the completely unrelated conversion of the cmci_discover_lock to a
regular (non raw) spinlock. This lock was annotated in commit
59d958d2c7 ("locking, x86: mce: Annotate cmci_discover_lock as raw")
with a proper explanation why.
The argument for converting the lock back to a regular spinlock was:
- it does percpu ops without disabling preemption. Preemption is not
disabled due to the mistaken use of a raw spinlock.
Which is complete nonsense. The raw_spinlock is disabling preemption in
the same way as a regular spinlock. In mainline spinlock maps to
raw_spinlock, in RT spinlock becomes a "sleeping" lock.
raw_spinlock has on RT exactly the same semantics as in mainline. And
because this lock is taken in non preemptible context it must be raw on
RT.
Undo the locking brainfart.
Reported-by: Clark Williams <williams@redhat.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A quiet release, more bug fixes than anything else. A few things do
stand out though:
- Updates to several drivers to move towards the standard GPIO chip
select handling in the core.
- DMA support for the SH MSIOF driver.
- Support for Rockchip SPI controllers (their first mainline
submission).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT4RsmAAoJELSic+t+oim9Kd0P/A3bTbf7UiK9t3NgbpIDO+yJ
JHe8O5OtxeAWGiuv9RF47Gutr8061Rww0yzX2+iiRBkaOYE7TZfdUVSBT8LnTrw6
WFye5pmxy25mDX97OJnhlsddPEoCxb/a4MlcqcCxULsHcyU9jIM+uId1v6LxMC3d
QtuB2Fuxzhhqmdfg9NLdsVsMWiwVwZn20Cmxt7Fc9EzwK6BBs1U50/X/wJHzBQ4K
fbl3hwxKODBd7aMG9DRHt4cW04WG5wQYkJS54ThUAROebqjEx8YWbNIszKA1fQcW
jBcd8Oieo724/jGZq1/U4RJUpRKmwx/ug31nrYx/Mcp+Za+yIZ1dwxAcK5AkdJNa
1lw5LGMLcP04EN0pdKKyrVwwkzV60fwrV9ELcZcnbpKhcvR0G4g7pbKufNIcGu64
0RGTnq1Y0HD1/0Zcomdt1oSSA4gv1B2Va7ZBM/SaphA+MW6EN0KfGMmcopJA5gAD
Dv66ijnIUjkKqJb4HsWa4gcq6EnqiK/GUzr9Pjng4ogl8/OF+OYOa+mYnj4DP98p
aXy/IUKSNDRwY6tV6Z4eEXnhsHWmkzfqYwZoEZVZIR7Dytl1oxdK4kW5BC0hJRc1
DrgnVMgxsIMfVD3RVbqLN4zPyFBmKHwXrcTMHQuv4ndfhiDck0u4uiF5CNCap1I1
ThAHObRbw+X/9D0ywFK/
=u37g
-----END PGP SIGNATURE-----
Merge tag 'spi-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"A quiet release, more bug fixes than anything else. A few things do
stand out though:
- updates to several drivers to move towards the standard GPIO chip
select handling in the core.
- DMA support for the SH MSIOF driver.
- support for Rockchip SPI controllers (their first mainline
submission)"
* tag 'spi-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (64 commits)
spi: davinci: use spi_device.cs_gpio to store gpio cs per spi device
spi: davinci: add support to configure gpio cs through dt
spi/pl022: Explicitly truncate large bitmask
spi/atmel: Fix pointer to int conversion warnings on 64 bit builds
spi: davinci: fix to support more than 2 chip selects
spi: topcliff-pch: don't hardcode PCI slot to get DMA device
spi: orion: fix incorrect handling of cell-index DT property
spi: orion: Fix error return code in orion_spi_probe()
spi/rockchip: fix error return code in rockchip_spi_probe()
spi/rockchip: remove redundant dev_err call in rockchip_spi_probe()
spi/rockchip: remove duplicated include from spi-rockchip.c
ARM: dts: fix the chip select gpios definition in the SPI nodes
spi: s3c64xx: Update binding documentation
spi: s3c64xx: use the generic SPI "cs-gpios" property
spi: s3c64xx: Revert "spi: s3c64xx: Added provision for dedicated cs pin"
spi: atmel: Use dmaengine_prep_slave_sg() API
spi: topcliff-pch: Update error messages for dmaengine_prep_slave_sg() API
spi: sh-msiof: Use correct device for DMA mapping with IOMMU
spi: sh-msiof: Handle dmaengine_prep_slave_single() failures gracefully
spi: rspi: Handle dmaengine_prep_slave_sg() failures gracefully
...
This time with:
* Support for the generic PCI device alias code in x86 IOMMU
drivers
* A new sysfs interface for IOMMUs
* Preparations for hotplug support in the Intel IOMMU driver
* Change the AMD IOMMUv2 driver to not hold references to core
data structures like mm_struct or task_struct. Rely on
mmu_notifers instead.
* Removal of the OMAP IOVMM interface, all users of it are
converted to DMA-API now
* Make the struct iommu_ops const everywhere
* Initial PCI support for the ARM SMMU driver
* There is now a generic device tree binding documented for
ARM IOMMUs
* Various fixes and cleanups all over the place
Also included are some changes to the OMAP code, which are acked by the
maintainer.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT4OVjAAoJECvwRC2XARrjHmsP/23svgRbCyajL4Aov1Tk0YLE
FkUhhXvgw6fex+dubsHYL24ALkja8MkucI4g7nbCvtb0hwcaaDYHR3NniCWzIEB6
/83B54v1OPxRGycyjaXxCpLTZOb7PV+9ALATGwpdxgVh8M8RXqSyxjEOq/sKQd+i
9hbLd/XFAyrucjJuiG1V8MRdymuBIGwHqX5jwi2cl0IaQ6+WUCayU6F+0qYmmdDo
xNYJHvGz6stUTtHWTlQwMgCUamgm8ZhHr02KHWqeqZreggqAucJcNKqaaLJd3A8g
ZoNCqZwbfFYsfzXtjkIwTEej85hnmw1mx+GtNWm9WOemTrTJB91O3xOAPFqr+eJm
qqUpXd+vpPOiXuNSttj/pi6ou6xFf8IcAQnr7A5oWDK3Sy+N11BEhPCA8sfTWa47
Hqu1B1lj6ayTe6iWQENosbnpsT4zVpaJiDRF+jkzws0nYM3Fb3s6spRHb+OTPkmR
JG85VceoPGNSZIZVS6QlUmUirM4ThLNybcspoLY31Yrs/eYapWHzfEZ6B2c0Ku+Y
BL11RlC0LHjcOedmBeSSqzp6TuXS2uMrGV4Hlc3DhXJplAE/hg1z3AR7iEPXHVcQ
CQMdr+TS+NcRreS8TeiCTnovD6Vd1nwV8qqOf+3bHOQbf4wEjXr2xxVCNSAmLM53
ZQn5PxrLzzUKFWLTrqHd
=ChW/
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"This time with:
- support for the generic PCI device alias code in x86 IOMMU drivers
- a new sysfs interface for IOMMUs
- preparations for hotplug support in the Intel IOMMU driver
- change the AMD IOMMUv2 driver to not hold references to core data
structures like mm_struct or task_struct. Rely on mmu_notifers
instead.
- removal of the OMAP IOVMM interface, all users of it are converted
to DMA-API now
- make the struct iommu_ops const everywhere
- initial PCI support for the ARM SMMU driver
- there is now a generic device tree binding documented for ARM
IOMMUs
- various fixes and cleanups all over the place
Also included are some changes to the OMAP code, which are acked by
the maintainer"
* tag 'iommu-updates-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (67 commits)
devicetree: Add generic IOMMU device tree bindings
iommu/vt-d: Fix race setting IRQ CPU affinity while freeing IRQ
iommu/amd: Fix 2 typos in comments
iommu/amd: Fix device_state reference counting
iommu/amd: Remove change_pte mmu_notifier call-back
iommu/amd: Don't set pasid_state->mm to NULL in unbind_pasid
iommu/exynos: Select ARM_DMA_USE_IOMMU
iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains
iommu/omap: Remove platform data da_start and da_end fields
ARM: omap: Don't set iommu pdata da_start and da_end fields
iommu/omap: Remove virtual memory manager
iommu/vt-d: Fix issue in computing domain's iommu_snooping flag
iommu/vt-d: Introduce helper function iova_size() to improve code readability
iommu/vt-d: Introduce helper domain_pfn_within_range() to simplify code
iommu/vt-d: Simplify intel_unmap_sg() and kill duplicated code
iommu/vt-d: Change iommu_enable/disable_translation to return void
iommu/vt-d: Simplify include/linux/dmar.h
iommu/vt-d: Avoid freeing virtual machine domain in free_dmar_iommu()
iommu/vt-d: Fix possible invalid memory access caused by free_dmar_iommu()
iommu/vt-d: Allocate dynamic domain id for virtual domains only
...
Without CONFIG_RELOCATABLE the early boot code will decompress the
kernel to LOAD_PHYSICAL_ADDR. While this may have been fine in the BIOS
days, that isn't going to fly with UEFI since parts of the firmware
code/data may be located at LOAD_PHYSICAL_ADDR.
Straying outside of the bounds of the regions we've explicitly requested
from the firmware will cause all sorts of trouble. Bruno reports that
his machine resets while trying to decompress the kernel image.
We already go to great pains to ensure the kernel is loaded into a
suitably aligned buffer, it's just that the address isn't necessarily
LOAD_PHYSICAL_ADDR, because we can't guarantee that address isn't in-use
by the firmware.
Explicitly enforce CONFIG_RELOCATABLE for the EFI boot stub, so that we
can load the kernel at any address with the correct alignment.
Reported-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The getrandom(2) system call was requested by the LibreSSL Portable
developers. It is analoguous to the getentropy(2) system call in
OpenBSD.
The rationale of this system call is to provide resiliance against
file descriptor exhaustion attacks, where the attacker consumes all
available file descriptors, forcing the use of the fallback code where
/dev/[u]random is not available. Since the fallback code is often not
well-tested, it is better to eliminate this potential failure mode
entirely.
The other feature provided by this new system call is the ability to
request randomness from the /dev/urandom entropy pool, but to block
until at least 128 bits of entropy has been accumulated in the
/dev/urandom entropy pool. Historically, the emphasis in the
/dev/urandom development has been to ensure that urandom pool is
initialized as quickly as possible after system boot, and preferably
before the init scripts start execution.
This is because changing /dev/urandom reads to block represents an
interface change that could potentially break userspace which is not
acceptable. In practice, on most x86 desktop and server systems, in
general the entropy pool can be initialized before it is needed (and
in modern kernels, we will printk a warning message if not). However,
on an embedded system, this may not be the case. And so with this new
interface, we can provide the functionality of blocking until the
urandom pool has been initialized. Any userspace program which uses
this new functionality must take care to assure that if it is used
during the boot process, that it will not cause the init scripts or
other portions of the system startup to hang indefinitely.
SYNOPSIS
#include <linux/random.h>
int getrandom(void *buf, size_t buflen, unsigned int flags);
DESCRIPTION
The system call getrandom() fills the buffer pointed to by buf
with up to buflen random bytes which can be used to seed user
space random number generators (i.e., DRBG's) or for other
cryptographic uses. It should not be used for Monte Carlo
simulations or other programs/algorithms which are doing
probabilistic sampling.
If the GRND_RANDOM flags bit is set, then draw from the
/dev/random pool instead of the /dev/urandom pool. The
/dev/random pool is limited based on the entropy that can be
obtained from environmental noise, so if there is insufficient
entropy, the requested number of bytes may not be returned.
If there is no entropy available at all, getrandom(2) will
either block, or return an error with errno set to EAGAIN if
the GRND_NONBLOCK bit is set in flags.
If the GRND_RANDOM bit is not set, then the /dev/urandom pool
will be used. Unlike using read(2) to fetch data from
/dev/urandom, if the urandom pool has not been sufficiently
initialized, getrandom(2) will block (or return -1 with the
errno set to EAGAIN if the GRND_NONBLOCK bit is set in flags).
The getentropy(2) system call in OpenBSD can be emulated using
the following function:
int getentropy(void *buf, size_t buflen)
{
int ret;
if (buflen > 256)
goto failure;
ret = getrandom(buf, buflen, 0);
if (ret < 0)
return ret;
if (ret == buflen)
return 0;
failure:
errno = EIO;
return -1;
}
RETURN VALUE
On success, the number of bytes that was filled in the buf is
returned. This may not be all the bytes requested by the
caller via buflen if insufficient entropy was present in the
/dev/random pool, or if the system call was interrupted by a
signal.
On error, -1 is returned, and errno is set appropriately.
ERRORS
EINVAL An invalid flag was passed to getrandom(2)
EFAULT buf is outside the accessible address space.
EAGAIN The requested entropy was not available, and
getentropy(2) would have blocked if the
GRND_NONBLOCK flag was not set.
EINTR While blocked waiting for entropy, the call was
interrupted by a signal handler; see the description
of how interrupted read(2) calls on "slow" devices
are handled with and without the SA_RESTART flag
in the signal(7) man page.
NOTES
For small requests (buflen <= 256) getrandom(2) will not
return EINTR when reading from the urandom pool once the
entropy pool has been initialized, and it will return all of
the bytes that have been requested. This is the recommended
way to use getrandom(2), and is designed for compatibility
with OpenBSD's getentropy() system call.
However, if you are using GRND_RANDOM, then getrandom(2) may
block until the entropy accounting determines that sufficient
environmental noise has been gathered such that getrandom(2)
will be operating as a NRBG instead of a DRBG for those people
who are working in the NIST SP 800-90 regime. Since it may
block for a long time, these guarantees do *not* apply. The
user may want to interrupt a hanging process using a signal,
so blocking until all of the requested bytes are returned
would be unfriendly.
For this reason, the user of getrandom(2) MUST always check
the return value, in case it returns some error, or if fewer
bytes than requested was returned. In the case of
!GRND_RANDOM and small request, the latter should never
happen, but the careful userspace code (and all crypto code
should be careful) should check for this anyway!
Finally, unless you are doing long-term key generation (and
perhaps not even then), you probably shouldn't be using
GRND_RANDOM. The cryptographic algorithms used for
/dev/urandom are quite conservative, and so should be
sufficient for all purposes. The disadvantage of GRND_RANDOM
is that it can block, and the increased complexity required to
deal with partially fulfilled getrandom(2) requests.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zach Brown <zab@zabbo.net>
Pull ARM updates from Russell King:
"Included in this update:
- perf updates from Will Deacon:
The main changes are callchain stability fixes from Jean Pihet and
event mapping and PMU name rework from Mark Rutland
The latter is preparatory work for enabling some code re-use with
arm64 in the future.
- updates for nommu from Uwe Kleine-König:
Two different fixes for the same problem making some ARM nommu
configurations not boot since 3.6-rc1. The problem is that
user_addr_max returned the biggest available RAM address which
makes some copy_from_user variants fail to read from XIP memory.
- deprecate legacy OMAP DMA API, in preparation for it's removal.
The popular drivers have been converted over, leaving a very small
number of rarely used drivers, which hopefully can be converted
during the next cycle with a bit more visibility (and hopefully
people popping out of the woodwork to help test)
- more tweaks for BE systems, particularly with the kernel image
format. In connection with this, I've cleaned up the way we
generate the linker script for the decompressor.
- removal of hard-coded assumptions of the kernel stack size, making
everywhere depend on the value of THREAD_SIZE_ORDER.
- MCPM updates from Nicolas Pitre.
- Make it easier for proper CPU part number checks (which should
always include the vendor field).
- Assembly code optimisation - use the "bx" instruction when
returning from a function on ARMv6+ rather than "mov pc, reg".
- Save the last kernel misaligned fault location and report it via
the procfs alignment file.
- Clean up the way we create the initial stack frame, which is a
repeated pattern in several different locations.
- Support for 8-byte get_user(), needed for some DRM implementations.
- mcs locking from Will Deacon.
- Save and restore a few more Cortex-A9 registers (for errata
workarounds)
- Fix various aspects of the SWP emulation, and the ELF hwcap for the
SWP instruction.
- Update LPAE logic for pte_write and pmd_write to make it more
correct.
- Support for Broadcom Brahma15 CPU cores.
- ARM assembly crypto updates from Ard Biesheuvel"
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (53 commits)
ARM: add comments to the early page table remap code
ARM: 8122/1: smp_scu: enable SCU standby support
ARM: 8121/1: smp_scu: use macro for SCU enable bit
ARM: 8120/1: crypto: sha512: add ARM NEON implementation
ARM: 8119/1: crypto: sha1: add ARM NEON implementation
ARM: 8118/1: crypto: sha1/make use of common SHA-1 structures
ARM: 8113/1: remove remaining definitions of PLAT_PHYS_OFFSET from <mach/memory.h>
ARM: 8111/1: Enable erratum 798181 for Broadcom Brahma-B15
ARM: 8110/1: do CPU-specific init for Broadcom Brahma15 cores
ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE
ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear
ARM: hwcap: disable HWCAP_SWP if the CPU advertises it has exclusives
ARM: SWP emulation: only initialise on ARMv7 CPUs
ARM: SWP emulation: always enable when SMP is enabled
ARM: 8103/1: save/restore Cortex-A9 CP15 registers on suspend/resume
ARM: 8098/1: mcs lock: implement wfe-based polling for MCS locking
ARM: 8091/2: add get_user() support for 8 byte types
ARM: 8097/1: unistd.h: relocate comments back to place
ARM: 8096/1: Describe required sort order for textofs-y (TEXT_OFFSET)
ARM: 8090/1: add revision info for PL310 errata 588369 and 727915
...
Pull m68knommu fixes from Greg Ungerer:
"Just a couple of small fixes. Fix definition of page_to_phys() and
remove unecesary prototype of kobjsize()"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: Remove unnecessary prototype for kobjsize()
m68knommu: Correct page_to_phys when PAGE_OFFSET is non-zero.
After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set. With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:
Call Trace:
[<ffffffff81493563>] dump_stack+0x49/0x5e
[<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
[<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
[<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
[<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
[<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
[<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
[<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
[<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
[<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]
To fix this, we cannot rely on the processor's virtual interrupt delivery,
because "acknowledge interrupt on exit" must only update the virtual
ISR/PPR/IRR registers (and SVI, which is just a cache of the virtual ISR)
but it should not deliver the interrupt through the IDT. Thus, KVM has
to deliver the interrupt "by hand", similar to the treatment of EOI in
commit fc57ac2c9c (KVM: lapic: sync highest ISR to hardware apic on
EOI, 2014-05-14).
The patch modifies kvm_cpu_get_interrupt to always acknowledge an
interrupt; there are only two callers, and the other is not affected
because it is never reached with kvm_apic_vid_enabled() == true. Then it
modifies apic_set_isr and apic_clear_irr to update SVI and RVI in addition
to the registers.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Tested-by: Felipe Reyes <freyes@suse.com>
Fixes: 77b0f5d67f
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An external interrupt will cause a vmexit with reason "external interrupt"
when L2 is running. L1 will pick up the interrupt through vmcs12 if
L1 set the ack interrupt bit. Commit 77b0f5d (KVM: nVMX: Ack and write
vector info to intr_info if L1 asks us to) retrieves the interrupt that
belongs to L1 before vmcs01 is loaded.
This will lead to problems in the next patch, which would write to SVI
of vmcs02 instead of vmcs01 (SVI of vmcs02 doesn't make sense because
L2 runs without APICv).
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Tested-by: Felipe Reyes <freyes@suse.com>
Fixes: 77b0f5d67f
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
[Move tracepoint as well. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes it possible to use IRQFDs on platforms that use the XICS
interrupt controller. To do this we implement kvm_irq_map_gsi() and
kvm_irq_map_chip_pin() in book3s_xics.c, so as to provide a 1-1 mapping
between global interrupt numbers and XICS interrupt source numbers.
For now, all interrupts are mapped as "IRQCHIP" interrupts, and no
MSI support is provided.
This means that kvm_set_irq can now get called with level == 0 or 1
as well as the powerpc-specific values KVM_INTERRUPT_SET,
KVM_INTERRUPT_UNSET and KVM_INTERRUPT_SET_LEVEL. We change
ics_deliver_irq() to accept all those values, and remove its
report_status argument, as it is always false, given that we don't
support KVM_IRQ_LINE_STATUS.
This also adds support for interrupt ack notifiers to the XICS code
so that the IRQFD resampler functionality can be supported.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, the IRQFD code is conditional on CONFIG_HAVE_KVM_IRQ_ROUTING.
So that we can have the IRQFD code compiled in without having the
IRQ routing code, this creates a new CONFIG_HAVE_KVM_IRQFD, makes
the IRQFD code conditional on it instead of CONFIG_HAVE_KVM_IRQ_ROUTING,
and makes all the platforms that currently select HAVE_KVM_IRQ_ROUTING
also select HAVE_KVM_IRQFD.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This provides accessor functions for the KVM interrupt mappings, in
order to reduce the amount of code that accesses the fields of the
kvm_irq_routing_table struct, and restrict that code to one file,
virt/kvm/irqchip.c. The new functions are kvm_irq_map_gsi(), which
maps from a global interrupt number to a set of IRQ routing entries,
and kvm_irq_map_chip_pin, which maps from IRQ chip and pin numbers to
a global interrupt number.
This also moves the update of kvm_irq_routing_table::chip[][]
into irqchip.c, out of the various kvm_set_routing_entry
implementations. That means that none of the kvm_set_routing_entry
implementations need the kvm_irq_routing_table argument anymore,
so this removes it.
This does not change any locking or data lifetime rules.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 29577fc00b ("KVM: PPC: HV: Remove generic instruction emulation")
caused a build failure with allyesconfig:
arch/powerpc/kvm/kvm-pr.o:(__tracepoints+0xa8): multiple definition of `__tracepoint_kvm_ppc_instr'
arch/powerpc/kvm/kvm.o:(__tracepoints+0x1c0): first defined here
due to a duplicate definition of the tracepoint in trace.h and
trace_pr.h. Because the tracepoint is still used by Book3S HV
code, and because the PR code does include trace.h, just remove
the duplicate definition from trace_pr.h, and export it from
kvm.o.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some new functions are exposed for use by the IOMMU code but
won't build when CONFIG_IOMMU_API isn't set, so shield them
appropriately.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Highlights in this release include:
- BookE: Rework instruction fetch, not racy anymore now
- BookE HV: Fix ONE_REG accessors for some in-hardware registers
- Book3S: Good number of LE host fixes, enable HV on LE
- Book3S: Some misc bug fixes
- Book3S HV: Add in-guest debug support
- Book3S HV: Preload cache lines on context switch
- Remove 440 support
Alexander Graf (31):
KVM: PPC: Book3s PR: Disable AIL mode with OPAL
KVM: PPC: Book3s HV: Fix tlbie compile error
KVM: PPC: Book3S PR: Handle hyp doorbell exits
KVM: PPC: Book3S PR: Fix ABIv2 on LE
KVM: PPC: Book3S PR: Fix sparse endian checks
PPC: Add asm helpers for BE 32bit load/store
KVM: PPC: Book3S HV: Make HTAB code LE host aware
KVM: PPC: Book3S HV: Access guest VPA in BE
KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
KVM: PPC: Book3S HV: Access XICS in BE
KVM: PPC: Book3S HV: Fix ABIv2 on LE
KVM: PPC: Book3S HV: Enable for little endian hosts
KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
KVM: PPC: Deflect page write faults properly in kvmppc_st
KVM: PPC: Book3S: Stop PTE lookup on write errors
KVM: PPC: Book3S: Add hack for split real mode
KVM: PPC: Book3S: Make magic page properly 4k mappable
KVM: PPC: Remove 440 support
KVM: Rename and add argument to check_extension
KVM: Allow KVM_CHECK_EXTENSION on the vm fd
KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
KVM: PPC: Implement kvmppc_xlate for all targets
KVM: PPC: Move kvmppc_ld/st to common code
KVM: PPC: Remove kvmppc_bad_hva()
KVM: PPC: Use kvm_read_guest in kvmppc_ld
KVM: PPC: Handle magic page in kvmppc_ld/st
KVM: PPC: Separate loadstore emulation from priv emulation
KVM: PPC: Expose helper functions for data/inst faults
KVM: PPC: Remove DCR handling
KVM: PPC: HV: Remove generic instruction emulation
KVM: PPC: PR: Handle FSCR feature deselects
Alexey Kardashevskiy (1):
KVM: PPC: Book3S: Fix LPCR one_reg interface
Aneesh Kumar K.V (4):
KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
KVM: PPC: BOOK3S: PR: Emulate instruction counter
KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page
Anton Blanchard (2):
KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
Bharat Bhushan (10):
kvm: ppc: bookehv: Added wrapper macros for shadow registers
kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
kvm: ppc: Add SPRN_EPR get helper function
kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
KVM: PPC: Booke-hv: Add one reg interface for SPRG9
KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr
Michael Neuling (1):
KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling
Mihai Caraman (8):
KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
KVM: PPC: e500: Fix default tlb for victim hint
KVM: PPC: e500: Emulate power management control SPR
KVM: PPC: e500mc: Revert "add load inst fixup"
KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
KVM: PPC: Book3s: Remove kvmppc_read_inst() function
KVM: PPC: Allow kvmppc_get_last_inst() to fail
KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
Paul Mackerras (4):
KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication
Stewart Smith (2):
Split out struct kvmppc_vcore creation to separate function
Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJT21skAAoJECszeR4D/txgeFEP/AzJopN7s//W33CfyBqURHXp
XALCyAw+S67gtcaTZbxomcG1xuT8Lj9WEw28iz3rCtAnJwIxsY63xrI1nXMzTaI2
p1rC0ai5Qy+nlEbd6L78spZy/Nzh8DFYGWx78iUSO1mYD8xywJwtoiBA539pwp8j
8N+mgn61Hwhv31bKtsZlmzXymVr/jbTp5LVuxsBLJwD2lgT49g+4uBnX2cG/iXkg
Rzbh7LxoNNXrSPI8sYmTWu/81aeXteeX70ja6DHuV5dWLNTuAXJrh5EUfeAZqBrV
aYcLWUYmIyB87txNmt6ZGVar2p3jr2Xhb9mKx+EN4dbehblanLc1PUqlHd0q3dKc
Nt60ByqpZn+qDAK86dShSZLEe+GT3lovvE76CqVXD4Er+OUEkc9JoxhN1cof/Gb0
o6uwZ2isXHRdGoZx5vb4s3UTOlwZGtoL/CyY/HD/ujYDSURkCGbxLj3kkecSY8ut
QdDAWsC15BwsHtKLr5Zwjp2w+0eGq2QJgfvO0zqWFiz9k33SCBCUpwluFeqh27Hi
aR5Wir3j+MIw9G8XlYlDJWYfi0h/SZ4G7hh7jSu26NBNBzQsDa8ow/cLzdMhdUwH
OYSaeqVk5wiRb9to1uq1NQWPA0uRAx3BSjjvr9MCGRqmvn+FV5nj637YWUT+53Hi
aSvg/U2npghLPPG2cihu
=JuLr
-----END PGP SIGNATURE-----
Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm
Patch queue for ppc - 2014-08-01
Highlights in this release include:
- BookE: Rework instruction fetch, not racy anymore now
- BookE HV: Fix ONE_REG accessors for some in-hardware registers
- Book3S: Good number of LE host fixes, enable HV on LE
- Book3S: Some misc bug fixes
- Book3S HV: Add in-guest debug support
- Book3S HV: Preload cache lines on context switch
- Remove 440 support
Alexander Graf (31):
KVM: PPC: Book3s PR: Disable AIL mode with OPAL
KVM: PPC: Book3s HV: Fix tlbie compile error
KVM: PPC: Book3S PR: Handle hyp doorbell exits
KVM: PPC: Book3S PR: Fix ABIv2 on LE
KVM: PPC: Book3S PR: Fix sparse endian checks
PPC: Add asm helpers for BE 32bit load/store
KVM: PPC: Book3S HV: Make HTAB code LE host aware
KVM: PPC: Book3S HV: Access guest VPA in BE
KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
KVM: PPC: Book3S HV: Access XICS in BE
KVM: PPC: Book3S HV: Fix ABIv2 on LE
KVM: PPC: Book3S HV: Enable for little endian hosts
KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
KVM: PPC: Deflect page write faults properly in kvmppc_st
KVM: PPC: Book3S: Stop PTE lookup on write errors
KVM: PPC: Book3S: Add hack for split real mode
KVM: PPC: Book3S: Make magic page properly 4k mappable
KVM: PPC: Remove 440 support
KVM: Rename and add argument to check_extension
KVM: Allow KVM_CHECK_EXTENSION on the vm fd
KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
KVM: PPC: Implement kvmppc_xlate for all targets
KVM: PPC: Move kvmppc_ld/st to common code
KVM: PPC: Remove kvmppc_bad_hva()
KVM: PPC: Use kvm_read_guest in kvmppc_ld
KVM: PPC: Handle magic page in kvmppc_ld/st
KVM: PPC: Separate loadstore emulation from priv emulation
KVM: PPC: Expose helper functions for data/inst faults
KVM: PPC: Remove DCR handling
KVM: PPC: HV: Remove generic instruction emulation
KVM: PPC: PR: Handle FSCR feature deselects
Alexey Kardashevskiy (1):
KVM: PPC: Book3S: Fix LPCR one_reg interface
Aneesh Kumar K.V (4):
KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
KVM: PPC: BOOK3S: PR: Emulate instruction counter
KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page
Anton Blanchard (2):
KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
Bharat Bhushan (10):
kvm: ppc: bookehv: Added wrapper macros for shadow registers
kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
kvm: ppc: Add SPRN_EPR get helper function
kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
KVM: PPC: Booke-hv: Add one reg interface for SPRG9
KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr
Michael Neuling (1):
KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling
Mihai Caraman (8):
KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
KVM: PPC: e500: Fix default tlb for victim hint
KVM: PPC: e500: Emulate power management control SPR
KVM: PPC: e500mc: Revert "add load inst fixup"
KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
KVM: PPC: Book3s: Remove kvmppc_read_inst() function
KVM: PPC: Allow kvmppc_get_last_inst() to fail
KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
Paul Mackerras (4):
KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication
Stewart Smith (2):
Split out struct kvmppc_vcore creation to separate function
Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8
Conflicts:
Documentation/virtual/kvm/api.txt
- Fixes and code refactoring for stage2 kvm MMU unmap_range
- Support unmapping IPAs on deleting memslots for arm and arm64
- Support MMIO mappings in stage2 faults
- KVM VGIC v2 emulation on GICv3 hardware
- Big-Endian support for arm/arm64 (guest and host)
- Debug Architecture support for arm64 (arm32 is on Christoffer's todo list)
- Detect non page-aligned GICV regions and bail out (plugs guest-can-crash host bug)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJT3oZTAAoJEEtpOizt6ddyKQIH/1Bj/cZYSkkSf3IJfQhHRbWN
jS37IBsvcHwjHkRJxCNmQuKP/Ho5XEusluPGrVY25PAgBMl+ouPqAuKzUk+GEab6
snjJjDFqw0zs0x0h3tg6UwfZdF+eyyIkmFGn8/IATD5P3PPd8kWBVtYnSnZmYK+R
KJNVcp6RPDrt9kvUDY8Ln9fW99Jl+7CdgQAnc3QkHcXUlanLyrfq+fE1lSzyrbhZ
ETzyMFAX4kCdc8tflgyyBS4A7+RvfQ6ZIQummxoAMFHIoSk90dtK7ovX68rd9U3e
yL+mpe130+dTIFpUMbxCnIdE7C0eud3vcgXC6MuWtFjUrxQoaEgsVE+ffGC5tX0=
=axkp
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm
KVM/ARM New features for 3.17 include:
- Fixes and code refactoring for stage2 kvm MMU unmap_range
- Support unmapping IPAs on deleting memslots for arm and arm64
- Support MMIO mappings in stage2 faults
- KVM VGIC v2 emulation on GICv3 hardware
- Big-Endian support for arm/arm64 (guest and host)
- Debug Architecture support for arm64 (arm32 is on Christoffer's todo list)
Conflicts:
virt/kvm/arm/vgic.c [last minute cherry-pick from 3.17 to 3.16]
Some people see things like "Exception: 501" in stack traces in dmesg
and assume that means that something has gone badly wrong, when in
fact "Exception: 501" just means a device interrupt was taken.
This changes "Exception" to "interrupt" to make it clearer that we
are just recording the fact of a change in control flow rather than
some error condition.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
vmemmap_populated() checks whether the [start, start + page_size) has valid
pfn numbers, to know whether a vmemmap mapping has been created that includes
this range.
Some range before end might not be checked by this loop:
sec11start......start11..sec11end/sec12start..end....start12..sec12end
as the above, for start11(section 11), it checks [sec11start, sec11end), and
loop ends as the next start(start12) is bigger than end. However,
[sec11end/sec12start, end) is not checked here.
So before the loop, adjust the start to be the start of the section, so we don't miss ranges like the above.
After we adjust start to be the start of the section, it also means it's
aligned with vmemmap as of the sizeof struct page, so we could use
page_to_pfn directly in the loop.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
vmemmap_free() does the opposite of vmemap_populate().
This patch also puts vmemmap_free() and vmemmap_list_free() into
CONFIG_MEMMORY_HOTPLUG.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is to be called in vmemmap_free(), leave the implementation on BOOK3E
empty as before.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements vmemmap_list_free() for vmemmap_free().
The freed entries will be removed from vmemmap_list, and form a freed list,
with next as the header. The next position in the last allocated page is kept
at the list tail.
When allocation, if there are freed entries left, get it from the freed list;
if no freed entries left, get it like before from the last allocated pages.
With this change, realmode_pfn_to_page() also needs to be changed to walk
all the entries in the vmemmap_list, as the virt_addr of the entries might not
be stored in order anymore.
It helps to reuse the memory when continuous doing memory hot-plug/remove
operations, but didn't reclaim the pages already allocated, so the memory usage
will only increase, but won't exceed the value for the largest memory
configuration.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
remap_4k_pfn() silently truncates upper bits of input 4K PFN
if it cannot be contained in PTE. This leads invalid memory mapping and could
result in a system crash when the memory is accessed. This patch fails
remap_4k_pfn() and returns -EINVAL if the input 4K PFN cannot be contained in
PTE.
V3 : Added parentheses to protect 'pfn' and entire macro as suggested by Brian.
V2 : Rewritten to avoid helper function as suggested by Stephen Rothwell.
Signed-off-by: Madhusudanan Kandasamy <kmadhu@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
(NOTE: This patch depends on upstream HMI handling patchset at
https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119731.html)
The current HMI handling on napping cpus does not take care of endianess
issue. On LE host kernel when we wake up from nap due to HMI interrupt we
would checkstop while jumping into opal call. There is a similar issue in
case of fast sleep wakeup where the code invokes opal_resync_tb opal call
without handling LE issue. This patch fixes that as well.
With this patch applied, HMIs handling on LE host kernel works fine.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
HMIs are thread specific and can come while thread is in sleep/nap mode.
Hence with SMT=off mode we can receive HMIs on sleeping threads. For
interrupt received in nap mode, cpu wakes up at system reset vector, clears
the interrupt and go back to nap mode again. But HMIs are sticky and they
keep happening until we clear reason bits from HMER. Hence add a special
check for HMI in reset vector (through power7_wakeup_* functions) and
invoke opal call to handle HMI.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we hit the HMI in Linux, invoke opal call to handle/recover from HMI
errors in real mode and then in virtual mode during check_irq_replay()
invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from
OPAL and act accordingly.
Now that we are ready to handle HMI interrupt directly in linux, remove
the HMI interrupt registration with firmware.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements
basic infrastructure to handle HMI in Linux host. The design is to invoke
opal handle hmi in real mode for recovery and set irq_pending when we hit HMI.
During check_irq_replay pull opal hmi event and print hmi info on console.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There is a couple of commented debug prints which still use
IOMMU_PAGE_SHIFT() which is not defined for POWERPC anymore, replace
them with it_page_shift.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The PCI config accessors check for PE frozen state and clear it if
EEH isn't functional. The patch handles compound PE in config accessors
if PHB supports it. For consistency, all PEs will be put into frozen
state if any one in compound group gets frozen by hardware.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch handles compound PE for EEH backend. If one specific
PE in compound group has been frozen, we enforces to freeze
all PEs in the group. If we're enable DMA or MMIO for one PE
in compound group, DMA or MMIO of all PEs in the group will be
enabled.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch introduces 3 PHB callbacks: compound PE state retrieval,
force freezing and unfreezing compound PE. The PCI config accessors
and PowerNV EEH backend can use them in subsequent patches.
We don't export the capability of compound PE to EEH core, which
helps avoiding more complexity to EEH core.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Function ioda_eeh_get_state() is used to fetch EEH state for PHB
or PE. We're going to support compound PE and the function becomes
more complicated with that. The patch splits the function into two
functions for PHB and PE cases separately to improve readability.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch synchronizes header file with firmware to have new OPAL
API opal_pci_eeh_freeze_set(), which is used to freeze the specified
PE in order to support "compound" PE.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch enables M64 aperatus for PHB3.
We already had platform hook (ppc_md.pcibios_window_alignment) to affect
the PCI resource assignment done in PCI core so that each PE's M32 resource
was built on basis of M32 segment size. Similarly, we're using that for
M64 assignment on basis of M64 segment size.
* We're using last M64 BAR to cover M64 aperatus, and it's shared by all
256 PEs.
* We don't support P7IOC yet. However, some function callbacks are added
to (struct pnv_phb) so that we can reuse them on P7IOC in future.
* PE, corresponding to PCI bus with large M64 BAR device attached, might
span multiple M64 segments. We introduce "compound" PE to cover the case.
The compound PE is a list of PEs and the master PE is used as before.
The slave PEs are just for MMIO isolation.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch allows PE (struct eeh_pe) instance to have auxillary data,
whose size is configurable on basis of platform. For PowerNV, the
auxillary data will be used to cache PHB diag-data for that PE
(frozen PE or fenced PHB). In turn, we can retrieve the diag-data
at any later points.
It's useful for the case of VFIO PCI devices where the error log
should be cached, and then be retrieved by the guest at later point.
Also, it can avoid PHB diag-data overwritting if another frozen PE
reported and the previous diag-data isn't fetched by guest.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness
problems in EEH"). The patch helps to get non-endian-dependent
diag-data.
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pr_warn() is equal to pr_warning(), but the former is a bit more
formal according to commit fc62f2f ("kernel.h: add pr_warn for
symmetry to dev_warn, netdev_warn").
The patch replaces pr_warning() with pr_warn().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch prints 4 PCIE or AER config registers each line, which
is part of the EEH log so that it looks a bit more compact.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.
The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:
eeh_add_flag(): Add EEH flag
eeh_clear_flag(): Clear EEH flag
eeh_has_flag(): Check if one specific flag has been set
eeh_enabled(): Check if EEH functionality has been enabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Function eeh_iommu_group_to_pe() iterates each PCI device to check
the binding IOMMU group with get_iommu_table_base(), which possibly
fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59)
for "bypass" cases.
The patch fixes the issue by iterating devices hooked to the IOMMU
group and fetch IOMMU table there.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass
through one PCI device, whose hose driver ever enable the bypass
mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU
table. However, EEH needs access the IOMMU table when the device
is owned by guest.
The patch fixes pdev->dev.archdata.dma_data.iommu_table when
passing through the device to guest in pnv_pci_ioda2_set_bypass().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pci_get_slot() is called with hold of PCI bus semaphore and it's not
safe to be called in interrupt context. However, we possibly checks
EEH error and calls the function in interrupt context. To avoid using
pci_get_slot(), we turn into device tree for fetching location code.
Otherwise, we might run into WARN_ON() as following messages indicate:
WARNING: at drivers/pci/search.c:223
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc3+ #72
task: c000000001367af0 ti: c000000001444000 task.ti: c000000001444000
NIP: c000000000497b70 LR: c000000000037530 CTR: 000000003003d114
REGS: c000000001446fa0 TRAP: 0700 Not tainted (3.16.0-rc3+)
MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 48002422 XER: 20000000
CFAR: c00000000003752c SOFTE: 0
:
NIP [c000000000497b70] .pci_get_slot+0x40/0x110
LR [c000000000037530] .eeh_pe_loc_get+0x150/0x190
Call Trace:
.of_get_property+0x30/0x60 (unreliable)
.eeh_pe_loc_get+0x150/0x190
.eeh_dev_check_failure+0x1b4/0x550
.eeh_check_failure+0x90/0xf0
.lpfc_sli_check_eratt+0x504/0x7c0 [lpfc]
.lpfc_poll_eratt+0x64/0x100 [lpfc]
.call_timer_fn+0x64/0x190
.run_timer_softirq+0x2cc/0x3e0
Cc: stable@vger.kernel.org
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix wrong __IO_H definition in boot/io.h
Reported-by: Fernando Silveira <fsilveira@gmail.com>
Signed-off-by: Lucas Tanure <tanure@linux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit bcdde7e made __sysfs_remove_dir() recursive and introduced a BUG_ON
during PHB removal while attempting to delete the power managment attribute
group of the bus. This is a result of tearing the bridge and bus devices down
out of order in remove_phb_dynamic. Since, the the bus resides below the bridge
in the sysfs device tree it should be torn down first.
This patch simply moves the device_unregister call for the PHB bridge device
after the device_unregister call for the PHB bus.
Fixes: bcdde7e221 ("sysfs: make __sysfs_remove_dir() recursive")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc defines various machine-specific routines for handling
pci_set_dma_mask(). The routines for machine "PowerNV" may neglect
to set dev->dma_mask. This could confuse anyone (e.g. drivers) that
consult dev->dma_mask to find the current mask. Set the dma_mask in
the PowerNV leaf routine.
Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This silences a section mismatch warning. early_alloc_pgtable() is
called from map_kernel_page() which cannot be __init, but only when
slab_is_available() returns false which can only happen during early
boot.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh:
Skip eeh sysfs when eeh is disabled"). That commit added condition
to create sysfs entries with EEH_ENABLED, which isn't populated
when trying to create sysfs entries on PowerNV platform during system
boot time. The patch fixes the issue by:
* Reoder EEH initialization functions so that they're same on
PowerNV/pSeries.
* Cache PE's primary bus by PowerNV platform instead of EEH core
to avoid kernel crash caused by the function reorder. Another
benefit with this is to avoid one eeh_probe_mode_dev() in EEH
core.
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch exports functions to be used by new VFIO ioctl command,
which will be introduced in subsequent patch, to support EEH
functinality for VFIO PCI devices.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We must not handle EEH error on devices which are passed to somebody
else. Instead, we expect that the frozen device owner detects an EEH
error and recovers from it.
This avoids EEH error handling on passed through devices so the device
owner gets a chance to handle them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Scott writes:
Highlights include e6500 hardware threading support, an e6500 TLB erratum
workaround, corenet error reporting, support for a new board, and some
minor fixes.
This patches adds an "install" target to install kernel builds for SPARC,
modeled after the i386 script.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is a guesswork, but it seems to make sense to drop this
break, as otherwise the following line is never executed and becomes
dead code. And that following line actually saves the result of
local calculation by the pointer given in function argument. So the
proposed change makes sense if this code in the whole makes sense (but I
am unable to analyze it in the whole).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81641
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LDC handshake could have been asynchronously triggered
after ldc_bind() enables the ldc_rx() receive interrupt-handler
(and thus intercepts incoming control packets)
and before vio_port_up() calls ldc_connect(). If that is the case,
ldc_connect() should return 0 and let the state-machine
progress.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based almost entirely upon a patch by Christopher Alexander Tobias
Schulze.
In commit db64fe0225 ("mm: rewrite vmap
layer") lazy VMAP tlb flushing was added to the vmalloc layer. This
causes problems on sparc64.
Sparc64 has two VMAP mapped regions and they are not contiguous with
eachother. First we have the malloc mapping area, then another
unrelated region, then the vmalloc region.
This "another unrelated region" is where the firmware is mapped.
If the lazy TLB flushing logic in the vmalloc code triggers after
we've had both a module unload and a vfree or similar, it will pass an
address range that goes from somewhere inside the malloc region to
somewhere inside the vmalloc region, and thus covering the
openfirmware area entirely.
The sparc64 kernel learns about openfirmware's dynamic mappings in
this region early in the boot, and then services TLB misses in this
area. But openfirmware has some locked TLB entries which are not
mentioned in those dynamic mappings and we should thus not disturb
them.
These huge lazy TLB flush ranges causes those openfirmware locked TLB
entries to be removed, resulting in all kinds of problems including
hard hangs and crashes during reboot/reset.
Besides causing problems like this, such huge TLB flush ranges are
also incredibly inefficient. A plea has been made with the author of
the VMAP lazy TLB flushing code, but for now we'll put a safety guard
into our flush_tlb_kernel_range() implementation.
Since the implementation has become non-trivial, stop defining it as a
macro and instead make it a function in a C source file.
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the big USB driver update for 3.17-rc1.
Loads of gadget driver changes in here, including some big file
movements to make things easier to manage over time. There's also the
usual xhci and uas driver updates, and a handful of other changes in
here. The changelog has the full details.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf2ZIACgkQMUfUDdst+ylvQwCfQKPKcwhtq4vJ2imUzJROEZwN
ygYAnRpFpH/19X59uGSdE7DAdhbitqKg
=uPh1
-----END PGP SIGNATURE-----
Merge tag 'usb-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB updates from Greg KH:
"Here is the big USB driver update for 3.17-rc1.
Loads of gadget driver changes in here, including some big file
movements to make things easier to manage over time. There's also the
usual xhci and uas driver updates, and a handful of other changes in
here. The changelog has the full details.
All of these have been in linux-next for a while"
* tag 'usb-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (211 commits)
USB: devio: fix issue with log flooding
uas: Log a warning when we cannot use uas because the hcd lacks streams
uas: Only complain about missing sg if all other checks succeed
xhci: Add missing checks for xhci_alloc_command failure
xhci: Rename Asrock P67 pci product-id to EJ168
xhci: Blacklist using streams on the Etron EJ168 controller
uas: Limit qdepth to 32 when connected over usb-2
uwb/whci: use correct structure type name in sizeof
usb-core bInterval quirk
USB: serial: ftdi_sio: Add support for new Xsens devices
USB: serial: ftdi_sio: Annotate the current Xsens PID assignments
usb: chipidea: debug: fix sparse non static symbol warnings
usb: ci_hdrc_imx doc: fsl,usbphy is required
usb: ci_hdrc_imx: Return -EINVAL for missing USB PHY
usb: core: allow zero packet flag for interrupt urbs
usb: lvstest: Fix sparse warnings generated by kbuild test bot
USB: core: hcd-pci: free IRQ before disabling PCI device when shutting down
phy: miphy365x: Represent each PHY channel as a DT subnode
phy: miphy365x: Provide support for the MiPHY356x Generic PHY
phy: miphy365x: Add Device Tree bindings for the MiPHY365x
...
Here's the big tty / serial driver update for 3.17-rc1.
Nothing major, just a number of fixes and new features for different
serial drivers, and some more tty core fixes and documentation of the
tty locks.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf2C4ACgkQMUfUDdst+yllVgCgtZl/Mcr/LlxPgjsg2C1AE7nX
YJ4An3o4N112bkdGqhZ7RjAE6K/8YILx
=rPhE
-----END PGP SIGNATURE-----
Merge tag 'tty-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver update from Greg KH:
"Here's the big tty / serial driver update for 3.17-rc1.
Nothing major, just a number of fixes and new features for different
serial drivers, and some more tty core fixes and documentation of the
tty locks.
All of these have been in linux-next for a while"
* tag 'tty-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (82 commits)
tty/n_gsm.c: fix a memory leak in gsmld_open
pch_uart: don't hardcode PCI slot to get DMA device
tty: n_gsm, use setup_timer
Revert "ARC: [arcfpga] stdout-path now suffices for earlycon/console"
serial: sc16is7xx: Correct initialization of s->clk
serial: 8250_dw: Add support for deferred probing
serial: 8250_dw: Add optional reset control support
serial: st-asc: Fix overflow in baudrate calculation
serial: st-asc: Don't call BUG in asc_console_setup()
tty: serial: msm: Make of_device_id array const
tty/n_gsm.c: get gsm->num after gsm_activate_mux
serial/core: Fix too big allocation for attribute member
drivers/tty/serial: use correct type for dma_map/unmap
serial: altera_jtaguart: Fix putchar function passed to uart_console_write()
serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers
Serial: allow port drivers to have a default attribute group
tty: kgdb_nmi: Automatically manage tty enable
serial: altera_jtaguart: Adpot uart_console_write()
serial: samsung: improve code clarity by defining a variable
serial: samsung: correct the case and default order in switch
...
Here's the big pull request for the staging driver tree for 3.17-rc1.
Lots of things in here, over 2000 patches, but the best part is this:
1480 files changed, 39070 insertions(+), 254659 deletions(-)
Thanks to the great work of Kristina Martšenko, 14 different staging
drivers have been removed from the tree as they were obsolete and no one
was willing to work on cleaning them up. Other than the driver
removals, loads of cleanups are in here (comedi, lustre, etc.) as well
as the usual IIO driver updates and additions.
All of this has been in the linux-next tree for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf1wYACgkQMUfUDdst+ykrNwCgswPkRSAPQ3C8WvLhzUYRZZ/L
AqEAoJP0Q8Fz8unXjlSMcx7pgcqUaJ8G
=mrTQ
-----END PGP SIGNATURE-----
Merge tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here's the big pull request for the staging driver tree for 3.17-rc1.
Lots of things in here, over 2000 patches, but the best part is this:
1480 files changed, 39070 insertions(+), 254659 deletions(-)
Thanks to the great work of Kristina Martšenko, 14 different staging
drivers have been removed from the tree as they were obsolete and no
one was willing to work on cleaning them up. Other than the driver
removals, loads of cleanups are in here (comedi, lustre, etc.) as well
as the usual IIO driver updates and additions.
All of this has been in the linux-next tree for a while"
* tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (2199 commits)
staging: comedi: addi_apci_1564: remove diagnostic interrupt support code
staging: comedi: addi_apci_1564: add subdevice to check diagnostic status
staging: wlan-ng: coding style problem fix
staging: wlan-ng: fixing coding style problems
staging: comedi: ii_pci20kc: request and ioremap memory
staging: lustre: bitwise vs logical typo
staging: dgnc: Remove unneeded dgnc_trace.c and dgnc_trace.h
staging: dgnc: rephrase comment
staging: comedi: ni_tio: remove some dead code
staging: rtl8723au: Fix static symbol sparse warning
staging: rtl8723au: usb_dvobj_init(): Remove unused variable 'pdev_desc'
staging: rtl8723au: Do not duplicate kernel provided USB macros
staging: rtl8723au: Remove never set struct pwrctrl_priv.bHWPowerdown
staging: rtl8723au: Remove two never set variables
staging: rtl8723au: RSSI_test is never set
staging:r8190: coding style: Fixed checkpatch reported Error
staging:r8180: coding style: Fixed too long lines
staging:r8180: coding style: Fixed commenting style
staging: lustre: ptlrpc: lproc_ptlrpc.c - fix dereferenceing user space buffer
staging: lustre: ldlm: ldlm_resource.c - fix dereferenceing user space buffer
...
Here's the big driver-core pull request for 3.17-rc1.
Largest thing in here is the dma-buf rework and fence code, that touched
many different subsystems so it was agreed it should go through this
tree to handle merge issues. There's also some firmware loading
updates, as well as tests added, and a few other tiny changes, the
changelog has the details.
All have been in linux-next for a long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf1XcACgkQMUfUDdst+ylREACdHLXBa02yLrRzbrONJ+nARuFv
JuQAoMN49PD8K9iMQpXqKBvZBsu+iCIY
=w8OJ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the big driver-core pull request for 3.17-rc1.
Largest thing in here is the dma-buf rework and fence code, that
touched many different subsystems so it was agreed it should go
through this tree to handle merge issues. There's also some firmware
loading updates, as well as tests added, and a few other tiny changes,
the changelog has the details.
All have been in linux-next for a long time"
* tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
ARM: imx: Remove references to platform_bus in mxc code
firmware loader: Fix _request_firmware_load() return val for fw load abort
platform: Remove most references to platform_bus device
test: add firmware_class loader test
doc: fix minor typos in firmware_class README
staging: android: Cleanup style issues
Documentation: devres: Sort managed interfaces
Documentation: devres: Add devm_kmalloc() et al
fs: debugfs: remove trailing whitespace
kernfs: kernel-doc warning fix
debugfs: Fix corrupted loop in debugfs_remove_recursive
stable_kernel_rules: Add pointer to netdev-FAQ for network patches
driver core: platform: add device binding path 'driver_override'
driver core/platform: remove unused implicit padding in platform_object
firmware loader: inform direct failure when udev loader is disabled
firmware: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN
firmware: read firmware size using i_size_read()
firmware loader: allow disabling of udev as firmware loader
reservation: add suppport for read-only access using rcu
reservation: update api and add some helpers
...
Conflicts:
drivers/base/platform.c
Pull x86 vdso updates from Ingo Molnar:
"Further simplifications and improvements to the VDSO code, by Andy
Lutomirski"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86_64/vsyscall: Fix warn_bad_vsyscall log output
x86/vdso: Set VM_MAYREAD for the vvar vma
x86, vdso: Get rid of the fake section mechanism
x86, vdso: Move the vvar area before the vdso text
Pull x86 UV TLB update from Ingo Molnar:
"UV TLB shootdown logic updates for version of the UV architecture"
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uv: Update the UV3 TLB shootdown logic
Pull RAS updates from Ingo Molnar:
"The main changes in this cycle are:
- RAS tracing/events infrastructure, by Gong Chen.
- Various generalizations of the APEI code to make it available to
non-x86 architectures, by Tomasz Nowicki"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ras: Fix build warnings in <linux/aer.h>
acpi, apei, ghes: Factor out ioremap virtual memory for IRQ and NMI context.
acpi, apei, ghes: Make NMI error notification to be GHES architecture extension.
apei, mce: Factor out APEI architecture specific MCE calls.
RAS, extlog: Adjust init flow
trace, eMCA: Add a knob to adjust where to save event log
trace, RAS: Add eMCA trace event interface
RAS, debugfs: Add debugfs interface for RAS subsystem
CPER: Adjust code flow of some functions
x86, MCE: Robustify mcheck_init_device
trace, AER: Move trace into unified interface
trace, RAS: Add basic RAS trace event
x86, MCE: Kill CPU_POST_DEAD
Pull x86 platform updates from Ingo Molnar:
"The main changes in this cycle are:
- Intel SOC driver updates, by Aubrey Li.
- TS5500 platform updates, by Vivien Didelot"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pmc_atom: Silence shift wrapping warnings in pmc_sleep_tmr_show()
x86/pmc_atom: Expose PMC device state and platform sleep state
x86/pmc_atom: Eisable a few S0ix wake up events for S0ix residency
x86/platform: New Intel Atom SOC power management controller driver
x86/platform/ts5500: Add support for TS-5400 boards
x86/platform/ts5500: Add a 'name' sysfs attribute
x86/platform/ts5500: Use the DEVICE_ATTR_RO() macro
Pull x86 mm changes from Ingo Molnar:
"The main change in this cycle is the rework of the TLB range flushing
code, to simplify, fix and consolidate the code. By Dave Hansen"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Set TLB flush tunable to sane value (33)
x86/mm: New tunable for single vs full TLB flush
x86/mm: Add tracepoints for TLB flushes
x86/mm: Unify remote INVLPG code
x86/mm: Fix missed global TLB flush stat
x86/mm: Rip out complicated, out-of-date, buggy TLB flushing
x86/mm: Clean up the TLB flushing code
x86/smep: Be more informative when signalling an SMEP fault
Pull EFI changes from Ingo Molnar:
"Main changes in this cycle are:
- arm64 efi stub fixes, preservation of FP/SIMD registers across
firmware calls, and conversion of the EFI stub code into a static
library - Ard Biesheuvel
- Xen EFI support - Daniel Kiper
- Support for autoloading the efivars driver - Lee, Chun-Yi
- Use the PE/COFF headers in the x86 EFI boot stub to request that
the stub be loaded with CONFIG_PHYSICAL_ALIGN alignment - Michael
Brown
- Consolidate all the x86 EFI quirks into one file - Saurabh Tangri
- Additional error logging in x86 EFI boot stub - Ulf Winkelvos
- Support loading initrd above 4G in EFI boot stub - Yinghai Lu
- EFI reboot patches for ACPI hardware reduced platforms"
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
efi/arm64: Handle missing virtual mapping for UEFI System Table
arch/x86/xen: Silence compiler warnings
xen: Silence compiler warnings
x86/efi: Request desired alignment via the PE/COFF headers
x86/efi: Add better error logging to EFI boot stub
efi: Autoload efivars
efi: Update stale locking comment for struct efivars
arch/x86: Remove efi_set_rtc_mmss()
arch/x86: Replace plain strings with constants
xen: Put EFI machinery in place
xen: Define EFI related stuff
arch/x86: Remove redundant set_bit(EFI_MEMMAP) call
arch/x86: Remove redundant set_bit(EFI_SYSTEM_TABLES) call
efi: Introduce EFI_PARAVIRT flag
arch/x86: Do not access EFI memory map if it is not available
efi: Use early_mem*() instead of early_io*()
arch/ia64: Define early_memunmap()
x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag
efi/reboot: Allow powering off machines using EFI
efi/reboot: Add generic wrapper around EfiResetSystem()
...
Pull x86 cpufeature updates from Ingo Molnar:
"The main changes in this cycle were:
- Continued cleanups of CPU bugs mis-marked as 'missing features', by
Borislav Petkov.
- Detect the xsaves/xrstors feature and releated cleanup, by Fenghua
Yu"
* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, cpu: Kill cpu_has_mp
x86, amd: Cleanup init_amd
x86/cpufeature: Add bug flags to /proc/cpuinfo
x86, cpufeature: Convert more "features" to bugs
x86/xsaves: Detect xsaves/xrstors feature
x86/cpufeature.h: Reformat x86 feature macros
Pull x86 build/cleanup/debug updates from Ingo Molnar:
"Robustify the build process with a quirk to avoid GCC reordering
related bugs.
Two code cleanups.
Simplify entry_64.S CFI annotations, by Jan Beulich"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, build: Change code16gcc.h from a C header to an assembly header
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Simplify __HAVE_ARCH_CMPXCHG tests
x86/tsc: Get rid of custom DIV_ROUND() macro
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/debug: Drop several unnecessary CFI annotations
The assumption was that update_mmu_cache() (and the equivalent for PMDs) would
only be called when the PTE being installed will be accessible by the user.
This is not true for code paths originating from remove_migration_pte().
There are dire consequences for placing a non-valid PTE into the TSB. The TLB
miss frramework assumes thatwhen a TSB entry matches we can just load it into
the TLB and return from the TLB miss trap.
So if a non-valid PTE is in there, we will deadlock taking the TLB miss over
and over, never satisfying the miss.
Just exit early from update_mmu_cache() and friends in this situation.
Based upon a report and patch from Christopher Alexander Tobias Schulze.
Signed-off-by: David S. Miller <davem@davemloft.net>
Building a 32-bit kernel with CONFIG_X86_USE_3DNOW and CONFIG_EFI_STUB
leads to the following build error,
drivers/firmware/efi/libstub/lib.a(efi-stub-helper.o): In function `efi_relocate_kernel':
efi-stub-helper.c:(.text+0xda5): undefined reference to `_mmx_memcpy'
This is due to the fact that the EFI boot stub pulls in the 3DNow
optimized versions of the memcpy() prototype from
arch/x86/include/asm/string_32.h, even though the _mmx_memcpy()
implementation isn't available in the EFI stub.
For now, predicate CONFIG_EFI_STUB on !CONFIG_X86_USE_3DNOW. This is
most definitely a temporary fix. A complete solution will involve
selectively including kernel headers/symbols into the early-boot
execution environment of the EFI boot stub, i.e. something analogous to
the way that the _SETUP symbol is used.
Previous attempts have been made to fix this kind of problem, though
none seem to have ever been merged,
http://lkml.kernel.org/r/20120329104822.GA17233@x1.osrc.amd.com
Clearly, this problem has been around for a long time.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1407193939-27813-1-git-send-email-matt@console-pimps.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Pull perf changes from Ingo Molnar:
"Kernel side changes:
- Consolidate the PMU interrupt-disabled code amongst architectures
(Vince Weaver)
- misc fixes
Tooling changes (new features, user visible changes):
- Add support for pagefault tracing in 'trace', please see multiple
examples in the changeset messages (Stanislav Fomichev).
- Add pagefault statistics in 'trace' (Stanislav Fomichev)
- Add header for columns in 'top' and 'report' TUI browsers (Jiri
Olsa)
- Add pagefault statistics in 'trace' (Stanislav Fomichev)
- Add IO mode into timechart command (Stanislav Fomichev)
- Fallback to syscalls:* when raw_syscalls:* is not available in the
perl and python perf scripts. (Daniel Bristot de Oliveira)
- Add --repeat global option to 'perf bench' to be used in benchmarks
such as the existing 'futex' one, that was modified to use it
instead of a local option. (Davidlohr Bueso)
- Fix fd -> pathname resolution in 'trace', be it using /proc or a
vfs_getname probe point. (Arnaldo Carvalho de Melo)
- Add suggestion of how to set perf_event_paranoid sysctl, to help
non-root users trying tools like 'trace' to get a working
environment. (Arnaldo Carvalho de Melo)
- Updates from trace-cmd for traceevent plugin_kvm plus args cleanup
(Steven Rostedt, Jan Kiszka)
- Support S/390 in 'perf kvm stat' (Alexander Yarygin)
Tooling infrastructure changes:
- Allow reserving a row for header purposes in the hists browser
(Arnaldo Carvalho de Melo)
- Various fixes and prep work related to supporting Intel PT (Adrian
Hunter)
- Introduce multiple debug variables control (Jiri Olsa)
- Add callchain and additional sample information for python scripts
(Joseph Schuchart)
- More prep work to support Intel PT: (Adrian Hunter)
- Polishing 'script' BTS output
- 'inject' can specify --kallsym
- VDSO is per machine, not a global var
- Expose data addr lookup functions previously private to 'script'
- Large mmap fixes in events processing
- Include standard stringify macros in power pc code (Sukadev
Bhattiprolu)
Tooling cleanups:
- Convert open coded equivalents to asprintf() (Andy Shevchenko)
- Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo)
- Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de
Melo)
- No need to reimplement err() in 'perf bench sched-messaging', drop
barf(). (Davidlohr Bueso).
- Remove ev_name argument from perf_evsel__hists_browse, can be
obtained from the other parameters. (Jiri Olsa)
Tooling fixes:
- Fix memory leak in the 'sched-messaging' perf bench test.
(Davidlohr Bueso)
- The -o and -n 'perf bench mem' options are mutually exclusive, emit
error when both are specified. (Davidlohr Bueso)
- Fix scrollbar refresh row index in the ui browser, problem exposed
now that headers will be added and will be allowed to be switched
on/off. (Jiri Olsa)
- Handle the num array type in python properly (Sebastian Andrzej
Siewior)
- Fix wrong condition for allocation failure (Jiri Olsa)
- Adjust callchain based on DWARF debug info on powerpc (Sukadev
Bhattiprolu)
- Fix a risk for doing free on uninitialized pointer in traceevent
lib (Rickard Strandqvist)
- Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa)
- Enable close-on-exec flag on perf file descriptor (Yann Droneaud)
- Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)
- Event ordering fixes (Jiri Olsa)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits)
Revert "perf tools: Fix jump label always changing during tracing"
perf tools: Fix perf usage string leftover
perf: Check permission only for parent tracepoint event
perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds
perf record: Always force PERF_RECORD_FINISHED_ROUND event
perf inject: Add --kallsyms parameter
perf tools: Expose 'addr' functions so they can be reused
perf session: Fix accounting of ordered samples queue
perf powerpc: Include util/util.h and remove stringify macros
perf tools: Fix build on gcc 4.4.7
perf tools: Add thread parameter to vdso__dso_findnew()
perf tools: Add dso__type()
perf tools: Separate the VDSO map name from the VDSO dso name
perf tools: Add vdso__new()
perf machine: Fix the lifetime of the VDSO temporary file
perf tools: Group VDSO global variables into a structure
perf session: Add ability to skip 4GiB or more
perf session: Add ability to 'skip' a non-piped event stream
perf tools: Pass machine to vdso__dso_findnew()
perf tools: Add dso__data_size()
...
Pull locking updates from Ingo Molnar:
"The main changes in this cycle are:
- big rtmutex and futex cleanup and robustification from Thomas
Gleixner
- mutex optimizations and refinements from Jason Low
- arch_mutex_cpu_relax() removal and related cleanups
- smaller lockdep tweaks"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
arch, locking: Ciao arch_mutex_cpu_relax()
locking/lockdep: Only ask for /proc/lock_stat output when available
locking/mutexes: Optimize mutex trylock slowpath
locking/mutexes: Try to acquire mutex only if it is unlocked
locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro
locking/mutexes: Correct documentation on mutex optimistic spinning
rtmutex: Make the rtmutex tester depend on BROKEN
futex: Simplify futex_lock_pi_atomic() and make it more robust
futex: Split out the first waiter attachment from lookup_pi_state()
futex: Split out the waiter check from lookup_pi_state()
futex: Use futex_top_waiter() in lookup_pi_state()
futex: Make unlock_pi more robust
rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
rtmutex: Cleanup deadlock detector debug logic
rtmutex: Confine deadlock logic to futex
rtmutex: Simplify remove_waiter()
rtmutex: Document pi chain walk
rtmutex: Clarify the boost/deboost part
rtmutex: No need to keep task ref for lock owner check
rtmutex: Simplify and document try_to_take_rtmutex()
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT38QhAAoJEKurIx+X31iBXgwP/RzqeqGULY4XXqEsvzuPT1y6
Os1PnYoHHQBfQTkjZYrZDSfWGtG0vbc69nI/ntIhAqPvkIHwWiDorcI8kttMFHdd
sWOXUWJ8VxtP3/Hgnv1X8fKKB18dlMowFyD4ymyGj7o0lnStZRlh8/1z677W9vIf
rasZ4f+qcynKtAAAuoU2GCh9n2dt+hT6edq605G98by463NyG49RxzWVxQxxgrfb
KkBwGtRfGUcx1NmLIBulsFnhSLE01ElSQL++QRJ81CE9rTLPzdWbchAp1ARyc82L
cIRulsJ9U/Z78DFVmH46CKCYJn+N2EgsVcvjloybCW1ErmqjjmIxAwHdduCjyQrj
D/4aE8PDm8IkLjQrvazogY8yUiSX7t6TEq80v/tMSklRZfiFS4ZHfvDpd7QnVAEL
tPBlPNPq9MK+/k5S7vOjUCdqp1fcS14CdKWgdzGIHiDRKUvA0PpPzniqPAOB8tq1
yvSFNXasvM3Pb6ycbQackxi4svOCvS/rLXD9nwZO1QN64Uf6+CG5OyxQ/5/+n4d8
MTWfVKdOd593xEC8+cFlf0p/+ZnY7844qeP2J2IHE5n8sCHF9tHdI/1LlXG7ZOKn
u/q0LuHjy79gwsunj4JxtedxXSLgaW1EpAxOKbzSQue5m9Ra4hiAFLQZwvOxSOOP
jP0gdCCuWMSdiva8iaAx
=hP9S
-----END PGP SIGNATURE-----
Merge tag 'please-pull-misc-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 cleanups from Tony Luck:
"Miscellaneous ia64 specific cleanup"
* tag 'please-pull-misc-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] sn: Do not needlessly convert between pointers and integers
[IA64] sn: Fix zeroing of PDAs
Changes include:
- Context tracking support (NO_HZ_FULL) which narrowly missed 3.16
- vDSO layout rework following Andy's work on x86
- TEXT_OFFSET fuzzing for bootloader testing
- /proc/cpuinfo tidy-up
- Preliminary work to support 48-bit virtual addresses, but this is
currently disabled until KVM has been ported to use it (the patches
do, however, bring some nice clean-up)
- Boot-time CPU sanity checks (especially useful on heterogenous
systems)
- Support for syscall auditing
- Support for CC_STACKPROTECTOR
- defconfig updates
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJT3qkzAAoJEC379FI+VC/ZxwEP/3uYs9glDLTd1hmVFr1cRutg
j4m1Kc7RCO+zpbYCXJLAQLPjwjOaUWPZUeZPQZib6bO+4sTqFYe9vsaqRyvn/bxM
BaQhytpyxymfG8m3rmXaI97TzBwnRB2oQ0k36rsjMwG/VQMLf9kVuEwURoAHF07l
RyMK2sAwE0/8XIJZQFNo5SAbkO52EiHlehdlTzCXGWWOWdHDyVfks/k6YhIS991r
0W9Y0ghHaMz+mAumTSq7jzPQa3aF3GjTp0W7gJjk/PRBDHfPisphEO36zsA0yHtE
3uvEH0kUQK/ve4ZUQiNvuEZCSqalPFag6j5Z8BnFtafa66J5h414CGPAfER6Kz7+
KGpoEve+7Rpvvb1S4T0tTMg7HoGrvqc5wKS3uFxfoGooGUcUOchSkYiVTBMDJSKn
QlJbb1QSvuNFGhcKntTOe1QMT+x0w9urq/e+QfnQrZ/m5Er7J3qCZzeOfA2JFTjQ
sB24yjzAz5a5VwbKbuB2b4gDILY9oYNe94HFP08o/rJfANnL0dpP1Oyl0b12ILsI
a69EMdpaeEQo8703KLIlzfW6u92PqYs6UkYvya8o27FAvmNvDfB/PffjgVsOAHFi
Qc+dpYbnzNfwJgG9w0qhJ+MR8g5fiBYHqNpfGOY+g5M50j0hZUX9comoWw1xkl0X
HlvG7xzrTF7/VbWEtZ2o
=6XMc
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Once again, Catalin's off on holiday and I'm looking after the arm64
tree. Please can you pull the following arm64 updates for 3.17?
Note that this branch also includes the new GICv3 driver (merged via a
stable tag from Jason's irqchip tree), since there is a fix for older
binutils on top.
Changes include:
- context tracking support (NO_HZ_FULL) which narrowly missed 3.16
- vDSO layout rework following Andy's work on x86
- TEXT_OFFSET fuzzing for bootloader testing
- /proc/cpuinfo tidy-up
- preliminary work to support 48-bit virtual addresses, but this is
currently disabled until KVM has been ported to use it (the patches
do, however, bring some nice clean-up)
- boot-time CPU sanity checks (especially useful on heterogenous
systems)
- support for syscall auditing
- support for CC_STACKPROTECTOR
- defconfig updates"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (55 commits)
arm64: add newline to I-cache policy string
Revert "arm64: dmi: Add SMBIOS/DMI support"
arm64: fpsimd: fix a typo in fpsimd_save_partial_state ENDPROC
arm64: don't call break hooks for BRK exceptions from EL0
arm64: defconfig: enable devtmpfs mount option
arm64: vdso: fix build error when switching from LE to BE
arm64: defconfig: add virtio support for running as a kvm guest
arm64: gicv3: Allow GICv3 compilation with older binutils
arm64: fix soft lockup due to large tlb flush range
arm64/crypto: fix makefile rule for aes-glue-%.o
arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL
arm64: Fix barriers used for page table modifications
arm64: Add support for 48-bit VA space with 64KB page configuration
arm64: asm/pgtable.h pmd/pud definitions clean-up
arm64: Determine the vmalloc/vmemmap space at build time based on VA_BITS
arm64: Clean up the initial page table creation in head.S
arm64: Remove asm/pgtable-*level-types.h files
arm64: Remove asm/pgtable-*level-hwdef.h files
arm64: Convert bool ARM64_x_LEVELS to int ARM64_PGTABLE_LEVELS
arm64: mm: Implement 4 levels of translation tables
...
few days.
MIPS and s390 have little going on this release; just bugfixes, some
small, some larger.
The highlights for x86 are nested VMX improvements (Jan Kiszka), optimizations
for old processor (up to Nehalem, by me and Bandan Das), and a lot of x86
emulator bugfixes (Nadav Amit).
Stephen Rothwell reported a trivial conflict with the tracing branch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJT300XAAoJEBvWZb6bTYby3V8QAJz+XyajnhJ8wH55Vxczz22L
i2gtUGmBLhEXsBcaVKO4BBfek88lLzg0SGLjfW5wCMQmKtxVlrwTCXNkBoPGjapd
NwHtWkMKym44PDhRovn7zkSumkxC43uFIBR/ebrhP6Bvhh9s+MnkQUxfw9ILB+YV
EeKyEG8sSgxFCciuHbp3mIXpDcO6r/ldy6I7009OdyhLoMY+Kvmk7kRe9wtAivdg
CGJi60QvGOn2RGRPOCEtF6UWr8Ae8fe1t84o0hkXPv/j3jtabzAatXKJa4dYNbIs
7Mp4NQpxaGV6rq3WCYVeZRxGs+UReGDAS3Il4Z8C9eTOTooSfxdVr8acpM8PY6I8
UmLT6ECLGycc4ELXrETtR+QLmiXACyJqyVxz4aiLV3kWSWfamKD3hBeQK9NizNcE
VoPDl+PyISvR1tW4KstBuzfUWAEXi+gO78cqqFr/VW6cl7HKpA1DFQaPfGkYKDae
2CPwcLwI5/M6RtSgkyXTkEqNZLc2BjldqSeM1lmWjhZVW56X2iqePUL46Vab3Yvt
U+sELtwEE560NLN3hbaHUsLR1tcUix5w8vTzcXPxgoHQBszHCcAZTWd1XHulr64F
rp/cangqtkPKcu5j1mNhQs38oLjHI1MUsbQrqFoD4tmHjQ75iXHRFzYGoIVKXyHG
AnGbQzJzBcdAANhm3LW0
=UXxV
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
"These are the x86, MIPS and s390 changes; PPC and ARM will come in a
few days.
MIPS and s390 have little going on this release; just bugfixes, some
small, some larger.
The highlights for x86 are nested VMX improvements (Jan Kiszka),
optimizations for old processor (up to Nehalem, by me and Bandan Das),
and a lot of x86 emulator bugfixes (Nadav Amit).
Stephen Rothwell reported a trivial conflict with the tracing branch"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (104 commits)
x86/kvm: Resolve shadow warnings in macro expansion
KVM: s390: rework broken SIGP STOP interrupt handling
KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
KVM: vmx: remove duplicate vmx_mpx_supported() prototype
KVM: s390: Fix memory leak on busy SIGP stop
x86/kvm: Resolve shadow warning from min macro
kvm: Resolve missing-field-initializers warnings
Replace NR_VMX_MSR with its definition
KVM: x86: Assertions to check no overrun in MSR lists
KVM: x86: set rflags.rf during fault injection
KVM: x86: Setting rflags.rf during rep-string emulation
KVM: x86: DR6/7.RTM cannot be written
KVM: nVMX: clean up nested_release_vmcs12 and code around it
KVM: nVMX: fix lifetime issues for vmcs02
KVM: x86: Defining missing x86 vectors
KVM: x86: emulator injects #DB when RFLAGS.RF is set
KVM: x86: Cleanup of rflags.rf cleaning
KVM: x86: Clear rflags.rf on emulated instructions
KVM: x86: popf emulation should not change RF
KVM: x86: Clearing rflags.rf upon skipped emulated instruction
...
to the ftrace function callback infrastructure. It's introducing a
way to allow different functions to call directly different trampolines
instead of all calling the same "mcount" one.
The only user of this for now is the function graph tracer, which always
had a different trampoline, but the function tracer trampoline was called
and did basically nothing, and then the function graph tracer trampoline
was called. The difference now, is that the function graph tracer
trampoline can be called directly if a function is only being traced by
the function graph trampoline. If function tracing is also happening on
the same function, the old way is still done.
The accounting for this takes up more memory when function graph tracing
is activated, as it needs to keep track of which functions it uses.
I have a new way that wont take as much memory, but it's not ready yet
for this merge window, and will have to wait for the next one.
Another big change was the removal of the ftrace_start/stop() calls that
were used by the suspend/resume code that stopped function tracing when
entering into suspend and resume paths. The stop of ftrace was done
because there was some function that would crash the system if one called
smp_processor_id()! The stop/start was a big hammer to solve the issue
at the time, which was when ftrace was first introduced into Linux.
Now ftrace has better infrastructure to debug such issues, and I found
the problem function and labeled it with "notrace" and function tracing
can now safely be activated all the way down into the guts of suspend
and resume.
Other changes include clean ups of uprobe code.
Clean up of the trace_seq() code.
And other various small fixes and clean ups to ftrace and tracing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJT35zXAAoJEKQekfcNnQGuOz0H/38zqM0nLFhrgvz3EPk2UOjn
xqpX8qyb2V7TJZL+IqeXU2a5cQZl5ba0D4WtBGpxbTae3CJYiuQ87iKUNFoH0om5
FDpn80igb368k8V3qRdRsziKVCCf0XBd/NkHJXc0ZkfXGyzB2Ga4bBxALxp2gj9y
bnO+vKo6+tWYKG4hyQb4P3LRXUrK8/LWEsPr39cH2QH1Rdj69Lx9CgrCdUVJmwcb
Bj8hEiLXL/RYCFNn79A3wNTUvW0rG/AOIf4SLqXtasSRZ0ToaU0ZyDnrNv+0Ol47
rX8tSk+LfXchL9hpIvjCf1vlAYq3pO02favteR/jip3lx/dTjEDE4RJ9qtJzZ4Q=
=fwQY
-----END PGP SIGNATURE-----
Merge tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"This pull request has a lot of work done. The main thing is the
changes to the ftrace function callback infrastructure. It's
introducing a way to allow different functions to call directly
different trampolines instead of all calling the same "mcount" one.
The only user of this for now is the function graph tracer, which
always had a different trampoline, but the function tracer trampoline
was called and did basically nothing, and then the function graph
tracer trampoline was called. The difference now, is that the
function graph tracer trampoline can be called directly if a function
is only being traced by the function graph trampoline. If function
tracing is also happening on the same function, the old way is still
done.
The accounting for this takes up more memory when function graph
tracing is activated, as it needs to keep track of which functions it
uses. I have a new way that wont take as much memory, but it's not
ready yet for this merge window, and will have to wait for the next
one.
Another big change was the removal of the ftrace_start/stop() calls
that were used by the suspend/resume code that stopped function
tracing when entering into suspend and resume paths. The stop of
ftrace was done because there was some function that would crash the
system if one called smp_processor_id()! The stop/start was a big
hammer to solve the issue at the time, which was when ftrace was first
introduced into Linux. Now ftrace has better infrastructure to debug
such issues, and I found the problem function and labeled it with
"notrace" and function tracing can now safely be activated all the way
down into the guts of suspend and resume
Other changes include clean ups of uprobe code, clean up of the
trace_seq() code, and other various small fixes and clean ups to
ftrace and tracing"
* tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits)
ftrace: Add warning if tramp hash does not match nr_trampolines
ftrace: Fix trampoline hash update check on rec->flags
ring-buffer: Use rb_page_size() instead of open coded head_page size
ftrace: Rename ftrace_ops field from trampolines to nr_trampolines
tracing: Convert local function_graph functions to static
ftrace: Do not copy old hash when resetting
tracing: let user specify tracing_thresh after selecting function_graph
ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on()
tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST
s390/ftrace: remove check of obsolete variable function_trace_stop
arm64, ftrace: Remove check of obsolete variable function_trace_stop
Blackfin: ftrace: Remove check of obsolete variable function_trace_stop
metag: ftrace: Remove check of obsolete variable function_trace_stop
microblaze: ftrace: Remove check of obsolete variable function_trace_stop
MIPS: ftrace: Remove check of obsolete variable function_trace_stop
parisc: ftrace: Remove check of obsolete variable function_trace_stop
sh: ftrace: Remove check of obsolete variable function_trace_stop
sparc64,ftrace: Remove check of obsolete variable function_trace_stop
tile: ftrace: Remove check of obsolete variable function_trace_stop
ftrace: x86: Remove check of obsolete variable function_trace_stop
...
drivers and fixes/enhancements to existing clock drivers. There are also
some non-critical fixes and improvements to the framework core.
Changes to the clock framework core include:
* improvements to printks on errors
* flattening the previously hierarchal structure of per-clock entries
in debugfs
* allow per-clock debugfs entries that are specific to a particular
clock driver
* configure initial clock parent and/or initial clock rate from Device
Tree
* several feature enhancements to the composite clock type
* misc fixes
New clock drivers added include:
* TI Palmas PMIC
* Allwinner A23 SoC
* Qualcomm APQ8084 and IPQ8064 SoCs
* Rockchip rk3188, rk3066 and rk3288 SoCs
* STMicroelectronics STiH407 SoC
* Cirrus Logic CLPS711X SoC
Many fixes, feature enhancements and further clock tree support for
existing clock drivers also were merged, such as Samsung's "ARMCLK down"
power saving feature for their Exynos4 & Exynos5 SoCs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJT38lmAAoJEDqPOy9afJhJh9YQAKROq+lrKaf+YAk22E0GCF30
Q+KZ9EcePdxWvcDPKsMIf/wAIYdtGDoI6wgyw1tcSWeLKwwyHMfVdOCExWig2gwl
/4LU2tACKe+Xa0HJnsbNwQGj2n4qMGOUsDeRRmK4rcbuHZhTP15IscmFCbL+sUia
z3uaYf7ty3a1tSXBl3NY4EpYAXGiE+MMVBoU08ATYOOjvGcxNNfu50JSltGXarqv
BFBjpv0oikN3RvbVyuUUvEF8m6AeNYhbqxI0IuNmoE+mAkgB2n221CK4Qv6a3oDb
QJebzRdeprcak8HrK76Ik6Dd9itcs03u6G1qwLc30JH5wUHYcgqA5bvqDIx+2W0J
Z7NPi3tFTry1aeXnZPk7DbWruzXLQkXkgRM4xHXsmezRnO7zDvuoDgUT0pIrS9+v
+BRIyfPiBL9Lp1J17R0I1K76O7YnvyQhX+0CdZx0SOJNGPl+SIwTI4q+gQoDIZqP
0ubpuaH4v6gZiEol2HXKYN9ASWyRtX7PfiexQgmts1aewlPopWfuc7LdxhHQIv3B
3O/7jbhdhXsf7VaTvx7xkFEMxjY7IwEF4pN0F+ulwWj/rLK3vLCnTwxgv8IrNHit
Dkzt4kVzLW/GSWa3irTnISvsg+bHkRc7aPuW/i0km7RYUuL2dcaJLEBPYuka/AdH
1xIMaGNpkA3HrS+8CQYf
=48y9
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus-3.17' of git://git.linaro.org/people/mike.turquette/linux
Pull clock framework updates from Mike Turquette:
"The clock framework changes for 3.17 are mostly additions of new clock
drivers and fixes/enhancements to existing clock drivers. There are
also some non-critical fixes and improvements to the framework core.
Changes to the clock framework core include:
- improvements to printks on errors
- flattening the previously hierarchal structure of per-clock entries
in debugfs
- allow per-clock debugfs entries that are specific to a particular
clock driver
- configure initial clock parent and/or initial clock rate from
Device Tree
- several feature enhancements to the composite clock type
- misc fixes
New clock drivers added include:
- TI Palmas PMIC
- Allwinner A23 SoC
- Qualcomm APQ8084 and IPQ8064 SoCs
- Rockchip rk3188, rk3066 and rk3288 SoCs
- STMicroelectronics STiH407 SoC
- Cirrus Logic CLPS711X SoC
Many fixes, feature enhancements and further clock tree support for
existing clock drivers also were merged, such as Samsung's "ARMCLK
down" power saving feature for their Exynos4 & Exynos5 SoCs"
* tag 'clk-for-linus-3.17' of git://git.linaro.org/people/mike.turquette/linux: (86 commits)
clk: Add missing of_clk_set_defaults export
clk: checking wrong variable in __set_clk_parents()
clk: Propagate any error return from debug_init()
clk: clps711x: Add DT bindings documentation
clk: Add CLPS711X clk driver
clk: st: Use round to closest divider flag
clk: st: Update frequency tables for fs660c32 and fs432c65
clk: st: STiH407: Support for clockgenA9
clk: st: STiH407: Support for clockgenD0/D2/D3
clk: st: STiH407: Support for clockgenC0
clk: st: Add quadfs reset handling
clk: st: Add polarity bit indication
clk: st: STiH407: Support for clockgenA0
clk: st: STiH407: Support for A9 MUX Clocks
clk: st: STiH407: Support for Flexgen Clocks
clk: st: Adds Flexgen clock binding
clk: st: Remove uncessary (void *) cast
clk: st: use static const for clkgen_pll_data tables
clk: st: use static const for stm_fs tables
clk: st: Update ST clock binding documentation
...
Pull percpu updates from Tejun Heo:
- Major reorganization of percpu header files which I think makes
things a lot more readable and logical than before.
- percpu-refcount is updated so that it requires explicit destruction
and can be reinitialized if necessary. This was pulled into the
block tree to replace the custom percpu refcnting implemented in
blk-mq.
- In the process, percpu and percpu-refcount got cleaned up a bit
* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits)
percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
percpu-refcount: require percpu_ref to be exited explicitly
percpu-refcount: use unsigned long for pcpu_count pointer
percpu-refcount: add helpers for ->percpu_count accesses
percpu-refcount: one bit is enough for REF_STATUS
percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
workqueue: stronger test in process_one_work()
workqueue: clear POOL_DISASSOCIATED in rebind_workers()
percpu: Use ALIGN macro instead of hand coding alignment calculation
percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
percpu: preffity percpu header files
percpu: use raw_cpu_*() to define __this_cpu_*()
percpu: reorder macros in percpu header files
percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
percpu: reorganize include/linux/percpu-defs.h
percpu: move accessors from include/linux/percpu.h to percpu-defs.h
percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
percpu: introduce arch_raw_cpu_ptr()
...
Pull crypto update from Herbert Xu:
- CTR(AES) optimisation on x86_64 using "by8" AVX.
- arm64 support to ccp
- Intel QAT crypto driver
- Qualcomm crypto engine driver
- x86-64 assembly optimisation for 3DES
- CTR(3DES) speed test
- move FIPS panic from module.c so that it only triggers on crypto
modules
- SP800-90A Deterministic Random Bit Generator (drbg).
- more test vectors for ghash.
- tweak self tests to catch partial block bugs.
- misc fixes.
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (94 commits)
crypto: drbg - fix failure of generating multiple of 2**16 bytes
crypto: ccp - Do not sign extend input data to CCP
crypto: testmgr - add missing spaces to drbg error strings
crypto: atmel-tdes - Switch to managed version of kzalloc
crypto: atmel-sha - Switch to managed version of kzalloc
crypto: testmgr - use chunks smaller than algo block size in chunk tests
crypto: qat - Fixed SKU1 dev issue
crypto: qat - Use hweight for bit counting
crypto: qat - Updated print outputs
crypto: qat - change ae_num to ae_id
crypto: qat - change slice->regions to slice->region
crypto: qat - use min_t macro
crypto: qat - remove unnecessary parentheses
crypto: qat - remove unneeded header
crypto: qat - checkpatch blank lines
crypto: qat - remove unnecessary return codes
crypto: Resolve shadow warnings
crypto: ccp - Remove "select OF" from Kconfig
crypto: caam - fix DECO RSR polling
crypto: qce - Let 'DEV_QCE' depend on both HAS_DMA and HAS_IOMEM
...
This patch adds common part of dsi node.
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
This patch adds sysreg property to fimd device node
which is required to use I80 interface.
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
This patch adds sysreg property to fimd device node
which is required to use I80 interface.
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
This driver is needed by SATA, PCIe and USB modules on TI SoCs.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
I don't know if we really need 64 bits here but these variables are
declared as u64 and it can't hurt to cast this so we prevent any shift
wrapping.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Aubrey Li <aubrey.li@linux.intel.com>
Link: http://lkml.kernel.org/r/20140801082715.GE28869@mwanda
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix
split 'struct sk_filter' into
struct sk_filter {
atomic_t refcnt;
struct rcu_head rcu;
struct bpf_prog *prog;
};
and
struct bpf_prog {
u32 jited:1,
len:31;
struct sock_fprog_kern *orig_prog;
unsigned int (*bpf_func)(const struct sk_buff *skb,
const struct bpf_insn *filter);
union {
struct sock_filter insns[0];
struct bpf_insn insnsi[0];
struct work_struct work;
};
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases
split SK_RUN_FILTER macro into:
SK_RUN_FILTER to be used with 'struct sk_filter *' and
BPF_PROG_RUN to be used with 'struct bpf_prog *'
__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function
also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:
sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter
API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet
API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
to indicate that this function is converting classic BPF into eBPF
and not related to sockets
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM fixes from Russell King:
"A few fixes for ARM. Some of these are correctness issues:
- TLBs must be flushed after the old mappings are removed by the DMA
mapping code, but before the new mappings are established.
- An off-by-one entry error in the Keystone LPAE setup code.
Fixes include:
- ensuring that the identity mapping for LPAE does not remove the
kernel image from the identity map.
- preventing userspace from trapping into kgdb.
- fixing a preemption issue in the Intel iwmmxt code.
- fixing a build error with nommu.
Other changes include:
- Adding a note about which areas of memory are expected to be
accessible while the identity mapping tables are in place"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8124/1: don't enter kgdb when userspace executes a kgdb break instruction
ARM: idmap: add identity mapping usage note
ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout
ARM: fix alignment of keystone page table fixup
ARM: 8112/1: only select ARM_PATCH_PHYS_VIRT if MMU is enabled
ARM: 8100/1: Fix preemption disable in iwmmxt_task_enable()
ARM: DMA: ensure that old section mappings are flushed from the TLB
The kgdb breakpoint hooks (kgdb_brk_fn and kgdb_compiled_brk_fn)
should only be entered when a kgdb break instruction is executed
from the kernel. Otherwise, if kgdb is enabled, a userspace program
can cause the kernel to drop into the debugger by executing either
KGDB_BREAKINST or KGDB_COMPILED_BREAK.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add a note about the usage of the identity mapping; we do not support
accesses outside of the identity map region and kernel image while a
CPU is using the identity map. This is because the identity mapping
may overwrite vmalloc space, IO mappings, the vectors pages, etc.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add further comments to the early page table remap code to explain what
the code is doing, why it is doing it, but more importantly to explain
that the code is not architecturally compliant and is squarely in
"UNPREDICTABLE" behaviour territory.
Add a warning and tainting of the kernel too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With SCU standby enabled, SCU CLK will be turned off when all processors
are in WFI mode. And the clock will be turned on when any processor
leaves WFI mode.
This behavior should be preferable in terms of power efficiency of
system idle. So let's set the SCU standby bit to enable the support in
function scu_enable().
Cortex-A9 earlier than r2p0 has no standby bit in SCU, so we need to
skip setting the bit for those.
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Acked-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use macro instead of magic number for SCU enable bit.
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Acked-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Common SHA-1 structures are defined in <crypto/sha.h> for code sharing.
This patch changes SHA-1/ARM glue code to use these structures.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull x86 fix from Peter Anvin:
"A single fix to not invoke the espfix code on Xen PV, as it turns out
to oops the guest when invoked after all. This patch leaves some
amount of dead code, in particular unnecessary initialization of the
espfix stacks when they won't be used, but in the interest of keeping
the patch minimal that cleanup can wait for the next cycle"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86_64/entry/xen: Do not invoke espfix64 on Xen
The bus devices created to be parents for other peripherals
were using platform_bus as a parent, not being platform
devices themselves. Remove the references, making them
virtual devices instead.
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since checkin
411cf9ee29 x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
... is_vsmp_box() is only used in vsmp_64.c and does not have any
header file declaring it, so make it explicitly static.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Shai Fultheim <shai@scalemp.com>
Cc: Oren Twaig <oren@scalemp.com>
Link: http://lkml.kernel.org/r/1404036068-11674-1-git-send-email-oren@scalemp.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch implements the stack protector code in MIPS compressed boot
phase based on the same code added to arm in commit
8779657d29 "stackprotector: Introduce
CONFIG_CC_STACKPROTECTOR_STRONG" by Kees Cook <keescook@chromium.org>
Signed-off-by: Ben Chan <benchan@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7175/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The MIPS_CONF4_FTLBSETS_SHIFT define is cut and pasted twice so we can
remove the second define.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: John Crispin <blogic@openwrt.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: kernel-janitors@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7063/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Initialise the MAARs such that speculation is enabled for all physical
addresses outside of the I/O region.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7333/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add initialisation for Memory Accessibility Attribute Registers. Generic
code cannot know the platform-specific requirements with regards to
speculative accesses, so it simply calls a platform_maar_init function
which platforms with MAARs are expected to implement by calling the
provided write_maar_pair function & returning the number of MAAR pairs
used. A weak default implementation will simply use no MAAR pairs. Any
present but unused MAAR pairs are then marked invalid, effectively
disabling them.
The end result of this patch is that MAARs are all marked invalid, until
platforms implement the platform_maar_init function.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7331/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Detect the presence of MAAR using the MRP bit in Config5, and record
that presence using a CPU option bit. A cpu_has_maar macro will then
allow code to conditionalise upon the presence of MAARs.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7330/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add accessor macros for the Memory Accessibility Attribute Registers
(MAARs), the bits contained within the MAARs & the Config5.MRP bit
indicating their presence. The only current use of the MAARs is to
enable speculative accesses to regions of memory. Besides the potential
performance benefits of speculative accesses, they are a requirement
for the P5600 core to handle non-128b-aligned MSA vector loads & stores
rather than generating an address error.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7329/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In light of the commit 16f77de82f (Revert "MIPS: Save/restore MSA
context around signals") the MSA support in the kernel is incomplete.
Until the replacement for the former sigcontext changes is agreed upon
and in tree, mark MSA experimental & disable it by default.
MSA is only implemented by one CPU supported by the kernel, the P5600.
The P5600 is a 32 bit core, and thus MSA can only be used when the
experimental CONFIG_MIPS_O32_FP64_SUPPORT option is enabled. Therefore
MSA is only being used in experimental settings anyway and this change
doesn't actually make any difference beyond clarifying the state of
MSA support.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7311/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MSA requires that Status.FR == 1, so for MIPS32 tasks MSA can only be
used if CONFIG_MIPS_O32_FP64_SUPPORT is enabled. If it is not & the
kernel is 32bit, there's no point including support for MSA.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7310/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The TIF_MSA_CTX_LIVE flag (indicating that a task has MSA context which
needs to be preserved) was being cleared in start_thread, but the
TIF_USEDMSA flag (indicating that a task has used MSA in this timeslice)
was not. In copy_thread neither flag was cleared, but both need to be.
Without clearing these flags the kernel will proceed to attempt to save
MSA context when the task is context switched out, and if the task had
not used MSA in the meantime then it will fail because MSA or the FPU
are disabled. The end result is typically:
do_cpu invoked from kernel context![#1]:
CPU: 0 PID: 99 Comm: sh Not tainted 3.16.0-rc4-00025-g6dc9476-dirty #88
task: 8f23dc60 ti: 8f1d8000 task.ti: 8f1d8000
...
Call Trace:
[<8010edbc>] resume+0x5c/0x280
[<80481e0c>] __schedule+0x370/0x800
[<80104838>] work_resched+0x8/0x2c
Fix by consistently clearing both flags in both functions.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7309/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The MSA specification upon first read appears to suggest that it is safe
to perform vector loads & stores with arbitrary alignment. However it
leaves provision for "address-dependent exceptions"... Align the vector
context to a 16 byte boundary to ensure that the kernel cannot cause any
such exceptions.
Note that the fpu field of struct thread_struct was already at a 16 byte
boundary within the struct, the introduction of FPU_ALIGN simply makes
the requirement explicit. The only part of this impacting the generated
kernel binary is ARCH_MIN_TASKALIGN.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7308/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Preemption must be disabled throughout the process of enabling the FPU,
enabling MSA & initialising the vector registers. Without doing so it
is possible to lose the FPU or MSA whilst initialising them causing
that initialisation to fail.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7307/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The kernel relies upon MSA being disabled when a task begins running,
so that it can initialise or restore context in response to the
resulting MSA disabled exception. Previously the state of MSA following
boot was left as it was before the kernel ran, where MSA could
potentially have been enabled. Explicitly disable it during boot to
prevent any problems.
As a nice side effect the code reads a little better too.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7306/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit d96cc3d1ec "MIPS: Add microMIPS MSA support." attempted to use
the value of a macro within an inline asm statement but instead emitted
a comment leading to the cfcmsa & ctcmsa instructions being omitted. Fix
that by passing CFC_MSA_INSN & CTC_MSA_INSN as arguments to the asm
statements.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7305/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If a task does not execute scalar FP instructions prior to using MSA
then the flags indicating that the task has live MSA context were not
being set. The upper 64b of each vector register would then be lost
upon the tasks first context switch after using MSA.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7500/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When a task first makes use of MSA we need to ensure that the upper
64b of the vector registers are set to some value such that no
information can be leaked to it from the previous task to use MSA
context on the CPU. The architecture formerly specified that these
bits would be cleared to 0 when a scalar FP instructions wrote to the
aliased FP registers, which would have implicitly handled this as the
kernel restored scalar FP context. However more recent versions of the
specification now state that the value of the bits in such cases is
unpredictable. Initialise them explictly to be sure, and set all the
bits to 1 rather than 0 for consistency with the least significant
64b.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7497/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The kernel depends upon MSA never being enabled when the FPU is not, a
condition which is currently violated in a few places (whilst saving
sigcontext, following mips_cpu_save). Catch all the problem cases by
disabling MSA in lose_fpu, after saving context if necessary.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7302/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Switching the vector context implicitly saves & restores the state of
the aliased scalar FP data registers, however the scalar FP control
& status register is distinct from the MSA control & status register.
In order to allow scalar FP to function correctly in programs using
MSA, the scalar CSR needs to be saved & restored along with the MSA
vector context.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7301/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
I added a field for the MSACSR register in struct mips_fpu_struct, but
never actually made use of it... This is a clear bug. Save and restore
the MSACSR register along with the vector registers.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7300/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Just #ifdef away the C functions when included from an assembly file,
as will be done in a following commit.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7299/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add a definition for D-Link DSR-1000N router. The bootloader on this board
supplies 20006 in the bootinfo; the enum CVMX_BOARD_TYPE_CUST_DSR1000N
comes from the GPL sources of the board.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: David Daney <ddaney.cavm@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/7217/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Disable HOTPLUG_CPU functionality if the bootloader version is incorrect.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Cc: David Daney <ddaney.cavm@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/7200/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If nosmp kernel option given, we can assume HOTPLUG_CPU is disabled.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7202/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The same check is already done earlier in octeon_smp_hotplug_setup().
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Patchwork: https://patchwork.linux-mips.org/patch/7199/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The GICBIS macro could update the GIC registers incorrectly, depending
on the data value passed in:
* Bits were only OR'd into the register data, so register fields could
not be cleared.
* Bits were OR'd into the register data without masking the data to the
correct field width, corrupting adjacent bits.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7378/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Malta malta_ipi_irqdispatch() routine now checks only IPI interrupts
when handling IPIs. It could previously call do_IRQ() for non-IPIs, and
also call do_IRQ() with an invalid IRQ number if there were no pending
GIC interrupts when gic_get_int() was called.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7377/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Move most of the functionality of gic_get_int() into a new function
gic_get_int_mask() which takes a bitmask of interrupts in which the
caller is interested, and returns the subset which are pending for the
current CPU.
This allows CP0 IRQ dispatch routines to check only the GIC interrupts
which are routed to a particular CPU interrupt input.
gic_get_int() is reimplemented using gic_get_int_mask() and is retained
for use by any platforms for which gic_get_int() is sufficient.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7376/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A GIC interrupt which is declared as having a GIC_MAP_TO_NMI_MSK
mapping causes the cpu parameter to gic_setup_intr() to be increased
to 32, causing memory corruption when pcpu_masks[] is written to again
later in the function.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7375/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
irq-gic.c:gic_get_int() masks out interrupts from the pending set which
aren’t in the pcpu_mask. Only interrupts marked with GIC_FLAG_IPI were
set in pcpu_mask, meaning that peripheral interrupts also had to be
marked as IPIs. Remove the use of GIC_FLAG_IPI and allow the flags
member of struct gic_intr_map to be zero.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7374/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The value of GIC_NUM_INTRS is platform-specific. Using a default value
from gic.h will result in incorrect behaviour on some systems, so
require a suitable definition to be present in the platform's irq.h.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7373/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Several bitmaps are declared in arch/mips/include/asm/gic.h, but the
scope of their use is limited to arch/mips/kernel/irq-gic.c. Move the
declarations from the header file to the C file.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7372/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Detect if the core supports unique exception codes for the
Read-Inhibit and Execute-Inhibit exceptions and set the
option accordingly. The RI/XI exception support is detected
by setting the 27th bit (IEC) of the PageGrain C0 register
and reading back the value of that register to verify the
bit is enabled.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7340/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use the regular tlb_do_page_fault_0 (no write) handler to handle
the RI and XI exceptions. Also skip the RI/XI validation check
on TLB load handler since it's redundant when the CPU has
unique RI/XI exceptions.
Singed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7339/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPSr5 added support for unique exception codes for the Read-Inhibit
and Execute-Inhibit exceptions.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7338/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Hardware Page Table Walker aims to speed up TLB refill exceptions
by handling them in the hardware level instead of having a software
TLB refill handler. However, a TLB refill exception can still be
thrown in certain cases such as, synchronus exceptions, or address
translation or memory errors during the HTW operation. As a result of
which, HTW must not be considered a complete replacement for the TLB
refill software handler, but rather a fast-path for it.
For HTW to work, the PWBase register must contain the task's page
global directory address so the HTW will kick in on TLB refill
exceptions.
Due to HTW being a separate engine embedded deep in the CPU pipeline,
we need to restart the HTW everytime a PTE changes to avoid HTW
fetching a old entry from the page tables. It's also necessary to
restart the HTW on context switches to prevent it from fetching a
page from the previous process. Finally, since HTW is using the
entryhi register to write the translations to the TLB, it's necessary
to stop the HTW whenever the entryhi changes (eg for tlb probe
perations) and enable it back afterwards.
== Performance ==
The following trivial test was used to measure the performance of the
HTW. Using the same root filesystem, the following command was used
to measure the number of tlb refill handler executions with and
without (using 'nohtw' kernel parameter) HTW support. The kernel was
modified to use a scratch register as a counter for the TLB refill
exceptions.
find /usr -type f -exec ls -lh {} \;
HTW Enabled:
TLB refill exceptions: 12306
HTW Disabled:
TLB refill exceptions: 17805
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Markos Chandras <markos.chandras@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/7336/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Detect if the core implements the HTW and set the option accordingly.
Also, add a new kernel parameter called 'nohtw' allowing
the user to disable the htw support and fallback to the software
refill handler.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7335/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Moreover, report hardware page table walker support as 'htw' in the ASE
list of /proc/cpuinfo, if the core implements this feature.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7334/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Long integers which are 4 bytes in MIPS32 can't hold new CPU
options anymore, so the type of the 'options' variable is changed
to unsigned long long which allows 32 more cpu options to be defined
for MIPS32
Also, re-arrange the 'options' struct member to avoid potential 4-byte
alignment gap in the middle of the struct.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7324/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add cases in perf_event_mipsxx.c for CPU_P5600. All the event numbers
listed for proAptiv also apply to P5600, so we use mipsxxcore_event_map2
and mipsxxcore_cache_map2 too, but the P5600 has 8-bit event numbers so
bit 8 (256) of the user ABI config is used for the parity bit (to
specify odd/even counter events).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7242/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In mipsxx_pmu_map_raw_event(), set event_id to base_id after the cpu
type conditional code to allow that code to override the base_id to use
more bits from the config and a higher bit for parity.
This will allow cores with up to 512 events between all even/odd
counters (an 8-bit event id) such as P5600 to use bit 8 for parity.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7243/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This header defines an exported interface (the register layout used in
core dumps and the GP regset accessible with PTRACE_{GET,SET}REGSET),
therefore belongs in uapi.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7458/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The struct user definition in this file is not used anywhere (the ELF
core dumper does not use that format). Therefore, remove the header and
instead enable the asm-generic user.h which is an empty header to
satisfy a few generic headers which still try to include user.h.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7459/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since the core dumper now uses regsets, the old core dump functions are
now unused. Remove them.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7456/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In uapi/asm/ptrace.h, a user version of pt_regs is defined wrapped in
ifndef __KERNEL__. This structure definition does not match anything
used by any kernel API, in particular it does not match the format used
by PTRACE_{GET,SET}REGS.
Therefore, replace the structure definition with one matching what is
used by PTRACE_{GET,SET}REGS. The format used by these is the same for
both 32-bit and 64-bit.
Also, change the implementation of PTRACE_{GET,SET}REGS to use this new
structure definition. The structure is renamed to user_pt_regs when
__KERNEL__ is defined to avoid conflicts with the kernel's own pt_regs.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7457/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A comment in the O32/32-bit system call code is incorrect since commit
46e12c07b3 ("MIPS: O32 / 32-bit: Always copy 4 stack arguments.").
Remove it.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7455/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On 32-bit/O32, pt_regs has a padding area at the beginning into which the
syscall arguments passed via the user stack are copied. 4 arguments
totalling 16 bytes are copied to offset 16 bytes into this area, however
the area is only 24 bytes long. This means the last 2 arguments overwrite
pt_regs->regs[{0,1}].
If a syscall function returns an error, handle_sys stores the original
syscall number in pt_regs->regs[0] for syscall restart. signal.c checks
whether regs[0] is non-zero, if it is it will check whether the syscall
return value is one of the ERESTART* codes to see if it must be
restarted.
Should a syscall be made that results in a non-zero value being copied
off the user stack into regs[0], and then returns a positive (non-error)
value that matches one of the ERESTART* error codes, this can be mistaken
for requiring a syscall restart.
While the possibility for this to occur has always existed, it is made
much more likely to occur by commit 46e12c07b3 ("MIPS: O32 / 32-bit:
Always copy 4 stack arguments."), since now every syscall will copy 4
arguments and overwrite regs[0], rather than just those with 7 or 8
arguments.
Since that commit, booting Debian under a 32-bit MIPS kernel almost
always results in a hang early in boot, due to a wait4 syscall returning
a PID that matches one of the ERESTART* codes, which then causes an
incorrect restart of the syscall.
The problem is fixed by increasing the size of the padding area so that
arguments copied off the stack will not overwrite pt_regs->regs[{0,1}].
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Cc: <stable@vger.kernel.org> # v3.13+
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7454/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A DT bugfix for Nomadik that had an ambigouos double-inversion of a gpio
line, and one MAINTAINER URL update that might as well go in now.
We could hold off until the merge window, but then we'll just have to mark
the DT fix for stable and it just seems like in total causing more work.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJT2dyEAAoJEIwa5zzehBx3rZMQAKwoGghFh2t/4USXOWEDT1ns
B0cPMvg3wjlOxnJsC4Nvi9BclIBBlUImHoHYG8Un7/aJXkuRpOC36Zg2wWDJUkVC
NZOwHAj/wBwPTKQnHKSQ2Nyw/h1t1FidlWt07GcFxmYvl00pfE707MHKiRTd/pUY
T4/VELaUXmcJ2/9WqscNLi4TzuQBd5eBb1n48GRzVsTfjJo7jXExCsuKWNOpbPar
eDqZjw7KAU73T8d0hJmZtJuI8iZ0I3mMYFrA7Sp1srXEpw/ZKfvn6+SBH3CzZdxz
oDXZYYBl4wYuqqW7I42esZkcsyGmCE58KS0tYWZgvQUJzrHeyV6myuEn2qt2ROzQ
Ii5huDBSbxNOk2D+v5JevBG8SuDFMS6Op1Gj9fUB7F/+P+EDzccwSkejPZYAnHDd
JHWiks6CU4xRzlSrPb4AtQ3UkX0GO1QPlhhT/TQD0wOAn9XMkC9hfYBBn+pnusXm
l8F+DSGx2UUziwF2f4tTGcpKeiTN71g8xizHFs4LHhctnVeb/G68a5s1Cxsfq7Bd
tW3Y/b1sy5q/TllKTL6qTSqTS+GPldV/w4IwBm4tJs3Q5lMoJErvFQqTEPpIwcRP
xDVBT0BpEStvxMNYVhIEE4C50k02zLhqS2CH9pqzHVBFYhhNlp2g23QDsbU2yWgG
sPg0G+Ts8yxeS1DZotnq
=sz9W
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM straggler SoC fix from Olof Johansson:
"A DT bugfix for Nomadik that had an ambigouos double-inversion of a
gpio line, and one MAINTAINER URL update that might as well go in now.
We could hold off until the merge window, but then we'll just have to
mark the DT fix for stable and it just seems like in total causing
more work"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: Update Tegra Git URL
ARM: nomadik: fix up double inversion in DT
not boot since 3.6-rc1. The problem is that user_addr_max returned the biggest
available RAM address which makes some copy_from_user variants fail to read
from XIP memory.
Even in the presence of one of the two fixes the other still makes sense, so
both patches are included here.
This problem was the last one preventing efm32 boot to a prompt with mainline.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlOyfhsACgkQ6suMTIUe0hZWggCePaoe/S+aDki9B2ASCY0zVkRq
XE8AoM5G4yRgnL3zitI2ftvvlp4xx1mS
=4Vjn
-----END PGP SIGNATURE-----
Merge tag 'nommu-for-rmk' of git://git.pengutronix.de/git/ukl/linux into devel-stable
Two different fixes for the same problem making some ARM nommu configurations
not boot since 3.6-rc1. The problem is that user_addr_max returned the biggest
available RAM address which makes some copy_from_user variants fail to read
from XIP memory.
Even in the presence of one of the two fixes the other still makes sense, so
both patches are included here.
This problem was the last one preventing efm32 boot to a prompt with mainline.
Commit
Commit 1d40cfcd34
Author: Ralf Baechle <ralf@linux-mips.org>
Date: Fri Jul 15 15:23:23 2005 +0000
Avoid SMP cacheflushes. This is a minor optimization of startup but
will also avoid smp_call_function from doing stupid things when called
from a CPU that is not yet marked online.
missed an appropriate cache flush of TLB refill handler because that time it was
at fixed location CAC_BASE. After years the refill handler in EBASE vector
is not at that location and can be allocated in some another memory and needs
I-cache sync as other TLB exception vectors.
Besides that, the new function - local_flash_icache_range() was introduced
to avoid SMP cacheflushes.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: paul.gortmaker@windriver.com
Cc: jchandra@broadcom.com
Cc: linux-kernel@vger.kernel.org
Cc: david.daney@cavium.com
Patchwork: https://patchwork.linux-mips.org/patch/7312/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Rename usb_nop_xceiv to usb_phy_generic in platform data to match the
name change of the nop transceiver driver in commit 4525bee (usb: phy:
rename usb_nop_xceiv to usb_phy_generic).
The name change induced a kernel panic due to an unhandled kernel
unaligned access while trying to dereference musb->xceiv->io_ops in
musb_init_controller().
Signed-off-by: Apelete Seketeli <apelete@seketeli.net>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: John Crispin <blogic@openwrt.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7263/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit d6d3c9afaa (MIPS: MT: proc: Add support for printing VPE and TC
ids) causes a link error when CONFIG_PROC_FS=n:
arch/mips/built-in.o: In function `proc_cpuinfo_notifier_init':
smp-mt.c: undefined reference to `register_proc_cpuinfo_notifier'
This is fixed by adding an ifdef around the procfs handling code
in smp-mt.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reported-by: Markos Chandras <markos.chandras@imgtec.com>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # >= 3.15
Patchwork: https://patchwork.linux-mips.org/patch/7244/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The IP_PKTOPTIONS sockopt puts control messages in option_values, these
need to be handled differently in the compat case. This is already done
through the MSG_CMSG_COMPAT flag, we just need to use
compat_sys_getsockopt which sets that flag.
Signed-off-by: Sorin Dumitru <sdumitru@ixiacom.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7115/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If a call to device_create() fails for a channel during the initialize
loop, we need to clean the devices entries already created before
leaving.
Signed-off-by: Sebastien Bourdelin <sebastien.bourdelin@savoirfairelinux.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: John Crispin <blogic@openwrt.org>
Cc: Qais Yousef <Qais.Yousef@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: Jerome Oufella <jerome.oufella@savoirfairelinux.com>
Patchwork: https://patchwork.linux-mips.org/patch/7111/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Due to a missing newline in the I-cache policy detection log output,
it's possible to get some ratehr unfortunate output at boot time:
CPU1: Booted secondary processor
Detected VIPT I-cache on CPU1CPU2: Booted secondary processor
Detected VIPT I-cache on CPU2CPU3: Booted secondary processor
Detected VIPT I-cache on CPU3CPU4: Booted secondary processor
Detected PIPT I-cache on CPU4CPU5: Booted secondary processor
Detected PIPT I-cache on CPU5Brought up 6 CPUs
SMP: Total of 6 processors activated.
This patch adds the missing newline to the format string, cleaning up
the output.
Fixes: 59ccc0d41b ("arm64: cachetype: report weakest cache policy")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The large segment table entry format has block of bits for the
ACC/F values for the large page. These bits are valid only if
another bit (AV bit 0x10000) of the segment table entry is set.
The ACC/F bits do not have a meaning if the AV bit is off.
This allows to put the THP splitting bit, the segment young bit
and the new segment dirty bit into the ACC/F bits as long as
the AV bit stays off. The dirty and young information is only
available if the pmd is large.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit f0a3eaff71 (ARM64: KVM: fix big endian issue in
access_vm_reg for 32bit guest) changed the way we handle CP15
VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.
This looks like a good idea as it solves indianness issues in an
elegant way, except for one small detail: the register index is
doesn't refer to the same array! We end up corrupting some random
data structure instead.
Fix this by reverting to the original code, except for the introduction
of a vcpu_cp15_64_high macro that deals with the endianness thing.
Tested on Juno with 32bit SMP guests.
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
commit ec66ad66a0 (s390/mm: enable
split page table lock for PMD level) activated the split pmd lock
for s390. Turns out that we missed one place: We also have to take
the pmd lock instead of the page table lock when we reallocate the
page tables (==> changing entries in the PMD) during sie enablement.
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix coding style problems in arch/arm/mach-omap2/control.c.
Signed-off-by: Jeremy Vial <jvial@adeneo-embedded.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
According to the comment “restore_es3: applies to 34xx >= ES3.0" in
"arch/arm/mach-omap2/sleep34xx.S”, omap3_restore_es3 should be used
if the revision of an OMAP34xx is ES3.1.2.
Signed-off-by: Jeremy Vial <jvial@adeneo-embedded.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
This is based on tags/exynos-power because this DT changes
are depending PMU cleanup.
Fixes boot for exynos5260 and exynos5410,
- Since exynos cannot boot without obtaining PMU address via
DT from now on, add PMU node for exynos5260 and exynos5410
For preparing exynos5250-spring,
- move max77686 and cypress,cyapa trackpad from exynos5250-
cros-common to exynos5250-snow DT file
(Note exynos5250-spring is not included in this branch yet)
For exynos3250,
- add TMU node and remove duplicated interrupt-parent
- add missing pinctrl property for uart0 and uart1
For exynos5250-smdk5250 board
- add max77686 pmic interrupt property which is connected to
gpx3
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJT2of/AAoJEA0Cl+kVi2xqNtkP/RzgKbqZfBVFVj7j9gC/c8qs
0Cnh3l6JHXRQXMLvL7vI9MXS41kH0jaLWxSDgZlf87pARINT2zDP6KNAYleJJz3j
+VX++YP6e4Yhw72Jzxuzqjjdu0IMNoo0JL7T5w/aHK5C186YKM7mEuG98+/qbwBh
nF2pEPlQ9x3D78XawyC8IdGBkB6YlpqWWYThPDVGN2wvm14Jf9CRJFVyDjcjO281
y4K7cXnxKYzc51Oih5ktzT0K4oHaHag5biLLOwBCCQFJVr6ZWbXG/Nd849v8chG+
Cfg9ckkch+CGre8NsdW4qfu1UGIK3qubHw9fKZzm8j9/0PANgLMfrX4/JKFhPbsQ
2Hluc3u4UFkNheDJ4xfY127MQCvIlXwUZLAhDS3PGvm7nP/BiPjpX4HsLsBq11we
wRCEtiD1aWb5XcVUyJQfUIETEfQDoc7bU9YjSEpz30ilTc/F5rLs2/zAgrz4/yLE
rhFyGl4qWW8LbveD6X7QjZ2mLEG5kYOd0job2SfJeXN0uQDJV3JpZbhJ7pvHDQTk
3OiR7xtrpfOsMGjxW3kpCs/kz9h7MLwbIdlUa5cFl3kpTTLmexo+0TagzcGIus+z
Yzt02gnM5zLUYmy0YjDI2YqvL17urTnLvrQnUeM/WlXOHPJkSnHvh9ywbsm9SrIe
iGQG4cuxEtVk+s6ECVUf
=aXEe
-----END PGP SIGNATURE-----
Merge tag 'samsung-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/dt
Merge "Samsung DT 2nd updates for v3.17" from Kukjin Kim:
This is based on tags/exynos-power because this DT changes
are depending PMU cleanup.
Fixes boot for exynos5260 and exynos5410,
- Since exynos cannot boot without obtaining PMU address via
DT from now on, add PMU node for exynos5260 and exynos5410
For preparing exynos5250-spring,
- move max77686 and cypress,cyapa trackpad from exynos5250-
cros-common to exynos5250-snow DT file
(Note exynos5250-spring is not included in this branch yet)
For exynos3250,
- add TMU node and remove duplicated interrupt-parent
- add missing pinctrl property for uart0 and uart1
For exynos5250-smdk5250 board
- add max77686 pmic interrupt property which is connected to
gpx3
* tag 'samsung-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: (28 commits)
ARM: dts: Add missing pinctrl for uart0/1 for exynos3250
ARM: dts: Remove duplicate 'interrput-parent' property for exynos3250
ARM: dts: Add TMU dt node to monitor the temperature for exynos3250
ARM: dts: Specify MAX77686 pmic interrupt for exynos5250-smdk5250
ARM: dts: cypress,cyapa trackpad is exynos5250-Snow only
ARM: dts: max77686 is exynos5250-snow only
ARM: EXYNOS: Add exynos5260 PMU compatible string to DT match table
ARM: dts: Add PMU DT node for exynos5260 SoC
ARM: EXYNOS: Add support for Exynos5410 PMU
ARM: dts: Add PMU to exynos5410
ARM: dts: Document exynos5410 PMU
ARM: EXYNOS: Move cpufreq and cpuidle device registration to init_machine
ARM: EXYNOS: Refactored code for using PMU address via DT
ARM: EXYNOS: Support cluster power off on exynos5420/5800
ARM: EXYNOS: populate suspend and powered_up callbacks for mcpm
ARM: EXYNOS: do not allow cpuidle registration for exynos5420
cpuidle: big.LITTLE: init driver for exynos5420
cpuidle: big.LITTLE: Add ARCH_EXYNOS entry in config
ARM: EXYNOS: add generic function to calculate cpu number
cpuidle: big.LITTLE: add of_device_id structure
...
Signed-off-by: Olof Johansson <olof@lixom.net>
This patch adds socfpga Ethernet filter attributes for multicast
and unicast filters per Synopsys Ethernet IP configuration chosen
by Altera for the Cyclone 5 and Arria SOC FPGAs.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing a debug message for setting the identity map since it becomes
rather noisy after rework of the identity map code.
Signed-off-by: Matthew Rushton <mrushton@amazon.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
This has been run through Intel's LKP tests across a wide range
of modern sytems and workloads and it wasn't shown to make a
measurable performance difference positive or negative.
Now that we have some shiny new tracepoints, we can actually
figure out what the heck is going on.
During a kernel compile, 60% of the flush_tlb_mm_range() calls
are for a single page. It breaks down like this:
size percent percent<=
V V V
GLOBAL: 2.20% 2.20% avg cycles: 2283
1: 56.92% 59.12% avg cycles: 1276
2: 13.78% 72.90% avg cycles: 1505
3: 8.26% 81.16% avg cycles: 1880
4: 7.41% 88.58% avg cycles: 2447
5: 1.73% 90.31% avg cycles: 2358
6: 1.32% 91.63% avg cycles: 2563
7: 1.14% 92.77% avg cycles: 2862
8: 0.62% 93.39% avg cycles: 3542
9: 0.08% 93.47% avg cycles: 3289
10: 0.43% 93.90% avg cycles: 3570
11: 0.20% 94.10% avg cycles: 3767
12: 0.08% 94.18% avg cycles: 3996
13: 0.03% 94.20% avg cycles: 4077
14: 0.02% 94.23% avg cycles: 4836
15: 0.04% 94.26% avg cycles: 5699
16: 0.06% 94.32% avg cycles: 5041
17: 0.57% 94.89% avg cycles: 5473
18: 0.02% 94.91% avg cycles: 5396
19: 0.03% 94.95% avg cycles: 5296
20: 0.02% 94.96% avg cycles: 6749
21: 0.18% 95.14% avg cycles: 6225
22: 0.01% 95.15% avg cycles: 6393
23: 0.01% 95.16% avg cycles: 6861
24: 0.12% 95.28% avg cycles: 6912
25: 0.05% 95.32% avg cycles: 7190
26: 0.01% 95.33% avg cycles: 7793
27: 0.01% 95.34% avg cycles: 7833
28: 0.01% 95.35% avg cycles: 8253
29: 0.08% 95.42% avg cycles: 8024
30: 0.03% 95.45% avg cycles: 9670
31: 0.01% 95.46% avg cycles: 8949
32: 0.01% 95.46% avg cycles: 9350
33: 3.11% 98.57% avg cycles: 8534
34: 0.02% 98.60% avg cycles: 10977
35: 0.02% 98.62% avg cycles: 11400
We get in to dimishing returns pretty quickly. On pre-IvyBridge
CPUs, we used to set the limit at 8 pages, and it was set at 128
on IvyBrige. That 128 number looks pretty silly considering that
less than 0.5% of the flushes are that large.
The previous code tried to size this number based on the size of
the TLB. Good idea, but it's error-prone, needs maintenance
(which it didn't get up to now), and probably would not matter in
practice much.
Settting it to 33 means that we cover the mallopt
M_TRIM_THRESHOLD, which is the most universally common size to do
flushes.
That's the short version. Here's the long one for why I chose 33:
1. These numbers have a constant bias in the timestamps from the
tracing. Probably counts for a couple hundred cycles in each of
these tests, but it should be fairly _even_ across all of them.
The smallest delta between the tracepoints I have ever seen is
335 cycles. This is one reason the cycles/page cost goes down in
general as the flushes get larger. The true cost is nearer to
100 cycles.
2. A full flush is more expensive than a single invlpg, but not
by much (single percentages).
3. A dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns
(~34 cycles). At those rates, refilling the 512-entry dTLB takes
22,000 cycles.
4. 22,000 cycles is approximately the equivalent of doing 85
invlpg operations. But, the odds are that the TLB can
actually be filled up faster than that because TLB misses that
are close in time also tend to leverage the same caches.
6. ~98% of flushes are <=33 pages. There are a lot of flushes of
33 pages, probably because libc's M_TRIM_THRESHOLD is set to
128k (32 pages)
7. I've found no consistent data to support changing the IvyBridge
vs. SandyBridge tunable by a factor of 16
I used the performance counters on this hardware (IvyBridge i5-3320M)
to figure out the tlb miss costs:
ocperf.py stat -e dtlb_load_misses.walk_duration,dtlb_load_misses.walk_completed,dtlb_store_misses.walk_duration,dtlb_store_misses.walk_completed,itlb_misses.walk_duration,itlb_misses.walk_completed,itlb.itlb_flush
7,720,030,970 dtlb_load_misses_walk_duration [57.13%]
169,856,353 dtlb_load_misses_walk_completed [57.15%]
708,832,859 dtlb_store_misses_walk_duration [57.17%]
19,346,823 dtlb_store_misses_walk_completed [57.17%]
2,779,687,402 itlb_misses_walk_duration [57.15%]
82,241,148 itlb_misses_walk_completed [57.13%]
770,717 itlb_itlb_flush [57.11%]
Show that a dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns
(~34 cycles). At those rates, refilling the 512-entry dTLB takes
22,000 cycles. On a SandyBridge system with more cores and larger
caches, those are dtlb=13.4ns and itlb=9.5ns.
cat perf.stat.txt | perl -pe 's/,//g'
| awk '/itlb_misses_walk_duration/ { icyc+=$1 }
/itlb_misses_walk_completed/ { imiss+=$1 }
/dtlb_.*_walk_duration/ { dcyc+=$1 }
/dtlb_.*.*completed/ { dmiss+=$1 }
END {print "itlb cyc/miss: ", icyc/imiss, " dtlb cyc/miss: ", dcyc/dmiss, " ----- ", icyc,imiss, dcyc,dmiss }
On Westmere CPUs, the counters to use are: itlb_flush,itlb_misses.walk_cycles,itlb_misses.any,dtlb_misses.walk_cycles,dtlb_misses.any
The assumptions that this code went in under:
https://lkml.org/lkml/2012/6/12/119 say that a flush and a refill are
about 100ns. Being generous, that is over by a factor of 6 on the
refill side, although it is fairly close on the cost of an invlpg.
An increase of a single invlpg operation seems to lengthen the flush
range operation by about 200 cycles. Here is one example of the data
collected for flushing 10 and 11 pages (full data are below):
10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714
11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145
How to generate this table:
echo 10000 > /sys/kernel/debug/tracing/buffer_size_kb
echo x86-tsc > /sys/kernel/debug/tracing/trace_clock
echo 'reason != 0' > /sys/kernel/debug/tracing/events/tlb/tlb_flush/filter
echo 1 > /sys/kernel/debug/tracing/events/tlb/tlb_flush/enable
Pipe the trace output in to this script:
http://sr71.net/~dave/intel/201402-tlb/trace-time-diff-process.pl.txt
Note that these data were gathered with the invlpg threshold set to
150 pages. Only data points with >=50 of samples were printed:
Flush % of %<=
in flush this
pages es size
------------------------------------------------------------------------------
-1: 2.20% 2.20% avg cycles: 2283 cycles/page: xxxx samples: 23960
1: 56.92% 59.12% avg cycles: 1276 cycles/page: 1276 samples: 620895
2: 13.78% 72.90% avg cycles: 1505 cycles/page: 752 samples: 150335
3: 8.26% 81.16% avg cycles: 1880 cycles/page: 626 samples: 90131
4: 7.41% 88.58% avg cycles: 2447 cycles/page: 611 samples: 80877
5: 1.73% 90.31% avg cycles: 2358 cycles/page: 471 samples: 18885
6: 1.32% 91.63% avg cycles: 2563 cycles/page: 427 samples: 14397
7: 1.14% 92.77% avg cycles: 2862 cycles/page: 408 samples: 12441
8: 0.62% 93.39% avg cycles: 3542 cycles/page: 442 samples: 6721
9: 0.08% 93.47% avg cycles: 3289 cycles/page: 365 samples: 917
10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714
11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145
12: 0.08% 94.18% avg cycles: 3996 cycles/page: 333 samples: 864
13: 0.03% 94.20% avg cycles: 4077 cycles/page: 313 samples: 289
14: 0.02% 94.23% avg cycles: 4836 cycles/page: 345 samples: 236
15: 0.04% 94.26% avg cycles: 5699 cycles/page: 379 samples: 390
16: 0.06% 94.32% avg cycles: 5041 cycles/page: 315 samples: 643
17: 0.57% 94.89% avg cycles: 5473 cycles/page: 321 samples: 6229
18: 0.02% 94.91% avg cycles: 5396 cycles/page: 299 samples: 224
19: 0.03% 94.95% avg cycles: 5296 cycles/page: 278 samples: 367
20: 0.02% 94.96% avg cycles: 6749 cycles/page: 337 samples: 185
21: 0.18% 95.14% avg cycles: 6225 cycles/page: 296 samples: 1964
22: 0.01% 95.15% avg cycles: 6393 cycles/page: 290 samples: 83
23: 0.01% 95.16% avg cycles: 6861 cycles/page: 298 samples: 61
24: 0.12% 95.28% avg cycles: 6912 cycles/page: 288 samples: 1307
25: 0.05% 95.32% avg cycles: 7190 cycles/page: 287 samples: 533
26: 0.01% 95.33% avg cycles: 7793 cycles/page: 299 samples: 94
27: 0.01% 95.34% avg cycles: 7833 cycles/page: 290 samples: 66
28: 0.01% 95.35% avg cycles: 8253 cycles/page: 294 samples: 73
29: 0.08% 95.42% avg cycles: 8024 cycles/page: 276 samples: 846
30: 0.03% 95.45% avg cycles: 9670 cycles/page: 322 samples: 296
31: 0.01% 95.46% avg cycles: 8949 cycles/page: 288 samples: 79
32: 0.01% 95.46% avg cycles: 9350 cycles/page: 292 samples: 60
33: 3.11% 98.57% avg cycles: 8534 cycles/page: 258 samples: 33936
34: 0.02% 98.60% avg cycles: 10977 cycles/page: 322 samples: 268
35: 0.02% 98.62% avg cycles: 11400 cycles/page: 325 samples: 177
36: 0.01% 98.63% avg cycles: 11504 cycles/page: 319 samples: 161
37: 0.02% 98.65% avg cycles: 11596 cycles/page: 313 samples: 182
38: 0.02% 98.66% avg cycles: 11850 cycles/page: 311 samples: 195
39: 0.01% 98.68% avg cycles: 12158 cycles/page: 311 samples: 128
40: 0.01% 98.68% avg cycles: 11626 cycles/page: 290 samples: 78
41: 0.04% 98.73% avg cycles: 11435 cycles/page: 278 samples: 477
42: 0.01% 98.73% avg cycles: 12571 cycles/page: 299 samples: 74
43: 0.01% 98.74% avg cycles: 12562 cycles/page: 292 samples: 78
44: 0.01% 98.75% avg cycles: 12991 cycles/page: 295 samples: 108
45: 0.01% 98.76% avg cycles: 13169 cycles/page: 292 samples: 78
46: 0.02% 98.78% avg cycles: 12891 cycles/page: 280 samples: 261
47: 0.01% 98.79% avg cycles: 13099 cycles/page: 278 samples: 67
48: 0.01% 98.80% avg cycles: 13851 cycles/page: 288 samples: 77
49: 0.01% 98.80% avg cycles: 13749 cycles/page: 280 samples: 66
50: 0.01% 98.81% avg cycles: 13949 cycles/page: 278 samples: 73
52: 0.00% 98.82% avg cycles: 14243 cycles/page: 273 samples: 52
54: 0.01% 98.83% avg cycles: 15312 cycles/page: 283 samples: 87
55: 0.01% 98.84% avg cycles: 15197 cycles/page: 276 samples: 109
56: 0.02% 98.86% avg cycles: 15234 cycles/page: 272 samples: 208
57: 0.00% 98.86% avg cycles: 14888 cycles/page: 261 samples: 53
58: 0.01% 98.87% avg cycles: 15037 cycles/page: 259 samples: 59
59: 0.01% 98.87% avg cycles: 15752 cycles/page: 266 samples: 63
62: 0.00% 98.89% avg cycles: 16222 cycles/page: 261 samples: 54
64: 0.02% 98.91% avg cycles: 17179 cycles/page: 268 samples: 248
65: 0.12% 99.03% avg cycles: 18762 cycles/page: 288 samples: 1324
85: 0.00% 99.10% avg cycles: 21649 cycles/page: 254 samples: 50
127: 0.01% 99.18% avg cycles: 32397 cycles/page: 255 samples: 75
128: 0.13% 99.31% avg cycles: 31711 cycles/page: 247 samples: 1466
129: 0.18% 99.49% avg cycles: 33017 cycles/page: 255 samples: 1927
181: 0.33% 99.84% avg cycles: 2489 cycles/page: 13 samples: 3547
256: 0.05% 99.91% avg cycles: 2305 cycles/page: 9 samples: 550
512: 0.03% 99.95% avg cycles: 2133 cycles/page: 4 samples: 304
1512: 0.01% 99.99% avg cycles: 3038 cycles/page: 2 samples: 65
Here are the tlb counters during a 10-second slice of a kernel compile
for a SandyBridge system. It's better than IvyBridge, but probably
due to the larger caches since this was one of the 'X' extreme parts.
10,873,007,282 dtlb_load_misses_walk_duration
250,711,333 dtlb_load_misses_walk_completed
1,212,395,865 dtlb_store_misses_walk_duration
31,615,772 dtlb_store_misses_walk_completed
5,091,010,274 itlb_misses_walk_duration
163,193,511 itlb_misses_walk_completed
1,321,980 itlb_itlb_flush
10.008045158 seconds time elapsed
# cat perf.stat.1392743721.txt | perl -pe 's/,//g' | awk '/itlb_misses_walk_duration/ { icyc+=$1 } /itlb_misses_walk_completed/ { imiss+=$1 } /dtlb_.*_walk_duration/ { dcyc+=$1 } /dtlb_.*.*completed/ { dmiss+=$1 } END {print "itlb cyc/miss: ", icyc/imiss/3.3, " dtlb cyc/miss: ", dcyc/dmiss/3.3, " ----- ", icyc,imiss, dcyc,dmiss }'
itlb ns/miss: 9.45338 dtlb ns/miss: 12.9716
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154103.10C1115E@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Most of the logic here is in the documentation file. Please take
a look at it.
I know we've come full-circle here back to a tunable, but this
new one is *WAY* simpler. I challenge anyone to describe in one
sentence how the old one worked. Here's the way the new one
works:
If we are flushing more pages than the ceiling, we use
the full flush, otherwise we use per-page flushes.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154101.12B52CAF@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We don't have any good way to figure out what kinds of flushes
are being attempted. Right now, we can try to use the vm
counters, but those only tell us what we actually did with the
hardware (one-by-one vs full) and don't tell us what was actually
_requested_.
This allows us to select out "interesting" TLB flushes that we
might want to optimize (like the ranged ones) and ignore the ones
that we have very little control over (the ones at context
switch).
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
There are currently three paths through the remote flush code:
1. full invalidation
2. single page invalidation using invlpg
3. ranged invalidation using invlpg
This takes 2 and 3 and combines them in to a single path by
making the single-page one just be the start and end be start
plus a single page. This makes placement of our tracepoint easier.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154058.E0F90408@viggo.jf.intel.com
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
If we take the
if (end == TLB_FLUSH_ALL || vmflag & VM_HUGETLB) {
local_flush_tlb();
goto out;
}
path out of flush_tlb_mm_range(), we will have flushed the tlb,
but not incremented NR_TLB_LOCAL_FLUSH_ALL. This unifies the
way out of the function so that we always take a single path when
doing a full tlb flush.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154056.FF763B76@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
I think the flush_tlb_mm_range() code that tries to tune the
flush sizes based on the CPU needs to get ripped out for
several reasons:
1. It is obviously buggy. It uses mm->total_vm to judge the
task's footprint in the TLB. It should certainly be using
some measure of RSS, *NOT* ->total_vm since only resident
memory can populate the TLB.
2. Haswell, and several other CPUs are missing from the
intel_tlb_flushall_shift_set() function. Thus, it has been
demonstrated to bitrot quickly in practice.
3. It is plain wrong in my vm:
[ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] tlb_flushall_shift: 6
Which leads to it to never use invlpg.
4. The assumptions about TLB refill costs are wrong:
http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
(more on this in later patches)
5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
I believe the sample times were too short. Running the
benchmark in a loop yields times that vary quite a bit.
Note that this leaves us with a static ceiling of 1 page. This
is a conservative, dumb setting, and will be revised in a later
patch.
This also removes the code which attempts to predict whether we
are flushing data or instructions. We expect instruction flushes
to be relatively rare and not worth tuning for explicitly.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The
if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
line of code is not exactly the easiest to audit, especially when
it ends up at two different indentation levels. This eliminates
one of the the copy-n-paste versions. It also gives us a unified
exit point for each path through this function. We need this in
a minute for our tracepoint.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154054.44F1CDDC@viggo.jf.intel.com
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Since commit b5660ba76b ("x86, platforms: Remove NUMAQ") removed NUMAQ,
the mps_oem_check() apic callback has been obsolete. Remove it.
This allows generic_mps_oem_check() to be removed as well.
Signed-off-by: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1407302349390.17503@chino.kir.corp.google.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The trampoline_phys_{high,low} members of struct apic are always
initialized to DEFAULT_TRAMPOLINE_PHYS_HIGH and TRAMPOLINE_PHYS_LOW,
respectively. Hardwire the constants and remove the unneeded members.
Signed-off-by: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1407302348330.17503@chino.kir.corp.google.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Resolve shadow warnings that appear in W=2 builds. Instead of
using ret to hold the return pointer, save the length in a new
variable saved_len and compute the pointer on exit. This also
resolves a very technical error, in that ret was declared as
a const char *, when it really was a char * const.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- a memory leak on busy SIGP
- pontentially lost SIGP stop in rare situations (shutdown loops)
The first issue is not part of a released kernel. The 2nd issue is
present in all KVM versions, but did not trigger before commit
7dfc63cf97 (KVM: s390: allow only one SIGP STOP
(AND STORE STATUS) at a time) with Linux as a guest.
So no need for cc stable
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJT2fIAAAoJEBF7vIC1phx8IU4P/A83LP0jpEu3tdvIjU4g9Ev8
yan96/6d/QPOe4mlR7TkalbnkC0mT09ZFnUKnfdVIYD89HWa0gnoXKXhXYge8ZoK
uB90uTV/O/ySJuSYUkITNHGqvZPkqGc75GrdBsQ1KP9Kcs475KREkStEj4nXhSgb
IoIL/yuBkYpqroEcZbgV6r2u2UYXGdle131OKHLO4xJ96tniVWsSM32E1/sh29zj
Sr9ho0dNHdKO4ILKc03osCS3Y49gXC1nmODPLvTs1pBQtw7RSJ5dM9aqCR1XQkaD
q5PJFQsjBQEhrHOTOZUYAxdk7j2kT2qxZqwYahL+WelbKVZgwdNQMZ78Xrq1RXVp
hQC29KCYrNbzZMVim0ZQCSWVrlQtdoAnjXejeRQShVVYCjo0JckPaSNB+tpPCtaV
+e4wPyhLFDmKZIX1Pme716TYxuLL4pCwU8wd6b+DGJ7hq9rzILs8guQua1eMYiCD
6CjSkEbpDhSthXJ3qdDpIWId13ppY0sVvaFoC3Jnwrj167gRXl0tfK2nX4EpexaG
rrqWXhwLFuRWm4ZDJaE6tg/bQSsDumXSUX1zGUoFe522DppuKBJCeYzXTrqtokYd
vyl4CzHlkov1K7EmJfUu69BMyVDWPX7JMOnM2k4Mde2io6IEIqYQvLK5fcAhAE7r
ElmgACkwwrEL62qryq5l
=+xZE
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-20140730' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next
Two fixes for recently introduced regressions
- a memory leak on busy SIGP
- pontentially lost SIGP stop in rare situations (shutdown loops)
The first issue is not part of a released kernel. The 2nd issue is
present in all KVM versions, but did not trigger before commit
7dfc63cf97 (KVM: s390: allow only one SIGP STOP
(AND STORE STATUS) at a time) with Linux as a guest.
So no need for cc stable
Commit 72c5839515 (arm64: gicv3: Allow GICv3 compilation with
older binutils) changed the way we express the GICv3 system registers,
but couldn't change the occurences used by KVM as the code wasn't
merged yet.
Just fix the accessors.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
This reverts commit a28e3f4b90.
Ard and Yi Li report that this patch is broken by design, so revert it
and let them sort it out for 3.18 instead.
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A GIC interrupt which is declared as having a GIC_MAP_TO_NMI_MSK
mapping causes the cpu parameter to gic_setup_intr() to be increased
to 32, causing memory corruption when pcpu_masks[] is written to again
later in the function.
Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7375/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 190f1ca85d ("arm64: add support for kernel mode NEON in interrupt
context") introduced a typing error in fpsimd_save_partial_state ENDPROC.
This patch fixes the typing error.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: byungchul.park <byungchul.park@lge.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Our break hooks are used to handle brk exceptions from kgdb (and potentially
kprobes if that code ever resurfaces), so don't bother calling them if
the BRK exception comes from userspace.
This prevents userspace from trapping to a kdb shell on systems where
kgdb is enabled and active.
Cc: <stable@vger.kernel.org>
Reported-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>