Adding sysfs group 'format' attribute for pmu device that
contains a syntax description on how to construct raw events.
The event configuration is described in following
struct pefr_event_attr attributes:
config
config1
config2
Each sysfs attribute within the format attribute group,
describes mapping of name and bitfield definition within
one of above attributes.
eg:
"/sys/...<dev>/format/event" contains "config:0-7"
"/sys/...<dev>/format/umask" contains "config:8-15"
"/sys/...<dev>/format/usr" contains "config:16"
the attribute value syntax is:
line: config ':' bits
config: 'config' | 'config1' | 'config2"
bits: bits ',' bit_term | bit_term
bit_term: VALUE '-' VALUE | VALUE
Adding format attribute definitions for x86 cpu pmus.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I got somewhat tired of having to decode hex numbers..
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/n/tip-0vsy1sgywc4uar3mu1szm0rg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit f0fbf0abc0 ("x86: integrate delay functions") converted
delay_tsc() into a random delay generator for 64 bit. The reason is
that it merged the mostly identical versions of delay_32.c and
delay_64.c. Though the subtle difference of the result was:
static void delay_tsc(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long bclock, now;
Now the function uses rdtscl() which returns the lower 32bit of the
TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
bit this fails when the lower 32bit are close to wrap around when
bclock is read, because the following check
if ((now - bclock) >= loops)
break;
evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
because the unsigned long (now - bclock) of these values results in
0xffffffff00000001 which is definitely larger than the loops
value. That explains Tvortkos observation:
"Because I am seeing udelay(500) (_occasionally_) being short, and
that by delaying for some duration between 0us (yep) and 491us."
Make those variables explicitely u32 again, so this works for both 32
and 64 bit.
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # >= 2.6.27
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two fixes are queued up. The first is an additional fix for the OMAP
initialization order issue and the second patch fixes a possible section
mismatch which can lead to a kernel crash in the AMD IOMMU driver when
suspend/resume is used and the compiler has not inlined the
iommu_set_device_table function.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPWgA/AAoJECvwRC2XARrj7voQAN6evWicqjRDiXQgEC3muQFU
OrA/Jz/i7+pHEYXcsTt0xHLb8juZNJAdkJjToB+oz7i/7D+TYRmJe+QRNkVmw7Ld
d3DbUSUi9B3agvGblosKV3DYM8By1vTn9Gy2GNatW1yPuo5o4FHK2ePC5sn8Z/8z
qwTZZnmvqluz7frNiw6Y3bNOqLd46z+9thUOoKmRn/fo3vKCOOVvb85yu1m/uqy6
Dmpn6ep0w53jK29ZTKWcL8PW0YrLTEfszhMcVshFT+Y7GVSGnSxwgSh1fnZm/WL6
z11L57dI0+7RS/z+cw+ko7ymIloV2v4ABRArMPIoLgbIQT0lidDNSqOQnPvWaBek
MwdLL8W64lt2h4T7bLhDNRSDggWCX+EJYlk87O4hJYt4n57c3yT55z2+BGdoFivZ
tzPshNWN4KVDMZCU6sTvzvz6eErwvro5wlVM2WxDVfXTxn6UTblP5uIQDGb2zVA9
G95kK/OlK/s+giwOSOKxtR62livKEAuJ2Croa5LsdJnLdCo6ipvIz9cuAG6eij2W
tulJQUN3FZr288iUAOPQ9xj6hWYM9RXqoYBxAHAvgYgGuirj9E6hjk/fL0y6XGQk
jcg0bCdJjh2RzsLZR0eeslybtlrvUsBWYCPYCAAmRjUYHwZ6s822aO7qh0VnrFuQ
csH09+3B0twvJ+ZRJibN
=PZDd
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull two IOMMU fixes from Joerg Roedel:
"The first is an additional fix for the OMAP initialization order issue
and the second patch fixes a possible section mismatch which can lead
to a kernel crash in the AMD IOMMU driver when suspend/resume is used
and the compiler has not inlined the iommu_set_device_table function."
* tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
x86/amd: iommu_set_device_table() must not be __init
ARM: OMAP: fix iommu, not mailbox
One samsung build fix due to a mis-applied patch, and a small set of OMAP fixes.
This should be the last from arm-soc for 3.3, hopefully.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPWQktAAoJEIwa5zzehBx3OK4P/i3+nTB2WW8i8ll8pTHybSAj
YuQGOtsvJDCkZQ6tIgMahNH/Pr2Q2gFe6cjP/akbWWCfUvOl36ZAl9NYY8eexoco
ZGkMEXfhgtAsVepzEF1/64lwOgrh+AikzlL0q0tA2FmClcAejwFE7ht6zPcwFgJj
9lmR201/6OLc+qkmvrKRbmpO+eYZUVuQ4cOEMOxebXZN/tn5S/R2QWC3+PkUMbWz
7bngLHjO97legI0SGzfaKOx/02y/4hIDSybIPDOKDuXi1jUJlkrmnP09sQQLYdKN
uxu56mn+vUPCIGNwmE1/ufSneM8pZEAHaRkUjFOrxtsdGKVsxFz1vyJ3OAQHd8FU
WU+fKOew/Jf+jRpApydKWJ0yOdyTZIVqncvWpFs1RlmAVKPE96M8IL9oUZaV8rgJ
c9XUCvDFF0OuZQyMoacSzbVfcAy9SE/f7fNXLUpI9rEriYfY5waCXcQrYGwjRtmD
j2iY0pyaU72i7mtTbcsInPZ/bZ7gO+D1MmFl9WzduLWDJoGHJVcUyC1+YrJjoGF3
ggplns12qTtdgfiqTKV2I5qf2nHHbk8kmhII1fkYPZwdG2enKhV+r5Oth3zKPVXF
oOD8YnjACuad05YdkpzQL2DSaiGtPVAXHP4xMN9ktNPSJHwx19HNDiTKex6oxcZf
Dki+FinbRiFYUP/O91Xe
=PLTM
-----END PGP SIGNATURE-----
Merge tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull last minute fixes from Olof Johansson:
"One samsung build fix due to a mis-applied patch, and a small set of
OMAP fixes. This should be the last from arm-soc for 3.3, hopefully."
* tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: S3C2440: Fixed build error for s3c244x
ARM: OMAP2+: Fix module build errors with CONFIG_OMAP4_ERRATA_I688
ARM: OMAP: id: Add missing break statement in omap3xxx_check_revision
ARM: OMAP2+: Remove apply_uV constraints for fixed regulator
ARM: OMAP: irqs: Fix NR_IRQS value to handle PRCM interrupts
Fixes up a duplicate #include, adds an empty implementation of
of_find_compatible_node() and make git ignore .dtb files. And fix
up bus name on OF described PHYs. Nothing exciting here.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPWPvsAAoJEEFnBt12D9kB8QsP/2e6jv7CQ/wUhihzw126Y8yi
4HLG3ByJ382/c3LXg3juC+WfMvlpqEa0yqhikFKgIsm7cNbv7PkPS583pujdSXo0
IRykiT3Q9+XbO7L+9C88miYIQT8+IT/AjIjfuRwlP4gLPNUJhknpYhYOc09YzwOp
zhheeGVyeyTay+beQXip8gOEq2dYEl4IKsItagCBZfkDvj3Y8yDwTlP7f2j0FYZi
uODa3NFKE4uc0U2chtro0Vt87TRfbnIQ93SbvEWyQBWMEbPqT3l1Q94AeFGubCLj
RX910WvurCE4evzo2ZJzXPt9gPEyGIgtMGbjh+cENfT5/wNB2HWk1ftKnss5yEjp
WmcbwWKwXPwRPc5EpRdgabBCRgouCZghtcnJpkoXKSwbEt/nj3no8GOZXQnv+rSL
Ga0BSk1jJYC9DRpaIrLvdxXZF7vy1nug8fgU9ALi2R0gTmN+k6tHlLh60EWMSCBF
YCXv3fUd9IVJfOYsu4qMk0Pu40pW9rDc698Nxiep7C0iNhlddq8fiHTwB3DdoinC
iYmz+wEdDDp/68qY/mguXXtr5GAFSz1PUOwSJkh2zW6LnBDEv1djfhoA+5gwZX2v
I1IwfNHMn/XAEuhOFWMo9CejB0rDoBhigwucH8EilbIdBYVCCEtYaubO/Gq6HmId
h7hx9sWjrwF9HY0N9EHX
=/fxH
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull minor devicetree bug fixes and documentation updates from Grant Likely:
"Fixes up a duplicate #include, adds an empty implementation of
of_find_compatible_node() and make git ignore .dtb files. And fix up
bus name on OF described PHYs. Nothing exciting here."
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
doc: dt: Fix broken reference in gpio-leds documentation
of/mdio: fix fixed link bus name
of/fdt.c: asm/setup.h included twice
of: add picochip vendor prefix
dt: add empty of_find_compatible_node function
ARM: devicetree: Add .dtb files to arch/arm/boot/.gitignore
Fixed following:
arch/arm/mach-s3c2440/s3c244x.c: In function 's3c244x_restart':
arch/arm/mach-s3c2440/s3c244x.c:209: error: expected declaration or statement at end of input
make[1]: *** [arch/arm/mach-s3c24xx/s3c244x.o] Error 1
make: *** [arch/arm/mach-s3c24xx] Error 2
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull ARM updates from Russell King.
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7358/1: perf: add PMU hotplug notifier
ARM: 7357/1: perf: fix overflow handling for xscale2 PMUs
ARM: 7356/1: perf: check that we have an event in the PMU IRQ handlers
ARM: 7355/1: perf: clear overflow flag when disabling counter on ARMv7 PMU
ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode
ARM: ecard: ensure fake vma vm_flags is setup
ARM: 7346/1: errata: fix PL310 erratum #753970 workaround selection
ARM: 7345/1: errata: update workaround for A9 erratum #743622
ARM: 7348/1: arm/spear600: fix one-shot timer
ARM: 7339/1: amba/serial.h: Include types.h for resolving dependency of type bool
There was a latent typo in the C6X KSTK_EIP and KSTK_ESP macros which
caused a problem with a new patch which used them. The broken definitions
were of the form:
#define KSTK_FOO(tsk) (task_pt_regs(task)->foo)
Note the use of task vs tsk. This actually worked before because the
only place in the kernel which used these macros passed in a local
pointer named task.
Signed-off-by: Mark Salter <msalter@redhat.com>
When a CPU is taken out of reset, either cold booted or hotplugged in,
some of its PMU registers can contain UNKNOWN values.
This patch adds a hotplug notifier to ARM core perf code so that upon
CPU restart the PMU unit is reset and becomes ready to use again.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
xscale2 PMUs indicate overflow not via the PMU control register, but by
a separate overflow FLAG register instead.
This patch fixes the xscale2 PMU code to use this register to detect
to overflow and ensures that we clear any pending overflow when
disabling a counter.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The PMU IRQ handlers in perf assume that if a counter has overflowed
then perf must be responsible. In the paranoid world of crazy hardware,
this could be false, so check that we do have a valid event before
attempting to dereference NULL in the interrupt path.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When disabling a counter on an ARMv7 PMU, we should also clear the
overflow flag in case an overflow occurred whilst stopping the counter.
This prevents a spurious overflow being picked up later and leading to
either false accounting or a NULL dereference.
Cc: <stable@vger.kernel.org>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ARM, the PMU does not stop counting after an overflow and therefore
IRQ latency affects the new counter value read by the kernel. This is
significant for non-sampling runs where it is possible for the new value
to overtake the previous one, causing the delta to be out by up to
max_period events.
Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
to IRQ latency") attempted to fix this problem by allowing interrupt
handlers to pass an overflow flag to the event update function, causing
the overflow calculation to assume that the counter passed through zero
when going from prev to new. Unfortunately, this doesn't work when
overflow occurs on the perf_task_tick path because we have the flag
cleared and end up computing a large negative delta.
This patch removes the overflow flag from armpmu_event_update and
instead limits the sample_period to half of the max_period for
non-sampling profiling runs.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It turns out that test-compiling this file on x86-64 doesn't really
help, because much of it is x86-32-specific. And so I hadn't noticed
the slightly over-eager removal of the 'r' from 'addr' variable despite
thinking I had tested it.
Signed-off-by: Linus "oopsie" Torvalds <torvalds@linux-foundation.org>
Several users of "find_vma_prev()" were not in fact interested in the
previous vma if there was no primary vma to be found either. And in
those cases, we're much better off just using the regular "find_vma()",
and then "prev" can be looked up by just checking vma->vm_prev.
The find_vma_prev() semantics are fairly subtle (see Mikulas' recent
commit 83cd904d271b: "mm: fix find_vma_prev"), and the whole "return
prev by reference" means that it generates worse code too.
Thus this "let's avoid using this inconvenient and clearly too subtle
interface when we don't really have to" patch.
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'fixes' of git://github.com/hzhuang1/linux: (3 commits)
ARM: pxa: fix invalid mfp pin issue
ARM: pxa: remove duplicated registeration on pxa-gpio
ARM: pxa: add dummy clock for pxa25x and pxa27x
Includes an update to v3.3-rc6
As done for the other ep93xx machines in:
commit 9a6879bd90
ARM: ep93xx: convert to MULTI_IRQ_HANDLER
Now that there is a generic IRQ handler for multiple VIC devices use it
for vision_ep9307 to help building multi platform kernels.
Signed-off-by: Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <rmallon@gmail.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix a bug in kprobes which can modify kernel code
permanently at run-time. In the result, kernel can
crash when it executes the modified code.
This bug can happen when we put two probes enough near
and the first probe is optimized. When the second probe
is set up, it copies a byte which is already modified
by the first probe, and executes it when the probe is hit.
Even worse, the first probe and the second probe are removed
respectively, the second probe writes back the copied
(modified) instruction.
To fix this bug, kprobes always recovers the original
code and copies the first byte from recovered instruction.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: systemtap@sourceware.org
Cc: anderson@redhat.com
Link: http://lkml.kernel.org/r/20120305133215.5982.31991.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current probed-instruction recovery expects that only breakpoint
instruction modifies instruction. However, since kprobes jump
optimization can replace original instructions with a jump,
that expectation is not enough. And it may cause instruction
decoding failure on the function where an optimized probe
already exists.
This bug can reproduce easily as below:
1) find a target function address (any kprobe-able function is OK)
$ grep __secure_computing /proc/kallsyms
ffffffff810c19d0 T __secure_computing
2) decode the function
$ objdump -d vmlinux --start-address=0xffffffff810c19d0 --stop-address=0xffffffff810c19eb
vmlinux: file format elf64-x86-64
Disassembly of section .text:
ffffffff810c19d0 <__secure_computing>:
ffffffff810c19d0: 55 push %rbp
ffffffff810c19d1: 48 89 e5 mov %rsp,%rbp
ffffffff810c19d4: e8 67 8f 72 00 callq
ffffffff817ea940 <mcount>
ffffffff810c19d9: 65 48 8b 04 25 40 b8 mov %gs:0xb840,%rax
ffffffff810c19e0: 00 00
ffffffff810c19e2: 83 b8 88 05 00 00 01 cmpl $0x1,0x588(%rax)
ffffffff810c19e9: 74 05 je ffffffff810c19f0 <__secure_computing+0x20>
3) put a kprobe-event at an optimize-able place, where no
call/jump places within the 5 bytes.
$ su -
# cd /sys/kernel/debug/tracing
# echo p __secure_computing+0x9 > kprobe_events
4) enable it and check it is optimized.
# echo 1 > events/kprobes/p___secure_computing_9/enable
# cat ../kprobes/list
ffffffff810c19d9 k __secure_computing+0x9 [OPTIMIZED]
5) put another kprobe on an instruction after previous probe in
the same function.
# echo p __secure_computing+0x12 >> kprobe_events
bash: echo: write error: Invalid argument
# dmesg | tail -n 1
[ 1666.500016] Probing address(0xffffffff810c19e2) is not an instruction boundary.
6) however, if the kprobes optimization is disabled, it works.
# echo 0 > /proc/sys/debug/kprobes-optimization
# cat ../kprobes/list
ffffffff810c19d9 k __secure_computing+0x9
# echo p __secure_computing+0x12 >> kprobe_events
(no error)
This is because kprobes doesn't recover the instruction
which is overwritten with a relative jump by another kprobe
when finding instruction boundary.
It only recovers the breakpoint instruction.
This patch fixes kprobes to recover such instructions.
With this fix:
# echo p __secure_computing+0x9 > kprobe_events
# echo 1 > events/kprobes/p___secure_computing_9/enable
# cat ../kprobes/list
ffffffff810c1aa9 k __secure_computing+0x9 [OPTIMIZED]
# echo p __secure_computing+0x12 >> kprobe_events
# cat ../kprobes/list
ffffffff810c1aa9 k __secure_computing+0x9 [OPTIMIZED]
ffffffff810c1ab2 k __secure_computing+0x12 [DISABLED]
Changes in v4:
- Fix a bug to ensure optimized probe is really optimized
by jump.
- Remove kprobe_optready() dependency.
- Cleanup code for preparing optprobe separation.
Changes in v3:
- Fix a build error when CONFIG_OPTPROBE=n. (Thanks, Ingo!)
To fix the error, split optprobe instruction recovering
path from kprobes path.
- Cleanup comments/styles.
Changes in v2:
- Fix a bug to recover original instruction address in
RIP-relative instruction fixup.
- Moved on tip/master.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: systemtap@sourceware.org
Cc: anderson@redhat.com
Link: http://lkml.kernel.org/r/20120305133209.5982.36568.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Failure is reported on hx4700 with kernel v3.3-rc1.
__mfp_validate: GPIO20 is invalid pin
__mfp_validate: GPIO21 is invalid pin
__mfp_validate: GPIO15 is invalid pin
__mfp_validate: GPIO78 is invalid pin
__mfp_validate: GPIO79 is invalid pin
__mfp_validate: GPIO80 is invalid pin
__mfp_validate: GPIO33 is invalid pin
__mfp_validate: GPIO48 is invalid pin
__mfp_validate: GPIO49 is invalid pin
__mfp_validate: GPIO50 is invalid pin
Since pxa_last_gpio is used in mfp-pxa2xx driver. But it's only
updated in pxa-gpio driver that run after mfp-pxa2xx driver.
So update the pxa_last_gpio first in mfp-pxa2xx driver.
Reported-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Both reboot (via reboot(RB_AUTOBOOT)) and suspend freeze on hx4700.
Registration of pxa_gpio_syscore_ops is moved into pxa-gpio driver,
but it still exists in arch-pxa directory. It resulsts failure on
reboot and suspend.
Now remove the registration code in arch-pxa.
Reported-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
gpio-pxa driver is shared among arch-pxa and arch-mmp. Clock is the
essential component on pxa3xx/pxa95x and arch-mmp. So we need to
define dummy clock in pxa25x/pxa27x instead.
This regression was introduced by the commit "ARM: pxa: add dummy
clock for sa1100-rtc", id a55b5adaf4.
Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Merge the emailed seties of 19 patches from Andrew Morton
* akpm:
rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler
memcg: fix mapcount check in move charge code for anonymous page
mm: thp: fix BUG on mm->nr_ptes
alpha: fix 32/64-bit bug in futex support
memcg: fix GPF when cgroup removal races with last exit
debugobjects: Fix selftest for static warnings
floppy/scsi: fix setting of BIO flags
memcg: fix deadlock by inverting lrucare nesting
drivers/rtc/rtc-r9701.c: fix crash in r9701_remove()
c2port: class_create() returns an ERR_PTR
pps: class_create() returns an ERR_PTR, not NULL
hung_task: fix the broken rcu_lock_break() logic
vfork: kill PF_STARTING
coredump_wait: don't call complete_vfork_done()
vfork: make it killable
vfork: introduce complete_vfork_done()
aio: wake up waiters when freeing unused kiocbs
kprobes: return proper error code from register_kprobe()
kmsg_dump: don't run on non-error paths by default
Michael Cree said:
: : I have noticed some user space problems (pulseaudio crashes in pthread
: : code, glibc/nptl test suite failures, java compiler freezes on SMP alpha
: : systems) that arise when using a 2.6.39 or later kernel on Alpha.
: : Bisecting between 2.6.38 and 2.6.39 (using glibc/nptl test suite as
: : criterion for good/bad kernel) eventually leads to:
: :
: : 8d7718aa08 is the first bad commit
: : commit 8d7718aa08
: : Author: Michel Lespinasse <walken@google.com>
: : Date: Thu Mar 10 18:50:58 2011 -0800
: :
: : futex: Sanitize futex ops argument types
: :
: : Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
: : prototypes to use u32 types for the futex as this is the data type the
: : futex core code uses all over the place.
: :
: : Looking at the commit I see there is a change of the uaddr argument in
: : the Alpha architecture specific code for futexes from int to u32, but I
: : don't see why this should cause a problem.
Richard Henderson said:
: futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
: u32 oldval, u32 newval)
: ...
: : "r"(uaddr), "r"((long)oldval), "r"(newval)
:
:
: There is no 32-bit compare instruction. These are implemented by
: consistently extending the values to a 64-bit type. Since the
: load instruction sign-extends, we want to sign-extend the other
: quantity as well (despite the fact it's logically unsigned).
:
: So:
:
: - : "r"(uaddr), "r"((long)oldval), "r"(newval)
: + : "r"(uaddr), "r"((long)(int)oldval), "r"(newval)
:
: should do the trick.
Michael said:
: This fixes the glibc test suite failures and the pulseaudio related
: crashes, but it does not fix the java compiiler lockups that I was (and
: are still) observing. That is some other problem.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Our TLB ops want to check the vma vm_flags to find out whether the
mapping is executable. However, we leave this uninitialized in
ecard.c. Initialize it with an appropriate value.
Reported-by: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull PCI fixes from Jesse Barnes:
"A couple of fixes for booting specific machines, and one for a minor
memory leak on pre-_CRS platforms."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
x86/PCI: do not tie MSI MS-7253 use_crs quirk to BIOS version
x86/PCI: use host bridge _CRS info on MSI MS-7253
PCI: fix memleak when ACPI _CRS is not used.
Pull MIPS fixes from Ralf Baechle:
"What's in there: a number of MIPS fixes and touchups. The most
important change in this pull request is Kautuk Consul's port of
changes to do_page_fault which fix a hang that affects some
configurations. Still not quite ready for a release, there are
problems with 64-bit platforms."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: traps.c: Fix typo
MIPS: PowerTV: Fix defconfigs for coverage builds
MIPS: Netlogic: Fix defconfigs for coverage builds
MIPS: ATH79: Avoid a kernel bug on AR913X
MIPS: PCI: use list_for_each_entry() for bus->devices traversal
MIPS: fault.c: Port OOM changes to do_page_fault
MIPS: vmlinux.lds.S: remove duplicate _sdata symbol
MIPS: Alchemy: Increase minimum timeout for 32kHz timer.
MIPS: txx9 7segled fix struct device has no member
MIPS: Alchemy: Update Au1300 inlined GPIO macros
MIPS: Remove temporary kludge from <asm/page.h>
MIPS: BMIPS: smp-bmips.c does not need to include version.h
With branch stack sampling, it is possible to filter by priv levels.
In system-wide mode, that means it is possible to capture only user
level branches. The builtin SW LBR filter needs to disassemble code
based on LBR captured addresses. For that, it needs to know the task
the addresses are associated with. Because of context switches, the
content of the branch stack buffer may contain addresses from
different tasks.
We need a callback on context switch to either flush the branch stack
or save it. This patch adds a new callback in struct pmu which is called
during context switches. The callback is called only when necessary.
That is when a system-wide context has, at least, one event which
uses PERF_SAMPLE_BRANCH_STACK. The callback is never called for
per-thread context.
In this version, the Intel x86 code simply flushes (resets) the LBR
on context switches (fills it with zeroes). Those zeroed branches are
then filtered out by the SW filter.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-11-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds an internal sofware filter to complement
the (optional) LBR hardware filter.
The software filter is necessary:
- as a substitute when there is no HW LBR filter (e.g., Atom, Core)
- to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere)
- to provide finer grain filtering (e.g., all processors)
Sometimes the LBR HW filter cannot distinguish between two types
of branches. For instance, to capture syscall as CALLS, it is necessary
to enable the LBR_FAR filter which will also capture JMP instructions.
Thus, a second pass is necessary to filter those out, this is what the
SW filter can do.
The SW filter is built on top of the internal x86 disassembler. It
is a best effort filter especially for user level code. It is subject
to the availability of the text page of the program.
The SW filter is enabled on all Intel processors. It is bypassed
when the user is capturing all branches at all priv levels.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-9-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements PERF_SAMPLE_BRANCH support for Intel
x86processors. It connects PERF_SAMPLE_BRANCH to the actual LBR.
The patch adds the hooks in the PMU irq handler to save the LBR
on counter overflow for both regular and PEBS modes.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-8-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch adds a restriction for Intel Atom LBR support. Only
steppings 10 (PineView) and more recent are supported. Older models
do not have a functional LBR. Their LBR does not freeze on PMU
interrupt which makes LBR unusable in the context of perf_events.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-7-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the mappings from the generic PERF_SAMPLE_BRANCH_*
filters to the actual Intel x86LBR filters, whenever they exist.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-6-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If precise sampling is enabled on Intel x86 then perf_event uses PEBS.
To correct for the off-by-one error of PEBS, perf_event uses LBR when
precise_sample > 1.
On Intel x86 PERF_SAMPLE_BRANCH_STACK is implemented using LBR,
therefore both features must be coordinated as they may not
configure LBR the same way.
For PEBS, LBR needs to capture all branches at the priv level of
the associated event.
This patch checks that the branch type and priv level of BRANCH_STACK
is compatible with that of the PEBS LBR requirement, thereby allowing:
$ perf record -b any,u -e instructions:upp ....
But:
$ perf record -b any_call,u -e instructions:upp
Is not possible.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Intel LBR on some recent processor is capable
of filtering branches by type. The filter is configurable
via the LBR_SELECT MSR register.
There are limitation on how this register can be used.
On Nehalem/Westmere, the LBR_SELECT is shared by the two HT threads
when HT is on. It is private to each core when HT is off.
On SandyBridge, the LBR_SELECT register is private to each thread
when HT is on. It is private to each core when HT is off.
The kernel must manage the sharing of LBR_SELECT. It allows
multiple users on the same logical CPU to use LBR_SELECT as
long as they program it with the same value. Across sibling
CPUs (HT threads), the same restriction applies on NHM/WSM.
This patch implements this sharing logic by leveraging the
mechanism put in place for managing the offcore_response
shared MSR.
We modify __intel_shared_reg_get_constraints() to cause
x86_get_event_constraint() to be called because LBR may
be associated with events that may be counter constrained.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the LBR definitions for NHM/WSM/SNB and Core.
It also adds the definitions for the architected LBR MSR:
LBR_SELECT, LBRT_TOS.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the ability to sample taken branches to the
perf_event interface.
The ability to capture taken branches is very useful for all
sorts of analysis. For instance, basic block profiling, call
counts, statistical call graph.
This new capability requires hardware assist and as such may
not be available on all HW platforms. On Intel x86 it is
implemented on top of the Last Branch Record (LBR) facility.
To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK
bit must be set in attr->sample_type.
Sampled taken branches may be filtered by type and/or priv
levels.
The patch adds a new field, called branch_sample_type, to the
perf_event_attr structure. It contains a bitmask of filters
to apply to the sampled taken branches.
Filters may be implemented in HW. If the HW filter does not exist
or is not good enough, some arch may also implement a SW filter.
The following generic filters are currently defined:
- PERF_SAMPLE_USER
only branches whose targets are at the user level
- PERF_SAMPLE_KERNEL
only branches whose targets are at the kernel level
- PERF_SAMPLE_HV
only branches whose targets are at the hypervisor level
- PERF_SAMPLE_ANY
any type of branches (subject to priv levels filters)
- PERF_SAMPLE_ANY_CALL
any call branches (may incl. syscall on some arch)
- PERF_SAMPLE_ANY_RET
any return branches (may incl. syscall returns on some arch)
- PERF_SAMPLE_IND_CALL
indirect call branches
Obviously filter may be combined. The priv level bits are optional.
If not provided, the priv level of the associated event are used. It
is possible to collect branches at a priv level different from the
associated event. Use of kernel, hv priv levels is subject to permissions
and availability (hv).
The number of taken branch records present in each sample may vary based
on HW, the type of sampled branches, the executed code. Therefore
each sample contains the number of taken branches it contains.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>