Граф коммитов

95036 Коммитов

Автор SHA1 Сообщение Дата
Gavin Shan 7b401850a1 powerpc/eeh: EEH_PE_ISOLATED not reflect HW state
When doing PE reset, EEH_PE_ISOLATED is cleared unconditionally.
However, We should remove that if the PE reset has cleared the
frozen state successfully. Otherwise, the flag should be kept.
The patch fixes the issue.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:57 +10:00
Gavin Shan b34497d184 powerpc/powernv: Remove fields in PHB diag-data dump
For some fields (e.g. LEM, MMIO, DMA) in PHB diag-data dump, it's
meaningless to print them if they have non-zero value in the
corresponding mask registers because we always have non-zero values
in the mask registers. The patch only prints those fieds if we
have non-zero values in the primary registers (e.g. LEM, MMIO, DMA
status) so that we can save couple of lines. The patch also removes
unnecessary spare line before "brdgCtl:" and two leading spaces as
prefix in each line as Ben suggested.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:52 +10:00
Gavin Shan f5bc6b70d2 powerpc/powernv: Move PNV_EEH_STATE_ENABLED around
The flag PNV_EEH_STATE_ENABLED is put into pnv_phb::eeh_state,
which is protected by CONFIG_EEH. We needn't that. Instead, we
can have pnv_phb::flags and maintain all flags there, which is
the purpose of the patch. The patch also renames PNV_EEH_STATE_ENABLED
to PNV_PHB_FLAG_EEH.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:48 +10:00
Gavin Shan 467f79a956 powerpc/powernv: Remove PNV_EEH_STATE_REMOVED
The PHB state PNV_EEH_STATE_REMOVED maintained in pnv_phb isn't
so useful any more and it's duplicated to EEH_PE_ISOLATED. The
patch replaces PNV_EEH_STATE_REMOVED with EEH_PE_ISOLATED.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:44 +10:00
Gavin Shan 9e04937560 powerpc/eeh: Remove EEH_PE_PHB_DEAD
The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to
EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD
would be removed from the system. However, it's safe to replace
that with EEH_PE_ISOLATED.

The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled,
either failure or success. It makes the PHB PE state consistent with:

	PHB functions normally		  NONE
	PHB has been removed		  EEH_PE_ISOLATED
	PHB fenced, recovery in progress  EEH_PE_ISOLATED | RECOVERING

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:39 +10:00
Alistair Popple e4565362c7 powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
This patch fixes this section mismatch:

WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from
the function apm821xx_pciex_init_port_hw() to the function
.init.text:ppc4xx_pciex_wait_on_sdr.isra.9()

The function apm821xx_pciex_init_port_hw() references the function
__init ppc4xx_pciex_wait_on_sdr.isra.9().  This is often because
apm821xx_pciex_init_port_hw lacks a __init annotation or the
annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong.

apm821xx_pciex_init_port_hw is only referenced by a struct in
__initdata, so it should be safe to add __init to
apm821xx_pciex_init_port_hw.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:53 +10:00
Preeti U Murthy 582b910eda ppc/kvm: Clear the runlatch bit of a vcpu before napping
When the guest cedes the vcpu or the vcpu has no guest to
run it naps. Clear the runlatch bit of the vcpu before
napping to indicate an idle cpu.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:49 +10:00
Preeti U Murthy fd17dc7b9a ppc/kvm: Set the runlatch bit of a CPU just before starting guest
The secondary threads in the core are kept offline before launching guests
in kvm on powerpc: "371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use
SMT processor modes."

Hence their runlatch bits are cleared. When the secondary threads are called
in to start a guest, their runlatch bits need to be set to indicate that they
are busy. The primary thread has its runlatch bit set though, but there is no
harm in setting this bit once again. Hence set the runlatch bit for all
threads before they start guest.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:45 +10:00
Preeti U Murthy f203891117 ppc/powernv: Set the runlatch bits correctly for offline cpus
Up until now we have been setting the runlatch bits for a busy CPU and
clearing it when a CPU enters idle state. The runlatch bit has thus
been consistent with the utilization of a CPU as long as the CPU is online.

However when a CPU is hotplugged out the runlatch bit is not cleared. It
needs to be cleared to indicate an unused CPU. Hence this patch has the
runlatch bit cleared for an offline CPU just before entering an idle state
and sets it immediately after it exits the idle state.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:40 +10:00
Li Zhong 42dbfc8649 powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:

Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
        wait_event(root->deactivate_waitq,
                   atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);

Process #1120 is trying to online memory499 by
   echo 1 > memory499/online

In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process

The backtrace of both processes are shown below:

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():

 * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
 * and online/offline operations before this call, as required by
 * try_offline_node().
 */
void __ref remove_memory(int nid, u64 start, u64 size)

With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:14 +10:00
Anton Blanchard 0c93069210 powerpc: Fix error return in rtas_flash module init
module_init should return 0 or a negative errno.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:07 +10:00
Anton Blanchard 579a53cafd powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the
kernel.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:02 +10:00
Anton Blanchard a5980d064f powerpc: Bump COMMAND_LINE_SIZE to 2048
I've had a report that the current limit is too small for
an automated network based installer. Bump it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:31:58 +10:00
Anton Blanchard a2dd5da77f powerpc: Rename duplicate COMMAND_LINE_SIZE define
We have two definitions of COMMAND_LINE_SIZE, one for the kernel
and one for the boot wrapper. I assume this is so the boot
wrapper can be self sufficient and not rely on kernel headers.

Having two defines with the same name is confusing, I just
updated the wrong one when trying to bump it.

Make the boot wrapper define unique by calling it
BOOT_COMMAND_LINE_SIZE.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:31:54 +10:00
Cody P Schafer bbad3e50e8 powerpc/perf/hv-24x7: Catalog version number is be64, not be32
The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:31:50 +10:00
Cody P Schafer 1ee9fcc1a0 powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:27 +10:00
Cody P Schafer 78d13166b1 powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:26 +10:00
Cody P Schafer 58a685c2d8 powerpc/perf/hv-gpci: Make device attr static
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:26 +10:00
Cody P Schafer 0a8cf9e28c powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
fixup for "powerpc/perf: Add support for the hv gpci (get performance
counter info) interface".

Makes the "not enabled" message less awful (and hidden unless
debugging).

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:25 +10:00
Cody P Schafer e98bf005d5 powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
fixup for "powerpc/perf: Add support for the hv 24x7 interface"

Makes the "not enabled" message less awful (and hides it in most cases).

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:25 +10:00
Aneesh Kumar K.V 29ef7a3e26 powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
The if condition check was based on a draft ISA doc. Remove the same.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:24 +10:00
Anton Blanchard 2d6b63bbdd powerpc/powernv: Fix little endian issues in OPAL dump code
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:24 +10:00
Anton Blanchard 3441f04b4b powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.

The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:23 +10:00
Anton Blanchard 14ad0c58d5 powerpc/powernv: Fix little endian issues in OPAL error log code
Fix little endian issues with the OPAL error log code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:23 +10:00
Anton Blanchard 56b4c99312 powerpc/powernv: Fix little endian issues with opal_do_notifier calls
The bitmap in opal_poll_events and opal_handle_interrupt is
big endian, so we need to byteswap it on little endian builds.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:22 +10:00
Anton Blanchard e2c8b93e65 powerpc/powernv: Remove some OPAL function declaration duplication
We had some duplication of the internal OPAL functions.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:21 +10:00
Anton Blanchard 2bad742388 powerpc/powernv: Use uint64_t instead of size_t in OPAL APIs
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:21 +10:00
Wei Yang 4966bfa1b3 powerpc/powernv: Release the refcount for pci_dev
On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.

This patch release the unnecessary refcount.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:20 +10:00
Wei Yang 3f28c5af39 powerpc/powernv: Reduce multi-hit of iommu_add_device()
During the EEH hotplug event, iommu_add_device() will be invoked three times
and two of them will trigger warning or error.

The three times to invoke the iommu_add_device() are:

    pci_device_add
       ...
       set_iommu_table_base_and_group   <- 1st time, fail
    device_add
       ...
       tce_iommu_bus_notifier           <- 2nd time, succees
    pcibios_add_pci_devices
       ...
       pcibios_setup_bus_devices        <- 3rd time, re-attach

The first time fails, since the dev->kobj->sd is not initialized. The
dev->kobj->sd is initialized in device_add().
The third time's warning is triggered by the re-attach of the iommu_group.

After applying this patch, the error

    iommu_tce: 0003:05:00.0 has not been added, ret=-14

and the warning

    [  204.123609] ------------[ cut here ]------------
    [  204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
    [  204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
    [  204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
    [  204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
    [  204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
    [  204.124506] REGS: c0000027ed50f440 TRAP: 0700   Not tainted  (3.14.0-rc5yw+)
    [  204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 88008084  XER: 20000000
    [  204.124682] CFAR: c00000000006c644 SOFTE: 1
    GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
    GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
    GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
    GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
    GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
    GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
    [  204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
    [  204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125443] Call Trace:
    [  204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
    [  204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
    [  204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
    [  204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
    [  204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
    [  204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
    [  204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
    [  204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
    [  204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
    [  204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
    [  204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
    [  204.126556] Instruction dump:
    [  204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
    [  204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
    [  204.126966] ---[ end trace 6e7aefd80add2973 ]---

are cleared.

This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
revert part of the change in commit d905c5df(PPC: POWERNV: move
iommu_add_device earlier).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:11:20 +10:00
Anton Blanchard cc146d1db0 powerpc/powernv: Fix little endian issues in OPAL flash code
With this patch I was able to update firmware on an LE kernel.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:50 +10:00
Benjamin Herrenschmidt 298b34d7d5 powerpc/powernv: Fix kexec races going back to OPAL
We have a subtle race when sending CPUs back to OPAL on kexec.

We mark them as "in real mode" right before we send them down. Once
we've booted the new kernel, it might try to call opal_reinit_cpus()
to change endianness, and that requires all CPUs to be spinning inside
OPAL.

However there is no synchronization here and we've observed cases
where the returning CPUs hadn't established their new state inside
OPAL before opal_reinit_cpus() is called, causing it to fail.

The proper fix is to actually wait for them to go down all the way
from the kexec'ing kernel.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:50 +10:00
Joel Stanley 63aecfb20a powerpc/powernv: Check sysparam size before creation
The size of the sysparam sysfs files is determined from the device tree
at boot. However the buffer is hard coded to 64 bytes. If we encounter a
parameter that is larger than 64, or miss-parse the device tree, the
buffer will overflow when reading or writing to the parameter.

Check it at discovery time, and if the parameter is too large, do not
create a sysfs entry for it.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:49 +10:00
Joel Stanley 16003d235b powerpc/powernv: Fix typos in sysparam code
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:49 +10:00
Joel Stanley 85390378f0 powerpc/powernv: Check sysfs size before copying
The sysparam code currently uses the userspace supplied number of
bytes when memcpy()ing in to a local 64-byte buffer.

Limit the maximum number of bytes by the size of the buffer.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:48 +10:00
Joel Stanley b8569d2304 powerpc/powernv: Use ssize_t for sysparam return values
The OPAL calls are returning int64_t values, which the sysparam code
stores in an int, and the sysfs callback returns ssize_t. Make code a
easier to read by consistently using ssize_t.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:48 +10:00
Joel Stanley ba9a32b176 powerpc/powernv: Fix sysparam sysfs error handling
When a sysparam query in OPAL returned a negative value (error code),
sysfs would spew out a decent chunk of memory; almost 64K more than
expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
sysfs code at sys_param_show.

The return value of sys_param_show is a ssize_t, calculated using

  return ret ? ret : attr->param_size;

Alan Modra explains:

  "attr->param_size" is an unsigned int, "ret" an int, so the overall
  expression has type unsigned int.  Result is that ret is cast to
  unsigned int before being cast to ssize_t.

Instead of using the ternary operator, set ret to the param_size if an
error is not detected. The same bug exists in the sysfs write callback;
this patch fixes it in the same way.

A note on debugging this next time: on my system gcc will warn about
this if compiled with -Wsign-compare, which is not enabled by -Wall,
only -Wextra.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:47 +10:00
Li Zhong 4fb8d027dc powerpc: Fix Oops in rtas_stop_self()
commit 41dd03a9 may cause Oops in rtas_stop_self().

The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.

So the patch moves rtas_args away from stack by adding static before
it.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:47 +10:00
Jeff Mahoney 28ea3c7529 powerpc: Export flush_icache_range
Commit aac416fc38 (lkdtm: flush icache and report actions) calls
flush_icache_range from a module. It's exported on most architectures
that implement it, but not on powerpc. This patch exports it to fix
the module link failure.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 13:08:46 +10:00
Linus Torvalds 6d4596905b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "This fixes the preemption-count imbalance crash reported by Owen
  Kibel"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix CMCI preemption bugs
2014-04-19 10:41:43 -07:00
Linus Torvalds 8de3f7a705 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two kernel side fixes:

   - an Intel uncore PMU driver potential crash fix
   - a kprobes/perf-call-graph interaction fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
  kprobes/x86: Fix page-fault handling logic
2014-04-19 10:40:11 -07:00
Linus Torvalds ea2388f281 Merge branch 'akpm' (incoming from Andrew)
Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  thp: close race between split and zap huge pages
  mm: fix new kernel-doc warning in filemap.c
  mm: fix CONFIG_DEBUG_VM_RB description
  mm: use paravirt friendly ops for NUMA hinting ptes
  mips: export flush_icache_range
  mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
  wait: explain the shadowing and type inconsistencies
  Shiraz has moved
  Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
  powerpc/mm: fix ".__node_distance" undefined
  kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
  init/Kconfig: move the trusted keyring config option to general setup
  vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
2014-04-18 16:40:31 -07:00
Kees Cook 8229f1a044 mips: export flush_icache_range
The lkdtm module performs tests against executable memory ranges, so it
needs to flush the icache for proper behaviors.  Other architectures
already export this, so do the same for MIPS.

[akpm@linux-foundation.org: relocate export sites]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: John Crispin <blogic@openwrt.org>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Viresh Kumar 9cc236827f Shiraz has moved
shiraz.hashim@st.com email-id doesn't exist anymore as he has left the
company.  Replace ST's id with shiraz.linux.kernel@gmail.com.

It also updates .mailmap file to fix address for 'git shortlog'.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:08 -07:00
Mike Qiu 12c743eb22 powerpc/mm: fix ".__node_distance" undefined
CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CHK     include/generated/utsrelease.h
  ...
  Building modules, stage 2.
WARNING: 1 bad relocations
c0000000013d6a30 R_PPC64_ADDR64    uprobes_fetch_type_table
  WRAP    arch/powerpc/boot/zImage.pseries
  WRAP    arch/powerpc/boot/zImage.epapr
  MODPOST 1849 modules
ERROR: ".__node_distance" [drivers/block/nvme.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
make: *** Waiting for unfinished jobs....

The reason is symbol "__node_distance" not been exported in powerpc.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:08 -07:00
Vineet Gupta 64ee9f32c3 ARC: Delete stale barrier.h
Commit 93ea02bb84 ("arch: Clean up asm/barrier.h implementations")
wired generic barrier.h for ARC, but failed to delete the existing file.

In 3.15, due to rcupdate.h updates, this causes a build breakage on ARC:

      CC      arch/arc/kernel/asm-offsets.s
    In file included from include/linux/sched.h:45:0,
                     from arch/arc/kernel/asm-offsets.c:9:
    include/linux/rculist.h: In function __list_add_rcu:
    include/linux/rculist.h:54:2: error: implicit declaration of function smp_store_release [-Werror=implicit-function-declaration]
      rcu_assign_pointer(list_next_rcu(prev), new);
      ^

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 13:49:15 -07:00
Linus Torvalds 674366e90e PCI updates for v3.15:
Host bridge drivers
     - Fix OF interrupt mapping for DesignWare, R-Car, Tegra (Lucas Stach)
     - Fix DesignWare iATU programming (Mohit Kumar)
 
   Miscellaneous
     - Fix powerpc NULL dereference from list_for_each_entry() update (Mike Qiu)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTUUMAAAoJEFmIoMA60/r8OrUQAKXhI6kXjJ36sSHiioQBFxD9
 fgJXDHS8gA4UD7ZV0BKQeA6kDM2v9A80UWLA2yLt+BWr0+DEgljsnRNWZjsrzoEn
 JZantkukqKfdo6LseIwfFJM7LTWqP/q6enTMXIp9UjLxnBK+j0K7qU0+u/VjH0Ab
 O8wouFu1B8REJDL5TvSjiEXO6uDtFPAuHQ1oAs5EeT35ZV54f8gSGyyUxLh1Fk18
 qrD+ERN5VrVI5K2ENJwIOljAoaHMiseSZPFA8nQu/XMZ5N0dZuU2gXGFMt3yuZID
 jTzq88umyEEMcbo+QqAW+rcoljq3GExBnOu2SF+0dvDWlXewsZz0WNdnKkUh5HCi
 vnnwvkSB4bpnv+rtu0QQgOFoFDkQV+EYSy6dUEijtVTOijZ2ZxnzOO8/y4pocpnB
 +hKehz0qNVl/RE9WMbCQ+TAb4K7+uECcilIUqq581BnmYFrH3FarfcDaHdWySI4O
 AIs2RtgwtlZv1MBpejM4OS7REav2IUFj8CWsvRIIrsgfvKvl/1Rxg2eQE8KPFSdc
 pi72CNKNvThzgAvpWjb0F/Wz7+0aWO6Nm0flYCBvcIB472x9dAzhVpoiKMf2BAII
 S/uebuA7hCZtoGEWOSwtEBYm+i3UR07YCL2UFv8Pj523DaOeTFgSYp7YFFJV+WqO
 AosR0V0L8mr2HMBK0KhN
 =VgYj
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "These are fixes for a powerpc NULL pointer dereference, an OF
  interrupt mapping issue on some of the new host bridges, and a
  DesignWare iATU issue.

  Host bridge drivers
   - Fix OF interrupt mapping for DesignWare, R-Car, Tegra (Lucas Stach)
   - Fix DesignWare iATU programming (Mohit Kumar)

  Miscellaneous
    - Fix powerpc NULL dereference from list_for_each_entry() update (Mike Qiu)"

* tag 'pci-v3.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: tegra: Use new OF interrupt mapping when possible
  PCI: rcar: Use new OF interrupt mapping when possible
  PCI: designware: Use new OF interrupt mapping when possible
  PCI: designware: Fix iATU programming for cfg1, io and mem viewport
  PCI: designware: Fix comment for setting number of lanes
  powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversal
2014-04-18 10:56:27 -07:00
Venkatesh Srinivas 2422365780 perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1394739386-22260-1-git-send-email-venkateshs@google.com
Cc: zheng.z.yan@intel.com
Cc: eranian@google.com
Cc: ak@linux.intel.com
Cc: linux-kernel@vger.kernel.org
[ The patch also silently fixes another bug: rapl_pmu_init() didn't handle the memory alloc failure case previously. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 12:14:26 +02:00
Linus Torvalds 81cef0fe19 Merge branch 'parisc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "There are two major changes in this patchset:

  The major fix is that the epoll_pwait() syscall for 32bit userspace
  was not using the compat wrapper on a 64bit kernel.

  Secondly we changed the value of SHMLBA from 4MB to PAGE_SIZE to
  reflect that we can actually mmap to any multiple of PAGE_SIZE.  The
  only thing which needs care is that shared mmaps need to be mapped at
  the same offset inside the 4MB cache window"

* 'parisc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: fix epoll_pwait syscall on compat kernel
  parisc: change value of SHMLBA from 0x00400000 to PAGE_SIZE
  parisc: Replace __get_cpu_var uses for address calculation
2014-04-17 13:21:35 -07:00
Linus Torvalds 88764e0a3e Xen regression and bug fixes for 3.15-rc1.
- Fix completely broken 32-bit PV guests caused by x86 refactoring
   32-bit thread_info.
 - Only enable ticketlock slow path on Xen (not bare metal).
 - Fix two bugs with PV guests not shutting down when requested.
 - Fix a minor memory leak in xen-pciback error path.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQEcBAABAgAGBQJTT/hQAAoJEFxbo/MsZsTR6sMIAJs7mJXSqDQn3Z8O+TemRa53
 p92ZomTNYALjUMglXcxJ2Zua6IsZMWdu7jcV1GoXC70V4YLmUs8KaBgZmI5ayUQy
 bBpK+6WIAJyBkJdNH5fK3wggJ2UZjw0/twPNgd9gACwjUiYhx8iHN/hTGvu4qPBJ
 MGAIlg6wdnGwRydi72uk9Am/xpebEdQy4DRD20vjwA/qUkT4uHVv/AA4hc4AK29w
 ToK8qFSisgAlahcmq8/T4+OBFEKz78b9dQcdsGWyAk0ofWILfwD1l53xhzUin25s
 JUVevWhhLCKRZBOq4Ykc5qyqnLff4m56rm/THQ6f0oRdJn/OR+SWOImda2Qqmvs=
 =Gxpq
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen fixes from David Vrabel:
 "Xen regression and bug fixes for 3.15-rc1:

   - fix completely broken 32-bit PV guests caused by x86 refactoring
     32-bit thread_info.
   - only enable ticketlock slow path on Xen (not bare metal)
   - fix two bugs with PV guests not shutting down when requested
   - fix a minor memory leak in xen-pciback error path"

* tag 'stable/for-linus-3.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/manage: Poweroff forcefully if user-space is not yet up.
  xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.
  xen/spinlock: Don't enable them unconditionally.
  xen-pciback: silence an unwanted debug printk
  xen: fix memory leak in __xen_pcibk_add_pci_dev()
  x86/xen: Fix 32-bit PV guests's usage of kernel_stack
2014-04-17 10:54:07 -07:00
Masami Hiramatsu 6381c24cd6 kprobes/x86: Fix page-fault handling logic
Current kprobes in-kernel page fault handler doesn't
expect that its single-stepping can be interrupted by
an NMI handler which may cause a page fault(e.g. perf
with callback tracing).

In that case, the page-fault handled by kprobes and it
misunderstands the page-fault has been caused by the
single-stepping code and tries to recover IP address
to probed address.

But the truth is the page-fault has been caused by the
NMI handler, and do_page_fault failes to handle real
page fault because the IP address is modified and
causes Kernel BUGs like below.

 ----
 [ 2264.726905] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
 [ 2264.727190] IP: [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40

To handle this correctly, I fixed the kprobes fault
handler to ensure the faulted ip address is its own
single-step buffer instead of checking current kprobe
state.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fche@redhat.com
Cc: systemtap@sourceware.org
Link: http://lkml.kernel.org/r/20140417081644.26341.52351.stgit@ltc230.yrl.intra.hitachi.co.jp
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-17 10:57:02 +02:00