Граф коммитов

454916 Коммитов

Автор SHA1 Сообщение Дата
Mahesh Salgaonkar e75ad93afc powerpc/book3s: Add stack overflow check in machine check handler.
Currently machine check handler does not check for stack overflow for
nested machine check. If we hit another MCE while inside the machine check
handler repeatedly from same address then we get into risk of stack
overflow which can cause huge memory corruption. This patch limits the
nested MCE level to 4 and panic when we cross level 4.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 19:15:13 +10:00
Mahesh Salgaonkar 2749a2f26a powerpc/book3s: Fix machine check handling for unhandled errors
Current code does not check for unhandled/unrecovered errors and return from
interrupt if it is recoverable exception which in-turn triggers same machine
check exception in a loop causing hypervisor to be unresponsive.

This patch fixes this situation and forces hypervisor to panic for
unhandled/unrecovered errors.

This patch also fixes another issue where unrecoverable_exception routine
was called in real mode in case of unrecoverable exception (MSR_RI = 0).
This causes another exception vector 0x300 (data access) during system crash
leading to confusion while debugging cause of the system crash.

Also turn ME bit off while going down, so that when another MCE is hit during
panic path, system will checkstop and hypervisor will get restarted cleanly
by SP.

With the above fixes we now throw correct console messages (see below) while
crashing the system in case of unhandled/unrecoverable machine checks.

--------------
Severe Machine check interrupt [[Not recovered]
  Initiator: CPU
  Error type: UE [Instruction fetch]
    Effective address: 0000000030002864
Oops: Machine check, sig: 7 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork]
CPU: 36 PID: 55162 Comm: bash Tainted: G           O 3.14.0mce #1
task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000
NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c
REGS: c000000007ec3d80 TRAP: 0200   Tainted: G           O  (3.14.0mce)
MSR: 9000000000041002 <SF,HV,ME,RI>  CR: 28222848  XER: 20000000
CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1
GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864
GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9
GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728
GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000
GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc
GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000
GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250
GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000
NIP [0000000030002864] 0x30002864
LR [00000000300151a4] 0x300151a4
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 7285f0beac1e29d3 ]---

Sending IPI to other CPUs
IPI complete
OPAL V3 detected !
--------------

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 19:15:13 +10:00
Gavin Shan 357b2f3dd9 powerpc/eeh: Dump PE location code
As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.

If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 19:12:23 +10:00
Boris BREZILLON 5c89a8b657 clk: sunxi: document PRCM clock compatible strings
Document new compatible strings for clock provided by the PRCM
(Power/Reset/Clock Management) unit.

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 10:25:03 +02:00
Boris BREZILLON c8a76cac19 clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
The PRCM (Power/Reset/Clock Management) unit provides several clock
devices:
- AR100 clk: used to clock the Power Management co-processor
- AHB0 clk: used to clock the AHB0 bus
- APB0 clk and gates: used to clk peripherals connected to the APB0 bus

Add support for these clks in a separate driver so that they can be probed
as platform devices instead of registered during early init.
This is needed to be able to probe PRCM MFD subdevices.

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 10:25:02 +02:00
Maxime Ripard efb3184c08 clk: sun6i: Protect SDRAM gating bit
Prevent the SDRAM controller from being gated by force-enabling it in the
machine code.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 10:25:02 +02:00
Maxime Ripard 2df73f40dc clk: sun6i: Protect CPU clock
Right now, AHB is an indirect child clock of the CPU clock. If that
happens to change, since the CPU clock has no other consumers declared
in Linux, it would be shut down, which is not really a good idea.

Prevent this by forcing it enabled.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 10:25:01 +02:00
Maxime Ripard 134a6690a3 clk: sunxi: Rework clock protection code
Since we start to have a lot of clocks to protect, some of them in a
few SoCs only, it becomes difficult to handle the clock protection
without having to add per machine exceptions.

Add per-SoC data to tell which clock to leave enabled.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 10:25:01 +02:00
Maxime Ripard 59cb10e32a clk: sunxi: Move the GMAC clock to a file of its own
Since we have a folder of our own, we can actually make use of it by
splitting the huge clock file into several sub drivers.

The gmac clock is pretty easy to deal with, since it's pretty much
isolated and doesn't have any dependency on the other clocks.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 09:58:44 +02:00
Maxime Ripard ff01df28e5 clk: sunxi: Move the 24M oscillator to a file of its own
Since we have a folder of our own, we can actually make use of it by
splitting the huge clock file into several sub drivers.

The main oscillator is pretty easy to deal with, since it's pretty much
isolated.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 09:58:44 +02:00
Maxime Ripard 2c6fba1038 clk: sunxi: Remove calls to clk_put
Callers of clk_put must disable the clock first. This also means that
as long as the clock is enabled the driver should hold a reference to
that clock. Hence, the call to clk_put here are bogus and should be
removed.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 09:58:44 +02:00
Emilio López 6d1d14d5ce clk: sunxi: document new A31 USB clock compatible
Support for the USB gates and resets on A31 has been recently added
using a new compatible, so let's document it here.

Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 09:58:43 +02:00
Maxime Ripard e0e7943c55 clk: sunxi: Implement A31 USB clock
The A31 USB clock slightly differ from its older counterparts, mostly
because it has a different gate for each PHY, while the older one had
a single gate for all the phy.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Emilio López <emilio@elopez.com.ar>
2014-06-11 09:58:43 +02:00
Lendacky, Thomas d5c4858237 amd-xgbe: Rename MAX_DMA_CHANNELS to avoid powerpc conflict
MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc
architecture.  Rename this #define in xgbe.h to avoid the
redefined warning issued during compilation.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:56:41 -07:00
Wei-Chun Chao 5882a07c72 net: fix UDP tunnel GSO of frag_list GRO packets
This patch fixes a kernel BUG_ON in skb_segment. It is hit when
testing two VMs on openvswitch with one VM acting as VXLAN gateway.

During VXLAN packet GSO, skb_segment is called with skb->data
pointing to inner TCP payload. skb_segment calls skb_network_protocol
to retrieve the inner protocol. skb_network_protocol actually expects
skb->data to point to MAC and it calls pskb_may_pull with ETH_HLEN.
This ends up pulling in ETH_HLEN data from header tail. As a result,
pskb_trim logic is skipped and BUG_ON is hit later.

Move skb_push in front of skb_network_protocol so that skb->data
lines up properly.

kernel BUG at net/core/skbuff.c:2999!
Call Trace:
[<ffffffff816ac412>] tcp_gso_segment+0x122/0x410
[<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
[<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
[<ffffffff816b3658>] skb_udp_tunnel_segment+0xd8/0x390
[<ffffffff816b3c00>] udp4_ufo_fragment+0x120/0x140
[<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
[<ffffffff8109d742>] ? default_wake_function+0x12/0x20
[<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
[<ffffffff8164b4d0>] __skb_gso_segment+0x60/0xc0
[<ffffffff8164b6b3>] dev_hard_start_xmit+0x183/0x550
[<ffffffff8166c91e>] sch_direct_xmit+0xfe/0x1d0
[<ffffffff8164bc94>] __dev_queue_xmit+0x214/0x4f0
[<ffffffff8164bf90>] dev_queue_xmit+0x10/0x20
[<ffffffff81687edb>] ip_finish_output+0x66b/0x890
[<ffffffff81688a58>] ip_output+0x58/0x90
[<ffffffff816c628f>] ? fib_table_lookup+0x29f/0x350
[<ffffffff816881c9>] ip_local_out_sk+0x39/0x50
[<ffffffff816cbfad>] iptunnel_xmit+0x10d/0x130
[<ffffffffa0212200>] vxlan_xmit_skb+0x1d0/0x330 [vxlan]
[<ffffffffa02a3919>] vxlan_tnl_send+0x129/0x1a0 [openvswitch]
[<ffffffffa02a2cd6>] ovs_vport_send+0x26/0xa0 [openvswitch]
[<ffffffffa029931e>] do_output+0x2e/0x50 [openvswitch]

Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:48:47 -07:00
huizhang f6c20c596f net: ipv6: Fixed up ipsec packet be re-routing issue
Bug report on https://bugzilla.kernel.org/show_bug.cgi?id=75781

When a local output ipsec packet match the mangle table rule,
and be set mark value, the packet will be route again in
route_me_harder -> _session_decoder6

In this case, the nhoff in CB of skb was still the default
value 0. So the protocal match can't success and the packet can't match
correct SA rule,and then the packet be send out in plaintext.

To fixed up the issue. The CB->nhoff must be set.

Signed-off-by: Hui Zhang <huizhang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:47:31 -07:00
Ben Hutchings 581d9baa21 farsync: Fix confusion about DMA address and buffer offset types
Use dma_addr_t for DMA address parameters and u32 for shared memory
offset parameters.

Do not assume that dma_addr_t is the same as unsigned long; it will
not be in PAE configurations.  Truncate DMA addresses to 32 bits when
printing them.  This is OK because the DMA mask for this device is
32-bit (per default).

Also rename the DMA address parameters from 'skb' to 'dma'.

Compile-tested only.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:45:29 -07:00
Dmitry Popov 5ce54af1fc ip_tunnel: fix i_key matching in ip_tunnel_find
Some tunnels (though only vti as for now) can use i_key just for internal use:
for example vti uses it for fwmark'ing incoming packets. So raw i_key value
shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for
cases when we want to compare two ip_tunnel_parms' i_keys.

Example bug:
ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2
ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2
spawned two tunnels, although it doesn't make sense.

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:43:37 -07:00
David S. Miller 7e1a61b6c1 Merge branch 'mlx4'
Or Gerlitz says:

====================
mlx4 SRIOV fixes

The patch from Wei Yang is a designed fix to a regression introduced by earlier commit
of him. Jack added a fix to the resource management which we got from IBM.

Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:53 -07:00
Wei Yang da1de8dfff net/mlx4_core: Keep only one driver entry release mlx4_priv
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.

This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.

During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.

Fix that by keeping only one driver entry which releases mlx4_priv.

Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:46 -07:00
Jack Morgenstein 95646373c9 net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas
The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.

Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.

As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.

This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).

The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.

When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.

When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.

This fix was developed by Rick Kready <kready@us.ibm.com>

Reported-by: Rick Kready <kready@us.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:46 -07:00
Dmitry Popov 7c8e6b9c28 ip_vti: Fix 'ip tunnel add' with 'key' parameters
ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2
translates to p->iflags = VTI_ISVTI|GRE_KEY and p->i_key = 1, but GRE_KEY !=
TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key)
making us unable to create vti tunnels with [io]key via ip tunnel.

We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because
vti_tunnels with same local/remote addresses but different ikeys will be treated
as different then. So, imo the best option here is to move p->i_flags & *_KEY
check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark
field for ip_tunnel_parm in the future.

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:30:52 -07:00
Rickard Strandqvist f05d435bde net: wimax: i2400m: control.c: Cleaning up conjunction always evaluates to false
Logical conjunction always evaluates to false:  minor < 2 && minor > 1
I guess what you wanted is rather: minor > 2 || minor < 1

This was partly found using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Rickard Strandqvist 655aa39306 net: ethernet: toshiba: ps3_gelic_net.c: Cleaning up a check on a memory allocation
A check on a memory allocation is checked incorrectly.

This was partly found using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
françois romieu a25aafaa5b amd-xgbe: fix unused variable compilation warning in phylib driver
Fix following compilation warning:
[...]
  CC      drivers/net/phy/amd-xgbe-phy.o
drivers/net/phy/amd-xgbe-phy.c:1353:30: warning:
‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable]
 static struct mdio_device_id amd_xgbe_phy_ids[] = {
                              ^
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Alexei Starovoitov df6d0f983a net: filter: fix nlattr and nlattr_nest BPF tests
- 'struct nlattr' must be 2 byte aligned
- provide big-endian input data for nlattr/nlattr_nest tests

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Alexei Starovoitov e430f34ee5 net: filter: cleanup A/X name usage
The macro 'A' used in internal BPF interpreter:
 #define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.

This patch is trying to clean up the naming and clarify its usage in the
following way:

- A and X are names of two classic BPF registers

- BPF_REG_A denotes internal BPF register R0 used to map classic register A
  in internal BPF programs generated from classic

- BPF_REG_X denotes internal BPF register R7 used to map classic register X
  in internal BPF programs generated from classic

- internal BPF instruction format:
struct sock_filter_int {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};

- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
  BPF_X - means use register X as source operand
  BPF_K - means use 32-bit immediate as source operand
In internal:
  BPF_X - means use 'src_reg' register as source operand
  BPF_K - means use 32-bit immediate as source operand

Suggested-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Manuel Schölling 84a7c0b1db dns_resolver: assure that dns_query() result is null-terminated
dns_query() credulously assumes that keys are null-terminated and
returns a copy of a memory block that is off by one.

Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:12:04 -07:00
Michael Neuling d4e58e5928 powerpc/powernv: Enable POWER8 doorbell IPIs
This patch enables POWER8 doorbell IPIs on powernv.

Since doorbells can only IPI within a core, we test to see when we can use
doorbells and if not we fall back to XICS.  This also enables hypervisor
doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.

Based on tests by Anton, the best case IPI latency between two threads dropped
from 894ns to 512ns.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:05:12 +10:00
Michael Neuling 9b6a68d943 powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry
Currently when entering fastsleep we clear all LPCR PECE bits.

This patch changes it to only clear the decrementer bit (ie. PECE1), which is
the only bit we really need to clear here.  This is needed if we want to set
other wakeup causes like the PECEDH bit so we can use hypervisor doorbells on
powernv.  Also we no longer clear the MER bit as it should never be set in the
host anyway.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:05:12 +10:00
Gavin Shan 5c7a35e3e2 powerpc/powernv: Fix killed EEH event
On PowerNV platform, EEH errors are reported by IO accessors or poller
driven by interrupt. After the PE is isolated, we won't produce EEH
event for the PE. The current implementation has possibility of EEH
event lost in this way:

The interrupt handler queues one "special" event, which drives the poller.
EEH thread doesn't pick the special event yet. IO accessors kicks in, the
frozen PE is marked as "isolated" and EEH event is queued to the list.
EEH thread runs because of special event and purge all existing EEH events.
However, we never produce an other EEH event for the frozen PE. Eventually,
the PE is marked as "isolated" and we don't have EEH event to recover it.

The patch fixes the issue to keep EEH events for PEs that have been
marked as "isolated" with the help of additional "force" help to
eeh_remove_event().

Reported-by: Rolf Brudeseth <rolfb@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:33 +10:00
Paul Bolle 6e0fdf9af2 powerpc: fix typo 'CONFIG_PMAC'
Commit b0d278b7d3 ("powerpc/perf_event: Reduce latency of calling
perf_event_do_pending") added a check for CONFIG_PMAC were a check for
CONFIG_PPC_PMAC was clearly intended.

Fixes: b0d278b7d3 ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:29 +10:00
Paul Bolle b69a1da94f powerpc: fix typo 'CONFIG_PPC_CPU'
Commit cd64d1697c ("powerpc: mtmsrd not defined") added a check for
CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended.

Fixes: cd64d1697c ("powerpc: mtmsrd not defined")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:25 +10:00
Gavin Shan 71b540adff powerpc/powernv: Don't escalate non-existing frozen PE
Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE")
escalates the frozen state on non-existing PE to fenced PHB. It
was to improve kdump reliability. After that, commit 361f2a2a
("powrpc/powernv: Reset PHB in kdump kernel") was introduced to
issue complete reset on all PHBs to increase the reliability of
kdump kernel.

Commit cb5b242c becomes unuseful and it would be reverted.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:20 +10:00
Gavin Shan 1ad7a72c5e powerpc/eeh: Report frozen parent PE prior to child PE
When we have the corner case of frozen parent and child PE at the
same time, we have to handle the frozen parent PE prior to the
child. Without clearning the frozen state on parent PE, the child
PE can't be recovered successfully.

The patch searches the EEH PE hierarchy tree and returns the toppest
frozen PE to be handled. It ensures the frozen parent PE will be
handled prior to child PE.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:16 +10:00
Gavin Shan 2c66599206 powerpc/eeh: Clear frozen state for child PE
Since commit cb523e09 ("powerpc/eeh: Avoid I/O access during PE
reset"), the PE is kept as frozen state on hardware level until
the PE reset is done completely. After that, we explicitly clear
the frozen state of the affected PE. However, there might have
frozen child PEs of the affected PE and we also need clear their
frozen state as well. Otherwise, the recovery is going to fail.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:12 +10:00
Anton Blanchard 4817fc323d powerpc/powernv: Reduce panic timeout from 180s to 10s
We've already dropped the default pseries timeout to 10s, do
the same for powernv.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:07 +10:00
Kees Cook 50b66dbf87 powerpc/xmon: avoid format string leaking to printk
This makes sure format strings cannot leak into printk (the string has
already been correctly processed for format arguments).

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:04:03 +10:00
Michael Ellerman 3752e453f6 selftests/powerpc: Add tests of PMU EBBs
The Power8 Performance Monitor Unit (PMU) has a new feature called Event
Based Branches (EBB). This commit adds tests of the kernel API for using
EBBs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:58 +10:00
Michael Ellerman 33b4819f3b selftests/powerpc: Add support for skipping tests
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:54 +10:00
Michael Ellerman de506f73dd selftests/powerpc: Put the test in a separate process group
Allows us to kill the test and any children it has spawned.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:49 +10:00
Michael Ellerman 0a6121cf33 selftests/powerpc: Fix instruction loop for ABIv2 (LE)
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:45 +10:00
Michael Ellerman 3df48c981d powerpc/perf: Ensure all EBB register state is cleared on fork()
In commit 330a1eb "Core EBB support for 64-bit book3s" I messed up
clear_task_ebb(). It clears some but not all of the task's Event Based
Branch (EBB) registers when we duplicate a task struct.

That allows a child task to observe the EBBHR & EBBRR of its parent,
which it should not be able to do.

Fix it by clearing EBBHR & EBBRR.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@vger.kernel.org [v3.11+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:41 +10:00
Joel Stanley caf69ba627 powerpc/powernv: Fix reading of OPAL msglog
memory_return_from_buffer returns a signed value, so ret should be
ssize_t.

Fixes the following issue reported by David Binderman:

  [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style)
  Checking if unsigned variable 'ret' is less than zero.
  [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style)
  Checking if unsigned variable 'ret' is less than zero.

  Local variable "ret" is of type size_t. This is always unsigned,
  so it is pointless to check if it is less than zero.

  https://bugzilla.kernel.org/show_bug.cgi?id=77551

Fixing this exposes a real bug for the case where the entire count
bytes is successfully read from the POS_WRAP case. The second
memory_read_from_buffer will return EINVAL, causing the entire read to
return EINVAL to userspace, despite the data being copied correctly. The
fix is to test for the case where the data has been read and return
early.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:36 +10:00
Dan Carpenter be8f9642a0 powerpc/spufs: Remove duplicate SPUFS_CNTL_MAP_SIZE define
The SPUFS_CNTL_MAP_SIZE define is cut and pasted twice so we can delete
the second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:32 +10:00
Dan Carpenter d9d8212319 powerpc/cpm: Remove duplicate FCC_GFMR_TTX define
The FCC_GFMR_TTX define is cut and pasted twice so we can remove the
second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:28 +10:00
Guo Chao ddf0322a3f powerpc/powernv: Fix endianness problems in EEH
EEH information fetched from OPAL need fix before using in LE environment.
To be included in sparse's endian check, declare them as __beXX and
access them by accessors.

Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:23 +10:00
Anton Blanchard 8b9f9269bc crypto/nx: disable NX on little endian builds
The NX driver has endian issues so disable it for now.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:19 +10:00
Anton Blanchard 1bd098903f powernv: Fix permissions on sysparam sysfs entries
Everyone can write to these files, which is not what we want.

Cc: stable@vger.kernel.org # 3.15
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:15 +10:00
Shreyas B. Prabhu ad41733062 powerpc/powernv : Disable subcore for UP configs
Build throws following errors when CONFIG_SMP=n
arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’:
arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function)
arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment

'setup_max_cpus' variable is relevant only on SMP, so there is no point
working around it for UP. Furthermore, subcore itself is relevant only
on SMP and hence the better solution is to exclude subcore.o and
subcore-asm.o for UP builds.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 17:03:10 +10:00