This rearranges head_64.S so that we have all the first-level exception
prologs together starting at 0x100, followed by all the second-level
handlers that are invoked from the first-level prologs, followed by
other code. This doesn't make any functional change but will make
following changes for relocatable kernel support easier.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Kdump kernel needs to use only those memory regions that it is allowed
to use (crashkernel, rtas, tce, etc.). Each of these regions have
their own sizes and are currently added under 'linux,usable-memory'
property under each memory@xxx node of the device tree.
The ibm,dynamic-memory property of ibm,dynamic-reconfiguration-memory
node (on POWER6) now stores in it the representation for most of the
logical memory blocks with the size of each memory block being a
constant (lmb_size). If one or more or part of the above mentioned
regions lie under one of the lmb from ibm,dynamic-memory property,
there is a need to identify those regions within the given lmb.
This makes the kernel recognize a new 'linux,drconf-usable-memory'
property added by kexec-tools. Each entry in this property is of the
form of a count followed by that many (base, size) pairs for the above
mentioned regions. The number of cells in the count value is given by
the #size-cells property of the root node.
Signed-off-by: Chandru Siddalingappa <chandru@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The return code from invocation of the notifier for
pSeries_reconfig_chain during update of the device tree is not
checked. This causes writes to /proc/ppc64/ofdt to update memory
properties (i.e. ibm,dyamic-reconfiguration-memory) to always
return success, instead of the result of the notifier chain.
This happens specifically when we remove/add memory from the
device tree on machines using memory specified in the
ibm,dynamic-reconfiguration-memory property of the device tree.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This new copy_4K_page() function was originally tuned for the best
performance on the Cell processor, but after testing on more 64bit
powerpc chips it was found that with a small modification it either
matched the performance offered by the current mainline version or
bettered it by a small amount.
It was found that on a Cell-based QS22 blade the amount of system
time measured when compiling a 2.6.26 pseries_defconfig decreased
by 4%. Using the same test, a 4-way 970MP machine saw a decrease of
2% in system time. No noticeable change was seen on Power4, Power5
or Power6.
The 4096 byte page is copied in thirty-two 128 byte strides. An
initial setup loop executes dcbt instructions for the whole source
page and dcbz instructions for the whole destination page. To do
this, the cache line size is retrieved from ppc64_caches.
A new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, (introduced in the
previous patch) is used to make the modification to this new copy
routine - on Power4, 970 and Cell the feature bit is set so the
setup loop is executed, but on all other 64bit chips the setup
loop is nop'ed out.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, to be added to the
64bit powerpc chips that benefit from having dcbt and dcbz
instructions used in their memory copy routines.
This will be used in a subsequent patch that updates copy_4K_page().
The new bit is added to Cell, PPC970 and Power4 because they show
better performance with the new copy_4K_page() when dcbt and dcbz
instructions are used.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: make minimum fanout 3
UBIFS: fix division by zero
UBIFS: amend f_fsid
UBIFS: fill f_fsid
UBIFS: improve statfs reporting even more
UBIFS: introduce LEB overhead
UBIFS: add forgotten gc_idx_lebs component
UBIFS: fix assertion
UBIFS: improve statfs reporting
UBIFS: remove incorrect index space check
UBIFS: push empty flash hack down
UBIFS: do not update min_idx_lebs in stafs
UBIFS: allow for racing between GC and TNC
UBIFS: always read hashed-key nodes under TNC mutex
UBIFS: fix zero-length truncations
It was introduced by "vsprintf: add support for '%pS' and '%pF' pointer
formats" in commit 0fe1ef24f7. However,
the current way its coded doesn't work on parisc64. For two reasons: 1)
parisc isn't in the #ifdef and 2) parisc has a different format for
function descriptors
Make dereference_function_descriptor() more accommodating by allowing
architecture overrides. I put the three overrides (for parisc64, ppc64
and ia64) in arch/kernel/module.c because that's where the kernel
internal linker which knows how to deal with function descriptors sits.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jie Yang at Atheros is getting more directly involved with upstream work
on the atl* drivers. This patch changes the ATL1 entry to ATLX (atl2
support posted to netdev today) and adds him as a maintainer.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the 2.6.27 circle ->fasync lost the BKL, and the last remaining
->open variant that takes the BKL is also gone. ->get_sb and ->kill_sb
didn't have BKL forever, so updated the entries while we're at that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When disconnected ccw devices are removed, the device has to be set
offline, otherwise there will be side effects including a reference
count imbalance. This patch modifies ccw_device_offline to work for
devices in disconnecte/not operational state. ccw_device_offline is
called by cio for devices which are online during device removal.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ssch() has two classes of return codes:
- condition codes (0-3) which need to be translated to Linux
error codes
- Linux error codes (-EIO on exceptions) which should be passed
to the caller (instead of erronously being handled like
condition code 3)
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix cleanup on error in chp_new() and init_channel_subsystem()
(must not call kfree() on structures that had been registered).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When running a 31-bit ptrace, on either an s390 or s390x kernel,
reads and writes into a padding area in struct user_regs_struct32
will result in a kernel panic.
This is also known as CVE-2008-1514.
Test case available here:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/user-area-padding.c?cvsroot=systemtap
Steps to reproduce:
1) wget the above
2) gcc -o user-area-padding-31bit user-area-padding.c -Wall -ggdb2 -D_GNU_SOURCE -m31
3) ./user-area-padding-31bit
<panic>
Test status
-----------
Without patch, both s390 and s390x kernels panic. With patch, the test case,
as well as the gdb testsuite, pass without incident, padding area reads
returning zero, writes ignored.
Nb: original version returned -EINVAL on write attempts, which broke the
gdb test and made the test case slightly unhappy, Jan Kratochvil suggested
the change to return 0 on write attempts.
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
avr32: pm_standby low-power ram bug fix
avr32: Fix lockup after Java stack underflow in user mode
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix rare boot build breakage
powerpc/spufs: Fix possible scheduling of a context to multiple SPEs
powerpc/spufs: Fix race for a free SPU
powerpc/spufs: Fix multiple get_spu_context()
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: arch_reinit_sched_domains() must destroy domains to force rebuild
sched, cpuset: rework sched domains and CPU hotplug handling (v4)
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ahci: RAID mode SATA patch for Intel Ibex Peak DeviceIDs
pata_sil680: remove duplicate pcim_enable_device
libata-sff: kill spurious WARN_ON() in ata_hsm_move()
sata_nv: disable hardreset for generic
ahci: disable PMP for marvell ahcis
sata_mv: add RocketRaid 1720 PCI ID to driver
ahci, pata_marvell: play nicely together
... one entry lacked a colon which broke one of my scripts.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
bridge: don't allow setting hello time to zero
netns : fix kernel panic in timewait socket destruction
pkt_sched: Fix qdisc state in net_tx_action()
netfilter: nf_conntrack_irc: make sure string is terminated before calling simple_strtoul
netfilter: nf_conntrack_gre: nf_ct_gre_keymap_flush() fixlet
netfilter: nf_conntrack_gre: more locking around keymap list
netfilter: nf_conntrack_sip: de-static helper pointers
The hw interface drivers for the usb serial devices deference the tty
structure to set up the parameters for the initial console. The tty
structure should be passed as a parameter to the set_termios() call.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Automounter maps can contain mount options valid for other NFS
implementations but not for Linux. The Linux automounter uses the
mount command's "-s" command line option ("s" for "sloppy") so that
mount requests containing such options are not rejected.
Commit f45663ce5f attempted to address a
known regression with text-based NFS mount option parsing. Unrecognized
mount options would cause mount requests to fail, even if the "-s"
option was used on the mount command line.
Unfortunately, this commit was not complete as submitted. It adds a
new mount option, "sloppy". But it is missing a hunk, so it now allows
NFS mounts with unrecognized mount options, even if the "sloppy" option
is not present. This could be a problem if a required critical mount
option such as "sync" is misspelled, for example, and is considered a
regression from 2.6.26.
This patch restores the missing hunk. Now, the default behavior of
text-based NFS mount options is as before: any unrecognized mount option
will cause the mount to fail.
Please include this in 2.6.27-rc.
Thanks to Neil Brown for reporting this.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dushan Tcholich reports that on his system ksoftirqd can consume
between %6 to %10 of cpu time, and cause ~200 context switches per
second.
He then correlated this with a report by bdupree@techfinesse.com:
http://marc.info/?l=linux-kernel&m=119613299024398&w=2
and the culprit cause seems to be starting the bridge interface.
In particular, when starting the bridge interface, his scripts
are specifying a hello timer interval of "0".
The bridge hello time can't be safely set to values less than 1
second, otherwise it is possible to end up with a runaway timer.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
How to reproduce ?
- create a network namespace
- use tcp protocol and get timewait socket
- exit the network namespace
- after a moment (when the timewait socket is destroyed), the kernel
panics.
# BUG: unable to handle kernel NULL pointer dereference at
0000000000000007
IP: [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
PGD 119985067 PUD 11c5c0067 PMD 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: ipv6 button battery ac loop dm_mod tg3 libphy ext3 jbd
edd fan thermal processor thermal_sys sg sata_svw libata dock serverworks
sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
Pid: 0, comm: swapper Not tainted 2.6.27-rc2 #3
RIP: 0010:[<ffffffff821e394d>] [<ffffffff821e394d>]
inet_twdr_do_twkill_work+0x6e/0xb8
RSP: 0018:ffff88011ff7fed0 EFLAGS: 00010246
RAX: ffffffffffffffff RBX: ffffffff82339420 RCX: ffff88011ff7ff30
RDX: 0000000000000001 RSI: ffff88011a4d03c0 RDI: ffff88011ac2fc00
RBP: ffffffff823392e0 R08: 0000000000000000 R09: ffff88002802a200
R10: ffff8800a5c4b000 R11: ffffffff823e4080 R12: ffff88011ac2fc00
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000041cbd940(0000) GS:ffff8800bff839c0(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000007 CR3: 00000000bd87c000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8800bff9e000, task
ffff88011ff76690)
Stack: ffffffff823392e0 0000000000000100 ffffffff821e3a3a
0000000000000008
0000000000000000 ffffffff821e3a61 ffff8800bff7c000 ffffffff8203c7e7
ffff88011ff7ff10 ffff88011ff7ff10 0000000000000021 ffffffff82351108
Call Trace:
<IRQ> [<ffffffff821e3a3a>] ? inet_twdr_hangman+0x0/0x9e
[<ffffffff821e3a61>] ? inet_twdr_hangman+0x27/0x9e
[<ffffffff8203c7e7>] ? run_timer_softirq+0x12c/0x193
[<ffffffff820390d1>] ? __do_softirq+0x5e/0xcd
[<ffffffff8200d08c>] ? call_softirq+0x1c/0x28
[<ffffffff8200e611>] ? do_softirq+0x2c/0x68
[<ffffffff8201a055>] ? smp_apic_timer_interrupt+0x8e/0xa9
[<ffffffff8200cad6>] ? apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff82011f4c>] ? default_idle+0x27/0x3b
[<ffffffff8200abbd>] ? cpu_idle+0x5f/0x7d
Code: e8 01 00 00 4c 89 e7 41 ff c5 e8 8d fd ff ff 49 8b 44 24 38 4c 89 e7
65 8b 14 25 24 00 00 00 89 d2 48 8b 80 e8 00 00 00 48 f7 d0 <48> 8b 04 d0
48 ff 40 58 e8 fc fc ff ff 48 89 df e8 c0 5f 04 00
RIP [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
RSP <ffff88011ff7fed0>
CR2: 0000000000000007
This patch provides a function to purge all timewait sockets related
to a network namespace. The timewait sockets life cycle is not tied with
the network namespace, that means the timewait sockets stay alive while
the network namespace dies. The timewait sockets are for avoiding to
receive a duplicate packet from the network, if the network namespace is
freed, the network stack is removed, so no chance to receive any packets
from the outside world. Furthermore, having a pending destruction timer
on these sockets with a network namespace freed is not safe and will lead
to an oops if the timer callback which try to access data belonging to
the namespace like for example in:
inet_twdr_do_twkill_work
-> NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED);
Purging the timewait sockets at the network namespace destruction will:
1) speed up memory freeing for the namespace
2) fix kernel panic on asynchronous timewait destruction
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 32-bit, at least the generic nops are fairly reasonable, but the
default nops for 64-bit really look pretty sad, and the P6 nops really do
look better.
So I would suggest perhaps moving the static P6 nop selection into the
CONFIG_X86_64 thing.
The alternative is to just get rid of that static nop selection, and just
have two cases: 32-bit and 64-bit, and just pick obviously safe cases for
them.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Set the class so it doesn't clash with the normal memory class.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
===================================================================
The second HPC3 could be found only on Guiness systems (Challenge-S),
but not on fullhouse (Indigo2) systems.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Remove duplicate call to pcim_enable_device in sil680_init_one.
Signed-off-by: David Milburn <dmilburn@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
On HSM_ST_ERR, ata_hsm_move() triggers WARN_ON() if AC_ERR_DEV or
AC_ERR_HSM is not set. PHY events may trigger HSM_ST_ERR with other
error codes and, with or without it, there just isn't much reason to
do WARN_ON() on it. Even if error code is not set there, core EH
logic won't have any problem dealing with the error condition.
OSDL bz#11065 reports this problem.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
of them being unifying probing, hotplug and EH reset paths uniform.
Previously, broken hardreset could go unnoticed as it wasn't used
during probing but when something goes wrong or after hotplug the
problem will surface and bite hard.
OSDL bug 11195 reports that sata_nv generic flavor falls into this
category. Hardreset itself succeeds but PHY stays offline after
hardreset. I tried longer debounce timing but the result was the
same.
http://bugzilla.kernel.org/show_bug.cgi?id=11195
So, it seems we'll have to drop hardreset from the generic flavor.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peer Chen <pchen@nvidia.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Marvell ahcis don't play nicely with PMPs. Disable it.
Reported by KueiHuan Chen in the following thread.
http://thread.gmane.org/gmane.linux.ide/33296
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: KueiHuan Chen <kueihuan.chen@gmail.com>
Cc: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
I've been chasing Jeff about this for months. Jeff added the Marvell
device identifiers to the ahci driver without making the AHCI driver
handle the PATA port. This means a lot of users can't use current
kernels and in most distro cases can't even install.
This has been going on since March 2008 for the 6121 Marvell, and late 2007
for the 6145!!!
This was all pointed out at the time and repeatedly ignored. Bugs assigned
to Jeff about this are ignored also.
To quote Jeff in email
> "Just switch the order of 'ahci' and 'pata_marvell' in
> /etc/modprobe.conf, then use Fedora's tools regenerate the initrd.
> See? It's not rocket science, and the current configuration can be
> easily made to work for Fedora users."
(Which isn't trivial, isn't end user, shouldn't be needed, and as it usually
breaks at install time is in fact impossible)
To quote Jeff in August 2007
> " mv-ahci-pata
> Marvell 6121/6141 PATA support. Needs fixing in the 'PATA controller
> command' area before it is usable, and can go upstream."
Only he add the ids anyway later and caused regressions, adding a further
id in March causing more regresions.
The actual fix for the moment is very simple. If the user has included
the pata_marvell driver let it drive the ports. If they've only selected
for SATA support give them the AHCI driver which will run the port a fraction
faster. Allow the user to control this decision via ahci.marvell_enable as
a module parameter so that distributions can ship 'it works' defaults and
smarter users (or config tools) can then flip it over it desired.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
A make -j20 powerpc kernel build broke a couple of months ago saying:
In file included from arch/powerpc/boot/gunzip_util.h:13,
from arch/powerpc/boot/prpmc2800.c:21:
arch/powerpc/boot/zlib.h:85: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘*’ token
arch/powerpc/boot/zlib.h:630: warning: type defaults to ‘int’ in declaration of ‘Byte’
arch/powerpc/boot/zlib.h:630: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
It happened again yesterday: too rare for me to confirm the fix, but
it looks like the list of dependants on gunzip_util.h was incomplete.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reverts commit bd699f2df6,
which causes camellia to fail the included self-test vectors.
It has also been confirmed that it breaks existing encrypted
disks using camellia.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
net_tx_action() can skip __QDISC_STATE_SCHED bit clearing while qdisc
is neither ran nor rescheduled, which may cause endless loop in
dev_deactivate().
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Tested-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan points out:
1. simple_strtoul() silently accepts all characters for given base even
if result won't fit into unsigned long. This is amazing stupidity in
itself, but
2. nf_conntrack_irc helper use simple_strtoul() for DCC request parsing.
Data first copied into 64KB buffer, so theoretically nothing prevents
reading past the end of it, since data comes from network given 1).
This is not actually a problem currently since we're guaranteed to have
a 0 byte in skb_shared_info or in the buffer the data is copied to, but
to make this more robust, make sure the string is actually terminated.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It does "kfree(list_head)" which looks wrong because entity that was
allocated is definitely not list_head.
However, this all works because list_head is first item in
struct nf_ct_gre_keymap.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
gre_keymap_list should be protected in all places.
(unless I'm misreading something)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Helper's ->help hook can run concurrently with itself, so iterating over
SIP helpers with static pointer won't work reliably.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have a race when scheduling a context to a SPE -
after we have found a runnable context in spusched_tick, the same
context may have been scheduled by spu_activate().
This may result in a panic if we try to unschedule a context that has
been freed in the meantime.
This change exits spu_schedule() if the context has already been
scheduled, so we don't end up scheduling it twice.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>