* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Don't clobber the attribute type in nfs_update_inode()
NFS: Fix a umount race
NFS: Fix an Oops when truncating a file
NFS: Ensure that we handle NFS4ERR_STALE_STATEID correctly
NFSv4.1: Don't call nfs4_schedule_state_recovery() unnecessarily
NFSv4: Don't allow posix locking against servers that don't support it
NFSv4: Ensure that the NFSv4 locking can recover from stateid errors
NFS: Avoid warnings when CONFIG_NFS_V4=n
NFS: Make nfs_commitdata_release static
NFS: Try to commit unstable writes in nfs_release_page()
NFS: Fix a reference leak in nfs_wb_cancel_page()
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Handle futex value corruption gracefully
futex: Handle user space corruption gracefully
futex_lock_pi() key refcnt fix
softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Extend umount wait coverage to full glock lifetime
GFS2: Wait for unlock completion on umount
Commit 859ddf0974 tried to fix
misallocation bug but broke full bit marking by not clearing
pa[idp->layers] and also is causing X failures due to lookup failure
in drm code. The cause of the latter hasn't been found yet. Revert
the fix for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After hours of debugging, I finally found the reason why some source
and runtime combination does not work. The PTP (page table pages)
address must be aligned. I am not sure how much, but alignment to
PAGE_SIZE is sufficient. Also, use ALSA's page allocation routines
to ensure proper virtual -> physical address translation.
Cc: <stable@kernel.org>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Following a gpu hang, we would leak the relocation buffer. So simply
earrange the error path to always free the relocation buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
acpi_lid_open() could take up to 10ms on my computer. Some component is
calling the drm GETCONNECTOR ioctl many times in a row. This results in
flickering (for example, when starting a video). Fix it by assuming an
always connected lid status.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Eric Anholt <eric@anholt.net>
Self Refresh should be disabled on dual plane configs. Otherwise, as
the SR watermark is not calculated for such configs, switching to non
VGA mode causes FIFO underrun and display flicker.
This fixes Korg Bug #14897.
Signed-off-by: David John <davidjon@xenontk.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: stable@kernel.org
Signed-off-by: Eric Anholt <eric@anholt.net>
This version of the i_size fix for fallocate makes sure we only update
the i_size when the current fallocate is really operating outside of
i_size.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When running the following fio job
[torrent]
filename=torrent-test
rw=randwrite
size=4g
filesize=4g
bs=4k
ioengine=sync
you would see long stalls where no work was being done. That is because we were
doing all this extra work to read in the file extent outside of the transaction,
however in the random io case this ends up hurting us because the file extents
are not there to begin with. So axe this logic, since we end up reading in the
file extent when we go to update it anyway. This took the fio job from 11 mb/s
with several ~10 second stalls to 24 mb/s to a couple of 1-2 second stalls.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When dropping a empty tree, walk_down_tree() skips checking
extent information for the tree root. This will triggers a
BUG_ON in walk_up_proc().
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Mounting a bad filesystem caused a BUG_ON(). The following is steps to
reproduce it.
# mkfs.btrfs /dev/sda2
# mount /dev/sda2 /mnt
# mkfs.btrfs /dev/sda1 /dev/sda2
(the program says that /dev/sda2 was mounted, and then exits. )
# umount /mnt
# mount /dev/sda1 /mnt
At the third step, mkfs.btrfs exited in the way of make filesystem. So the
initialization of the filesystem didn't finish. So the filesystem was bad, and
it caused BUG_ON() when mounting it. But BUG_ON() should be called by the wrong
code, not user's operation, so I think it is a bug of btrfs.
This patch fixes it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Increase extent buffer's reference count while holding the lock.
Otherwise it can race with try_release_extent_buffer.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
flush_dcache_page() must be called after (!ATA_TFLAG_WRITE) the
data copying to avoid D-cache aliasing with user space or I-D cache
coherency issues (when reading data from an ATA device using PIO,
the kernel dirties the D-cache but there is no flush_dcache_page()
required on Harvard architectures).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acer G725 shares the same suspend problem with the HP laptops which
lose ATA devices on resume. New firmware which fixes the problem is
already available. Add G725 with old firmwares to the broken suspend
list.
This problem has been reported in bko#15104.
http://bugzilla.kernel.org/show_bug.cgi?id=15104
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jani-Matti Hätinen <jani-matti.hatinen@iki.fi>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The value we get from the low byte of the ATA_ID_SECTOR_SIZE word is not not
a plain multiple, but the log of it, so fix the helper to give the correct
answer. Without this we'll get an incorrect minimal I/O size in the block
limits VPD page for 4k sector drives.
Also change the return value of ata_id_logical_per_physical_sectors to u16
for the unlikely case of very large logical sectors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix assignment which overwrote SAT ATA PASS-THROUGH command EXTEND
bit setting (ATA_TFLAG_LBA48)
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
During recovery, the dlm frees the locks for the dead node. If it finds a
lock in a resource for the dead node, it expects that node to also have a
ref in that lock resource. If not, it BUGs.
ossbz#1175 was filed with the above BUG. Now, while it is correct that we
should be expecting the ref, I see no reason why we have to BUG. After all,
we are freeing up the lock and clearing the ref.
This patch replaces the BUG_ON with a printk(). Hopefully, that will give
us more clues next time this happens.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1175
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
This patch plugs a race between the downconvert thread and an unlock ast message.
Specifically, after the downconvert worker has done its task, the dc thread needs
to check whether an unlock ast made the downconvert moot.
Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Fasheh <mfasheh@sus.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Currently the omap serial clocks are autoidled after 5 seconds.
However, this causes lost characters on the serial ports. As this
is considered non-standard behaviour for Linux, disable the timeout.
Note that this will also cause blocking of any deeper omap sleep
states.
To enable the autoidling of the serial ports, do something like
this for each serial port:
# echo 5 > /sys/devices/platform/serial8250.0/sleep_timeout
# echo 5 > /sys/devices/platform/serial8250.1/sleep_timeout
...
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
I have found an access to already released memory in
clk_debugfs_register_one() function.
Signed-off-by: Marek Skuczynski <mareksk7@gmail.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
David Binderman ran the sourceforge tool cppcheck over the source code of the
new Linux kernel 2.6.33-rc6:
[./arm/mach-omap2/mux.c:492]: (error) Buffer access out-of-bounds
13 characters + 1 digit + 1 zero byte is more than 14 characters.
Also add a comment on mode0 name length in case new omaps
start using longer names.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
3630 has more mux signals than 34xx. The additional pins
exist in omap36xx_cbp_subset, but are not initialized
as the superset is missing these offsets. This causes
the following errors during the boot:
mux: Unknown entry offset 0x236
mux: Unknown entry offset 0x22e
mux: Unknown entry offset 0x1ec
mux: Unknown entry offset 0x1ee
mux: Unknown entry offset 0x1f4
mux: Unknown entry offset 0x1f6
mux: Unknown entry offset 0x1f8
mux: Unknown entry offset 0x1fa
mux: Unknown entry offset 0x1fc
mux: Unknown entry offset 0x22a
mux: Unknown entry offset 0x226
mux: Unknown entry offset 0x230
mux: Unknown entry offset 0x22c
mux: Unknown entry offset 0x228
Fix this by adding the missing offsets to omap3 superset.
Note that additionally the uninitialized pins need to be
skipped on 34xx.
Based on an earlier patch by Allen Pais <allen.pais@ti.com>.
Reported-by: Allen Pais <allen.pais@ti.com>
Signed-off-by: Allen Pais <allen.pais@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Ensure valid clock pointer during GPMC init. Fixes compiler
warning about potential use of uninitialized variable.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Ensure valid base address during IRQ init. Fixes compiler warning
about potential use of uninitialized variable.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
OMAP platforms(like OMAP3530) include DSP or other co-processors
for media acceleration. when carving out memory for the
accelerators we can end up creating a hole in the memory map
of sort:
<kernel memory><hole(memory for accelerator)><kernel memory>
To handle such a memory configuration ARCH_HAS_HOLES_MEMORYMODEL
has to be enabled. For further information refer discussion at:
http://www.mail-archive.com/linux-omap@vger.kernel.org/msg15262.html.
Signed-off-by: Sriramakrishnan <srk@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The WARN_ON in lookup_pi_state which complains about a mismatch
between pi_state->owner->pid and the pid which we retrieved from the
user space futex is completely bogus.
The code just emits the warning and then continues despite the fact
that it detected an inconsistent state of the futex. A conveniant way
for user space to spam the syslog.
Replace the WARN_ON by a consistency check. If the values do not match
return -EINVAL and let user space deal with the mess it created.
This also fixes the missing task_pid_vnr() when we compare the
pi_state->owner pid with the futex value.
Reported-by: Jermome Marchand <jmarchan@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
If the owner of a PI futex dies we fix up the pi_state and set
pi_state->owner to NULL. When a malicious or just sloppy programmed
user space application sets the futex value to 0 e.g. by calling
pthread_mutex_init(), then the futex can be acquired again. A new
waiter manages to enqueue itself on the pi_state w/o damage, but on
unlock the kernel dereferences pi_state->owner and oopses.
Prevent this by checking pi_state->owner in the unlock path. If
pi_state->owner is not current we know that user space manipulated the
futex value. Ignore the mess and return -EINVAL.
This catches the above case and also the case where a task hijacks the
futex by setting the tid value and then tries to unlock it.
Reported-by: Jermome Marchand <jmarchan@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
This fixes a futex key reference count bug in futex_lock_pi(),
where a key's reference count is incremented twice but decremented
only once, causing the backing object to not be released.
If the futex is created in a temporary file in an ext3 file system,
this bug causes the file's inode to become an "undead" orphan,
which causes an oops from a BUG_ON() in ext3_put_super() when the
file system is unmounted. glibc's test suite is known to trigger this,
see <http://bugzilla.kernel.org/show_bug.cgi?id=14256>.
The bug is a regression from 2.6.28-git3, namely Peter Zijlstra's
38d47c1b70 "[PATCH] futex: rely on
get_user_pages() for shared futexes". That commit made get_futex_key()
also increment the reference count of the futex key, and updated its
callers to decrement the key's reference count before returning.
Unfortunately the normal exit path in futex_lock_pi() wasn't corrected:
the reference count is incremented by get_futex_key() and queue_lock(),
but the normal exit path only decrements once, via unqueue_me_pi().
The fix is to put_futex_key() after unqueue_me_pi(), since 2.6.31
this is easily done by 'goto out_put_key' rather than 'goto out'.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Darren Hart <dvhltc@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
If the NFS_ATTR_FATTR_TYPE field isn't set in fattr->valid, then we should
not set the S_IFMT part of inode->i_mode.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that we unregister the bdi before kill_anon_super() calls
ida_remove() on our device name.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
The VM/VFS does not allow mapping->a_ops->invalidatepage() to fail.
Unfortunately, nfs_wb_page_cancel() may fail if a fatal signal occurs.
Since the NFS code assumes that the page stays mapped for as long as the
writeback is active, we can end up Oopsing (among other things).
The only safe fix here is to convert nfs_wait_on_request(), so as to make
it uninterruptible (as is already the case with wait_on_page_writeback()).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Interrupts must be disabled while an interrupt state restore
(prep for interrupt return) is in progress.
Code to do this was lost in the port to the mainline kernel.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
Although all glocks are, by the time of the umount glock wait,
scheduled for demotion, some of them haven't made it far
enough through the process for the original set of waiting
code to wait for them.
This extends the ref count to the whole glock lifetime in order
to ensure that the waiting does catch all glocks. It does make
it a bit more invasive, but it seems the only sensible solution
at the moment.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch adds a wait on umount between the point at which we
dispose of all glocks and the point at which we unmount the
lock protocol. This ensures that we've received all the replies
to our unlock requests before we stop the locking.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Fabio M. Di Nitto <fdinitto@redhat.com>
During blocked lock processing, we should consider the possibility that the
lock is no longer blocking.
Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.
Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
During upconvert, if the master were to send a BAST, dlmglue will detect the
upconversion in process and send a cancel convert to the master. Upon receiving
the AST for the cancel convert, it will re-process the lock resource to determine
whether it needs downconverting. Say, the up was from PR to EX and the BAST was
for EX. After the cancel convert, it will need to downconvert to NL.
However, if the node was originally upconverting from NL to EX, then there would
be no reason to downconvert (assuming the same message sequence).
This patch makes dlmglue consider the possibility that the current lock level
is already compatible and that downconverting is not required.
Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.
Fixes ossbz#1178
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178
Reported-by: Coly Li <coly.li@suse.de>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were
to get an ast for an upconvert request, followed immediately by a bast,
there is a small window where the fs may downconvert the lock before the
process requesting the upconvert is able to take the lock.
This patch adds a new flag to indicate that the upconvert is still in
progress and that the dc thread should not downconvert it right now.
Wengang Wang <wen.gang.wang@oracle.com> and Joel Becker
<joel.becker@oracle.com> contributed heavily to this patch.
Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to
downconverted.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Acked-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Although we use u64 to pass userspace pointers to the kernel
to avoid compat_ioctl, it doesn't work in some ppc platform.
So wrap them with compat_ptr and add compat_ioctl.
The detailed discussion about compat_ptr can be found in thread
http://lkml.org/lkml/2009/10/27/423.
We indeed met with a bug when testing on ppc(-EFAULT is returned
when using old_path). This patch try to fix this.
I have tested in ppc64(with 32 bit reflink) and x86_64(with i686
reflink), both works.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Mainline commit aad1b15310 made the
dlm_begin_reco_handler() return -EAGAIN instead of EAGAIN.
As this error is transmitted over the wire, we want the receiver,
dlm_send_begin_reco_message(), to understand both the older EAGAIN and
the newer -EAGAIN, to allow rolling upgrade of the cluster nodes.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
In CoW, we have to make sure that the page is already written
out to the disk. So we have a BUG_ON(PageDirty(page)).
In ppc platform we have pagesize=64K, so if the cs=4K, if the
file have fragmented clusters, we will map the page many times.
See this file as an example.
Tree Depth: 0 Count: 19 Next Free Rec: 14
## Offset Clusters Block# Flags
0 0 4 2164864 0x2 Refcounted
1 4 2 9302792 0x2 Refcounted
...
We have to replace the extent recs one by one, so the page with index 0
will be mapped and dirtied twice.
I'd like to leave the BUG_ON there while adding a check so that in
case we meet with an error in other platforms, we can find it easily.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>