WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Eric W. Biederman	5e85d4abe3	[PATCH] task: Make task list manipulations RCU safe While we can currently walk through thread groups, process groups, and sessions with just the rcu_read_lock, this opens the door to walking the entire task list. We already have all of the other RCU guarantees so there is no cost in doing this, this should be enough so that proc can stop taking the tasklist lock during readdir. prev_task was killed because it has no users, and using it will miss new tasks when doing an rcu traversal. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-19 09:13:49 -07:00
Jens Axboe	9e0267c26e	[PATCH] splice: fixup writeout path after ->map changes Since ->map() no longer locks the page, we need to adjust the handling of those pages (and stealing) a little. This now passes full regressions again. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-19 15:57:31 +02:00
Jens Axboe	a4514ebd8e	[PATCH] splice: offset fixes - We need to adjust *ppos for writes as well. - Copy back modified offset value if one was passed in, similar to what sendfile does. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-19 15:57:05 +02:00
Jens Axboe	2a27250e6c	[PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks We need to ensure that we only drop a lock that is ordered last, to avoid ABBA deadlocks with competing processes. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-19 15:56:40 +02:00
Jens Axboe	c4f895cbe1	[PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling - generic_file_splice_read() more readable and correct - Don't bail on page allocation with NONBLOCK set, just don't allow direct blocking on IO (eg lock_page). Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-19 15:56:12 +02:00
Jens Axboe	91ad66ef44	[PATCH] splice: close i_size truncate races on read We need to check i_size after doing a blocking readpage. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-19 15:55:10 +02:00
Linus Torvalds	385910f2b2	x86: be careful about tailcall breakage for sys_open[at] too Came up through a quick grep for other cases similar to the ftruncate() one in commit `0a489cb3b6`. Also, add a comment, so that people who read the code understand why we do what looks like a no-op. (Again, this won't actually matter to any sane user, since libc will save and restore the register gcc stomps on, but it's still wrong to stomp on it) Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-18 13:22:59 -07:00
Linus Torvalds	0a489cb3b6	x86: don't allow tail-calls in sys_ftruncate[64]() Gcc thinks it owns the incoming argument stack, but that's not true for "asmlinkage" functions, and it corrupts the caller-set-up argument stack when it pushes the third argument onto the stack. Which can result in %ebx getting corrupted in user space. Now, normally nobody sane would ever notice, since libc will save and restore %ebx anyway over the system call, but it's still wrong. I'd much rather have "asmlinkage" tell gcc directly that it doesn't own the stack, but no such attribute exists, so we're stuck with our hacky manual "prevent_tail_call()" macro once more (we've had the same issue before with sys_waitpid() and sys_wait4()). Thanks to Hans-Werner Hilse <hilse@sub.uni-goettingen.de> for reporting the issue and testing the fix. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-18 13:02:48 -07:00
Ananiev, Leonid I	75616cf985	[PATCH] ext3: Fix missed mutex unlock Missed unlock_super()call is added in error condition code path. Signed-off-by: Leonid Ananiev <leonid.i.ananiev@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-04-17 14:24:57 -07:00
Stephen Rothwell	2436f039d2	[PATCH] Fix block device symlink name As noted further on the this file, some block devices have a / in their name, so fix the "block:..." symlink name the same as the /sys/block name. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-04-17 14:24:57 -07:00
Kay Sievers	d4d7e5dffc	[PATCH] BLOCK: delay all uevents until partition table is scanned [BLOCK] delay all uevents until partition table is scanned Here we delay the annoucement of all block device events until the disk's partition table is scanned and all partition devices are already created and sysfs is populated. We have a bunch of old bugs for removable storage handling where we probe successfully for a filesystem on the raw disk, but at the same time the kernel recognizes a partition table and creates partition devices. Currently there is no sane way to tell if partitions will show up or not at the time the disk device is announced to userspace. With the delayed events we can simply skip any probe for a filesystem on the raw disk when we find already present partitions. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-04-14 11:41:24 -07:00
NeilBrown	4508a7a734	[PATCH] sysfs: Allow sysfs attribute files to be pollable It works like this: Open the file Read all the contents. Call poll requesting POLLERR or POLLPRI (so select/exceptfds works) When poll returns, close the file and go to top of loop. or lseek to start of file and go back to the 'read'. Events are signaled by an object manager calling sysfs_notify(kobj, dir, attr); If the dir is non-NULL, it is used to find a subdirectory which contains the attribute (presumably created by sysfs_create_group). This has a cost of one int per attribute, one wait_queuehead per kobject, one int per open file. The name "sysfs_notify" may be confused with the inotify functionality. Maybe it would be nice to support inotify for sysfs attributes as well? This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action to be pollable Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-04-14 11:41:24 -07:00
Linus Torvalds	9a7e9f1c60	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse: [fuse] Direct I/O should not use fuse_reset_request [fuse] Don't init request twice [fuse] Fix accounting the number of waiting requests [fuse] fix deadlock between fuse_put_super() and request_end()	2006-04-14 09:11:34 -07:00
Linus Torvalds	9ca686626c	Merge branch 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block * 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: add support for sys_tee() [PATCH] splice: pass offset around for ->splice_read() and ->splice_write()	2006-04-14 09:02:07 -07:00
Eric W. Biederman	c06511d12d	[PATCH] de_thread: Don't change our parents and ptrace flags. This is two distinct changes. - Not changing our real parents. - Not changing our ptrace parents. Not changing our real parents is trivially correct because both tasks have the same real parents as they are part of a thread group. Now that we demote the leader to a thread there is no longer any reason to change it's parentage. Not changing our ptrace parents is a user visible change if someone looks hard enough. I don't think user space applications will care or even notice. In the practical and I think common case a debugger will have attached to all of the threads using the same ptrace flags. From my quick skim of strace and gdb that appears to be the case. Which if true means debuggers will not notice a change. Before this point we have already generated a ptrace event in do_exit that reports the leaders pid has died so de_thread is visible to a debugger. Which means attempting to hide this case by copying flags around appears excessive. By not doing anything it avoids all of the weird locking issues between de_thread and ptrace attach, and removes one case from consideration for fixing the ptrace locking. This only addresses Oleg's first concern with ptrace_attach, that of the problems caused by reparenting. Oleg's second concern is essentially a race between ptrace_attach and release_task that causes an oops when we get to force_sig_specific. There is nothing special about de_thread with respect to that race. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-14 08:49:19 -07:00
Miklos Szeredi	56cf34ff07	[fuse] Direct I/O should not use fuse_reset_request It's cleaner to allocate a new request, otherwise the uid/gid/pid fields of the request won't be filled in. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:51 +02:00
Miklos Szeredi	4858cae4f0	[fuse] Don't init request twice Request is already initialized in fuse_request_alloc() so no need to do it again in fuse_get_req(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:38 +02:00
Miklos Szeredi	9bc5dddad1	[fuse] Fix accounting the number of waiting requests Properly accounting the number of waiting requests was forgotten in "clean up request accounting" patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:09 +02:00
Miklos Szeredi	73ce8355c2	[fuse] fix deadlock between fuse_put_super() and request_end() A deadlock was possible, when the last reference to the superblock was held due to a background request containing a file reference. Releasing the file would release the vfsmount which in turn would release the superblock. Since sbput_sem is held during the fput() and fuse_put_super() tries to acquire this same semaphore, a deadlock results. The chosen soltuion is to get rid of sbput_sem, and instead use the spinlock to ensure the referenced inodes/file are released only once. Since the actual release may sleep, defer these outside the locked region, but using local variables instead of the structure members. This is a much more rubust solution. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:14:26 +02:00
Jens Axboe	70524490ee	[PATCH] splice: add support for sys_tee() Basically an in-kernel implementation of tee, which uses splice and the pipe buffers as an intelligent way to pass data around by reference. Where the user space tee consumes the input and produces a stdout and file output, this syscall merely duplicates the data inside a pipe to another pipe. No data is copied, the output just grabs a reference to the input pipe data. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 15:51:17 +02:00
Jens Axboe	cbb7e577e7	[PATCH] splice: pass offset around for ->splice_read() and ->splice_write() We need not use ->f_pos as the offset for the file input/output. If the user passed an offset pointer in through sys_splice(), just use that and leave ->f_pos alone. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 15:47:07 +02:00
Linus Torvalds	88dd9c16ce	Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] vfs: add splice_write and splice_read to documentation [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_* [PATCH] splice: warning fix [PATCH] another round of fs/pipe.c cleanups [PATCH] splice: comment styles [PATCH] splice: add Ingo as addition copyright holder [PATCH] splice: unlikely() optimizations [PATCH] splice: speedups and optimizations [PATCH] pipe.c/fifo.c code cleanups [PATCH] get rid of the PIPE_*() macros [PATCH] splice: speedup __generic_file_splice_read [PATCH] splice: add direct fd <-> fd splicing support [PATCH] splice: add optional input and output offsets [PATCH] introduce a "kernel-internal pipe object" abstraction [PATCH] splice: be smarter about calling do_page_cache_readahead() [PATCH] splice: optimize the splice buffer mapping [PATCH] splice: cleanup __generic_file_splice_read() [PATCH] splice: only call wake_up_interruptible() when we really have to [PATCH] splice: potential !page dereference [PATCH] splice: mark the io page as accessed	2006-04-11 06:34:02 -07:00
NeilBrown	358dd55aa3	[PATCH] knfsd: nfsd4: grant delegations more frequently Keep unused openowners around for at least one lease period, to avoid the need for as many open confirmations and to allow handing out more delegations. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:53 -07:00
NeilBrown	ef0f3390eb	[PATCH] knfsd: nfsd4: limit number of delegations handed out. It's very easy for the server to DOS itself by just giving out too many delegations. For now we just solve the problem with a dumb hard limit. Eventually we'll want a smarter policy. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:53 -07:00
NeilBrown	4e2fd495b5	[PATCH] knfsd: nfsd4: add missing rpciod_down() We should be shutting down rpciod for the callback channel when we shut down the server. Also note that we do rpciod_up() and create the callback client before setting cb_set--the cb_set only determines whether the initial null was succesful. So cb_set is not a reliable determiner of whether we need to clean up, only cb_client is. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:53 -07:00
NeilBrown	541e0e0981	[PATCH] knfsd: nfsd4: nfsd4_probe_callback cleanup Some obvious cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:53 -07:00
NeilBrown	5e8d5c2948	[PATCH] knfsd: nfsd4: fix laundromat shutdown race We need to make sure the laundromat work doesn't reschedule itself just when we try to cancel it. Also, we shouldn't be waiting for it to finish running while holding the state lock, as that's a potential deadlock. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	bb6e8a9f40	[PATCH] knfsd: nfsd4: fix corruption on readdir encoding with 64k pages Fix corruption on readdir encoding with 64k pages. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	6ed6decccf	[PATCH] knfsd: nfsd4: fix corruption of returned data when using 64k pages In v4 we grab an extra page just for the padding of returned data. The formula that the rpc server uses to allocate pages for the response doesn't take into account this extra page. Instead of adjusting those formulae, we adopt the same solution as v2 and v3, and put the "tail" data in the same page as the "head" data. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	f0e2993e9e	[PATCH] knfsd: nfsd4: remove nfsd_setuser from putrootfh Since nfsd_setuser() is already called from any operation that uses the current filehandle (because it's called from fh_verify), there's no reason to call it from putrootfh. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	54cceebb67	[PATCH] knfsd: nfsd: nfsd_setuser doesn't really need to modify rqstp->rq_cred. In addition to setting the processes filesystem id's, nfsd_setuser also modifies the value of the rq_cred which stores the id's that originally came from the rpc call, for example to reflect root squashing. There's no real reason to do that--the only case where rqstp->rq_cred is actually used later on is in the NFSv4 SETCLIENTID/SETCLIENTID_CONFIRM operations, and there the results are the opposite of what we want--those two operations don't deal with the filesystem at all, they only record the credentials used with the rpc call for later reference (so that we may require the same credentials be used on later operations), and the credentials shouldn't vary just because there was or wasn't a previous operation in the compound that referred to some export This fixes a bug which caused mounts from Solaris clients to fail. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	cd15654963	[PATCH] knfsd: nfsd: oops exporting nonexistent directory Export a directory that does not exist: exportfs -orw,fsid=0,insecure,no_subtree_check client:/home/NFS4 Try to mount from client with nfs4. Mount hangs (I'm not sure why - that's another issue). While client is hung, back on server mkdir /home/NFS4 The server panics in dput. I traced the problem back to svc_export_parse() calling path_release() even though path_lookup() failed (it happens to fill in the nameidata structure with a negative dentry - so the test after out: succeeds). After patching, an recreating the problem, the client mount still takes some time before finally exiting with a message "couldn't read superblock". Here is a simple patch to resolve this issue: Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:52 -07:00
NeilBrown	b5872b0dcc	[PATCH] knfsd: nfsd4: fix acl xattr length return We should be using the length from the second vfs_getxattr, in case it changed. (Note: there's still a small race here; we could end up returning -ENOMEM if the length increased between the first and second call. I don't know whether it's worth spending a lot of effort to fix that.) This makes XFS ACLs usable on NFS exports, which they currently aren't, since XFS appears to be returning a too-large value for vfs_getxattr() when it's passed a NULL buffer. So there's probably an XFS bug here too, though since getxattr with a NULL buffer is usually used to decide how much memory to allocate, it may be a fairly harmless bug in most cases. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
NeilBrown	b905b7b0a0	[PATCH] knfsd: nfsd4: better nfs4acl errors We're returning -1 in a few places in the NFSv4<->POSIX acl translation code where we could return a reasonable error. Also allows some minor simplification elsewhere. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
NeilBrown	249920527f	[PATCH] knfsd: nfsd4: Wrong error handling in nfs4acl this fixes coverity id #3. Coverity detected dead code, since the == -1 comparison only returns 0 or 1 to error. Therefore the if ( error < 0 ) statement was always false. Seems that this was an if( error = nfs4... ) statement some time ago, which got broken during cleanup. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
Adrian Bunk	e465a77f94	[PATCH] fs/nfsd/nfs4state.c: make a struct static Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Marc Eshel <eshel@almaden.ibm.com> Cc: Andy Adamson <andros@citi.umich.edu> Cc: J. Bruce Fields <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
NeilBrown	d5b9026a67	[PATCH] knfsd: locks: flag NFSv4-owned locks Use the fl_lmops field to identify which locks are ours, instead of trying to look them up in our private hash. This is safer and more efficient. Earlier versions of this patch used a lock flag instead, but Trond pointed out that adding a new flag for each lock manager wasn't going to scale well, and suggested this approach instead; a separate patch converts lockd to using fl_lmops in the same way. In the NFSv4 case this looks like a bit of a hack, since the NFSv4 server isn't currently actually defining a lock_manager_operations struct, so we end up defining one just to serve as a cookie to identify our locks. But it works, and we actually do expect to start using the lock_manager_operations at some point anyway. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
NeilBrown	7775f4c85d	[PATCH] knfsd: Correct reserved reply space for read requests. NFSd makes sure there is enough space to hold the maximum possible reply before accepting a request. The units for this maximum is (4byte) words. However in three places, particularly for read request, the number given is a number of bytes. This means too much space is reserved which is slightly wasteful. This is the sort of patch that could uncover a deeper bug, and it is not critical, so it would be best for it to spend a while in -mm before going in to mainline. (akpm: target 2.6.17-rc2, 2.6.16.3 (approx)) Discovered-by: "Eivind Sarto" <ivan@kasenna.com> Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:51 -07:00
Miklos Szeredi	08a53cdce6	[PATCH] fuse: account background requests The previous patch removed limiting the number of outstanding requests. This patch adds a much simpler limiting, that is also compatible with file locking operations. A task may have at most one synchronous request allocated. So these requests need not be otherwise limited. However the number of background requests (release, forget, asynchronous reads, interrupted requests) can grow indefinitely. This can be used by a malicous user to cause FUSE to allocate arbitrary amounts of unswappable kernel memory, denying service. For this reason add a limit for the number of background requests, and block allocations of new requests until the number goes bellow the limit. Also use this mechanism to block all requests until the INIT reply is received. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	ce1d5a491f	[PATCH] fuse: clean up request accounting FUSE allocated most requests from a fixed size pool filled at mount time. However in some cases (release/forget) non-pool requests were used. File locking operations aren't well served by the request pool, since they may block indefinetly thus exhausting the pool. This patch removes the request pool and always allocates requests on demand. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	a87046d822	[PATCH] fuse: consolidate device errors Return consistent error values for the case when the opened device file has no mount associated yet. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	d713311464	[PATCH] fuse: use a per-mount spinlock Remove the global spinlock in favor of a per-mount one. This patch is basically find & replace. The difficult part has already been done by the previous patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	0720b31597	[PATCH] fuse: simplify locking This is in preparation for removing the global spinlock in favor of a per-mount one. The only critical part is the interaction between fuse_dev_release() and fuse_fill_super(): fuse_dev_release() must see the assignment to file->private_data, otherwise it will leak the reference to fuse_conn. This is ensured by the fput() operation, which will synchronize the assignment with other CPU's that may do a final fput() soon after this. Also redundant locking is removed from fuse_fill_super(), where exclusion is already ensured by the BKL held for this function by the VFS. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	e5ac1d1e70	[PATCH] fuse: add O_NONBLOCK support to FUSE device I don't like duplicating the connected and list_empty tests in fuse_dev_readv, but this seemed cleaner than adding the f_flags test to request_wait. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	385a17bfc3	[PATCH] fuse: add O_ASYNC support to FUSE device This adds asynchronous notification to FUSE - a FUSE server can request O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input available. One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested, does no locking, unlink the other methods. I think it's unnecessary, as the fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync, which provide their own locking. It would also be wrong to use the fuse_lock, as it's a spin lock and fasync_helper can sleep. My one concern with this is the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a reference on the file struct, so this seems not to be a problem. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	7025d9ad10	[PATCH] fuse: fix fuse_dev_poll() return value fuse_dev_poll() returned an error value instead of a poll mask. Luckily (or unluckily) -ENODEV does contain the POLLERR bit. There's also a race if filesystem is unmounted between fuse_get_conn() and spin_lock(), in which case this event will be missed by poll(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00
Miklos Szeredi	d3406ffa4a	[PATCH] fuse: fix oops in fuse_send_readpages() During heavy parallel filesystem activity it was possible to Oops the kernel. The reason is that read_cache_pages() could skip pages which have already been inserted into the cache by another task. Occasionally this may result in zero pages actually being sent, while fuse_send_readpages() relies on at least one page being in the request. So check this corner case and just free the request instead of trying to send it. Reported and tested by Konstantin Isakov. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00
Ananiev, Leonid I	389ed39b97	[PATCH] ext3: Fix missed mutex unlock Missed unlock_super()call is added in error condition code path. Signed-off-by: Leonid Ananiev <leonid.i.ananiev@intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:46 -07:00
Arnd Bergmann	091e881d0e	[PATCH] inotify: check for NULL inode in inotify_d_instantiate The spufs file system creates files in a directory before instantiating the directory itself, which causes a NULL pointer access in inotify_d_instantiate since `c32ccd87bf`. I'd like to keep this behavior since it means that the user will not have access to files in the directory before I know that I succeed in creating everything in it. This patch adds a simple check for the inode to keep that working. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:45 -07:00
Vivek Goyal	68250ba5df	[PATCH] kdump: enable CONFIG_PROC_VMCORE by default Everybody seems to be using /proc/vmcore as a method to access the kernel crash dump. Hence probably it makes sense to enable CONFIG_PROC_VMCORE by default if CONFIG_CRASH_DUMP is selected. This makes kdump configuration further easier for a user. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:45 -07:00
Roland McGrath	f5e902817f	[PATCH] process accounting: take original leader's start_time in non-leader exec The only record we have of the real-time age of a process, regardless of execs it's done, is start_time. When a non-leader thread exec, the original start_time of the process is lost. Things looking at the real-time age of the process are fooled, for example the process accounting record when the process finally dies. This change makes the oldest start_time stick around with the process after a non-leader exec. This way the association between PID and start_time is kept constant, which seems correct to me. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:42 -07:00
Davide Libenzi	2395140ee2	[PATCH] uniform POLLRDHUP handling between epoll and poll/select As reported by Michael Kerrisk, POLLRDHUP handling was not consistent between epoll and poll/select, since in epoll it was unmaskeable. This patch brings uniformity in POLLRDHUP handling. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:42 -07:00
Vivek Goyal	80e8ff6341	[PATCH] kdump proc vmcore size oveflow fix A couple of /proc/vmcore data structures overflow with 32bit systems having memory more than 4G. This patch fixes those. Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:42 -07:00
Mitchell Blank Jr	b04eb6aa08	[PATCH] select: don't overflow if (SELECT_STACK_ALLOC % sizeof(long) != 0) If SELECT_STACK_ALLOC is not a multiple of sizeof(long) then stack_fds[] would be shorter than SELECT_STACK_ALLOC bytes and could overflow later in the function. Fixed by simply rearranging the test later to work on sizeof(stack_fds) Currently SELECT_STACK_ALLOC is 256 so this doesn't happen, but it's nasty to have things like this hidden in the code. What if later someone decides to change SELECT_STACK_ALLOC to 300? Signed-off-by: Mitchell Blank Jr <mitch@sfgoth.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:41 -07:00
Eric Van Hensbergen	00fbc6dfe7	[PATCH] 9p: handle sget() failure Handle a failing sget() in v9fs_get_sb(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:41 -07:00
Herbert Poetzl	f6422f17d3	[PATCH] vfs: propagate mnt_flags into do_loopback/vfsmount The mnt_flags are propagated into do_loopback(), so that they can be stored with the vfsmount Signed-off-by: Herbert Poetzl <herbert@13thfloor.at> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:41 -07:00
Andrew Morton	5246d05031	[PATCH] sync_file_range(): use unsigned for flags Ulrich suggested that the `flags' arg to sync_file_range() become unsigned. Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:40 -07:00
Jeff Dike	7b04d7170e	[PATCH] Add GFP_NOWAIT Introduce GFP_NOWAIT, as an alias for GFP_ATOMIC & ~__GFP_HIGH. This also changes XFS, which is the only in-tree user of this idiom that I could find. The XFS piece is compile-tested only. Signed-off-by: Jeff Dike <jdike@addtoit.com> Acked-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:35 -07:00
Andrew Morton	29ff2db551	[PATCH] select() warning fixes fs/select.c: In function `core_sys_select': fs/select.c:339: warning: assignment from incompatible pointer type fs/select.c:376: warning: comparison of distinct pointer types lacks a cast By using a void* we can remove lots of casts rather than adding more. Cc: Jes Sorensen <jes@trained-monkey.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:30 -07:00
Ingo Molnar	341b446bc5	[PATCH] another round of fs/pipe.c cleanups make pipe.c a bit more readable and hackable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:57:45 +02:00
Ingo Molnar	73d62d83ec	[PATCH] splice: comment styles - capitalize consistently - end sentences in one way or another - update comment text to match the implementation Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:57:21 +02:00
Jens Axboe	c2058e0611	[PATCH] splice: add Ingo as addition copyright holder The comment is also somewhat out of date, correct that as well. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:56:34 +02:00
Jens Axboe	49570e9b29	[PATCH] splice: unlikely() optimizations Also corrects a few comments. Patch mainly from Ingo, changes by me. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:56:09 +02:00
Jens Axboe	6f767b0425	[PATCH] splice: speedups and optimizations - Kill the local variables that cache ->nrbufs, they just take up space. - Only set do_wakeup for a real pipe. This is a big win for direct splicing. - Kill i_mutex lock around ->f_pos update, regular io paths don't do this either. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:53:56 +02:00
Ingo Molnar	923f4f2394	[PATCH] pipe.c/fifo.c code cleanups more code cleanups after the macro conversion: - standardize on 'struct pipe_inode_info *pipe' variable names - introduce 'pipe' temporaries to reduce mass inode->i_pipe dereferencing Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:53:33 +02:00
Ingo Molnar	9aeedfc471	[PATCH] get rid of the PIPE_() macros get rid of the PIPE_() macros. Scripted transformation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:53:10 +02:00
Jens Axboe	7480a90435	[PATCH] splice: speedup __generic_file_splice_read Using find_get_page() is a lot faster than find_or_create_page(). This gets splice a lot closer to sendfile() for fd -> socket transfers. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:52:47 +02:00
Jens Axboe	b92ce55893	[PATCH] splice: add direct fd <-> fd splicing support It's more efficient for sendfile() emulation. Basically we cache an internal private pipe and just use that as the intermediate area for pages. Direct splicing is not available from sys_splice(), it is only meant to be used for sendfile() emulation. Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at exit for the normal fast path. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 13:52:07 +02:00
Nathan Scott	019ff2d57b	[XFS] Fix a problem in aligning inode allocations to stripe unit boundaries. SGI-PV: 951862 SGI-Modid: xfs-linux-melb:xfs-kern:25726a Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:45:05 +10:00
Nathan Scott	8c0b5113a5	[XFS] Fix utime(2) in the case that no times parameter was passed in. SGI-PV: 949858 SGI-Modid: xfs-linux-melb:xfs-kern:25717a Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:12:45 +10:00
David Chinner	58829e490e	[XFS] Fix an inode use-after-free durin an unpin. When reclaiming inodes that have been unlinked, we may need to execute transactions during reclaim. By the time the transaction has hit the disk, the linux inode and xfs vnode may already have been freed so we can't reference them safely. Use the known xfs inode state to determine if it is safe to reference the vnode and linux inode during the unpin operation. SGI-PV: 946321 SGI-Modid: xfs-linux-melb:xfs-kern:25687a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:11:20 +10:00
David Chinner	1fc5d959d8	[XFS] Fix inode reclaim scalability regression. When a filesystem has millions of inodes cached and has sparse cluster population, removing inodes from the cluster hash consumes excessive amounts of CPU time. Reduce the CPU cost by making removal O(1) via use of a double linked list for the hash chains. SGI-PV: 951551 SGI-Modid: xfs-linux-melb:xfs-kern:25683a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:11:12 +10:00
Nathan Scott	8272145c05	[XFS] Fix a writepage regression where we accidentally stopped honouring nonblock mode with the new IO path code (since 2.6.16). SGI-PV: 951662 SGI-Modid: xfs-linux-melb:xfs-kern:25676a Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:10:55 +10:00
Nathan Scott	e50bd16fe4	[XFS] Fix superblock validation regression for the zero imaxpct case. Thanks to kjamieson for noticing. SGI-PV: 951661 SGI-Modid: xfs-linux-melb:xfs-kern:25675a Signed-off-by: Nathan Scott <nathans@sgi.com>	2006-04-11 15:10:45 +10:00
Linus Torvalds	e38d557896	Merge branch 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2 * 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2: [PATCH] CONFIGFS_FS must depend on SYSFS [PATCH] Bogus NULL pointer check in fs/configfs/dir.c ocfs2: Better I/O error handling in heartbeat ocfs2: test and set teardown flag early in user_dlm_destroy_lock() ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast() ocfs2: catch an invalid ast case in dlmfs ocfs2: remove an overly aggressive BUG() in dlmfs ocfs2: multi node truncate fix	2006-04-10 16:44:09 -07:00
Eric W. Biederman	de12a7878c	[PATCH] de_thread: Don't confuse users do_each_thread. Oleg Nesterov spotted two interesting bugs with the current de_thread code. The simplest is a long standing double decrement of __get_cpu_var(process_counts) in __unhash_process. Caused by two processes exiting when only one was created. The other is that since we no longer detach from the thread_group list it is possible for do_each_thread when run under the tasklist_lock to see the same task_struct twice. Once on the task list as a thread_group_leader, and once on the thread list of another thread. The double appearance in do_each_thread can cause a double increment of mm_core_waiters in zap_threads resulting in problems later on in coredump_wait. To remedy those two problems this patch takes the simple approach of changing the old thread group leader into a child thread. The only routine in release_task that cares is __unhash_process, and it can be trivially seen that we handle cleaning up a thread group leader properly. Since de_thread doesn't change the pid of the exiting leader process and instead shares it with the new leader process. I change thread_group_leader to recognize group leadership based on the group_leader field and not based on pids. This should also be slightly cheaper then the existing thread_group_leader macro. I performed a quick audit and I couldn't see any user of thread_group_leader that cared about the difference. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-10 16:36:50 -07:00
Adrian Bunk	65714b9184	[PATCH] CONFIGFS_FS must depend on SYSFS This patch fixes the a compile error with CONFIG_SYSFS=n Configfs is creating, as a matter of policy, the /sys/kernel/config mountpoint. This means it requires CONFIG_SYSFS. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-10 11:17:21 -07:00
Eric Sesterhenn	cbca692c24	[PATCH] Bogus NULL pointer check in fs/configfs/dir.c We check the "group" pointer after we dereference it. This check is bogus, as it cannot be NULL coming in. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-10 11:16:17 -07:00
Ingo Molnar	529565dcb1	[PATCH] splice: add optional input and output offsets add optional input and output offsets to sys_splice(), for seekable file descriptors: asmlinkage long sys_splice(int fd_in, loff_t __user off_in, int fd_out, loff_t __user off_out, size_t len, unsigned int flags); semantics are straightforward: f_pos will be updated with the offset provided by user-space, before the splice transfer is about to begin. Providing a NULL offset pointer means the existing f_pos will be used (and updated in situ). Providing an offset for a pipe results in -ESPIPE. Providing an invalid offset pointer results in -EFAULT. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 15:18:58 +02:00
Ingo Molnar	3a326a2ce8	[PATCH] introduce a "kernel-internal pipe object" abstraction separate out the 'internal pipe object' abstraction, and make it usable to splice. This cleans up and fixes several aspects of the internal splice APIs and the pipe code: - pipes: the allocation and freeing of pipe_inode_info is now more symmetric and more streamlined with existing kernel practices. - splice: small micro-optimization: less pointer dereferencing in splice methods Signed-off-by: Ingo Molnar <mingo@elte.hu> Update XFS for the ->splice_read/->splice_write changes. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 15:18:35 +02:00
Jens Axboe	0b749ce380	[PATCH] splice: be smarter about calling do_page_cache_readahead() We don't want to call into the read-ahead logic unless we are at the start of a page, _or_ we have multiple pages to read. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:05:04 +02:00
Jens Axboe	49d0b21be2	[PATCH] splice: optimize the splice buffer mapping We don't really need to lock down the pages, just make sure they are uptodate. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:04:41 +02:00
Jens Axboe	16c523ddab	[PATCH] splice: cleanup __generic_file_splice_read() The whole shadow/pages logic got overly complex, and this simpler approach is actually faster in testing. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:03:58 +02:00
Jens Axboe	c0bd1f650b	[PATCH] splice: only call wake_up_interruptible() when we really have to __wake_up_common() is pretty heavy in the kernel profiles, this brings it down to a more acceptable level. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:03:32 +02:00
Dave Jones	9aefe431f5	[PATCH] splice: potential !page dereference We can get to out: with a NULL page, which we probably don't want to be calling page_cache_release() on. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:02:40 +02:00
Jens Axboe	c7f21e4f5a	[PATCH] splice: mark the io page as accessed We should do that, since we do the LRU manipulation ourselves now. Suggested by Nick Piggin. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-10 09:01:01 +02:00
Mark Fasheh	a9e2ae3917	ocfs2: Better I/O error handling in heartbeat Propagate errors received in o2hb_bio_end_io() back to the heartbeat thread so it can skip re-arming the timer. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 18:03:09 -07:00
Mark Fasheh	2cd9888590	ocfs2: test and set teardown flag early in user_dlm_destroy_lock() Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 17:39:43 -07:00
Mark Fasheh	f43e6918c0	ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast() Remove the code which attempted to catch it via dlmunlock() return status - this never happens there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 17:37:52 -07:00
Mark Fasheh	cc6eb72595	ocfs2: catch an invalid ast case in dlmfs Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 17:36:16 -07:00
Mark Fasheh	1f7bc828e3	ocfs2: remove an overly aggressive BUG() in dlmfs Don't BUG() user_dlm_unblock_lock() on the absence of the USER_LOCK_BLOCKED flag - this turns out to be a valid case. Make some of the related BUG() statements print more useful information. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 17:27:43 -07:00
Mark Fasheh	ab0920ce7e	ocfs2: multi node truncate fix Fix ocfs2_truncate_file() so that it forces a truncate_inode_pages() on all interested nodes in all cases of a truncate(), not just allocation change. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2006-04-07 16:47:24 -07:00
Linus Torvalds	d69636157a	Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: fix page stealing LRU handling. [PATCH] splice: page stealing needs to wait_on_page_writeback() [PATCH] splice: export generic_splice_sendpage [PATCH] splice: add a SPLICE_F_MORE flag [PATCH] splice: add comments documenting more of the code [PATCH] splice: improve writeback and clean up page stealing [PATCH] splice: fix shadow[] filling logic	2006-04-02 14:22:06 -07:00
Jens Axboe	3e7ee3e7b3	[PATCH] splice: fix page stealing LRU handling. Originally from Nick Piggin, just adapted to the newer branch. You can't check PageLRU without holding zone->lru_lock. The page release code can get away with it only because the page refcount is 0 at that point. Also, you can't reliably remove pages from the LRU unless the refcount is 0. Ever. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:11:04 +02:00
Jens Axboe	ad8d6f0a78	[PATCH] splice: page stealing needs to wait_on_page_writeback() Thanks to Andrew for the good explanation of why this is so. akpm writes: If a page is under writeback and we remove it from pagecache, it's still going to get written to disk. But the VFS no longer knows about that page, nor that this page is about to modify disk blocks. So there might be scenarios in which those blocks-which-are-about-to-be-written-to get reused for something else. When writeback completes, it'll scribble on those blocks. This won't happen in ext2/ext3-style filesystems in normal mode because the page has buffers and try_to_release_page() will fail. But ext2 in nobh mode doesn't attach buffers at all - it just sticks the page in a BIO, finds some new blocks, points the BIO at those blocks and lets it rip. While that write IO's in flight, someone could truncate the file. Truncate won't block on the writeout because the page isn't in pagecache any more. So truncate will the free the blocks from the file under the page's feet. Then something else can reallocate those blocks. Then write data to them. Now, the original write completes, corrupting the filesystem. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:10:32 +02:00
Jens Axboe	059a8f3734	[PATCH] splice: export generic_splice_sendpage Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL to right below the functions. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:06:05 +02:00
Jens Axboe	b2b39fa478	[PATCH] splice: add a SPLICE_F_MORE flag This lets userspace indicate whether more data will be coming in a subsequent splice call. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:05:41 +02:00
Jens Axboe	83f9135bdd	[PATCH] splice: add comments documenting more of the code Hopefully this will make Andrew a little more happy. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:05:09 +02:00
Jens Axboe	4f6f0bd2ff	[PATCH] splice: improve writeback and clean up page stealing By cleaning up the writeback logic (killing write_one_page() and the manual set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and just keep it local in pipe_to_file(). This also adds dirty page balancing logic and O_SYNC handling. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:04:46 +02:00
Jens Axboe	53cd9ae886	[PATCH] splice: fix shadow[] filling logic Clear the entire range, and don't increment pidx or we keep filling the same position again and again. Thanks to KAMEZAWA Hiroyuki. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:04:21 +02:00

1 2 3 4 5 ...

2535 Коммитов