WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Geert Uytterhoeven	0157443c56	fuse: Initialize total_len in fuse_retrieve() fs/fuse/dev.c:1357: warning: ‘total_len’ may be used uninitialized in this function Initialize total_len to zero, else its value will be undefined. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-10-04 10:45:32 +02:00
Miklos Szeredi	b9ca67b2dd	fuse: fix lock annotations Sparse doesn't understand lock annotations of the form __releases(&foo->lock). Change them to __releases(foo->lock). Same for __acquires(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-09-07 13:42:41 +02:00
Miklos Szeredi	595afaf9e6	fuse: flush background queue on connection close David Bartly reported that fuse can hang in fuse_get_req_nofail() when the connection to the filesystem server is no longer active. If bg_queue is not empty then flush_bg_queue() called from request_end() can put more requests on to the pending queue. If this happens while ending requests on the processing queue then those background requests will be queued to the pending list and never ended. Another problem is that fuse_dev_release() didn't wake up processes sleeping on blocked_waitq. Solve this by: a) flushing the background queue before calling end_requests() on the pending and processing queues b) setting blocked = 0 and waking up processes waiting on blocked_waitq() Thanks to David for an excellent bug report. Reported-by: David Bartley <andareed@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2010-09-07 13:42:41 +02:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Al Viro	b57922d97f	convert remaining ->clear_inode() to ->evict_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:37 -04:00
Christoph Hellwig	2c27c65ed0	check ATTR_SIZE contraints in inode_change_ok Make sure we check the truncate constraints early on in ->setattr by adding those checks to inode_change_ok. Also clean up and document inode_change_ok to make this obvious. As a fallout we don't have to call inode_newsize_ok from simple_setsize and simplify it down to a truncate_setsize which doesn't return an error. This simplifies a lot of setattr implementations and means we use truncate_setsize almost everywhere. Get rid of fat_setsize now that it's trivial and mark ext2_setsize static to make the calling convention obvious. Keep the inode_newsize_ok in vmtruncate for now as all callers need an audit for its removal anyway. Note: setattr code in ecryptfs doesn't call inode_change_ok at all and needs a deeper audit, but that is left for later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:39 -04:00
Christoph Hellwig	db78b877f7	always call inode_change_ok early in ->setattr Make sure we call inode_change_ok before doing any changes in ->setattr, and make sure to call it even if our fs wants to ignore normal UNIX permissions, but use the ATTR_FORCE to skip those. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:38 -04:00
Linus Torvalds	fe21ea18c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add retrieve request fuse: add store request fuse: don't use atomic kmap	2010-08-07 13:18:36 -07:00
Eric Paris	9cfcac810e	vfs: re-introduce MAY_CHDIR Currently MAY_ACCESS means that filesystems must check the permissions right then and not rely on cached results or the results of future operations on the object. This can be because of a call to sys_access() or because of a call to chdir() which needs to check search without relying on any future operations inside that dir. I plan to use MAY_ACCESS for other purposes in the security system, so I split the MAY_ACCESS and the MAY_CHDIR cases. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2010-08-02 15:35:06 +10:00
Miklos Szeredi	2d45ba381a	fuse: add retrieve request Userspace filesystem can request data to be retrieved from the inode's mapping. This request is synchronous and the retrieved data is queued as a new request. If the write to the fuse device returns an error then the retrieve request was not completed and a reply will not be sent. Only present pages are returned in the retrieve reply. Retrieving stops when it finds a non-present page and only data prior to that is returned. This request doesn't change the dirty state of pages. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Miklos Szeredi	a1d75f2582	fuse: add store request Userspace filesystem can request data to be stored in the inode's mapping. This request is synchronous and has no reply. If the write to the fuse device returns an error then the store request was not fully completed (but may have updated some pages). If the stored data overflows the current file size, then the size is extended, similarly to a write(2) on the filesystem. Pages which have been completely stored are marked uptodate. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Miklos Szeredi	7909b1c640	fuse: don't use atomic kmap Don't use atomic kmap for mapping userspace buffers in device read/write/splice. This is necessary because the next patch (adding store notify) requires that caller of fuse_copy_page() may sleep between invocations. The simplest way to ensure this is to change the atomic kmaps to non-atomic ones. Thankfully architectures where kmap() is not a no-op are going out of fashion, so we can ignore the (probably negligible) performance impact of this change. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Linus Torvalds	003386fff3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: mm: export generic_pipe_buf_() to modules fuse: support splice() reading from fuse device fuse: allow splice to move pages mm: export remove_from_page_cache() to modules mm: export lru_cache_add_() to modules fuse: support splice() writing to fuse device fuse: get page reference for readpages fuse: use get_user_pages_fast() fuse: remove unneeded variable	2010-05-30 09:16:14 -07:00
Christoph Hellwig	7ea8085910	drop unused dentry argument to ->fsync Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:05:02 -04:00
Kay Sievers	578454ff7e	driver core: add devname module aliases to allow module on-demand auto-loading This adds: alias: devname:<name> to some common kernel modules, which will allow the on-demand loading of the kernel module when the device node is accessed. Ideally all these modules would be compiled-in, but distros seems too much in love with their modularization that we need to cover the common cases with this new facility. It will allow us to remove a bunch of pretty useless init scripts and modprobes from init scripts. The static device node aliases will be carried in the module itself. The program depmod will extract this information to a file in the module directory: $ cat /lib/modules/2.6.34-00650-g537b60d-dirty/modules.devname # Device nodes to trigger on-demand module loading. microcode cpu/microcode c10:184 fuse fuse c10:229 ppp_generic ppp c108:0 tun net/tun c10:200 dm_mod mapper/control c10:235 Udev will pick up the depmod created file on startup and create all the static device nodes which the kernel modules specify, so that these modules get automatically loaded when the device node is accessed: $ /sbin/udevd --debug ... static_dev_create_from_modules: mknod '/dev/cpu/microcode' c10:184 static_dev_create_from_modules: mknod '/dev/fuse' c10:229 static_dev_create_from_modules: mknod '/dev/ppp' c108:0 static_dev_create_from_modules: mknod '/dev/net/tun' c10:200 static_dev_create_from_modules: mknod '/dev/mapper/control' c10:235 udev_rules_apply_static_dev_perms: chmod '/dev/net/tun' 0666 udev_rules_apply_static_dev_perms: chmod '/dev/fuse' 0666 A few device nodes are switched to statically allocated numbers, to allow the static nodes to work. This might also useful for systems which still run a plain static /dev, which is completely unsafe to use with any dynamic minor numbers. Note: The devname aliases must be limited to the common and singleinstance* device nodes, like the misc devices, and never be used for conceptually limited systems like the loop devices, which should rather get fixed properly and get a control node for losetup to talk to, instead of creating a random number of device nodes in advance, regardless if they are ever used. This facility is to hide the mess distros are creating with too modualized kernels, and just to hide that these modules are not compiled-in, and not to paper-over broken concepts. Thanks! :) Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Chris Mason <chris.mason@oracle.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Ian Kent <raven@themaw.net> Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-05-25 15:08:26 -07:00
Miklos Szeredi	c3021629a0	fuse: support splice() reading from fuse device Allow userspace filesystem implementation to use splice() to read from the fuse device. The userspace filesystem can now transfer data coming from a WRITE request to an arbitrary file descriptor (regular file, block device or socket) without having to go through a userspace buffer. The semantics of using splice() to read messages are: 1) with a single splice() call move the whole message from the fuse device to a temporary pipe 2) read the header from the pipe and determine the message type 3a) if message is a WRITE then splice data from pipe to destination 3b) else read rest of message to userspace buffer Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:07 +02:00
Miklos Szeredi	ce534fb052	fuse: allow splice to move pages When splicing buffers to the fuse device with SPLICE_F_MOVE, try to move pages from the pipe buffer into the page cache. This allows populating the fuse filesystem's cache without ever touching the page contents, i.e. zero copy read capability. The following steps are performed when trying to move a page into the page cache: - buf->ops->confirm() to make sure the new page is uptodate - buf->ops->steal() to try to remove the new page from it's previous place - remove_from_page_cache() on the old page - add_to_page_cache_locked() on the new page If any of the above steps fail (non fatally) then the code falls back to copying the page. In particular ->steal() will fail if there are external references (other than the page cache and the pipe buffer) to the page. Also since the remove_from_page_cache() + add_to_page_cache_locked() are non-atomic it is possible that the page cache is repopulated in between the two and add_to_page_cache_locked() will fail. This could be fixed by creating a new atomic replace_page_cache_page() function. fuse_readpages_end() needed to be reworked so it works even if page->mapping is NULL for some or all pages which can happen if the add_to_page_cache_locked() failed. A number of sanity checks were added to make sure the stolen pages don't have weird flags set, etc... These could be moved into generic splice/steal code. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:07 +02:00
Miklos Szeredi	dd3bb14f44	fuse: support splice() writing to fuse device Allow userspace filesystem implementation to use splice() to write to the fuse device. The semantics of using splice() are: 1) buffer the message header and data in a temporary pipe 2) with a single splice() call move the message from the temporary pipe to the fuse device The READ reply message has the most interesting use for this, since now the data from an arbitrary file descriptor (which could be a regular file, a block device or a socket) can be tranferred into the fuse device without having to go through a userspace buffer. It will also allow zero copy moving of pages. One caveat is that the protocol on the fuse device requires the length of the whole message to be written into the header. But the length of the data transferred into the temporary pipe may not be known in advance. The current library implementation works around this by using vmplice to write the header and modifying the header after splicing the data into the pipe (error handling omitted): struct fuse_out_header out; iov.iov_base = &out; iov.iov_len = sizeof(struct fuse_out_header); vmsplice(pip[1], &iov, 1, 0); len = splice(input_fd, input_offset, pip[1], NULL, len, 0); /* retrospectively modify the header: */ out.len = len + sizeof(struct fuse_out_header); splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags); This works since vmsplice only saves a pointer to the data, it does not copy the data itself. Since pipes are currently limited to 16 pages and messages need to be spliced atomically, the length of the data is limited to 15 pages (or 60kB for 4k pages). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Miklos Szeredi	b5dd328537	fuse: get page reference for readpages Acquire a page ref on pages in ->readpages() and release them when the read has finished. Not acquiring a reference didn't seem to cause any trouble since the page is locked and will not be kicked out of the page cache during the read. However the following patches will want to remove the page from the cache so a separate ref is needed. Making the reference in req->pages explicit also makes the code easier to understand. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Miklos Szeredi	1bf94ca73e	fuse: use get_user_pages_fast() Replace uses of get_user_pages() with get_user_pages_fast(). It looks nicer and should be faster in most cases. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Dan Carpenter	4aa0edd294	fuse: remove unneeded variable "map" isn't needed any more after: `0bd87182d3` "fuse: fix kunmap in fuse_ioctl_copy_user" Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:05 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
Linus Torvalds	60f8a8d4c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix large stack use fuse: cleanup in fuse_notify_inval_...()	2010-03-03 08:08:21 -08:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Fang Wenqi	b2d82ee3c8	fuse: fix large stack use gcc 4.4 warns about: fs/fuse/dev.c: In function ‘fuse_notify_inval_entry’: fs/fuse/dev.c:925: warning: the frame size of 1060 bytes is larger than 1024 bytes The problem is we declare two structures and a large array on the stack, I move the array alway from the stack and allocate memory for it dynamically. Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-02-05 12:08:31 +01:00
Miklos Szeredi	b21dda438b	fuse: cleanup in fuse_notify_inval_...() Small cleanup in fuse_notify_inval_inode() and fuse_notify_inval_entry(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-02-05 12:08:31 +01:00
anfei zhou	931e80e4b3	mm: flush dcache before writing into page to avoid alias The cache alias problem will happen if the changes of user shared mapping is not flushed before copying, then user and kernel mapping may be mapped into two different cache line, it is impossible to guarantee the coherence after iov_iter_copy_from_user_atomic. So the right steps should be: flush_dcache_page(page); kmap_atomic(page); write to page; kunmap_atomic(page); flush_dcache_page(page); More precisely, we might create two new APIs flush_dcache_user_page and flush_dcache_kern_page to replace the two flush_dcache_page accordingly. Here is a snippet tested on omap2430 with VIPT cache, and I think it is not ARM-specific: int val = 0x11111111; fd = open("abc", O_RDWR); addr = mmap(NULL, 4096, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); (addr+0) = 0x44444444; tmp = (addr+0); *(addr+1) = 0x77777777; write(fd, &val, sizeof(int)); close(fd); The results are not always 0x11111111 0x77777777 at the beginning as expected. Sometimes we see 0x44444444 0x77777777. Signed-off-by: Anfei <anfei.zhou@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <linux-arch@vger.kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-02 18:11:21 -08:00
Csaba Henk	1b7323965a	fuse: reject O_DIRECT flag also in fuse_create The comment in fuse_open about O_DIRECT: "VFS checks this, but only _after_ ->open()" also holds for fuse_create, however, the same kind of check was missing there. As an impact of this bug, open(newfile, O_RDWR\|O_CREAT\|O_DIRECT) fails, but a stub newfile will remain if the fuse server handled the implied FUSE_CREATE request appropriately. Other impact: in the above situation ima_file_free() will complain to open/free imbalance if CONFIG_IMA is set. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Harshavardhana <harsha@gluster.com> Cc: stable@kernel.org	2009-11-27 16:37:13 +01:00
Miklos Szeredi	5219f346b0	fuse: invalidate target of rename Invalidate the target's attributes, which may have changed (such as nlink, change time) so that they are refreshed on the next getattr(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-11-04 10:24:52 +01:00
Jens Axboe	0bd87182d3	fuse: fix kunmap in fuse_ioctl_copy_user Looks like another victim of the confusing kmap() vs kmap_atomic() API differences. Reported-by: Todor Gyumyushev <yodor1@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org	2009-11-04 10:24:51 +01:00
Anand V. Avati	f60311d5f7	fuse: prevent fuse_put_request on invalid pointer fuse_direct_io() has a loop where requests are allocated in each iteration. if allocation fails, the loop is broken out and follows into an unconditional fuse_put_request() on that invalid pointer. Signed-off-by: Anand V. Avati <avati@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@kernel.org	2009-11-04 10:24:50 +01:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
npiggin@suse.de	c08d3b0e33	truncate: use new helpers Update some fs code to make use of new helper functions introduced in the previous patch. Should be no significant change in behaviour (except CIFS now calls send_sig under i_lock, via inode_newsize_ok). Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Cc: linux-nfs@vger.kernel.org Cc: Trond.Myklebust@netapp.com Cc: linux-cifs-client@lists.samba.org Cc: sfrench@samba.org Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 08:41:47 -04:00
Linus Torvalds	9eead2a811	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add fusectl interface to max_background fuse: limit user-specified values of max background requests fuse: use drop_nlink() instead of direct nlink manipulation fuse: document protocol version negotiation fuse: make the number of max background requests and congestion threshold tunable	2009-09-18 09:23:03 -07:00
Jens Axboe	32a88aa1b6	fs: Assign bdi in super_block We do this automatically in get_sb_bdev() from the set_bdev_super() callback. Filesystems that have their own private backing_dev_info must assign that in ->fill_super(). Note that ->s_bdi assignment is required for proper writeback! Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-16 15:18:51 +02:00
Csaba Henk	79a9d99434	fuse: add fusectl interface to max_background Make the max_background and congestion_threshold parameters of a FUSE mount tunable at runtime by adding the respective knobs to its directory within the fusectl filesystem. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:29 +02:00
Csaba Henk	487ea5af63	fuse: limit user-specified values of max background requests An untrusted user could DoS the system if s/he were allowed to accumulate an arbitrary number of pending background requests by setting the above limits to extremely high values in INIT. This patch excludes this possibility by imposing global upper limits on the possible values of per-mount "max background requests" and "congestion threshold" parameters for unprivileged FUSE filesystems. These global limits are implemented as module parameters. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:29 +02:00
Csaba Henk	d6db07ded5	fuse: use drop_nlink() instead of direct nlink manipulation drop_nlink() is the API function to decrease the link count of an inode. However, at a place the control filesystem used the decrement operator on i_nlink directly. Fix this. Cc: Anand Avati <avati@gluster.com> Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:28 +02:00
Jens Axboe	d993831fa7	writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 09:20:26 +02:00
Linus Torvalds	81e4e1ba7e	Revert "fuse: Fix build error" as unnecessary This reverts commit `097041e576`. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-11 11:22:34 -07:00
Larry Finger	097041e576	fuse: Fix build error When building v2.6.31-rc2-344-g69ca06c, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-10 19:09:46 -07:00
Jens Axboe	8aa7e847d8	Fix congestion_wait() sync/async vs read/write confusion Commit `1faa16d228` accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:53 +02:00
Csaba Henk	7a6d3c8b30	fuse: make the number of max background requests and congestion threshold tunable The practical values for these limits depend on the design of the filesystem server so let userspace set them at initialization time. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-07-07 17:28:52 +02:00
John Muir	3b463ae0c6	fuse: invalidation reverse calls Add notification messages that allow the filesystem to invalidate VFS caches. Two notifications are added: 1) inode invalidation - invalidate cached attributes - invalidate a range of pages in the page cache (this is optional) 2) dentry invalidation - try to invalidate a subtree in the dentry cache Care must be taken while accessing the 'struct super_block' for the mount, as it can go away while an invalidation is in progress. To prevent this, introduce a rw-semaphore, that is taken for read during the invalidation and taken for write in the ->kill_sb callback. Cc: Csaba Henk <csaba@gluster.com> Cc: Anand Avati <avati@zresearch.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-30 20:12:24 +02:00
Miklos Szeredi	e0a43ddcc0	fuse: allow umask processing in userspace This patch lets filesystems handle masking the file mode on creation. This is needed if filesystem is using ACLs. - The CREATE, MKDIR and MKNOD requests are extended with a "umask" parameter. - A new FUSE_DONT_MASK flag is added to the INIT request/reply. With this the filesystem may request that the create mode is not masked. CC: Jean-Pierre André <jean-pierre.andre@wanadoo.fr> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-30 20:12:23 +02:00
Miklos Szeredi	201fa69a28	fuse: fix bad return value in fuse_file_poll() Fix fuse_file_poll() which returned a -errno value instead of a poll mask. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-06-30 20:06:24 +02:00
Csaba Henk	b4c458b3a2	fuse: fix return value of fuse_dev_write() On 64 bit systems -- where sizeof(ssize_t) > sizeof(int) -- the following test exposes a bug due to a non-careful return of an int or unsigned value: implement a FUSE filesystem which sends an unsolicited notification to the kernel with invalid opcode. The respective write to /dev/fuse will return (1 << 32) - EINVAL with errno == 0 instead of -1 with errno == EINVAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-06-30 20:06:23 +02:00
Al Viro	66c6af2e8b	fuse doesn't need BKL in ->umount_begin() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-17 00:36:36 -04:00
Linus Torvalds	c34752bc8b	Merge branch 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: CUSE: implement CUSE - Character device in Userspace fuse: export symbols to be used by CUSE fuse: update fuse_conn_init() and separate out fuse_conn_kill() fuse: don't use inode in fuse_file_poll fuse: don't use inode in fuse_do_ioctl() helper fuse: don't use inode in fuse_sync_release() fuse: create fuse_do_open() helper for CUSE fuse: clean up args in fuse_finish_open() and fuse_release_fill() fuse: don't use inode in helpers called by fuse_direct_io() fuse: add members to struct fuse_file fuse: prepare fuse_direct_io() for CUSE fuse: clean up fuse_write_fill() fuse: use struct path in release structure fuse: misc cleanups	2009-06-12 09:31:20 -07:00
Tejun Heo	151060ac13	CUSE: implement CUSE - Character device in Userspace CUSE enables implementing character devices in userspace. With recent additions of ioctl and poll support, FUSE already has most of what's necessary to implement character devices. All CUSE has to do is bonding all those components - FUSE, chardev and the driver model - nicely. When client opens /dev/cuse, kernel starts conversation with CUSE_INIT. The client tells CUSE which device it wants to create. As the previous patch made fuse_file usable without associated fuse_inode, CUSE doesn't create super block or inodes. It attaches fuse_file to cdev file->private_data during open and set ff->fi to NULL. The rest of the operation is almost identical to FUSE direct IO case. Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME (which is symlink to /sys/devices/virtual/class/DEVNAME if SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort" among other things. Those two files have the same meaning as the FUSE control files. The only notable lacking feature compared to in-kernel implementation is mmap support. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-09 11:24:11 +02:00
Linus Torvalds	a6aeeebf51	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: destroy bdi on error	2009-05-13 16:32:57 -07:00
Alessio Igor Bogani	67e55205ec	vfs: umount_begin BKL pushdown Push BKL down into ->umount_begin() Signed-off-by: Alessio Igor Bogani <abogani@texware.it> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:38 -04:00
Tejun Heo	08cbf542bf	fuse: export symbols to be used by CUSE Export the following symbols for CUSE. fuse_conn_put() fuse_conn_get() fuse_conn_kill() fuse_send_init() fuse_do_open() fuse_sync_release() fuse_direct_io() fuse_do_ioctl() fuse_file_poll() fuse_request_alloc() fuse_get_req() fuse_put_request() fuse_request_send() fuse_abort_conn() fuse_dev_release() fuse_dev_operations Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:42 +02:00
Tejun Heo	a325f9b922	fuse: update fuse_conn_init() and separate out fuse_conn_kill() Update fuse_conn_init() such that it doesn't take @sb and move bdi registration into a separate function. Also separate out fuse_conn_kill() from fuse_put_super(). These will be used to implement cuse. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:41 +02:00
Miklos Szeredi	797759aaf3	fuse: don't use inode in fuse_file_poll Use ff->fc and ff->nodeid instead of file->f_dentry->d_inode in the fuse_file_poll() implementation. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:41 +02:00
Miklos Szeredi	d36f248710	fuse: don't use inode in fuse_do_ioctl() helper Create a helper for sending an IOCTL request that doesn't use a struct inode. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:39 +02:00
Miklos Szeredi	8b0797a498	fuse: don't use inode in fuse_sync_release() Make fuse_sync_release() a generic helper function that doesn't need a struct inode pointer. This makes it suitable for use by CUSE. Change return value of fuse_release_common() from int to void. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:39 +02:00
Miklos Szeredi	91fe96b403	fuse: create fuse_do_open() helper for CUSE Create a helper for sending an OPEN request that doesn't need a struct inode pointer. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	c7b7143c63	fuse: clean up args in fuse_finish_open() and fuse_release_fill() Move setting ff->fh, ff->nodeid and file->private_data outside fuse_finish_open(). Add ->open_flags member to struct fuse_file. This simplifies the argument passing to fuse_finish_open() and fuse_release_fill(), and paves the way for creating an open helper that doesn't need an inode pointer. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	2106cb1893	fuse: don't use inode in helpers called by fuse_direct_io() Use ff->fc and ff->nodeid instead of passing down the inode. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	da5e471457	fuse: add members to struct fuse_file Add new members ->fc and ->nodeid to struct fuse_file. This will aid in converting functions for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	d09cb9d7f6	fuse: prepare fuse_direct_io() for CUSE Move code operating on the inode out from fuse_direct_io(). This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	2d698b0703	fuse: clean up fuse_write_fill() Move out code from fuse_write_fill() which is not common to all callers. Remove two function arguments which become unnecessary. Also remove unnecessary memset(), the request is already initialized to zero. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	b0be46ebf7	fuse: use struct path in release structure Use struct path instead of separate dentry and vfsmount in req->misc.release. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	fd9db72977	fuse: destroy bdi on error Destroy bdi on error in fuse_fill_super(). This was an omission from commit `26c3679101` "fuse: destroy bdi on umount", which moved the bdi_destroy() call from fuse_conn_put() to fuse_put_super(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-04-28 16:56:35 +02:00
Tejun Heo	6b2db28a7a	fuse: misc cleanups * fuse_file_alloc() was structured in weird way. The success path was split between else block and code following the block. Restructure the code such that it's easier to read and modify. * Unindent success path of fuse_release_common() to ease future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:35 +02:00
Miklos Szeredi	3121bfe763	fuse: fix "direct_io" private mmap MAP_PRIVATE mmap could return stale data from the cache for "direct_io" files. Fix this by flushing the cache on mmap. Found with a slightly modified fsx-linux. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-09 17:37:53 +02:00
Miklos Szeredi	ce60a2f157	fuse: fix argument type in fuse_get_user_pages() Fix the following warning: fs/fuse/file.c: In function 'fuse_direct_io': fs/fuse/file.c:1002: warning: passing argument 3 of 'fuse_get_user_pages' from incompatible pointer type This was introduced by commit `f4975c67` "fuse: allow kernel to access "direct_io" files". Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-09 17:37:52 +02:00
Miklos Szeredi	fc280c9692	fuse: allow private mappings of "direct_io" files Allow MAP_PRIVATE mmaps of "direct_io" files. This is necessary for execute support. MAP_SHARED mappings require some sort of coherency between the underlying file and the mapping. With "direct_io" it is difficult to provide this, so for the moment just disallow shared (read-write and read-only) mappings altogether. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-02 14:25:35 +02:00
Miklos Szeredi	f4975c67dd	fuse: allow kernel to access "direct_io" files Allow the kernel read and write on "direct_io" files. This is necessary for nfs export and execute support. The implementation is simple: if an access from the kernel is detected, don't perform get_user_pages(), just use the kernel address provided by the requester to copy from/to the userspace filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-02 14:25:34 +02:00
Nick Piggin	c2ec175c39	mm: page_mkwrite change prototype to match fault Change the page_mkwrite prototype to take a struct vm_fault, and return VM_FAULT_xxx flags. There should be no functional change. This makes it possible to return much more detailed error information to the VM (and also can provide more information eg. virtual_address to the driver, which might be important in some special cases). This is required for a subsequent fix. And will also make it easier to merge page_mkwrite() with fault() in future. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Felix Blyakher <felixb@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-01 08:59:14 -07:00
Dan Carpenter	5291658d87	fuse: fix fuse_file_lseek returning with lock held This bug was found with smatch (http://repo.or.cz/w/smatch.git/). If we return directly the inode->i_mutex lock doesn't get released. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-03-30 17:26:24 +02:00
Al Viro	4269590a72	constify dentry_operations: FUSE Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:01 -04:00
Linus Torvalds	a1c70a756f	Merge branch 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc * 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc: (36 commits) fs/Kconfig: move 9p out fs/Kconfig: move afs out fs/Kconfig: move coda out fs/Kconfig: move the rest of ncpfs out fs/Kconfig: move smbfs out fs/Kconfig: move sunrpc out fs/Kconfig: move nfsd out fs/Kconfig: move nfs out fs/Kconfig: move ufs out fs/Kconfig: move sysv out fs/Kconfig: move romfs out fs/Kconfig: move qnx4 out fs/Kconfig: move hpfs out fs/Kconfig: move omfs out fs/Kconfig: move minix out fs/Kconfig: move vxfs out fs/Kconfig: move squashfs out fs/Kconfig: move cramfs out fs/Kconfig: move efs out fs/Kconfig: move bfs out ...	2009-01-26 10:08:50 -08:00
Miklos Szeredi	f6d47a1761	fuse: fix poll notify Move fuse_copy_finish() to before calling fuse_notify_poll_wakeup(). This is not a big issue because fuse_notify_poll_wakeup() should be atomic, but it's cleaner this way, and later uses of notification will need to be able to finish the copying before performing some actions. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-01-26 15:00:59 +01:00
Miklos Szeredi	26c3679101	fuse: destroy bdi on umount If a fuse filesystem is unmounted but the device file descriptor remains open and a new mount reuses the old device number, then the mount fails with EEXIST and the following warning is printed in the kernel log: WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d() sysfs: duplicate filename '0:15' can not be created The cause is that the bdi belonging to the fuse filesystem was destoryed only after the device file was released. Fix this by calling bdi_destroy() from fuse_put_super() instead. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:59 +01:00
Miklos Szeredi	c2b8f00690	fuse: fuse_fill_super error handling cleanup Clean up error handling for the whole of fuse_fill_super() function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-01-26 15:00:58 +01:00
Miklos Szeredi	3ddf1e7f57	fuse: fix missing fput on error Fix the leaking file reference if allocation or initialization of fuse_conn failed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:58 +01:00
Dan Carpenter	bb875b38dc	fuse: fix NULL deref in fuse_file_alloc() ff is set to NULL and then dereferenced on line 65. Compile tested only. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:58 +01:00
Alexey Dobriyan	3ef7784e47	fs/Kconfig: move fuse out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:55 +03:00
Linus Torvalds	5fec8bdbf9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: clean up annotations of fc->lock fuse: fix sparse warning in ioctl fuse: update interface version fuse: add fuse_conn->release() fuse: separate out fuse_conn_init() from new_conn() fuse: add fuse_ prefix to several functions fuse: implement poll support fuse: implement unsolicited notification fuse: add file kernel handle fuse: implement ioctl support fuse: don't let fuse_req->end() put the base reference fuse: move FUSE_MINOR to miscdevice.h fuse: style fixes	2009-01-06 17:01:20 -08:00
Nick Piggin	54566b2c15	fs: symlink write_begin allocation context fix With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Harvey Harrison	5d9ec854bf	fuse: clean up annotations of fc->lock Makes the existing annotations match the more common one per line style and adds a few missing annotations. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-12-02 14:49:42 +01:00
Miklos Szeredi	c9f0523d88	fuse: fix sparse warning in ioctl Fix sparse warning: CHECK fs/fuse/file.c fs/fuse/file.c:1615:17: warning: incorrect type in assignment (different address spaces) fs/fuse/file.c:1615:17: expected void [noderef] <asn:1>iov_base fs/fuse/file.c:1615:17: got void <noident> This was introduced by "fuse: implement ioctl support". Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-12-02 14:49:42 +01:00
Tejun Heo	43901aabd7	fuse: add fuse_conn->release() Add fuse_conn->release() so that fuse_conn can be embedded in other structures. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:56 +01:00
Tejun Heo	0d179aa592	fuse: separate out fuse_conn_init() from new_conn() Separate out fuse_conn_init() from new_conn() and while at it initialize fuse_conn->entry during conn initialization. This will be used by CUSE. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	b93f858ab2	fuse: add fuse_ prefix to several functions Add fuse_ prefix to request_send*() and get_root_inode() as some of those functions will be exported for CUSE. With or without CUSE export, having the function names scoped is a good idea for debuggability. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	95668a69a4	fuse: implement poll support Implement poll support. Polled files are indexed using kh in a RB tree rooted at fuse_conn->polled_files. Client should send FUSE_NOTIFY_POLL notification once after processing FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending notification unconditionally after the latest poll or everytime file content might have changed is inefficient but won't cause malfunction. fuse_file_poll() can sleep and requires patches from the following thread which allows f_op->poll() to sleep. http://thread.gmane.org/gmane.linux.kernel/726176 Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	8599396b50	fuse: implement unsolicited notification Clients always used to write only in response to read requests. To implement poll efficiently, clients should be able to issue unsolicited notifications. This patch implements basic notification support. Zero fuse_out_header.unique is now accepted and considered unsolicited notification and the error field contains notification code. This patch doesn't implement any actual notification. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	acf99433d9	fuse: add file kernel handle The file handle, fuse_file->fh, is opaque value supplied by userland FUSE server and uniqueness is not guaranteed. Add file kernel handle, fuse_file->kh, which is allocated by the kernel on file allocation and guaranteed to be unique. This will be used by poll to match notification to the respective file but can be used for other purposes where unique file handle is necessary. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	59efec7b90	fuse: implement ioctl support Generic ioctl support is tricky to implement because only the ioctl implementation itself knows which memory regions need to be read and/or written. To support this, fuse client can request retry of ioctl specifying memory regions to read and write. Deep copying (nested pointers) can be implemented by retrying multiple times resolving one depth of dereference at a time. For security and cleanliness considerations, ioctl implementation has restricted mode where the kernel determines data transfer directions and sizes using the _IOC_*() macros on the ioctl command. In this mode, retry is not allowed. For all FUSE servers, restricted mode is enforced. Unrestricted ioctl will be used by CUSE. Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl() for more information. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	e9bb09dd6c	fuse: don't let fuse_req->end() put the base reference fuse_req->end() was supposed to be put the base reference but there's no reason why it should. It only makes things more complex. Move it out of ->end() and make it the responsibility of request_end(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:54 +01:00
Miklos Szeredi	1729a16c2c	fuse: style fixes Fix coding style errors reported by checkpatch and others. Uptdate copyright date to 2008. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:54 +01:00
David Howells	c69e8d9c01	CRED: Use RCU to access another task's creds and to release a task's own creds Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:19 +11:00
David Howells	b6dff3ec5e	CRED: Separate task security context from task_struct Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:16 +11:00
David Howells	2186a71cbc	CRED: Wrap task credential accesses in the FUSE filesystem Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: fuse-devel@lists.sourceforge.net Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:38:53 +11:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Christoph Hellwig	440037287c	[PATCH] switch all filesystems over to d_obtain_alias Switch all users of d_alloc_anon to d_obtain_alias. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:13:01 -04:00
Tejun Heo	a7c1b990f7	fuse: implement nonseekable open Let the client request nonseekable open using FOPEN_NONSEEKABLE and call nonseekable_open() on the file if requested. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:57 +02:00
Tejun Heo	29d434b39c	fuse: add include protectors Add include protectors to include/linux/fuse.h and fs/fuse/fuse_i.h. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:57 +02:00
Julia Lawall	17e18ab6ff	fuse: add missing fuse_request_free The error handling code for the second call to fuse_request_alloc should include freeing the result of the first one. This bug was found by the Coccinelle project: http://www.emn.fr/x-info/coccinelle/ Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:56 +02:00
Miklos Szeredi	769415c611	fuse: fix SEEK_END incorrectness Update file size before using it in lseek(..., SEEK_END). Reported-by: Amnon Shiloh <u3557@miso.sublimeip.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:56 +02:00
Steven Whitehouse	a447c09324	vfs: Use const for kernel parser table This is a much better version of a previous patch to make the parser tables constant. Rather than changing the typedef, we put the "const" in all the various places where its required, allowing the __initconst exception for nfsroot which was the cause of the previous trouble. This was posted for review some time ago and I believe its been in -mm since then. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-13 10:10:37 -07:00
Al Viro	a110343f0d	[PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess * MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS * MAY_ACCESS on fuse should affect only the last step of pathname resolution * fchdir() and chroot() should pass MAY_ACCESS, for the same reason why chdir() needs that. * now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be removed; it has no business being in nameidata. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:21 -04:00
Miklos Szeredi	2f1936b877	[patch 3/5] vfs: change remove_suid() to file_remove_suid() All calls to remove_suid() are made with a file pointer, because (similarly to file_update_time) it is called when the file is written. Clean up callers by passing in a file instead of a dentry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:16 -04:00
Al Viro	e6305c43ed	[PATCH] sanitize ->permission() prototype * kill nameidata * argument; map the 3 bits in ->flags anybody cares about to new MAY_... ones and pass with the mask. * kill redundant gfs2_iop_permission() * sanitize ecryptfs_permission() * fix remaining places where ->permission() instances might barf on new MAY_... found in mask. The obvious next target in that direction is permission(9) folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:14 -04:00
Alexey Dobriyan	51cc50685a	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:07 -07:00
Miklos Szeredi	48e90761b5	fuse: lockd support If fuse filesystem doesn't define it's own lock operations, then allow the lock manager to work with fuse. Adding lockd support for remote locking is also possible, but more rarely used, so leave it till later. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	33670fa296	fuse: nfs export special lookups Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	c180eebe13	fuse: add fuse_lookup_name() helper Add a new helper function which sends a LOOKUP request with the supplied name. This will be used by the next patch to send special LOOKUP requests with "." and ".." as the name. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	dbd561d236	fuse: add export operations Implement export_operations, to allow fuse filesystems to be exported to NFS. This feature has been in the out-of-tree fuse module, and is widely used and tested. It has not been originally merged into mainline, because doing the NFS export in userspace was thought to be a cleaner and more efficient way of doing it, than through the kernel. While that is true, it would also have involved a lot of duplicated effort at reimplementing NFS exporting (all the different versions of the protocol). This effort was unfortunately not undertaken by anyone, so we are left with doing it the easy but less efficient way. If this feature goes in, the out-of-tree fuse module can go away, which would have several advantages: - not having to maintain two versions - less confusion for users - no bugs due to kernel API changes Comment from hch: - Use the same fh_type values as XFS, since we use the same fh encoding. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	0de6256daa	fuse: prepare lookup for nfs export Use d_splice_alias() instead of d_add() in fuse lookup code, to allow NFS exporting. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	f948d56435	fuse: fix thinko in max I/O size calucation Use max not min to enforce a lower limit on the max I/O size. This bug was introduced by "fuse: fix max i/o size calculation" (commit `e5d9a0df07`). Thanks to Brian Wang for noticing. Reported-by: Brian Wang <ywang221@hotmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-17 18:08:10 -07:00
Miklos Szeredi	03fb0bce01	fuse: fix bdi naming conflict Fuse allocates a separate bdi for each filesystem, and registers them in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for anon devices normally used by fuse, but can conflict with an already registered BDI for "fuseblk" filesystems, where sb->s_dev represents a real block device. In particularl this happens if a non-partitioned device is being mounted. Fix by registering with a different name for "fuseblk" filesystems. Thanks to Ioan Ionita for the bug report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Ioan Ionita <opslynx@gmail.com> Tested-by: Ioan Ionita <opslynx@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:07 -07:00
Miklos Szeredi	78bb6cb9a8	fuse: add flag to turn on big writes Prior to 2.6.26 fuse only supported single page write requests. In theory all fuse filesystem should be able support bigger than 4k writes, as there's nothing in the API to prevent it. Unfortunately there's a known case in NTFS-3G where big writes cause filesystem corruption. There could also be other filesystems, where the lack of testing with big write requests would result in bugs. To prevent such problems on a kernel upgrade, disable big writes by default, but let filesystems set a flag to turn it on. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
Harvey Harrison	bd7309677c	fuse: use clamp() rather than nested min/max clamp() exists for this use. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:02 -07:00
Miklos Szeredi	4dbf930ed6	fuse: fix sparse warnings fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5559b8f4d1	fuse: fix race in llseek Fuse doesn't use i_mutex to protect setting i_size, and so generic_file_llseek() can be racy: it doesn't use i_size_read(). So do a fuse specific llseek method, which does use i_size_read(). [akpm@linux-foundation.org: make `retval' loff_t] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	b48badf013	fuse: fix node ID type Node ID is 64bit but it is passed as unsigned long to some functions. This breakage wasn't noticed, because libfuse uses unsigned long too. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	e5d9a0df07	fuse: fix max i/o size calculation Fix a bug that Werner Baumann reported: fuse can send a bigger write request than the maximum specified. This only affected direct_io operation. In addition set a sane minimum for the max_read and max_write tunables, so I/O always makes some progress. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5c5c5e51b2	fuse: update file size on short read If the READ request returned a short count, then either - cached size is incorrect - filesystem is buggy, as short reads are only allowed on EOF So assume that the size is wrong and refresh it, so that cached read() doesn't zero fill the missing chunk. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Nick Piggin	ea9b9907b8	fuse: implement perform_write Introduce fuse_perform_write. With fusexmp (a passthrough filesystem), large (1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times (256MB/s vs 71MB/s). [mszeredi@suse.cz]: - split into smaller functions - testing - duplicate generic_file_aio_write(), so that there's no need to add a new ->perform_write() a_op. Comment from hch. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	854512ec35	fuse: clean up setting i_size in write Extract common code for setting i_size in write functions into a common helper. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	3be5a52b30	fuse: support writable mmap Quoting Linus (3 years ago, FUSE inclusion discussions): "User-space filesystems are hard to get right. I'd claim that they are almost impossible, unless you limit them somehow (shared writable mappings are the nastiest part - if you don't have those, you can reasonably limit your problems by limiting the number of dirty pages you accept through normal "write()" calls)." Instead of attempting the impossible, I've just waited for the dirty page accounting infrastructure to materialize (thanks to Peter Zijlstra and others). This nicely solved the biggest problem: limiting the number of pages used for write caching. Some small details remained, however, which this largish patch attempts to address. It provides a page writeback implementation for fuse, which is completely safe against VM related deadlocks. Performance may not be very good for certain usage patterns, but generally it should be acceptable. It has been tested extensively with fsx-linux and bash-shared-mapping. Fuse page writeback design -------------------------- fuse_writepage() allocates a new temporary page with GFP_NOFS\|__GFP_HIGHMEM. It copies the contents of the original page, and queues a WRITE request to the userspace filesystem using this temp page. The writeback is finished instantly from the MM's point of view: the page is removed from the radix trees, and the PageDirty and PageWriteback flags are cleared. For the duration of the actual write, the NR_WRITEBACK_TEMP counter is incremented. The per-bdi writeback count is not decremented until the actual write completes. On dirtying the page, fuse waits for a previous write to finish before proceeding. This makes sure, there can only be one temporary page used at a time for one cached page. This approach is wasteful in both memory and CPU bandwidth, so why is this complication needed? The basic problem is that there can be no guarantee about the time in which the userspace filesystem will complete a write. It may be buggy or even malicious, and fail to complete WRITE requests. We don't want unrelated parts of the system to grind to a halt in such cases. Also a filesystem may need additional resources (particularly memory) to complete a WRITE request. There's a great danger of a deadlock if that allocation may wait for the writepage to finish. Currently there are several cases where the kernel can block on page writeback: - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER - page migration - throttle_vm_writeout (through NR_WRITEBACK) - sync(2) Of course in some cases (fsync, msync) we explicitly want to allow blocking. So for these cases new code has to be added to fuse, since the VM is not tracking writeback pages for us any more. As an extra safetly measure, the maximum dirty ratio allocated to a single fuse filesystem is set to 1% by default. This way one (or several) buggy or malicious fuse filesystems cannot slow down the rest of the system by hogging dirty memory. With appropriate privileges, this limit can be raised through '/sys/class/bdi/<bdi>/max_ratio'. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	b6f2fcbcfc	mm: bdi: expose the BDI object in sysfs for FUSE Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR" Make the fuse control filesystem use s_dev instead of a fuse specific ID. This makes it easier to match directories under /sys/fs/fuse/connections/ with directories under /sys/class/bdi, and with actual mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:49 -07:00
Al Viro	42faad9965	[PATCH] restore sane ->umount_begin() API Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:23:25 -04:00
Miklos Szeredi	1a823ac9ff	fuse: fix permission checking I added a nasty local variable shadowing bug to fuse in 2.6.24, with the result, that the 'default_permissions' mount option is basically ignored. How did this happen? - old err declaration in inner scope - new err getting declared in outer scope - 'return err' from inner scope getting removed - old declaration not being noticed -Wshadow would have saved us, but it doesn't seem practical for the kernel :( More testing would have also saved us :(( Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-23 17:12:13 -08:00
Miklos Szeredi	d1875dbaa5	mount options: fix fuse Add blksize= option to /proc/mounts for fuseblk filesystems. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:40 -08:00
David Howells	fa300b1914	iget: stop FUSE from using iget() and read_inode() Stop the FUSE filesystem from using read_inode(), which it doesn't use anyway. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:28 -08:00
David Howells	e231c2ee64	Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using: perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]PTR_ERR' fs crypto net security` Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:26 -08:00
Miklos Szeredi	d12def1bcb	fuse: limit queued background requests Libfuse basically creates a new thread for each new request. This is fine for synchronous requests, which are naturally limited. However background requests (especially writepage) can cause a thread creation storm. To avoid this, limit the number of background requests available to userspace. This is done by introducing another queue for background requests, and a counter for the number of "active" requests, which are currently available for userspace. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Miklos Szeredi	b57d426445	fuse: save space in struct fuse_req Move the fields 'dentry' and 'vfsmount' into the request specific union, since these are only used for the RELEASE request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Miklos Szeredi	0952b2a4a8	fuse: fix attribute caching after create Invalidate attributes on create, since st_ctime is updated. Reported by Szabolcs Szakacsits. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Greg Kroah-Hartman	197b12d679	Kobject: convert fs/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	00d2666623	kobject: convert main fs kobject to use kobject_create This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	5c89e17e9c	kobject: convert fuse to use kobject_create We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Miklos Szeredi	08b633070a	fuse: fix attribute caching after rename Invalidate attributes on rename, since some filesystems may update st_ctime. Reported by Szabolcs Szakacsits Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
John Muir	fbee36b92a	fuse: fix uninitialized field in fuse_inode I found problems accessing (executing) previously existing files, until I did chmod on them (or setattr). If the fi->attr_version is not initialized, then it could be larger than fc->attr_version until a setattr is executed, and as a result the inode attributes would never be set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	d0186b25e6	fuse: fix FUSE_FILE_OPS sending FUSE_FILE_OPS is meant to signal that the kernel will send the open file to to the userspace filesystem for operations on open files, so that sillyrenaming unlinked files becomes unnecessary. However this needs VFS changes, which won't make it into 2.6.24. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	a6643094e7	fuse: pass open flags to read and write Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...) after open, but fuse currently only sends the flags to userspace in open. To make it possible to correcly handle changing flags, send the current value to userspace in each read and write. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	7dca9fd39f	fuse: cleanup: add fuse_get_attr_version() Extract repeated code into helper function, as suggested by Akpm. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	bcb4be809d	fuse: fix reading past EOF Currently reading a fuse file will stop at cached i_size and return EOF, even though the file might have grown since the attributes were last updated. So detect if trying to read past EOF, and refresh the attributes before continuing with the read. Thanks to mpb for the report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Adrian Bunk	8744969a81	fuse_file_alloc(): fix NULL dereferences Fix obvious NULL dereferences spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:42 -08:00
Miklos Szeredi	0e9663ee45	fuse: add blksize field to fuse_attr There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	f33321141b	fuse: add support for mandatory locking For mandatory locking the userspace filesystem needs to know the lock ownership for read, write and truncate operations. This patch adds the necessary fields to the protocol. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	b25e82e567	fuse: add helper for asynchronous writes This patch adds a new helper function fuse_write_fill() which makes it possible to send WRITE requests asynchronously. A new flag for WRITE requests is also added which indicates that this a write from the page cache, and not a "normal" file write. This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	93a8c3cd9e	fuse: add list of writable files to fuse_inode Each WRITE request must carry a valid file descriptor. When a page is written back from a memory mapping, the file through which the page was dirtied is not available, so a new mechananism is needed to find a suitable file in ->writepage(s). A list of fuse_files is added to fuse_inode. The file is removed from the list in fuse_release(). This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	a9ff4f8705	fuse: support BSD locking semantics It is trivial to add support for flock(2) semantics to the existing protocol, by setting the lock owner field to the file pointer, and passing a new FUSE_LK_FLOCK flag with the locking request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	6ff958edbf	fuse: add atomic open+truncate support This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single request, instead of separate truncate and open requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	17637cbaba	fuse: improve utimes support Add two new flags for setattr: FATTR_ATIME_NOW and FATTR_MTIME_NOW. These mean, that atime or mtime should be changed to the current time. Also it is now possible to update atime or mtime individually, not just together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	49d4914fd7	fuse: clean up open file passing in setattr Clean up supplying open file to the setattr operation. In addition to being a cleanup it prepares for the changes in the way the open file is passed to the setattr method. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	c79e322f63	fuse: add file handle to getattr operation Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	1fb69e7817	fuse: fix race between getattr and write Getattr and lookup operations can be running in parallel to attribute changing operations, such as write and setattr. This means, that if for example getattr was slower than a write, the cached size attribute could be set to a stale value. To prevent this race, introduce a per-filesystem attribute version counter. This counter is incremented whenever cached attributes are modified, and the incremented value stored in the inode. Before storing new attributes in the cache, getattr and lookup check, using the version number, whether the attributes have been modified during the request's lifetime. If so, the returned attributes are not cached, because they might be stale. Thanks to Jakub Bogusz for the bug report and test program. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jakub Bogusz <jakub.bogusz@gemius.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	e57ac68378	fuse: fix allowing operations The following operation didn't check if sending the request was allowed: setattr listxattr statfs Some other operations don't explicitly do the check, but VFS calls ->permission() which checks this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Miklos Szeredi	e8e961574b	fuse: clean up execute permission checking Define a new function fuse_refresh_attributes() that conditionally refreshes the attributes based on the validity timeout. In fuse_permission() only refresh the attributes for checking the execute bits if necessary. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	c9c9d7df5f	fuse: no ENOENT from fuse device read Don't return -ENOENT for a read() on the fuse device when the request was aborted. Instead return -ENODEV, meaning the filesystem has been force-umounted or aborted. Previously ENOENT meant that the request was interrupted, but now the 'aborted' flag is not set in case of interrupts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	a131de0a48	fuse: no abort on interrupt Don't set 'aborted' flag on a request if it's interrupted. We have to wait for the answer anyway, and this would only a very little time while copying the reply. This means, that write() on the fuse device will not return -ENOENT during normal operation, only if the filesystem is aborted by a forced umount or through the fusectl interface. This could simplify userspace code somewhat when backward compatibility with earlier kernel versions is not required. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	819c4b3b40	fuse: cleanup in release Move dput/mntput pair from request_end() to fuse_release_end(), because there's no other place they are used. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	ebc14c4dbe	fuse: fix permission checking on sticky directories The VFS checks sticky bits on the parent directory even if the filesystem defines it's own ->permission(). In some situations (sshfs, mountlo, etc) the user does have permission to delete a file even if the attribute based checking would not allow it. So work around this by storing the permission bits separately and returning them in stat(), but cutting the permission bits off from inode->i_mode. This is slightly hackish, but it's probably not worth it to add new infrastructure in VFS and a slight performance penalty for all filesystems, just for the sake of fuse. [Jan Engelhardt] cosmetic fixes Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	244f6385c2	fuse: refresh stale attributes in fuse_permission() fuse_permission() didn't refresh inode attributes before using them, even if the validity has already expired. Thanks to Junjiro Okajima for spotting this. Also remove some old code to unconditionally refresh the attributes on the root inode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	074406fa63	fuse: set i_nlink to sane value after mount Aufs seems to depend on a positive i_nlink value. So fill in a dummy but sane value for the root inode at mount time. The inode attributes are refreshed with the correct values at the first opportunity. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	b10099792b	fuse: fix page invalidation Other than truncate, there are two cases, when fuse tries to get rid of cached pages: a) in open, if KEEP_CACHE flag is not set b) in getattr, if file size changed spontaneously Until now invalidate_mapping_pages() were used, which didn't get rid of mapped pages. This is wrong, and becomes more wrong as dirty pages are introduced. So instead properly invalidate all pages with invalidate_inode_pages2(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	e00d2c2d4a	fuse: truncate on spontaneous size change Memory mappings were only truncated on an explicit truncate, but not when the file size was changed externally. Fix this by moving the truncation code from fuse_setattr to fuse_change_attributes. Yes, there are races between write and and external truncation, but we can't really do anything about them. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	c756e0a4d7	fuse: add reference counting to fuse_file Make lifetime of 'struct fuse_file' independent from 'struct file' by adding a reference counter and destructor. This will enable asynchronous page writeback, where it cannot be guaranteed, that the file is not released while a request with this file handle is being served. The actual RELEASE request is only sent when there are no more references to the fuse_file. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	de5e3dec42	fuse: fix reserved request wake up Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it is possible that the right task is not woken up. Also create a separate reserved_req_waitq in addition to the blocked_waitq, since they fulfill totally separate functions. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	f92b99b9dc	fuse: update backing_dev_info congestion state Set the read and write congestion state if the request queue is close to blocking, and clear it when it's not. This prevents unnecessary blocking in readahead and (when writable mmaps are allowed) writeback. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Peter Zijlstra	e0bf68ddec	mm: bdi init hooks provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Nick Piggin	5e6f58a1d7	fuse: convert to new aops [mszeredi] - don't send zero length write requests - it is not legal for the filesystem to return with zero written bytes Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:57 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Jens Axboe	5ffc4ef45b	sendfile: remove .sendfile from filesystems that use generic_file_sendfile() They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:13 +02:00
Alexey Dobriyan	edad01e2a1	fuse: ->fs_flags fixlet fs/fuse/inode.c:658:3: error: Initializer entry defined twice fs/fuse/inode.c:661:3: also defined here Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-16 13:16:15 -07:00
Miklos Szeredi	ead5f0b5fa	fuse: delete inode on drop When inode is dropped (no more references) delete it from cache. There's not much point in keeping it cached, when a new lookup will refresh the attributes anyway. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:13 -07:00
Miklos Szeredi	889f784831	fuse: generic_write_checks() for direct_io This fixes O_APPEND in direct IO mode. Also checks writes against file size limits, notably rlimits. Reported by Greg Bruno. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:13 -07:00
Miklos Szeredi	b9ba347f27	fuse: fix mknod of regular file The wrong lookup flag was tested in ->create() causing havoc (error or Oops) when a regular file was created with mknod() in a fuse filesystem. Thanks to J. Cameijo Cerdeira for the report. Kernels 2.6.18 onward are affected. Please apply to -stable as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:11 -07:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Christoph Lameter	a35afb830f	Remove SLAB_CTOR_CONSTRUCTOR SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-17 05:23:04 -07:00
Miklos Szeredi	79c0b2df79	add filesystem subtype support There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:01 -07:00
Linus Torvalds	2d56d3c43c	Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux * 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses	2007-05-07 12:34:24 -07:00
Christoph Lameter	50953fe9e0	slab allocators: Remove SLAB_DEBUG_INITIAL flag I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Marc Eshel	9d6a8c5c21	locks: give posix_test_lock same interface as ->lock posix_test_lock() and ->lock() do the same job but have gratuitously different interfaces. Modify posix_test_lock() so the two agree, simplifying some code in the process. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>	2007-05-06 17:39:00 -04:00
Greg Kroah-Hartman	823bccfc40	remove "struct subsystem" as it is no longer needed We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 18:57:59 -07:00
Timo Savola	a5bfffac64	[PATCH] fuse: validate rootmode mount option If rootmode isn't valid, we hit the BUG() in fuse_init_inode. Now EINVAL is returned. Signed-off-by: Timo Savola <tsavola@movial.fi> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-08 19:47:55 -07:00
Josef 'Jeff' Sipek	ee9b6d61a2	[PATCH] Mark struct super_operations const This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:47 -08:00
Arjan van de Ven	754661f143	[PATCH] mark struct inode_operations const 1 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Andrew Morton	fc0ecff698	[PATCH] remove invalidate_inode_pages() Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:31 -08:00
Miklos Szeredi	ff79544754	[PATCH] fuse: fix bug in control filesystem mount The BUG in fuse_ctl_add_dentry() could be triggered if the control filesystem was unmounted and mounted again while one or more fuse filesystems were present. The fix is to reset the dentry counter in fuse_ctl_kill_sb(). Bug reported by Florent Mertens. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Miklos Szeredi	9280f6822c	[PATCH] fuse: remove clear_page_dirty() call The use by FUSE was just a remnant of an optimization from the time when writable mappings were supported. Now FUSE never actually allows the creation of dirty pages, so this invocation of clear_page_dirty() is effectively a no-op. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-21 09:25:08 -08:00
Josef Sipek	7706a9d618	[PATCH] struct path: convert fuse Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:45 -08:00
Miklos Szeredi	875d95ec9e	[PATCH] fuse: fix compile without CONFIG_BLOCK Randy Dunlap wote: > Should FUSE depend on BLOCK? Without that and with BLOCK=n, I get: > > inode.c:(.text+0x3acc5): undefined reference to `sb_set_blocksize' > inode.c:(.text+0x3a393): undefined reference to `get_sb_bdev' > fs/built-in.o:(.data+0xd718): undefined reference to `kill_block_super Most fuse filesystems work fine without block device support, so I think a better solution is to disable the 'fuseblk' filesystem type if BLOCK=n. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	0ec7ca41f6	[PATCH] fuse: add DESTROY operation Add a DESTROY operation for block device based filesystems. With the help of this operation, such a filesystem can flush dirty data to the device synchronously before the umount returns. This is needed in situations where the filesystem is assumed to be clean immediately after unmount (e.g. ejecting removable media). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	b2d2272fae	[PATCH] fuse: add bmap support Add support for the BMAP operation for block device based filesystems. This is needed to support swap-files and lilo. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	d809161402	[PATCH] fuse: add blksize option Add 'blksize' option for block device based filesystems. During initialization this is used to set the block size on the device and the super block. The default block size is 512bytes. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Miklos Szeredi	d6392f873f	[PATCH] fuse: add support for block device based filesystems I never intended this, but people started using fuse to implement block device based "real" filesystems (ntfs-3g, zfs). The following four patches add better support for these kinds of filesystems. Unlike "normal" fuse filesystems, using this feature should require superuser privileges (enforced by the fusermount utility). Thanks to Szabolcs Szakacsits for the input and testing. This patch adds a 'fuseblk' filesystem type, which is only different from the 'fuse' filesystem type in how the 'dev_name' mount argument is interpreted. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Miklos Szeredi	bdcf250804	[PATCH] fuse: minor cleanup in fuse_dentry_revalidate Remove unneeded code from fuse_dentry_revalidate(). This made some sense while the validity time could wrap around, but now it's a very obvious no-op. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Christoph Lameter	e94b176609	[PATCH] slab: remove SLAB_KERNEL SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:24 -08:00
Miklos Szeredi	2d51013ed2	[PATCH] fuse: fix Oops in lookup Fix bug in certain error paths of lookup routines. The request object was reused for sending FORGET, which is illegal. This bug could cause an Oops in 2.6.18. In earlier versions it might silently corrupt memory, but this is very unlikely. These error paths are never triggered by libfuse, so this wasn't noticed even with the 2.6.18 kernel, only with a filesystem using the raw kernel interface. Thanks to Russ Cox for the bug report and test filesystem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-25 13:28:33 -08:00
OGAWA Hirofumi	2e990021bf	[PATCH] fuse: ->readpages() cleanup This just ignore the remaining pages. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Steven French <sfrench@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-03 12:27:57 -08:00
Miklos Szeredi	e956edd052	[PATCH] fuse: fix dereferencing dentry parent There's no locking for ->d_revalidate, so fuse_dentry_revalidate() should use dget_parent() instead of simply dereferencing ->d_parent. Due to topology changes in the directory tree the parent could become negative or be destroyed while being used. There hasn't been any reports about this yet. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	d2a85164aa	[PATCH] fuse: fix handling of moved directory Fuse considered it an error (EIO) if lookup returned a directory inode, to which a dentry already refered. This is because directory aliases are not allowed. But in a network filesystem this could happen legitimately, if a directory is moved on a remote client. This patch attempts to relax the restriction by trying to first evict the offending alias from the cache. If this fails, it still returns an error (EBUSY). A rarer situation is if an mkdir races with an indenpendent lookup, which finds the newly created directory already moved. In this situation the mkdir should return success, but that would be incorrect, since the dentry cannot be instantiated, so return EBUSY. Previously checking for a directory alias and instantiation of the dentry weren't done atomically in lookup/mkdir, hence two such calls racing with each other could create aliased directories. To prevent this introduce a new per-connection mutex: fuse_conn->inst_mutex, which is taken for instantiations with a directory inode. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	265126ba9e	[PATCH] fuse: fix spurious BUG Fix a spurious BUG in an unlikely race, where at least three parallel lookups return the same inode, but with different file type. This has not yet been observed in real life. Allowing unlimited retries could delay fuse_iget() indefinitely, but this is really for the broken userspace filesystem to worry about. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	8da5ff23ce	[PATCH] fuse: locking fix for nlookup An inode could be returned by independent parallel lookups, in this case an update of the lookup counter could be lost resulting in a memory leak in userspace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	9ffbb91623	[PATCH] fuse: fix hang on SMP Fuse didn't always call i_size_write() with i_mutex held which caused rare hangs on SMP/32bit. This bug has been present since fuse-2.2, well before being merged into mainline. The simplest solution is to protect i_size_write() with the per-connection spinlock. Using i_mutex for this purpose would require some restructuring of the code and I'm not even sure it's always safe to acquire i_mutex in all places i_size needs to be set. Since most of vmtruncate is already duplicated for other reasons, duplicate the remaining part as well, making all i_size_write() calls internal to fuse. Using i_size_write() was unnecessary in fuse_init_inode(), since this function is only called on a newly created locked inode. Reported by a few people over the years, but special thanks to Dana Henriksen who was persistent enough in helping me debug it. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Dave Hansen	ce71ec3684	[PATCH] r/o bind mounts: monitor zeroing of i_nlink Some filesystems, instead of simply decrementing i_nlink, simply zero it during an unlink operation. We need to catch these in addition to the decrement operations. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:30 -07:00
Dave Hansen	d8c76e6f45	[PATCH] r/o bind mount prepwork: inc_nlink() helper This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:30 -07:00
Badari Pulavarty	543ade1fc9	[PATCH] Streamline generic_file_* interfaces and filemap cleanups This patch cleans up generic_file__read/write() interfaces. Christoph Hellwig gave me the idea for this clean ups. In a nutshell, all filesystems should set .aio_read/.aio_write methods and use do_sync_read/ do_sync_write() as their .read/.write methods. This allows us to cleanup all variants of generic_file_ routines. Final available interfaces: generic_file_aio_read() - read handler generic_file_aio_write() - write handler generic_file_aio_write_nolock() - no lock write handler __generic_file_aio_write_nolock() - internal worker routine Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:28 -07:00
Badari Pulavarty	ee0b3e671b	[PATCH] Remove readv/writev methods and use aio_read/aio_write instead This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:28 -07:00
Miklos Szeredi	650a898342	[PATCH] vfs: define new lookup flag for chdir In the "operation does permission checking" model used by fuse, chdir permission is not checked, since there's no chdir method. For this case set a lookup flag, which will be passed to ->permission(), so fuse can distinguish it from permission checks for other operations. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Miklos Szeredi	5b35e8e58a	[PATCH] fuse: use dentry in statfs Some filesystems may want to report different values depending on the path within the filesystem, i.e. one mount is actually several filesystems. This can be the case for a network filesystem exported by an unprivileged server (e.g. sshfs). This is now possible, thanks to David Howells "VFS: Permit filesystem to perform statfs with a known root dentry" patch. This change is backward compatible, so no need to change interface version. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Josh Triplett	105f4d7a81	[PATCH] fuse: add lock annotations to request_end and fuse_read_interrupt request_end and fuse_read_interrupt release fc->lock. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Theodore Ts'o	ba52de123d	[PATCH] inode-diet: Eliminate i_blksize from the inode structure This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. [bunk@stusta.de: cleanup] [akpm@osdl.org: generic_fillattr() fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:18 -07:00
Theodore Ts'o	8e18e2941c	[PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:17 -07:00
Alexander Zarochentsev	1d7ea7324a	[PATCH] fuse: fix error case in fuse_readpages Don't let fuse_readpages leave the @pages list not empty when exiting on error. [akpm@osdl.org: kernel-doc fixes] Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:29 -07:00
Miklos Szeredi	873302c71c	[PATCH] fuse: fix typo Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Miklos Szeredi	0a0898cf41	[PATCH] fuse: use jiffies_64 It is entirely possible (though rare) that jiffies half-wraps around, while a dentry/inode remains in the cache. This could mean that the dentry/inode is not invalidated for another half wraparound-time. To get around this problem, use 64-bit jiffies. The only problem with this is that dentry->d_time is 32 bits on 32-bit archs. So use d_fsdata as the high 32 bits. This is an ugly hack, but far simpler, than having to allocate private data just for this purpose. Since 64-bit jiffies can be assumed never to wrap around, simple comparison can be used, and a zero time value can represent "invalid". Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Miklos Szeredi	685d16ddb0	[PATCH] fuse: fix zero timeout An attribute and entry timeout of zero should mean, that the entity is invalidated immediately after the operation. Previously invalidation only happened at the next clock tick. Reported and tested by Craig Davies. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Christoph Hellwig	f5e54d6e53	[PATCH] mark address_space_operations const Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-28 14:59:04 -07:00
Linus Torvalds	1d77062b14	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits) nfs: remove nfs_put_link() nfs-build-fix-99 git-nfs-build-fixes Merge branch 'odirect' NFS: alloc nfs_read/write_data as direct I/O is scheduled NFS: Eliminate nfs_get_user_pages() NFS: refactor nfs_direct_free_user_pages NFS: remove user_addr, user_count, and pos from nfs_direct_req NFS: "open code" the NFS direct write rescheduler NFS: Separate functions for counting outstanding NFS direct I/Os NLM: Fix reclaim races NLM: sem to mutex conversion locks.c: add the fl_owner to nlm_compare_locks NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts NFS: Split fs/nfs/inode.c NFS: Fix typo in nfs_do_clone_mount() NFS: Fix compile errors introduced by referrals patches NFSv4: Ensure that referral mounts bind to a reserved port NFSv4: A root pathname is sent as a zero component4 NFSv4: Follow a referral ...	2006-06-25 10:54:14 -07:00
Miklos Szeredi	9c8ef5614d	[PATCH] fuse: scramble lock owner ID VFS uses current->files pointer as lock owner ID, and it wouldn't be prudent to expose this value to userspace. So scramble it with XTEA using a per connection random key, known only to the kernel. Only one direction needs to be implemented, since the ID is never sent in the reverse direction. The XTEA algorithm is implemented inline since it's simple enough to do so, and this adds less complexity than if the crypto API were used. Thanks to Jesper Juhl for the idea. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:20 -07:00
Miklos Szeredi	a4d27e75ff	[PATCH] fuse: add request interruption Add synchronous request interruption. This is needed for file locking operations which have to be interruptible. However filesystem may implement interruptibility of other operations (e.g. like NFS 'intr' mount option). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	f9a2842e56	[PATCH] fuse: rename the interrupted flag Rename the 'interrupted' flag to 'aborted', since it indicates exactly that, and next patch will introduce an 'interrupted' flag for a Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	33649c91a3	[PATCH] fuse: ensure FLUSH reaches userspace All POSIX locks owned by the current task are removed on close(). If the FLUSH request resulting initiated by close() fails to reach userspace, there might be locks remaining, which cannot be removed. The only reason it could fail, is if allocating the request fails. In this case use the request reserved for RELEASE, or if that is currently used by another FLUSH, wait for it to become available. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	7142125937	[PATCH] fuse: add POSIX file locking support This patch adds POSIX file locking support to the fuse interface. This implementation doesn't keep any locking state in kernel. Unlocking on close() is handled by the FLUSH message, which now contains the lock owner id. Mandatory locking is not supported. The filesystem may enfoce mandatory locking in userspace if needed. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	bafa96541b	[PATCH] fuse: add control filesystem Add a control filesystem to fuse, replacing the attributes currently exported through sysfs. An empty directory '/sys/fs/fuse/connections' is still created in sysfs, and mounting the control filesystem here provides backward compatibility. Advantages of the control filesystem over the previous solution: - allows the object directory and the attributes to be owned by the filesystem owner, hence letting unpriviled users abort the filesystem connection - does not suffer from module unload race [akpm@osdl.org: fix this fs for recent dhowells depredations] [akpm@osdl.org: fix 64-bit printk warnings] Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	51eb01e735	[PATCH] fuse: no backgrounding on interrupt Don't put requests into the background when a fatal interrupt occurs while the request is in userspace. This removes a major wart from the implementation. Backgrounding of requests was introduced to allow breaking of deadlocks. However now the same can be achieved by aborting the filesystem through the 'abort' sysfs attribute. This is a change in the interface, but should not cause problems, since these kinds of deadlocks never happen during normal operation. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Trond Myklebust	816724e65c	Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ Conflicts: fs/nfs/inode.c fs/super.c Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch 'VFS: Permit filesystem to override root dentry on mount'	2006-06-24 13:07:53 -04:00
Miklos Szeredi	75e1fcc0b1	[PATCH] vfs: add lock owner argument to flush operation Pass the POSIX lock owner ID to the flush operation. This is useful for filesystems which don't want to store any locking state in inode->i_flock but want to handle locking/unlocking POSIX locks internally. FUSE is one such filesystem but I think it possible that some network filesystems would need this also. Also add a flag to indicate that a POSIX locking request was generated by close(), so filesystems using the above feature won't send an extra locking request in this case. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:02 -07:00
David Howells	726c334223	[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
David Howells	454e2398be	[PATCH] VFS: Permit filesystem to override root dentry on mount Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: () the get_sb_() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. () If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). () generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[] dentries with child trees. [] Anonymous until discovered from another tree. () The documentation has been adjusted, including the additional bit of changing ext2_ into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
Trond Myklebust	8b512d9a88	VFS: Remove dependency of ->umount_begin() call on MNT_FORCE Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:18 -04:00
Miklos Szeredi	8aa09a50b5	[fuse] fix race between checking and setting file->private_data BKL does not protect against races if the task may sleep between checking and setting a value. So move checking of file->private_data near to setting it in fuse_fill_super(). Found by Al Viro. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:49:16 +02:00
Miklos Szeredi	6dbbcb1205	[fuse] fix deadlock between fuse_put_super() and request_end(), try #2 A deadlock was possible, when the last reference to the superblock was held due to a background request containing a file reference. Releasing the file would release the vfsmount which in turn would release the superblock. Since sbput_sem is held during the fput() and fuse_put_super() tries to acquire this same semaphore, a deadlock results. The solution is to move the fput() outside the region protected by sbput_sem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:49:06 +02:00
Miklos Szeredi	5a5fb1ea74	Revert "[fuse] fix deadlock between fuse_put_super() and request_end()" This reverts `73ce8355c2` commit. It was wrong, because it didn't take into account the requirement, that iput() for background requests must be performed synchronously with ->put_super(), otherwise active inodes may remain after unmount. The right solution is to keep the sbput_sem and perform iput() within the locked region, but move fput() outside sbput_sem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:48:55 +02:00
Miklos Szeredi	56cf34ff07	[fuse] Direct I/O should not use fuse_reset_request It's cleaner to allocate a new request, otherwise the uid/gid/pid fields of the request won't be filled in. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:51 +02:00
Miklos Szeredi	4858cae4f0	[fuse] Don't init request twice Request is already initialized in fuse_request_alloc() so no need to do it again in fuse_get_req(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:38 +02:00
Miklos Szeredi	9bc5dddad1	[fuse] Fix accounting the number of waiting requests Properly accounting the number of waiting requests was forgotten in "clean up request accounting" patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:09 +02:00
Miklos Szeredi	73ce8355c2	[fuse] fix deadlock between fuse_put_super() and request_end() A deadlock was possible, when the last reference to the superblock was held due to a background request containing a file reference. Releasing the file would release the vfsmount which in turn would release the superblock. Since sbput_sem is held during the fput() and fuse_put_super() tries to acquire this same semaphore, a deadlock results. The chosen soltuion is to get rid of sbput_sem, and instead use the spinlock to ensure the referenced inodes/file are released only once. Since the actual release may sleep, defer these outside the locked region, but using local variables instead of the structure members. This is a much more rubust solution. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:14:26 +02:00
Miklos Szeredi	08a53cdce6	[PATCH] fuse: account background requests The previous patch removed limiting the number of outstanding requests. This patch adds a much simpler limiting, that is also compatible with file locking operations. A task may have at most one synchronous request allocated. So these requests need not be otherwise limited. However the number of background requests (release, forget, asynchronous reads, interrupted requests) can grow indefinitely. This can be used by a malicous user to cause FUSE to allocate arbitrary amounts of unswappable kernel memory, denying service. For this reason add a limit for the number of background requests, and block allocations of new requests until the number goes bellow the limit. Also use this mechanism to block all requests until the INIT reply is received. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	ce1d5a491f	[PATCH] fuse: clean up request accounting FUSE allocated most requests from a fixed size pool filled at mount time. However in some cases (release/forget) non-pool requests were used. File locking operations aren't well served by the request pool, since they may block indefinetly thus exhausting the pool. This patch removes the request pool and always allocates requests on demand. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	a87046d822	[PATCH] fuse: consolidate device errors Return consistent error values for the case when the opened device file has no mount associated yet. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	d713311464	[PATCH] fuse: use a per-mount spinlock Remove the global spinlock in favor of a per-mount one. This patch is basically find & replace. The difficult part has already been done by the previous patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	0720b31597	[PATCH] fuse: simplify locking This is in preparation for removing the global spinlock in favor of a per-mount one. The only critical part is the interaction between fuse_dev_release() and fuse_fill_super(): fuse_dev_release() must see the assignment to file->private_data, otherwise it will leak the reference to fuse_conn. This is ensured by the fput() operation, which will synchronize the assignment with other CPU's that may do a final fput() soon after this. Also redundant locking is removed from fuse_fill_super(), where exclusion is already ensured by the BKL held for this function by the VFS. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	e5ac1d1e70	[PATCH] fuse: add O_NONBLOCK support to FUSE device I don't like duplicating the connected and list_empty tests in fuse_dev_readv, but this seemed cleaner than adding the f_flags test to request_wait. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	385a17bfc3	[PATCH] fuse: add O_ASYNC support to FUSE device This adds asynchronous notification to FUSE - a FUSE server can request O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input available. One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested, does no locking, unlink the other methods. I think it's unnecessary, as the fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync, which provide their own locking. It would also be wrong to use the fuse_lock, as it's a spin lock and fasync_helper can sleep. My one concern with this is the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a reference on the file struct, so this seems not to be a problem. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	7025d9ad10	[PATCH] fuse: fix fuse_dev_poll() return value fuse_dev_poll() returned an error value instead of a poll mask. Luckily (or unluckily) -ENODEV does contain the POLLERR bit. There's also a race if filesystem is unmounted between fuse_get_conn() and spin_lock(), in which case this event will be missed by poll(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00
Miklos Szeredi	d3406ffa4a	[PATCH] fuse: fix oops in fuse_send_readpages() During heavy parallel filesystem activity it was possible to Oops the kernel. The reason is that read_cache_pages() could skip pages which have already been inserted into the cache by another task. Occasionally this may result in zero pages actually being sent, while fuse_send_readpages() relies on at least one page being in the request. So check this corner case and just free the request instead of trying to send it. Reported and tested by Konstantin Isakov. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00

... 3 4 5 6 7 ...

509 Коммитов