Граф коммитов

434 Коммитов

Автор SHA1 Сообщение Дата
Miklos Szeredi 33649c91a3 [PATCH] fuse: ensure FLUSH reaches userspace
All POSIX locks owned by the current task are removed on close().  If the
FLUSH request resulting initiated by close() fails to reach userspace, there
might be locks remaining, which cannot be removed.

The only reason it could fail, is if allocating the request fails.  In this
case use the request reserved for RELEASE, or if that is currently used by
another FLUSH, wait for it to become available.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi 7142125937 [PATCH] fuse: add POSIX file locking support
This patch adds POSIX file locking support to the fuse interface.

This implementation doesn't keep any locking state in kernel.  Unlocking on
close() is handled by the FLUSH message, which now contains the lock owner id.

Mandatory locking is not supported.  The filesystem may enfoce mandatory
locking in userspace if needed.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi bafa96541b [PATCH] fuse: add control filesystem
Add a control filesystem to fuse, replacing the attributes currently exported
through sysfs.  An empty directory '/sys/fs/fuse/connections' is still created
in sysfs, and mounting the control filesystem here provides backward
compatibility.

Advantages of the control filesystem over the previous solution:

  - allows the object directory and the attributes to be owned by the
    filesystem owner, hence letting unpriviled users abort the
    filesystem connection

  - does not suffer from module unload race

[akpm@osdl.org: fix this fs for recent dhowells depredations]
[akpm@osdl.org: fix 64-bit printk warnings]
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi 51eb01e735 [PATCH] fuse: no backgrounding on interrupt
Don't put requests into the background when a fatal interrupt occurs while the
request is in userspace.  This removes a major wart from the implementation.

Backgrounding of requests was introduced to allow breaking of deadlocks.
However now the same can be achieved by aborting the filesystem through the
'abort' sysfs attribute.

This is a change in the interface, but should not cause problems, since these
kinds of deadlocks never happen during normal operation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Trond Myklebust 816724e65c Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	fs/nfs/inode.c
	fs/super.c

Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'
2006-06-24 13:07:53 -04:00
Miklos Szeredi 75e1fcc0b1 [PATCH] vfs: add lock owner argument to flush operation
Pass the POSIX lock owner ID to the flush operation.

This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally.  FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.

Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
David Howells 726c334223 [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry
Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch.  That reduced the significance of
sb->s_root, allowing NFS to place a fake root there.  However, NFS does
require a dentry to use as a target for the statfs operation.  This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells 454e2398be [PATCH] VFS: Permit filesystem to override root dentry on mount
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
Trond Myklebust 8b512d9a88 VFS: Remove dependency of ->umount_begin() call on MNT_FORCE
Allow filesystems to decide to perform pre-umount processing whether or not
MNT_FORCE is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:18 -04:00
Miklos Szeredi 8aa09a50b5 [fuse] fix race between checking and setting file->private_data
BKL does not protect against races if the task may sleep between
checking and setting a value.  So move checking of file->private_data
near to setting it in fuse_fill_super().

Found by Al Viro.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:16 +02:00
Miklos Szeredi 6dbbcb1205 [fuse] fix deadlock between fuse_put_super() and request_end(), try #2
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The solution is to move the fput() outside the region protected by
sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:06 +02:00
Miklos Szeredi 5a5fb1ea74 Revert "[fuse] fix deadlock between fuse_put_super() and request_end()"
This reverts 73ce8355c2 commit.

It was wrong, because it didn't take into account the requirement,
that iput() for background requests must be performed synchronously
with ->put_super(), otherwise active inodes may remain after unmount.

The right solution is to keep the sbput_sem and perform iput() within
the locked region, but move fput() outside sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:48:55 +02:00
Miklos Szeredi 56cf34ff07 [fuse] Direct I/O should not use fuse_reset_request
It's cleaner to allocate a new request, otherwise the uid/gid/pid
fields of the request won't be filled in.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:51 +02:00
Miklos Szeredi 4858cae4f0 [fuse] Don't init request twice
Request is already initialized in fuse_request_alloc() so no need to
do it again in fuse_get_req().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:38 +02:00
Miklos Szeredi 9bc5dddad1 [fuse] Fix accounting the number of waiting requests
Properly accounting the number of waiting requests was forgotten in
"clean up request accounting" patch.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:09 +02:00
Miklos Szeredi 73ce8355c2 [fuse] fix deadlock between fuse_put_super() and request_end()
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The chosen soltuion is to get rid of sbput_sem, and instead use the
spinlock to ensure the referenced inodes/file are released only once.
Since the actual release may sleep, defer these outside the locked
region, but using local variables instead of the structure members.

This is a much more rubust solution.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:14:26 +02:00
Miklos Szeredi 08a53cdce6 [PATCH] fuse: account background requests
The previous patch removed limiting the number of outstanding requests.  This
patch adds a much simpler limiting, that is also compatible with file locking
operations.

A task may have at most one synchronous request allocated.  So these requests
need not be otherwise limited.

However the number of background requests (release, forget, asynchronous
reads, interrupted requests) can grow indefinitely.  This can be used by a
malicous user to cause FUSE to allocate arbitrary amounts of unswappable
kernel memory, denying service.

For this reason add a limit for the number of background requests, and block
allocations of new requests until the number goes bellow the limit.

Also use this mechanism to block all requests until the INIT reply is
received.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:49 -07:00
Miklos Szeredi ce1d5a491f [PATCH] fuse: clean up request accounting
FUSE allocated most requests from a fixed size pool filled at mount time.
However in some cases (release/forget) non-pool requests were used.  File
locking operations aren't well served by the request pool, since they may
block indefinetly thus exhausting the pool.

This patch removes the request pool and always allocates requests on demand.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:49 -07:00
Miklos Szeredi a87046d822 [PATCH] fuse: consolidate device errors
Return consistent error values for the case when the opened device file has no
mount associated yet.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi d713311464 [PATCH] fuse: use a per-mount spinlock
Remove the global spinlock in favor of a per-mount one.

This patch is basically find & replace.  The difficult part has already been
done by the previous patch.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi 0720b31597 [PATCH] fuse: simplify locking
This is in preparation for removing the global spinlock in favor of a
per-mount one.

The only critical part is the interaction between fuse_dev_release() and
fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.

This is ensured by the fput() operation, which will synchronize the assignment
with other CPU's that may do a final fput() soon after this.

Also redundant locking is removed from fuse_fill_super(), where exclusion is
already ensured by the BKL held for this function by the VFS.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Jeff Dike e5ac1d1e70 [PATCH] fuse: add O_NONBLOCK support to FUSE device
I don't like duplicating the connected and list_empty tests in fuse_dev_readv,
but this seemed cleaner than adding the f_flags test to request_wait.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Jeff Dike 385a17bfc3 [PATCH] fuse: add O_ASYNC support to FUSE device
This adds asynchronous notification to FUSE - a FUSE server can request
O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
available.

One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
does no locking, unlink the other methods.  I think it's unnecessary, as the
fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
which provide their own locking.  It would also be wrong to use the fuse_lock,
as it's a spin lock and fasync_helper can sleep.  My one concern with this is
the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
reference on the file struct, so this seems not to be a problem.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi 7025d9ad10 [PATCH] fuse: fix fuse_dev_poll() return value
fuse_dev_poll() returned an error value instead of a poll mask.  Luckily (or
unluckily) -ENODEV does contain the POLLERR bit.

There's also a race if filesystem is unmounted between fuse_get_conn() and
spin_lock(), in which case this event will be missed by poll().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:47 -07:00
Miklos Szeredi d3406ffa4a [PATCH] fuse: fix oops in fuse_send_readpages()
During heavy parallel filesystem activity it was possible to Oops the kernel.
The reason is that read_cache_pages() could skip pages which have already been
inserted into the cache by another task.  Occasionally this may result in zero
pages actually being sent, while fuse_send_readpages() relies on at least one
page being in the request.

So check this corner case and just free the request instead of trying to send
it.

Reported and tested by Konstantin Isakov.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:47 -07:00
Arjan van de Ven 4b6f5d20b0 [PATCH] Make most file operations structs in fs/ const
This is a conversion to make the various file_operations structs in fs/
const.  Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:06 -08:00
Miklos Szeredi 50322fe7d4 [PATCH] fuse: fix bug in negative lookup
If negative entries (nodeid == 0) were sent in reply to LOOKUP requests,
two bugs could be triggered:

- looking up a negative entry would return -EIO,

- revaildate on an entry which turned negative would send a FORGET
  request with zero nodeid, which would cause an abort() in the
  library.

The above would only happen if the 'negative_timeout=N' option was used,
otherwise lookups reply -ENOENT, which worked correctly.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-28 20:53:43 -08:00
Miklos Szeredi 77e7f250f8 [PATCH] fuse: fix bug in aborted fuse_release_end()
There's a rather theoretical case of the BUG triggering in
fuse_reset_request():

  - iget() fails because of OOM after a successful CREATE_OPEN request
  - during IO on the resulting RELEASE request the connection is aborted

Fix and add warning to fuse_reset_request().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 13:59:27 -08:00
Miklos Szeredi 7128ec2a74 [PATCH] fuse: fix request_end() vs fuse_reset_request() race
The last fix for this function in fact opened up a much more often
triggering race.

It was uncommented tricky code, that was buggy.  Add comment, make it less
tricky and fix bug.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:51 -08:00
Miklos Szeredi 9cd6845511 [PATCH] fuse: fix async read for legacy filesystems
While asynchronous reads mean a performance improvement in most cases, if
the filesystem assumed that reads are synchronous, then async reads may
degrade performance (filesystem may receive reads out of order, which can
confuse it's own readahead logic).

With sshfs a 1.5 to 4 times slowdown can be measured.

There's also a need for userspace filesystems to know whether asynchronous
reads are supported by the kernel or not.

To achive these, negotiate in the INIT request whether async reads will be
used and the maximum readahead value.  Update interface version to 7.6

If userspace uses a version earlier than 7.6, then disable async reads, and
set maximum readahead value to the maximum read size, as done in previous
versions.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:09 -08:00
Miklos Szeredi 095da6cbb6 [PATCH] fuse: fix bitfield race
Fix race in setting bitfields of fuse_conn.  Spotted by Andrew Morton.

The two fields ->connected and ->mounted were always changed with the
fuse_lock held.  But other bitfields in the same structure were changed
without the lock.  In theory this could lead to losing the assignment of
even the ones under lock.  The chosen solution is to change these two
fields to be a full unsigned type.  The other bitfields aren't "important"
enough to warrant the extra complexity of full locking or changing them to
bitops.

For all bitfields document why they are safe wrt. concurrent
assignments.

Also make the initialization of the 'num_waiting' atomic counter explicit.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:31 -08:00
Miklos Szeredi c1aa96a52e [PATCH] fuse: use asynchronous READ requests for readpages
This patch changes fuse_readpages() to send READ requests asynchronously.

This makes it possible for userspace filesystems to utilize the kernel
readahead logic instead of having to implement their own (resulting in double
caching).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:31 -08:00
Miklos Szeredi 361b1eb55e [PATCH] fuse: READ request initialization
Add a separate function for filling in the READ request.  This will make it
possible to send asynchronous READ requests as well as synchronous ones.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:31 -08:00
Miklos Szeredi 9b9a04693f [PATCH] fuse: move INIT handling to inode.c
Now the INIT requests can be completely handled in inode.c and the
fuse_send_init() function need not be global any more.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:31 -08:00
Miklos Szeredi 64c6d8ed4c [PATCH] fuse: add asynchronous request support
Add possibility for requests to run asynchronously and call an 'end' callback
when finished.

With this, the special handling of the INIT and RELEASE requests can be
cleaned up too.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:31 -08:00
Miklos Szeredi 69a53bf267 [PATCH] fuse: add connection aborting
Add ability to abort a filesystem connection.

With the introduction of asynchronous reads, the ability to interrupt any
request is not enough to dissolve deadlocks, since now waiting for the request
completion (page unlocked) is independent of the actual request, so in a
deadlock all threads will be uninterruptible.

The solution is to make it possible to abort all requests, even those
currently undergoing I/O to/from userspace.  The natural interface for this is
'mount -f mountpoint', but that only works as long as the filesystem is
attached.  So also add an 'abort' attribute to the sysfs view of the
connection.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi 0cd5b88553 [PATCH] fuse: add number of waiting requests attribute
This patch adds the 'waiting' attribute which indicates how many filesystem
requests are currently waiting to be completed.  A non-zero value without any
filesystem activity indicates a hung or deadlocked filesystem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi f543f253f3 [PATCH] fuse: make fuse connection a kobject
Kobjectify fuse_conn, and make it visible under /sys/fs/fuse/connections.

Lacking any natural naming, connections are numbered.

This patch doesn't add any attributes, just the infrastructure.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi 9ba7cbba10 [PATCH] fuse: extend semantics of connected flag
The ->connected flag for a fuse_conn object previously only indicated whether
the device file for this connection is currently open or not.

Change it's meaning so that it indicates whether the connection is active or
not: now either umount or device release will clear the flag.

The separate ->mounted flag is still needed for handling background requests.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi d77a1d5b61 [PATCH] fuse: introduce list for requests under I/O
Create a new list for requests in the process of being transfered to/from
userspace.  This will be needed to be able to abort all requests even those
currently under I/O

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi 83cfd49351 [PATCH] fuse: introduce unified request state
The state of request was made up of 2 bitfields (->sent and ->finished) and of
the fact that the request was on a list or not.

Unify this into a single state field.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi 6383bdaa2e [PATCH] fuse: miscellaneous cleanup
- remove some unneeded assignments

 - use kzalloc instead of kmalloc + memset

 - simplify setting sb->s_fs_info

 - in fuse_send_init() use fuse_get_request() instead of
   do_get_request() helper

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:30 -08:00
Miklos Szeredi 8bfc016d2e [PATCH] fuse: uninline some functions
Inline keyword is unnecessary in most cases.  Clean them up.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:29 -08:00
Miklos Szeredi b3bebd94bb [PATCH] fuse: handle error INIT reply
Handle the case when the INIT request is answered with an error.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:29 -08:00
Miklos Szeredi f43b155a5a [PATCH] fuse: fix request_end()
This function used the request object after decrementing its reference count
and releasing the lock.  This could in theory lead to all sorts of problems.

Fix and simplify at the same time.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:29 -08:00
Miklos Szeredi 222f1d6918 [PATCH] fuse: fuse_copy_finish() order fix
fuse_copy_finish() must be called before request_end(), since the later might
sleep, and no sleeping is allowed between fuse_copy_one() and
fuse_copy_finish() because of kmap_atomic()/kunmap_atomic() used in them.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16 23:15:29 -08:00
Jes Sorensen 1b1dcc1b57 [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar <mingo@elte.hu>

(finished the conversion)

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:24 -08:00
Miklos Szeredi 39ee059aff [PATCH] fuse: check file type in lookup
Previously invalid types were quietly changed to regular files, but at
revalidation the inode was changed to bad.  This was rather inconsistent
behavior.

Now check if the type is valid on initial lookup, and return -EIO if not.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:56 -08:00
Miklos Szeredi 6ad84acab9 [PATCH] fuse: ensure progress in read and write
In direct_io mode, send at least one page per reqest.  Previously it was
possible that reqests with zero data were sent, and hence the read/write
didn't make any progress, resulting in an infinite (though interruptible)
loop.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:56 -08:00
Miklos Szeredi 3ec870d524 [PATCH] fuse: make maximum write data configurable
Make the maximum size of write data configurable by the filesystem.  The
previous fixed 4096 limit only worked on architectures where the page size is
less or equal to this.  This change make writing work on other architectures
too, and also lets the filesystem receive bigger write requests in direct_io
mode.

Normal writes which go through the page cache are still limited to a page
sized chunk per request.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:56 -08:00
Miklos Szeredi 1d3d752b47 [PATCH] fuse: clean up request size limit checking
Change the way a too large request is handled.  Until now in this case the
device read returned -EINVAL and the operation returned -EIO.

Make it more flexibible by not returning -EINVAL from the read, but restarting
it instead.

Also remove the fixed limit on setxattr data and let the filesystem provide as
large a read buffer as it needs to handle the extended attribute data.

The symbolic link length is already checked by VFS to be less than PATH_MAX,
so the extra check against FUSE_SYMLINK_MAX is not needed.

The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the
dentry has already been looked up, and hence the name already checked.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:56 -08:00
Miklos Szeredi 248d86e87d [PATCH] fuse: fail file operations on bad inode
Make file operations on a bad inode fail.  This just makes things a
bit more consistent.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi 6f9f11806a [PATCH] fuse: add code documentation
Document some not-so-trivial functions.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi 8cbdf1e6f6 [PATCH] fuse: support caching negative dentries
Add support for caching negative dentries.

Up till now, ->d_revalidate() always forced a new lookup on these.  Now let
the lookup method return a zero node ID (not used for anything else) meaning a
negative entry, but with a positive cache timeout.  The old way of signaling
negative entry (replying ENOENT) still works.

Userspace should check the ABI minor version to see whether sending a zero ID
is allowed by the kernel or not.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi de5f120255 [PATCH] fuse: add frsize to statfs reply
Add 'frsize' member to the statfs reply.

I'm not sure if sending f_fsid will ever be needed, but just in case leave
some space at the end of the structure, so less compatibility mess would be
required.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi 45714d6561 [PATCH] fuse: bump interface version
Change interface version to 7.4.

Following changes will need backward compatibility support, so store the minor
version returned by userspace.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi 4633a22e7a [PATCH] fuse: clean up page offset calculation
Use page_offset() instead of doing page offset calculation by hand.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:55 -08:00
Miklos Szeredi 0aa7c6990e [PATCH] fuse: clean up fuse_lookup()
Simplify fuse_lookup() and related functions.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:54 -08:00
Miklos Szeredi 2827d0b23b [PATCH] fuse: check for invalid node ID in fuse_create_open()
Check for invalid node ID values in the new atomic create+open method.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-28 14:42:26 -08:00
Miklos Szeredi f007d5c961 [PATCH] fuse: check directory aliasing in mkdir
Check the created directory inode for aliases in the mkdir() method.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-28 14:42:26 -08:00
Miklos Szeredi befc649c22 [PATCH] FUSE: pass file handle in setattr
This patch passes the file handle supplied in iattr to userspace, in case the
->setattr() was invoked from sys_ftruncate().  This solves the permission
checking (or lack thereof) in ftruncate() for the class of filesystems served
by an unprivileged userspace process.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi fd72faac95 [PATCH] FUSE: atomic create+open
This patch adds an atomic create+open operation.  This does not yet work if
the file type changes between lookup and create+open, but solves the
permission checking problems for the separte create and open methods.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi 31d40d74b4 [PATCH] FUSE: add access call
Add a new access call, which will only be called if ->permission is invoked
from sys_access().  In all other cases permission checking is delayed until
the actual filesystem operation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Christoph Hellwig d8ba3b7310 [PATCH] fuse: remove dead code from fuse_permission
The -EROFS check has moved up to permission() in the VFS a while ago.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Miklos Szeredi 1779381dea [PATCH] fuse: spelling fixes
Correct some typos and inconsistent use of "initialise" vs "initialize" in
comments.  Reported by Ioannis Barkas.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:24 -08:00
Miklos Szeredi f12ec44070 [PATCH] fuse: clean up dead code related to nfs exporting
Remove last remains of NFS exportability support.

The code is actually buggy (as reported by Akshat Aranya), since 'alias'
will be leaked if it's non-null and alias->d_flags has DCACHE_DISCONNECTED.

This is not an active bug, since there will never be any disconnected
dentries.  But it's better to get rid of the unnecessary complexity anyway.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Miklos Szeredi dd190d066b [PATCH] fuse: check O_DIRECT
Check O_DIRECT and return -EINVAL error in open.  dentry_open() also checks
this but only after the open method is called.  This patch optimizes away
the unnecessary upcalls in this case.

It could be a correctness issue too: if filesystem has open() with side
effect, then it should fail before doing the open, not after.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 12:41:18 -07:00
Miklos Szeredi ee4e52719c [PATCH] fuse: check reserved node ID values
This patch checks reserved node ID values returned by lookup and creation
operations.  In case one of the reserved values is sent, return -EIO.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:40 -07:00
Miklos Szeredi 7c352bdf04 [PATCH] FUSE: don't allow restarting of system calls
This patch removes ability to interrupt and restart operations while there
hasn't been any side-effect.

The reason: applications.  There are some apps it seems that generate
signals at a fast rate.  This means, that if the operation cannot make
enough progress between two signals, it will be restarted for ever.  This
bug actually manifested itself with 'krusader' trying to open a file for
writing under sshfs.  Thanks to Eduard Czimbalmos for the report.

The problem can be solved just by making open() uninterruptible, because in
this case it was the truncate operation that slowed down the progress.  But
it's better to solve this by simply not allowing interrupts at all (except
SIGKILL), because applications don't expect file operations to be
interruptible anyway.  As an added bonus the code is simplified somewhat.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Miklos Szeredi 8254798199 [PATCH] FUSE: add fsync operation for directories
This patch adds a new FSYNCDIR request, which is sent when fsync is called
on directories.  This operation is available in libfuse 2.3-pre1 or
greater.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:47 -07:00
Miklos Szeredi b36c31ba95 [PATCH] fuse: don't update file times
Don't change mtime/ctime/atime to local time on read/write.  Rather invalidate
file attributes, so next stat() will force a GETATTR call.  Bug reported by
Ben Grimm.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:47 -07:00
Miklos Szeredi 45323fb764 [PATCH] fuse: more flexible caching
Make data caching behavior selectable on a per-open basis instead of
per-mount.  Compatibility for the old mount options 'kernel_cache' and
'direct_io' is retained in the userspace library (version 2.4.0-pre1 or
later).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:47 -07:00
Miklos Szeredi 04730fef1f [PATCH] fuse: transfer readdir data through device
This patch removes a long lasting "hack" in FUSE, which used a separate
channel (a file descriptor refering to a disk-file) to transfer directory
contents from userspace to the kernel.

The patch adds three new operations (OPENDIR, READDIR, RELEASEDIR), which
have semantics and implementation exactly maching the respective file
operations (OPEN, READ, RELEASE).

This simplifies the directory reading code.  Also disk space is not
necessary, which can be important in embedded systems.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:47 -07:00
Miklos Szeredi 413ef8cb30 [PATCH] FUSE - direct I/O
This patch adds support for the "direct_io" mount option of FUSE.

When this mount option is specified, the page cache is bypassed for
read and write operations.  This is useful for example, if the
filesystem doesn't know the size of files before reading them, or when
any kind of caching is harmful.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:46 -07:00
Miklos Szeredi 5a53368277 [PATCH] fuse: stricter mount option checking
Check for the presence of all mandatory mount options.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:46 -07:00
Miklos Szeredi 87729a5514 [PATCH] FUSE: tighten check for processes allowed access
This patch tightens the check for allowing processes to access non-privileged
mounts.  The rational is that the filesystem implementation can control the
behavior or get otherwise unavailable information of the filesystem user.  If
the filesystem user process has the same uid, gid, and is not suid or sgid
application, then access is safe.  Otherwise access is not allowed unless the
"allow_other" mount option is given (for which policy is controlled by the
userspace mount utility).

Thanks to everyone linux-fsdevel, especially Martin Mares who helped uncover
problems with the previous approach.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:46 -07:00
Miklos Szeredi db50b96c0f [PATCH] FUSE - readpages operation
This patch adds readpages support to FUSE.

With the help of the readpages() operation multiple reads are bundled
together and sent as a single request to userspace.  This can improve
reading performace.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:46 -07:00
Miklos Szeredi 92a8780e11 [PATCH] FUSE - extended attribute operations
This patch adds the extended attribute operations to FUSE.

The following operations are added:

 o getxattr
 o setxattr
 o listxattr
 o removexattr

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:45 -07:00
Miklos Szeredi 1e9a4ed939 [PATCH] FUSE - mount options
This patch adds miscellaneous mount options to the FUSE filesystem.

The following mount options are added:

 o default_permissions:  check permissions with generic_permission()
 o allow_other:          allow other users to access files
 o allow_root:           allow root to access files
 o kernel_cache:         don't invalidate page cache on open

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:45 -07:00
Miklos Szeredi b6aeadeda2 [PATCH] FUSE - file operations
This patch adds the file operations of FUSE.

The following operations are added:

 o open
 o flush
 o release
 o fsync
 o readpage
 o commit_write

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:45 -07:00
Miklos Szeredi 9e6268db49 [PATCH] FUSE - read-write operations
This patch adds the write filesystem operations of FUSE.

The following operations are added:

 o setattr
 o symlink
 o mknod
 o mkdir
 o create
 o unlink
 o rmdir
 o rename
 o link

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:45 -07:00
Miklos Szeredi e5e5558e92 [PATCH] FUSE - read-only operations
This patch adds the read-only filesystem operations of FUSE.

This contains the following files:

 o dir.c
    - directory, symlink and file-inode operations

The following operations are added:

 o lookup
 o getattr
 o readlink
 o follow_link
 o directory open
 o readdir
 o directory release
 o permission
 o dentry revalidate
 o statfs

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:45 -07:00
Miklos Szeredi 334f485df8 [PATCH] FUSE - device functions
This adds the FUSE device handling functions.

This contains the following files:

 o dev.c
    - fuse device operations (read, write, release, poll)
    - registers misc device
    - support for sending requests to userspace

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:44 -07:00
Miklos Szeredi d8a5ba4545 [PATCH] FUSE - core
This patch adds FUSE core.

This contains the following files:

 o inode.c
    - superblock operations (alloc_inode, destroy_inode, read_inode,
      clear_inode, put_super, show_options)
    - registers FUSE filesystem

 o fuse_i.h
    - private header file

Requirements
============

 The most important difference between orinary filesystems and FUSE is
 the fact, that the filesystem data/metadata is provided by a userspace
 process run with the privileges of the mount "owner" instead of the
 kernel, or some remote entity usually running with elevated
 privileges.

 The security implication of this is that a non-privileged user must
 not be able to use this capability to compromise the system.  Obvious
 requirements arising from this are:

  - mount owner should not be able to get elevated privileges with the
    help of the mounted filesystem

  - mount owner should not be able to induce undesired behavior in
    other users' or the super user's processes

  - mount owner should not get illegitimate access to information from
    other users' and the super user's processes

 These are currently ensured with the following constraints:

  1) mount is only allowed to directory or file which the mount owner
    can modify without limitation (write access + no sticky bit for
    directories)

  2) nosuid,nodev mount options are forced

  3) any process running with fsuid different from the owner is denied
     all access to the filesystem

 1) and 2) are ensured by the "fusermount" mount utility which is a
    setuid root application doing the actual mount operation.

 3) is ensured by a check in the permission() method in kernel

 I started thinking about doing 3) in a different way because Christoph
 H. made a big deal out of it, saying that FUSE is unacceptable into
 mainline in this form.

 The suggested use of private namespaces would be OK, but in their
 current form have many limitations that make their use impractical (as
 discussed in this thread).

 Suggested improvements that would address these limitations:

   - implement shared subtrees

   - allow a process to join an existing namespace (make namespaces
     first-class objects)

   - implement the namespace creation/joining in a PAM module

 With all that in place the check of owner against current->fsuid may
 be removed from the FUSE kernel module, without compromising the
 security requirements.

 Suid programs still interesting questions, since they get access even
 to the private namespace causing some information leak (exact
 order/timing of filesystem operations performed), giving some
 ptrace-like capabilities to unprivileged users.  BTW this problem is
 not strictly limited to the namespace approach, since suid programs
 setting fsuid and accessing users' files will succeed with the current
 approach too.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:44 -07:00