WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Christoph Hellwig	85fe4025c6	fs: do not assign default i_ino in new_inode Instead of always assigning an increasing inode number in new_inode move the call to assign it into those callers that actually need it. For now callers that need it is estimated conservatively, that is the call is added to all filesystems that do not assign an i_ino by themselves. For a few more filesystems we can avoid assigning any inode number given that they aren't user visible, and for others it could be done lazily when an inode number is actually needed, but that's left for later patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:26:11 -04:00
Nick Piggin	75de46b98d	fix setattr error handling in sysfs, configfs sysfs and configfs setattr functions have error cases after the generic inode's attributes have been changed. Fix consistency by changing the generic inode attributes only when it is guaranteed to succeed. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-06-04 13:27:53 -07:00
Nick Piggin	3322e79a38	fs: convert simple fs to new truncate Convert simple filesystems: ramfs, configfs, sysfs, block_dev to new truncate sequence. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:15:47 -04:00
Al Viro	d83c49f3e3	Fix the regression created by "set S_DEAD on unlink()..." commit 1) i_flags simply doesn't work for mount/unlink race prevention; we may have many links to file and rm on one of those obviously shouldn't prevent bind on top of another later on. To fix it right way we need to mark _dentry_ as unsuitable for mounting upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and i_mutex on the inode in question. Set it (with dont_mount(dentry)) in unlink/rmdir/etc., check (with cant_mount(dentry)) in places in namespace.c that used to check for S_DEAD. Setting S_DEAD is still needed in places where we used to set it (for directories getting killed), since we rely on it for readdir/rmdir race prevention. 2) rename()/mount() protection has another bogosity - we unhash the target before we'd checked that it's not a mountpoint. Fixed. 3) ancient bogosity in pivot_root() - we locked i_mutex on the right directory, but checked S_DEAD on the different (and wrong) one. Noticed and fixed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-15 07:16:33 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Al Viro	9b6e310211	Fix configfs leak Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-01-14 09:05:42 -05:00
Jens Axboe	d993831fa7	writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 09:20:26 +02:00
Louis Rilling	420118caa3	configfs: Rework configfs_depend_item() locking and make lockdep happy configfs_depend_item() recursively locks all inodes mutex from configfs root to the target item, which makes lockdep unhappy. The purpose of this recursive locking is to ensure that the item tree can be safely parsed and that the target item, if found, is not about to leave. This patch reworks configfs_depend_item() locking using configfs_dirent_lock. Since configfs_dirent_lock protects all changes to the configfs_dirent tree, and protects tagging of items to be removed, this lock can be used instead of the inodes mutex lock chain. This needs that the check for dependents be done atomically with CONFIGFS_USET_DROPPING tagging. Now lockdep looks happy with configfs. [ Lifted the setting of s_type into configfs_new_dirent() to satisfy the atomic setting of CONFIGFS_USET_CREATING -- Joel ] Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-04-30 10:48:26 -07:00
Louis Rilling	e74cc06df3	configfs: Silence lockdep on mkdir() and rmdir() When attaching default groups (subdirs) of a new group (in mkdir() or in configfs_register()), configfs recursively takes inode's mutexes along the path from the parent of the new group to the default subdirs. This is needed to ensure that the VFS will not race with operations on these sub-dirs. This is safe for the following reasons: - the VFS allows one to lock first an inode and second one of its children (The lock subclasses for this pattern are respectively I_MUTEX_PARENT and I_MUTEX_CHILD); - from this rule any inode path can be recursively locked in descending order as long as it stays under a single mountpoint and does not follow symlinks. Unfortunately lockdep does not know (yet?) how to handle such recursion. I've tried to use Peter Zijlstra's lock_set_subclass() helper to upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know that we might recursively lock some of their descendant, but this usage does not seem to fit the purpose of lock_set_subclass() because it leads to several i_mutex locked with subclass I_MUTEX_PARENT by the same task. >From inside configfs it is not possible to serialize those recursive locking with a top-level one, because mkdir() and rmdir() are already called with inodes locked by the VFS. So using some mutex_lock_nest_lock() is not an option. I am proposing two solutions: 1) one that wraps recursive mutex_lock()s with lockdep_off()/lockdep_on(). 2) (as suggested earlier by Peter Zijlstra) one that puts the i_mutexes recursively locked in different classes based on their depth from the top-level config_group created. This induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the nesting of configfs default groups whenever lockdep is activated but this limit looks reasonably high. Unfortunately, this also isolates VFS operations on configfs default groups from the others and thus lowers the chances to detect locking issues. Nobody likes solution 1), which I can understand. This patch implements solution 2). However lockdep is still not happy with configfs_depend_item(). Next patch reworks the locking of configfs_depend_item() and finally makes lockdep happy. [ Note: This hides a few locking interactions with the VFS from lockdep. That was my big concern, because we like lockdep's protection. However, the current state always dumps a spurious warning. The locking is correct, so I tell people to ignore the warning and that we'll keep our eyes on the locking to make sure it stays correct. With this patch, we eliminate the warning. We do lose some of the lockdep protections, but this only means that we still have to keep our eyes on the locking. We're going to do that anyway. -- Joel ] Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-04-30 10:48:23 -07:00
Subrata Modak	3c48f23ada	configfs: Fix Trivial Warning in fs/configfs/symlink.c I observed the following build warning with fs/configfs/symlink.c: fs/configfs/symlink.c: In function 'configfs_symlink': fs/configfs/symlink.c:138: warning: 'target_item' may be used uninitialized in this function Here is a small fix for this. Cc: Patrick Mochel <mochel@osdl.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Sachin P Sant <sachinp@linux.vnet.ibm.com> Signed-Off-By: Subrata Modak <subrata@linux.vnet.ibm.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-04-21 12:59:21 -07:00
Al Viro	296c2d8663	constify dentry_operations: configfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:03 -04:00
Mark Fasheh	436443f0f7	Revert "configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item()" This reverts commit `0e0333429a`. I committed this by accident - Joel and Louis are working with the lockdep maintainer to provide a better solution than just turning lockdep off. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: <Joel Becker <joel.becker@oracle.com>	2009-02-04 09:46:25 -08:00
Joel Becker	0e0333429a	configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item() When attaching default groups (subdirs) of a new group (in mkdir() or in configfs_register()), configfs recursively takes inode's mutexes along the path from the parent of the new group to the default subdirs. This is needed to ensure that the VFS will not race with operations on these sub-dirs. This is safe for the following reasons: - the VFS allows one to lock first an inode and second one of its children (The lock subclasses for this pattern are respectively I_MUTEX_PARENT and I_MUTEX_CHILD); - from this rule any inode path can be recursively locked in descending order as long as it stays under a single mountpoint and does not follow symlinks. Unfortunately lockdep does not know (yet?) how to handle such recursion. I've tried to use Peter Zijlstra's lock_set_subclass() helper to upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know that we might recursively lock some of their descendant, but this usage does not seem to fit the purpose of lock_set_subclass() because it leads to several i_mutex locked with subclass I_MUTEX_PARENT by the same task. >From inside configfs it is not possible to serialize those recursive locking with a top-level one, because mkdir() and rmdir() are already called with inodes locked by the VFS. So using some mutex_lock_nest_lock() is not an option. I am proposing two solutions: 1) one that wraps recursive mutex_lock()s with lockdep_off()/lockdep_on(). 2) (as suggested earlier by Peter Zijlstra) one that puts the i_mutexes recursively locked in different classes based on their depth from the top-level config_group created. This induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the nesting of configfs default groups whenever lockdep is activated but this limit looks reasonably high. Unfortunately, this alos isolates VFS operations on configfs default groups from the others and thus lowers the chances to detect locking issues. This patch implements solution 1). Solution 2) looks better from lockdep's point of view, but fails with configfs_depend_item(). This needs to rework the locking scheme of configfs_depend_item() by removing the variable lock recursion depth, and I think that it's doable thanks to the configfs_dirent_lock. For now, let's stick to solution 1). Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-02-02 14:20:18 -08:00
Alexey Dobriyan	4591dabe27	fs/Kconfig: move configfs out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:56 +03:00
Al Viro	56ff5efad9	zero i_uid/i_gid on inode allocation ... and don't bother in callers. Don't bother with zeroing i_blocks, while we are at it - it's already been zeroed. i_mode is not worth the effort; it has no common default value. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:28 -05:00
Al Viro	421748ecde	[PATCH] assorted path_lookup() -> kern_path() conversions more nameidata eviction Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:12:52 -04:00
Louis Rilling	de6bf18e9c	[PATCH] configfs: Consolidate locking around configfs_detach_prep() in configfs_rmdir() It appears that configfs_rmdir() can protect configfs_detach_prep() retries with less calls to {spin,mutex}_{lock,unlock}, and a cleaner code. This patch does not change any behavior, except that it removes two useless lock/unlock pairs having nothing inside to protect and providing a useless barrier. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <Joel.Becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 11:09:02 -07:00
Joel Becker	70526b6744	[PATCH] configfs: Pin configfs subsystems separately from new config_items. configfs_mkdir() creates a new item by calling its parent's ->make_item/group() functions. Once that object is created, configfs_mkdir() calls try_module_get() on the new item's module. If it succeeds, the module owning the new item cannot be unloaded, and configfs is safe to reference the item. If the item and the subsystem it belongs to are part of the same module, the subsystem is also pinned. This is the common case. However, if the subsystem is made up of multiple modules, this may not pin the subsystem. Thus, it would be possible to unload the toplevel subsystem module while there is still a child item. Thus, we now try_module_get() the subsystem's module. This only really affects children of the toplevel subsystem group. Deeper children already have their parents pinned. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	99cefda42a	[PATCH] configfs: Fix open directory making rmdir() fail When checking for user-created elements under an item to be removed by rmdir(), configfs_detach_prep() counts fake configfs_dirents created by dir_open() as user-created and fails when finding one. It is however perfectly valid to remove a directory that is open. Simply make configfs_detach_prep() skip fake configfs_dirent, like it already does for attributes, and like detach_groups() does. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	2e2ce171c3	[PATCH] configfs: Lock new directory inodes before removing on cleanup after failure Once a new configfs directory is created by configfs_attach_item() or configfs_attach_group(), a failure in the remaining initialization steps leads to removing a directory which inode the VFS may have already accessed. This commit adds the necessary inode locking to safely remove configfs directories while cleaning up after a failure. As an advantage, the locking rules of populate_groups() and detach_groups() become the same: the caller must have the group's inode mutex locked. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	2a109f2a41	[PATCH] configfs: Prevent userspace from creating new entries under attaching directories process 1: process 2: configfs_mkdir("A") attach_group("A") attach_item("A") d_instantiate("A") populate_groups("A") mutex_lock("A") attach_group("A/B") attach_item("A") d_instantiate("A/B") mkdir("A/B/C") do_path_lookup("A/B/C", LOOKUP_PARENT) ok lookup_create("A/B/C") mutex_lock("A/B") ok configfs_mkdir("A/B/C") ok attach_group("A/C") attach_item("A/C") d_instantiate("A/C") populate_groups("A/C") mutex_lock("A/C") attach_group("A/C/D") attach_item("A/C/D") failure mutex_unlock("A/C") detach_groups("A/C") nothing to do mkdir("A/C/E") do_path_lookup("A/C/E", LOOKUP_PARENT) ok lookup_create("A/C/E") mutex_lock("A/C") ok configfs_mkdir("A/C/E") ok detach_item("A/C") d_delete("A/C") mutex_unlock("A") detach_groups("A") mutex_lock("A/B") detach_group("A/B") detach_groups("A/B") nothing since no _default_ group detach_item("A/B") mutex_unlock("A/B") d_delete("A/B") detach_item("A") d_delete("A") Two bugs: 1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are removed in the end. The same could happen with symlink() instead of mkdir(). 2/ "A" and "A/C" inodes are not locked while detach_item() is called on them, which may probably confuse VFS. This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before building the inode and instantiating the dentry, and validating the whole group+default groups hierarchy in a second pass by clearing CONFIGFS_USET_CREATING. mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This does not prevent userspace from calling stat() successfuly on such directories, but this prevents userspace from adding (children to \| symlinking from/to \| read/write attributes of \| listing the contents of) not validated items. In other words, userspace will not interact with the subsystem on a new item until the new item creation completes correctly. It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the target of a new symlink: a valid target directory in the middle of attaching a new user-created child item could be wrongly detected as being attached. 2/ is fixed by next commit. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	9a73d78cda	[PATCH] configfs: Fix failing symlink() making rmdir() fail On a similar pattern as mkdir() vs rmdir(), a failing symlink() may make rmdir() fail for the symlink's parent and the symlink's target as well. failing symlink() making target's rmdir() fail: process 1: process 2: symlink("A/S" -> "B") allow_link() create_link() attach to "B" links list rmdir("B") detach_prep("B") error because of new link configfs_create_link("A", "S") error (eg -ENOMEM) failing symlink() making parent's rmdir() fail: process 1: process 2: symlink("A/D/S" -> "B") allow_link() create_link() attach to "B" links list configfs_create_link("A/D", "S") make_dirent("A/D", "S") rmdir("A") detach_prep("A") detach_prep("A/D") error because of "S" create("S") error (eg -ENOMEM) We cannot use the same solution as for mkdir() vs rmdir(), since rmdir() on the target cannot wait on the i_mutex of the new symlink's parent without risking a deadlock (with other symlink() or sys_rename()). Instead we define a global mutex protecting all configfs symlinks attachment, so that rmdir() can avoid the races above. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	4768e9b18d	[PATCH] configfs: Fix symlink() to a removing item The rule for configfs symlinks is that symlinks always point to valid config_items, and prevent the target from being removed. However, configfs_symlink() only checks that it can grab a reference on the target item, without ensuring that it remains alive until the symlink is correctly attached. This patch makes configfs_symlink() fail whenever the target is being removed, using the CONFIGFS_USET_DROPPING flag set by configfs_detach_prep() and protected by configfs_dirent_lock. This patch introduces a similar (weird?) behavior as with mkdir failures making rmdir fail: if symlink() races with rmdir() of the parent directory (or its youngest user-created ancestor if parent is a default group) or rmdir() of the target directory, and then fails in configfs_create(), this can make the racing rmdir() fail despite the concerned directory having no user-created entry (resp. no symlink pointing to it or one of its default groups) in the end. This behavior is fixed in later patches. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:12 -07:00
Joel Becker	dacdd0e047	[PATCH] configfs: Include linux/err.h in linux/configfs.h We now use PTR_ERR() in the ->make_item() and ->make_group() operations. Folks including configfs.h need err.h. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:12 -07:00
Joel Becker	a6795e9ebb	configfs: Allow ->make_item() and ->make_group() to return detailed errors. The configfs operations ->make_item() and ->make_group() currently return a new item/group. A return of NULL signifies an error. Because of this, -ENOMEM is the only return code bubbled up the stack. Multiple folks have requested the ability to return specific error codes when these operations fail. This patch adds that ability by changing the ->make_item/group() ops to return ERR_PTR() values. These errors are bubbled up appropriately. NULL returns are changed to -ENOMEM for compatibility. Also updated are the in-kernel users of configfs. This is a rework of reverted commit `11c3b79218`. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 15:21:29 -07:00
Joel Becker	f89ab8619e	Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors." This reverts commit `11c3b79218`. The code will move to PTR_ERR(). Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 14:53:48 -07:00
Louis Rilling	e752065175	configfs: call drop_link() to cleanup after create_link() failure When allow_link() succeeds but create_link() fails, the subsystem is not informed of the failure. This patch fixes this by calling drop_link() on create_link() failures. Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Joel Becker	11c3b79218	configfs: Allow ->make_item() and ->make_group() to return detailed errors. The configfs operations ->make_item() and ->make_group() currently return a new item/group. A return of NULL signifies an error. Because of this, -ENOMEM is the only return code bubbled up the stack. Multiple folks have requested the ability to return specific error codes when these operations fail. This patch adds that ability by changing the ->make_item/group() ops to return an int. Also updated are the in-kernel users of configfs. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Louis Rilling	6d8344baee	configfs: Fix failing mkdir() making racing rmdir() fail When fixing the rename() vs rmdir() deadlock, we stopped locking default groups' inodes in configfs_detach_prep(), letting racing mkdir() in default groups proceed concurrently. This enables races like below happen, which leads to a failing mkdir() making rmdir() fail, despite the group to remove having no user-created directory under it in the end. process A: process B: /* PWD=A/B */ mkdir("C") make_item("C") attach_group("C") rmdir("A") detach_prep("A") detach_prep("B") error because of "C" return -ENOTEMPTY attach_group("C/D") error (eg -ENOMEM) return -ENOMEM This patch prevents such scenarii by making rmdir() wait as long as detach_prep() fails because a racing mkdir() is in the middle of attach_group(). To achieve this, mkdir() sets a flag CONFIGFS_USET_IN_MKDIR in parent's configfs_dirent before calling attach_group(), and clears the flag once attach_group() is done. detach_prep() fails with -EAGAIN whenever the flag is hit and returns the guilty inode's mutex so that rmdir() can wait on it. Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Louis Rilling	b3e76af874	configfs: Fix deadlock with racing rmdir() and rename() This patch fixes the deadlock between racing sys_rename() and configfs_rmdir(). The idea is to avoid locking i_mutexes of default groups in configfs_detach_prep(), and rely instead on the new configfs_dirent_lock to protect against configfs_dirent's linkage mutations. To ensure that an mkdir() racing with rmdir() will not create new items in a to-be-removed default group, we make configfs_new_dirent() check for the CONFIGFS_USET_DROPPING flag right before linking the new dirent, and return error if the flag is set. This makes racing mkdir()/symlink()/dir_open() fail in places where errors could already happen, resp. in (attach_item()\|attach_group())/create_link()/new_dirent(). configfs_depend() remains safe since it locks all the path from configfs root, and is thus mutually exclusive with rmdir(). An advantage of this is that now detach_groups() unconditionnaly takes the default groups i_mutex, which makes it more consistent with populate_groups(). Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Louis Rilling	107ed40bd0	configfs: Make configfs_new_dirent() return error code instead of NULL This patch makes configfs_new_dirent return negative error code instead of NULL, which will be useful in the next patch to differentiate ENOMEM from ENOENT. Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Louis Rilling	5301a77da2	configfs: Protect configfs_dirent s_links list mutations Symlinks to a config_item are listed under its configfs_dirent s_links, but the list mutations are not protected by any common lock. This patch uses the configfs_dirent_lock spinlock to add the necessary protection. Note: we should also protect the list_empty() test in configfs_detach_prep() but 1/ the lock should not be released immediately because nothing would prevent the list from being filled after a successful list_empty() test, making the problem tricky, 2/ this will be solved by the rmdir() vs rename() deadlock bugfix. Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Louis Rilling	6f61076406	configfs: Introduce configfs_dirent_lock This patch introduces configfs_dirent_lock spinlock to protect configfs_dirent traversals against linkage mutations (add/del/move). This will allow configfs_detach_prep() to avoid locking i_mutexes. Locking rules for configfs_dirent linkage mutations are the same plus the requirement of taking configfs_dirent_lock. For configfs_dirent walking, one can either take appropriate i_mutex as before, or take configfs_dirent_lock. The spinlock could actually be a mutex, but the critical sections are either O(1) or should not be too long (default groups walking in last patch). ChangeLog: - Clarify the comment on configfs_dirent_lock usage - Move sd->s_element init before linking the new dirent - In lseek(), do not release configfs_dirent_lock before the dirent is relinked. Signed-off-by: Louis Rilling <Louis.Rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:15 -07:00
Harvey Harrison	8e24eea728	fs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:54 -07:00
Miklos Szeredi	e4ad08fe64	mm: bdi: add separate writeback accounting capability Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is set, then don't update the per-bdi writeback stats from test_set_page_writeback() and test_clear_page_writeback(). Misc cleanups: - convert bdi_cap_writeback_dirty() and friends to static inline functions - create a flag that includes all three dirty/writeback related flags, since almst all users will want to have them toghether Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Jan Blunck	1d957f9bf8	Introduce path_put() * Add path_put() functions for releasing a reference to the dentry and vfsmount of a struct path in the right order * Switch from path_release(nd) to path_put(&nd->path) * Rename dput_path() to path_put_conditional() [akpm@linux-foundation.org: fix cifs] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: <linux-fsdevel@vger.kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Jan Blunck	4ac9137858	Embed a struct path into struct nameidata instead of nd->{dentry,mnt} This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 `6894212` 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Joonwoo Park	116ba5d5ea	configfs: file.c fix possible recursive locking configfs_register_subsystem() with default_groups triggers recursive locking. it seems that mutex_lock_nested is needed. ============================================= [ INFO: possible recursive locking detected ] 2.6.24-rc6 #145 --------------------------------------------- swapper/1 is trying to acquire lock: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40c9a9e>] configfs_add_file+0x2e/0x70 but task is already holding lock: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40ca985>] configfs_register_subsystem+0x55/0x130 other info that might help us debug this: 1 lock held by swapper/1: #0: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40ca985>] configfs_register_subsystem+0x55/0x130 stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.24-rc6 #145 [<c40053ba>] show_trace_log_lvl+0x1a/0x30 [<c4005e82>] show_trace+0x12/0x20 [<c400687e>] dump_stack+0x6e/0x80 [<c404ec72>] __lock_acquire+0xe62/0x1120 [<c404efb2>] lock_acquire+0x82/0xa0 [<c43fda88>] mutex_lock_nested+0x98/0x2e0 [<c40c9a9e>] configfs_add_file+0x2e/0x70 [<c40c9b0c>] configfs_create_file+0x2c/0x40 [<c40ca639>] configfs_attach_item+0x139/0x220 [<c40ca734>] configfs_attach_group+0x14/0x140 [<c40ca7e9>] configfs_attach_group+0xc9/0x140 [<c40ca9f6>] configfs_register_subsystem+0xc6/0x130 [<c45c8186>] init_netconsole+0x2b6/0x300 [<c45a75f2>] kernel_init+0x142/0x320 [<c4004fb3>] kernel_thread_helper+0x7/0x14 ======================= Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2008-01-25 15:05:47 -08:00
Joonwoo Park	ba611edfe4	configfs: dir.c fix possible recursive locking configfs_register_subsystem() with default_groups triggers recursive locking. it seems that mutex_lock_nested is needed. ============================================= [ INFO: possible recursive locking detected ] 2.6.24-rc6 #141 --------------------------------------------- swapper/1 is trying to acquire lock: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40ca76f>] configfs_attach_group+0x4f/0x190 but task is already holding lock: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40ca9d5>] configfs_register_subsystem+0x55/0x130 other info that might help us debug this: 1 lock held by swapper/1: #0: (&sb->s_type->i_mutex_key#3){--..}, at: [<c40ca9d5>] configfs_register_subsystem+0x55/0x130 stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.24-rc6 #141 [<c40053ba>] show_trace_log_lvl+0x1a/0x30 [<c4005e82>] show_trace+0x12/0x20 [<c400687e>] dump_stack+0x6e/0x80 [<c404ec72>] __lock_acquire+0xe62/0x1120 [<c404efb2>] lock_acquire+0x82/0xa0 [<c43fdad8>] mutex_lock_nested+0x98/0x2e0 [<c40ca76f>] configfs_attach_group+0x4f/0x190 [<c40caa46>] configfs_register_subsystem+0xc6/0x130 [<c45c8186>] init_netconsole+0x2b6/0x300 [<c45a75f2>] kernel_init+0x142/0x320 [<c4004fb3>] kernel_thread_helper+0x7/0x14 ======================= Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2008-01-25 15:05:47 -08:00
Greg Kroah-Hartman	197b12d679	Kobject: convert fs/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	0ff21e4663	kobject: convert kernel_kset to be a kobject kernel_kset does not need to be a kset, but a much simpler kobject now that we have kobj_attributes. We also rename kernel_kset to kernel_kobj to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	bd35b93d80	kset: convert kernel_subsys to use kset_create Dynamically create the kset instead of declaring it statically. We also rename kernel_subsys to kernel_kset to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	3794491d0c	kobject: convert configfs to use kobject_create We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Dave Hansen	ce8d2cdf3d	r/o bind mounts: filesystem helpers for custom 'struct file's Why do we need r/o bind mounts? This feature allows a read-only view into a read-write filesystem. In the process of doing that, it also provides infrastructure for keeping track of the number of writers to any given mount. This has a number of uses. It allows chroots to have parts of filesystems writable. It will be useful for containers in the future because users may have root inside a container, but should not be allowed to write to somefilesystems. This also replaces patches that vserver has had out of the tree for several years. It allows security enhancement by making sure that parts of your filesystem read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated. I've been using the following script to test that the feature is working as desired. It takes a directory and makes a regular bind and a r/o bind mount of it. It then performs some normal filesystem operations on the three directories, including ones that are expected to fail, like creating a file on the r/o mount. This patch: Some filesystems forego the vfs and may_open() and create their own 'struct file's. This patch creates a couple of helper functions which can be used by these filesystems, and will provide a unified place which the r/o bind mount code may patch. Also, rename an existing, static-scope init_file() to a less generic name. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Peter Zijlstra	e0bf68ddec	mm: bdi init hooks provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Nick Piggin	800d15a53e	implement simple fs aops Implement new aops for some of the simpler filesystems. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:55 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Joel Becker	631d1febab	configfs: config item dependancies. Sometimes other drivers depend on particular configfs items. For example, ocfs2 mounts depend on a heartbeat region item. If that region item is removed with rmdir(2), the ocfs2 mount must BUG or go readonly. Not happy. This provides two additional API calls: configfs_depend_item() and configfs_undepend_item(). A client driver can call configfs_depend_item() on an existing item to tell configfs that it is depended on. configfs will then return -EBUSY from rmdir(2) for that item. When the item is no longer depended on, the client driver calls configfs_undepend_item() on it. These API cannot be called underneath any configfs callbacks, as they will conflict. They can block and allocate. A client driver probably shouldn't calling them of its own gumption. Rather it should be providing an API that external subsystems call. How does this work? Imagine the ocfs2 mount process. When it mounts, it asks for a heart region item. This is done via a call into the heartbeat code. Inside the heartbeat code, the region item is looked up. Here, the heartbeat code calls configfs_depend_item(). If it succeeds, then heartbeat knows the region is safe to give to ocfs2. If it fails, it was being torn down anyway, and heartbeat can gracefully pass up an error. [ Fixed some bad whitespace in configfs.txt. --Mark ] Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:18:59 -07:00
Joel Becker	299894cc90	configfs: accessing item hierarchy during rmdir(2) Add a notification callback, ops->disconnect_notify(). It has the same prototype as ->drop_item(), but it will be called just before the item linkage is broken. This way, configfs users who want to do work while the object is still in the heirarchy have a chance. Client drivers will still need to config_item_put() in their ->drop_item(), if they implement it. They need do nothing in ->disconnect_notify(). They don't have to provide it if they don't care. But someone who wants to be notified before ci_parent is set to NULL can now be notified. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:11:01 -07:00

1 2

97 Коммитов